U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • J Korean Med Sci
  • v.36(50); 2021 Dec 27

Logo of jkms

Formulating Hypotheses for Different Study Designs

Durga prasanna misra.

1 Department of Clinical Immunology and Rheumatology, Sanjay Gandhi Postgraduate Institute of Medical Sciences, Lucknow, India.

Armen Yuri Gasparyan

2 Departments of Rheumatology and Research and Development, Dudley Group NHS Foundation Trust (Teaching Trust of the University of Birmingham, UK), Russells Hall Hospital, Dudley, UK.

Olena Zimba

3 Department of Internal Medicine #2, Danylo Halytsky Lviv National Medical University, Lviv, Ukraine.

Marlen Yessirkepov

4 Department of Biology and Biochemistry, South Kazakhstan Medical Academy, Shymkent, Kazakhstan.

Vikas Agarwal

George d. kitas.

5 Centre for Epidemiology versus Arthritis, University of Manchester, Manchester, UK.

Generating a testable working hypothesis is the first step towards conducting original research. Such research may prove or disprove the proposed hypothesis. Case reports, case series, online surveys and other observational studies, clinical trials, and narrative reviews help to generate hypotheses. Observational and interventional studies help to test hypotheses. A good hypothesis is usually based on previous evidence-based reports. Hypotheses without evidence-based justification and a priori ideas are not received favourably by the scientific community. Original research to test a hypothesis should be carefully planned to ensure appropriate methodology and adequate statistical power. While hypotheses can challenge conventional thinking and may be controversial, they should not be destructive. A hypothesis should be tested by ethically sound experiments with meaningful ethical and clinical implications. The coronavirus disease 2019 pandemic has brought into sharp focus numerous hypotheses, some of which were proven (e.g. effectiveness of corticosteroids in those with hypoxia) while others were disproven (e.g. ineffectiveness of hydroxychloroquine and ivermectin).

Graphical Abstract

An external file that holds a picture, illustration, etc.
Object name is jkms-36-e338-abf001.jpg

DEFINING WORKING AND STANDALONE SCIENTIFIC HYPOTHESES

Science is the systematized description of natural truths and facts. Routine observations of existing life phenomena lead to the creative thinking and generation of ideas about mechanisms of such phenomena and related human interventions. Such ideas presented in a structured format can be viewed as hypotheses. After generating a hypothesis, it is necessary to test it to prove its validity. Thus, hypothesis can be defined as a proposed mechanism of a naturally occurring event or a proposed outcome of an intervention. 1 , 2

Hypothesis testing requires choosing the most appropriate methodology and adequately powering statistically the study to be able to “prove” or “disprove” it within predetermined and widely accepted levels of certainty. This entails sample size calculation that often takes into account previously published observations and pilot studies. 2 , 3 In the era of digitization, hypothesis generation and testing may benefit from the availability of numerous platforms for data dissemination, social networking, and expert validation. Related expert evaluations may reveal strengths and limitations of proposed ideas at early stages of post-publication promotion, preventing the implementation of unsupported controversial points. 4

Thus, hypothesis generation is an important initial step in the research workflow, reflecting accumulating evidence and experts' stance. In this article, we overview the genesis and importance of scientific hypotheses and their relevance in the era of the coronavirus disease 2019 (COVID-19) pandemic.

DO WE NEED HYPOTHESES FOR ALL STUDY DESIGNS?

Broadly, research can be categorized as primary or secondary. In the context of medicine, primary research may include real-life observations of disease presentations and outcomes. Single case descriptions, which often lead to new ideas and hypotheses, serve as important starting points or justifications for case series and cohort studies. The importance of case descriptions is particularly evident in the context of the COVID-19 pandemic when unique, educational case reports have heralded a new era in clinical medicine. 5

Case series serve similar purpose to single case reports, but are based on a slightly larger quantum of information. Observational studies, including online surveys, describe the existing phenomena at a larger scale, often involving various control groups. Observational studies include variable-scale epidemiological investigations at different time points. Interventional studies detail the results of therapeutic interventions.

Secondary research is based on already published literature and does not directly involve human or animal subjects. Review articles are generated by secondary research. These could be systematic reviews which follow methods akin to primary research but with the unit of study being published papers rather than humans or animals. Systematic reviews have a rigid structure with a mandatory search strategy encompassing multiple databases, systematic screening of search results against pre-defined inclusion and exclusion criteria, critical appraisal of study quality and an optional component of collating results across studies quantitatively to derive summary estimates (meta-analysis). 6 Narrative reviews, on the other hand, have a more flexible structure. Systematic literature searches to minimise bias in selection of articles are highly recommended but not mandatory. 7 Narrative reviews are influenced by the authors' viewpoint who may preferentially analyse selected sets of articles. 8

In relation to primary research, case studies and case series are generally not driven by a working hypothesis. Rather, they serve as a basis to generate a hypothesis. Observational or interventional studies should have a hypothesis for choosing research design and sample size. The results of observational and interventional studies further lead to the generation of new hypotheses, testing of which forms the basis of future studies. Review articles, on the other hand, may not be hypothesis-driven, but form fertile ground to generate future hypotheses for evaluation. Fig. 1 summarizes which type of studies are hypothesis-driven and which lead on to hypothesis generation.

An external file that holds a picture, illustration, etc.
Object name is jkms-36-e338-g001.jpg

STANDARDS OF WORKING AND SCIENTIFIC HYPOTHESES

A review of the published literature did not enable the identification of clearly defined standards for working and scientific hypotheses. It is essential to distinguish influential versus not influential hypotheses, evidence-based hypotheses versus a priori statements and ideas, ethical versus unethical, or potentially harmful ideas. The following points are proposed for consideration while generating working and scientific hypotheses. 1 , 2 Table 1 summarizes these points.

Evidence-based data

A scientific hypothesis should have a sound basis on previously published literature as well as the scientist's observations. Randomly generated (a priori) hypotheses are unlikely to be proven. A thorough literature search should form the basis of a hypothesis based on published evidence. 7

Unless a scientific hypothesis can be tested, it can neither be proven nor be disproven. Therefore, a scientific hypothesis should be amenable to testing with the available technologies and the present understanding of science.

Supported by pilot studies

If a hypothesis is based purely on a novel observation by the scientist in question, it should be grounded on some preliminary studies to support it. For example, if a drug that targets a specific cell population is hypothesized to be useful in a particular disease setting, then there must be some preliminary evidence that the specific cell population plays a role in driving that disease process.

Testable by ethical studies

The hypothesis should be testable by experiments that are ethically acceptable. 9 For example, a hypothesis that parachutes reduce mortality from falls from an airplane cannot be tested using a randomized controlled trial. 10 This is because it is obvious that all those jumping from a flying plane without a parachute would likely die. Similarly, the hypothesis that smoking tobacco causes lung cancer cannot be tested by a clinical trial that makes people take up smoking (since there is considerable evidence for the health hazards associated with smoking). Instead, long-term observational studies comparing outcomes in those who smoke and those who do not, as was performed in the landmark epidemiological case control study by Doll and Hill, 11 are more ethical and practical.

Balance between scientific temper and controversy

Novel findings, including novel hypotheses, particularly those that challenge established norms, are bound to face resistance for their wider acceptance. Such resistance is inevitable until the time such findings are proven with appropriate scientific rigor. However, hypotheses that generate controversy are generally unwelcome. For example, at the time the pandemic of human immunodeficiency virus (HIV) and AIDS was taking foot, there were numerous deniers that refused to believe that HIV caused AIDS. 12 , 13 Similarly, at a time when climate change is causing catastrophic changes to weather patterns worldwide, denial that climate change is occurring and consequent attempts to block climate change are certainly unwelcome. 14 The denialism and misinformation during the COVID-19 pandemic, including unfortunate examples of vaccine hesitancy, are more recent examples of controversial hypotheses not backed by science. 15 , 16 An example of a controversial hypothesis that was a revolutionary scientific breakthrough was the hypothesis put forth by Warren and Marshall that Helicobacter pylori causes peptic ulcers. Initially, the hypothesis that a microorganism could cause gastritis and gastric ulcers faced immense resistance. When the scientists that proposed the hypothesis themselves ingested H. pylori to induce gastritis in themselves, only then could they convince the wider world about their hypothesis. Such was the impact of the hypothesis was that Barry Marshall and Robin Warren were awarded the Nobel Prize in Physiology or Medicine in 2005 for this discovery. 17 , 18

DISTINGUISHING THE MOST INFLUENTIAL HYPOTHESES

Influential hypotheses are those that have stood the test of time. An archetype of an influential hypothesis is that proposed by Edward Jenner in the eighteenth century that cowpox infection protects against smallpox. While this observation had been reported for nearly a century before this time, it had not been suitably tested and publicised until Jenner conducted his experiments on a young boy by demonstrating protection against smallpox after inoculation with cowpox. 19 These experiments were the basis for widespread smallpox immunization strategies worldwide in the 20th century which resulted in the elimination of smallpox as a human disease today. 20

Other influential hypotheses are those which have been read and cited widely. An example of this is the hygiene hypothesis proposing an inverse relationship between infections in early life and allergies or autoimmunity in adulthood. An analysis reported that this hypothesis had been cited more than 3,000 times on Scopus. 1

LESSONS LEARNED FROM HYPOTHESES AMIDST THE COVID-19 PANDEMIC

The COVID-19 pandemic devastated the world like no other in recent memory. During this period, various hypotheses emerged, understandably so considering the public health emergency situation with innumerable deaths and suffering for humanity. Within weeks of the first reports of COVID-19, aberrant immune system activation was identified as a key driver of organ dysfunction and mortality in this disease. 21 Consequently, numerous drugs that suppress the immune system or abrogate the activation of the immune system were hypothesized to have a role in COVID-19. 22 One of the earliest drugs hypothesized to have a benefit was hydroxychloroquine. Hydroxychloroquine was proposed to interfere with Toll-like receptor activation and consequently ameliorate the aberrant immune system activation leading to pathology in COVID-19. 22 The drug was also hypothesized to have a prophylactic role in preventing infection or disease severity in COVID-19. It was also touted as a wonder drug for the disease by many prominent international figures. However, later studies which were well-designed randomized controlled trials failed to demonstrate any benefit of hydroxychloroquine in COVID-19. 23 , 24 , 25 , 26 Subsequently, azithromycin 27 , 28 and ivermectin 29 were hypothesized as potential therapies for COVID-19, but were not supported by evidence from randomized controlled trials. The role of vitamin D in preventing disease severity was also proposed, but has not been proven definitively until now. 30 , 31 On the other hand, randomized controlled trials identified the evidence supporting dexamethasone 32 and interleukin-6 pathway blockade with tocilizumab as effective therapies for COVID-19 in specific situations such as at the onset of hypoxia. 33 , 34 Clues towards the apparent effectiveness of various drugs against severe acute respiratory syndrome coronavirus 2 in vitro but their ineffectiveness in vivo have recently been identified. Many of these drugs are weak, lipophilic bases and some others induce phospholipidosis which results in apparent in vitro effectiveness due to non-specific off-target effects that are not replicated inside living systems. 35 , 36

Another hypothesis proposed was the association of the routine policy of vaccination with Bacillus Calmette-Guerin (BCG) with lower deaths due to COVID-19. This hypothesis emerged in the middle of 2020 when COVID-19 was still taking foot in many parts of the world. 37 , 38 Subsequently, many countries which had lower deaths at that time point went on to have higher numbers of mortality, comparable to other areas of the world. Furthermore, the hypothesis that BCG vaccination reduced COVID-19 mortality was a classic example of ecological fallacy. Associations between population level events (ecological studies; in this case, BCG vaccination and COVID-19 mortality) cannot be directly extrapolated to the individual level. Furthermore, such associations cannot per se be attributed as causal in nature, and can only serve to generate hypotheses that need to be tested at the individual level. 39

IS TRADITIONAL PEER REVIEW EFFICIENT FOR EVALUATION OF WORKING AND SCIENTIFIC HYPOTHESES?

Traditionally, publication after peer review has been considered the gold standard before any new idea finds acceptability amongst the scientific community. Getting a work (including a working or scientific hypothesis) reviewed by experts in the field before experiments are conducted to prove or disprove it helps to refine the idea further as well as improve the experiments planned to test the hypothesis. 40 A route towards this has been the emergence of journals dedicated to publishing hypotheses such as the Central Asian Journal of Medical Hypotheses and Ethics. 41 Another means of publishing hypotheses is through registered research protocols detailing the background, hypothesis, and methodology of a particular study. If such protocols are published after peer review, then the journal commits to publishing the completed study irrespective of whether the study hypothesis is proven or disproven. 42 In the post-pandemic world, online research methods such as online surveys powered via social media channels such as Twitter and Instagram might serve as critical tools to generate as well as to preliminarily test the appropriateness of hypotheses for further evaluation. 43 , 44

Some radical hypotheses might be difficult to publish after traditional peer review. These hypotheses might only be acceptable by the scientific community after they are tested in research studies. Preprints might be a way to disseminate such controversial and ground-breaking hypotheses. 45 However, scientists might prefer to keep their hypotheses confidential for the fear of plagiarism of ideas, avoiding online posting and publishing until they have tested the hypotheses.

SUGGESTIONS ON GENERATING AND PUBLISHING HYPOTHESES

Publication of hypotheses is important, however, a balance is required between scientific temper and controversy. Journal editors and reviewers might keep in mind these specific points, summarized in Table 2 and detailed hereafter, while judging the merit of hypotheses for publication. Keeping in mind the ethical principle of primum non nocere, a hypothesis should be published only if it is testable in a manner that is ethically appropriate. 46 Such hypotheses should be grounded in reality and lend themselves to further testing to either prove or disprove them. It must be considered that subsequent experiments to prove or disprove a hypothesis have an equal chance of failing or succeeding, akin to tossing a coin. A pre-conceived belief that a hypothesis is unlikely to be proven correct should not form the basis of rejection of such a hypothesis for publication. In this context, hypotheses generated after a thorough literature search to identify knowledge gaps or based on concrete clinical observations on a considerable number of patients (as opposed to random observations on a few patients) are more likely to be acceptable for publication by peer-reviewed journals. Also, hypotheses should be considered for publication or rejection based on their implications for science at large rather than whether the subsequent experiments to test them end up with results in favour of or against the original hypothesis.

Hypotheses form an important part of the scientific literature. The COVID-19 pandemic has reiterated the importance and relevance of hypotheses for dealing with public health emergencies and highlighted the need for evidence-based and ethical hypotheses. A good hypothesis is testable in a relevant study design, backed by preliminary evidence, and has positive ethical and clinical implications. General medical journals might consider publishing hypotheses as a specific article type to enable more rapid advancement of science.

Disclosure: The authors have no potential conflicts of interest to disclose.

Author Contributions:

  • Data curation: Gasparyan AY, Misra DP, Zimba O, Yessirkepov M, Agarwal V, Kitas GD.

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

1.2 - the 7 step process of statistical hypothesis testing.

We will cover the seven steps one by one.

Step 1: State the Null Hypothesis

The null hypothesis can be thought of as the opposite of the "guess" the researchers made. In the example presented in the previous section, the biologist "guesses" plant height will be different for the various fertilizers. So the null hypothesis would be that there will be no difference among the groups of plants. Specifically, in more statistical language the null for an ANOVA is that the means are the same. We state the null hypothesis as:

\(H_0 \colon \mu_1 = \mu_2 = ⋯ = \mu_T\)

for  T levels of an experimental treatment.

Step 2: State the Alternative Hypothesis

\(H_A \colon \text{ treatment level means not all equal}\)

The alternative hypothesis is stated in this way so that if the null is rejected, there are many alternative possibilities.

For example, \(\mu_1\ne \mu_2 = ⋯ = \mu_T\) is one possibility, as is \(\mu_1=\mu_2\ne\mu_3= ⋯ =\mu_T\). Many people make the mistake of stating the alternative hypothesis as \(\mu_1\ne\mu_2\ne⋯\ne\mu_T\) which says that every mean differs from every other mean. This is a possibility, but only one of many possibilities. A simple way of thinking about this is that at least one mean is different from all others. To cover all alternative outcomes, we resort to a verbal statement of "not all equal" and then follow up with mean comparisons to find out where differences among means exist. In our example, a possible outcome would be that fertilizer 1 results in plants that are exceptionally tall, but fertilizers 2, 3, and the control group may not differ from one another.

Step 3: Set \(\alpha\)

If we look at what can happen in a hypothesis test, we can construct the following contingency table:

You should be familiar with Type I and Type II errors from your introductory courses. It is important to note that we want to set \(\alpha\) before the experiment ( a-priori ) because the Type I error is the more grievous error to make. The typical value of \(\alpha\) is 0.05, establishing a 95% confidence level. For this course, we will assume \(\alpha\) =0.05, unless stated otherwise.

Step 4: Collect Data

Remember the importance of recognizing whether data is collected through an experimental design or observational study.

Step 5: Calculate a test statistic

For categorical treatment level means, we use an F- statistic, named after R.A. Fisher. We will explore the mechanics of computing the F- statistic beginning in Lesson 2. The F- value we get from the data is labeled \(F_{\text{calculated}}\).

Step 6: Construct Acceptance / Rejection regions

As with all other test statistics, a threshold (critical) value of F is established. This F- value can be obtained from statistical tables or software and is referred to as \(F_{\text{critical}}\) or \(F_\alpha\). As a reminder, this critical value is the minimum value of the test statistic (in this case \(F_{\text{calculated}}\)) for us to reject the null.

The F- distribution, \(F_\alpha\), and the location of acceptance/rejection regions are shown in the graph below:

Step 7: Based on Steps 5 and 6, draw a conclusion about \(H_0\)

If \(F_{\text{calculated}}\) is larger than \(F_\alpha\), then you are in the rejection region and you can reject the null hypothesis with \(\left(1-\alpha \right)\) level of confidence.

Note that modern statistical software condenses Steps 6 and 7 by providing a p -value. The p -value here is the probability of getting an \(F_{\text{calculated}}\) even greater than what you observe assuming the null hypothesis is true. If by chance, the \(F_{\text{calculated}} = F_\alpha\), then the p -value would be exactly equal to \(\alpha\). With larger \(F_{\text{calculated}}\) values, we move further into the rejection region and the p- value becomes less than \(\alpha\). So, the decision rule is as follows:

If the p- value obtained from the ANOVA is less than \(\alpha\), then reject \(H_0\) in favor of \(H_A\).

SkillsYouNeed

  • NUMERACY SKILLS
  • Developing and Testing Hypotheses

Search SkillsYouNeed:

Numeracy Skills:

  • A - Z List of Numeracy Skills
  • How Good Are Your Numeracy Skills? Numeracy Quiz
  • Money Management and Financial Skills
  • Real-World Maths
  • Numbers | An Introduction
  • Special Numbers and Mathematical Concepts
  • Systems of Measurement
  • Common Mathematical Symbols and Terminology
  • Apps to Help with Maths
  • Subtraction -
  • Multiplication ×
  • Positive and Negative Numbers
  • Ordering Mathematical Operations - BODMAS
  • Mental Arithmetic – Basic Mental Maths Hacks
  • Ratio and Proportion
  • Percentages %
  • Percentage Calculators
  • Percentage Change | Increase and Decrease
  • Calculating with Time
  • Estimation, Approximation and Rounding
  • Introduction to Geometry: Points, Lines and Planes
  • Introduction to Cartesian Coordinate Systems
  • Polar, Cylindrical and Spherical Coordinates
  • Properties of Polygons
  • Simple Transformations of 2-Dimensional Shapes
  • Circles and Curved Shapes
  • Perimeter and Circumference
  • Calculating Area
  • Three-Dimensional Shapes
  • Net Diagrams of 3D Shapes
  • Calculating Volume
  • Area, Surface Area and Volume Reference Sheet
  • Graphs and Charts
  • Averages (Mean, Median & Mode)
  • Simple Statistical Analysis
  • Statistical Analysis: Types of Data
  • Understanding Correlations
  • Understanding Statistical Distributions
  • Significance and Confidence Intervals
  • Multivariate Analysis
  • Introduction to Algebra
  • Simultaneous and Quadratic Equations
  • Introduction to Trigonometry
  • Introduction to Probability

Subscribe to our FREE newsletter and start improving your life in just 5 minutes a day.

You'll get our 5 free 'One Minute Life Skills' and our weekly newsletter.

We'll never share your email address and you can unsubscribe at any time.

Statistical Analysis: Developing and Testing Hypotheses

Statistical hypothesis testing is sometimes known as confirmatory data analysis. It is a way of drawing inferences from data. In the process, you develop a hypothesis or theory about what you might see in your research. You then test that hypothesis against the data that you collect.

Hypothesis testing is generally used when you want to compare two groups, or compare a group against an idealised position.

Before You Start: Developing A Research Hypothesis

Before you can do any kind of research in social science fields such as management, you need a research question or hypothesis. Research is generally designed to either answer a research question or consider a research hypothesis . These two are closely linked, and generally one or the other is used, rather than both.

A research question is the question that your research sets out to answer . For example:

Do men and women like ice cream equally?

Do men and women like the same flavours of ice cream?

What are the main problems in the market for ice cream?

How can the market for ice cream be segmented and targeted?

Research hypotheses are statements of what you believe you will find in your research.

These are then tested statistically during the research to see if your belief is correct. Examples include:

Men and women like ice cream to different extents.

Men and women like different flavours of ice cream.

Men are more likely than women to like mint ice cream.

Women are more likely than men to like chocolate ice cream.

Both men and women prefer strawberry to vanilla ice cream.

Relationships vs Differences

Research hypotheses can be expressed in terms of differences between groups, or relationships between variables. However, these are two sides of the same coin: almost any hypothesis could be set out in either way.

For example:

There is a relationship between gender and liking ice cream OR

Men are more likely to like ice cream than women.

Testing Research Hypotheses

The purpose of statistical hypothesis testing is to use a sample to draw inferences about a population.

Testing research hypotheses requires a number of steps:

Step 1. Define your research hypothesis

The first step in any hypothesis testing is to identify your hypothesis, which you will then go on to test. How you define your hypothesis may affect the type of statistical testing that you do, so it is important to be clear about it. In particular, consider whether you are going to hypothesise simply that there is a relationship or speculate about the direction of the relationship.

Using the examples above:

There is a relationship between gender and liking ice cream is a non-directional hypothesis. You have simply specified that there is a relationship, not whether men or women like ice cream more.

However, men are more likely to like ice cream than women is directional : you have specified which gender is more likely to like ice cream.

Generally, it is better not to specify direction unless you are moderately sure about it.

Step 2. Define the null hypothesis

The null hypothesis is basically a statement of what you are hoping to disprove: the opposite of your ‘guess’ about the relationship. For example, in the hypotheses above, the null hypothesis would be:

Men and women like ice cream equally, or

There is no relationship between gender and ice cream.

This also defines your ‘alternative hypothesis’ which is your ‘test hypothesis’ ( men like ice cream more than women ). Your null hypothesis is generally that there is no difference, because this is the simplest position.

The purpose of hypothesis testing is to disprove the null hypothesis. If you cannot disprove the null hypothesis, you have to assume it is correct.

Step 3. Develop a summary measure that describes your variable of interest for each group you wish to compare

Our page on Simple Statistical Analysis describes several summary measures, including two of the most common, mean and median.

The next step in your hypothesis testing is to develop a summary measure for each of your groups. For example, to test the gender differences in liking for ice cream, you might ask people how much they liked ice cream on a scale of 1 to 5. Alternatively, you might have data about the number of times that ice creams are consumed each week in the summer months.

You then need to produce a summary measure for each group, usually mean and standard deviation. These may be similar for each group, or quite different.

Step 4. Choose a reference distribution and calculate a test statistic

To decide whether there is a genuine difference between the two groups, you have to use a reference distribution against which to measure the values from the two groups.

The most common source of reference distributions is a standard distribution such as the normal distribution or t - distribution. These two are the same, except that the standard deviation of the t -distribution is estimated from the sample, and that of the normal distribution is known. There is more about this in our page on Statistical Distributions .

You then compare the summary data from the two groups by using them to calculate a test statistic. There is a standard formula for every test statistic and reference distribution. The test and reference distribution depend on your data and the purpose of your testing (see below).

The test that you use to compare your groups will depend on how many groups you have, the type of data that you have collected, and how reliable your data are. In general, you would use different tests for comparing two groups than you would for comparing three or more.

Our page Surveys and Survey Design explains that there are two types of answer scale, continuous and categorical. Age, for example, is a continuous scale, although it can also be grouped into categories. You may also find it helpful to read our page on Types of Data .

Gender is a category scale.

For a continuous scale, you can use the mean values of the two groups that you are comparing.

For a category scale, you need to use the median values.

Source: Easterby-Smith, Thorpe and Jackson, Management Research 4th Edition

One- or Two-Tailed Test

The other thing that you have to decide is whether you use what is known as a ‘one-tailed’ or ‘two-tailed’ test.

This allows you to compare differences between groups in either one or both directions.

In practice, this boils down to whether your research hypothesis is expressed as ‘x is likely to be more than y’, or ‘x is likely to be different from y’. If you are confident of the direction of the distance (that is, you are sure that the only options are that ‘x is likely to be more than y’ or ‘x and y are the same’), then your test will be one-tailed. If not, it will be two-tailed .

If there is any doubt, it is better to use a two-tailed test.

You should only use a one-tailed test when you are certain about the direction of the difference, and it doesn’t matter if you are wrong.

The graph under Step 5 shows a two-tailed test.

If you are not very confident about the quality of the data collected, for example because the inputting was done quickly and cheaply, or because the data have not been checked, then you may prefer to use the median  even if the data are continuous  to avoid any problems with outliers. This makes the tests more robust, and the results more reliable.

Our page on correlations suggests that you may also want to plot a scattergraph before undertaking any further analysis. This will also help you to identify any outliers or potential problems with the data.

Calculating the Test Statistic

For each type of test, there is a standard formula for the test statistic. For example, for the t -test, it is:

(M1-M2)/SE(diff)

M1 is the mean of the first group

M2 is the mean of the second group

SE(diff) is the standard error of the difference, which is calculated from the standard deviation and the sample size of each group.

The formula for calculating the standard error of the difference between means is:

  • sd 2 = the square of the standard deviation of the source population (i.e., the variance);
  • n a = the size of sample A; and
  • n b = the size of sample B.

Step 5. Identify Acceptance and Rejection Regions

The final part of the test is to see if your test statistic is significant—in other words, whether you are going to accept or reject your null hypothesis. You need to consider first what level of significance is required. This tells you the probability that you have achieved your result by chance.

Significance (or p-value) is usually required to be either 5% or 1%, meaning that you are 95% or 99% confident that your result was not achieved by chance.

NOTE:  the significance level is sometimes expressed as  p  < 0.05 or  p  < 0.01.

For more about significance, you may like to read our page on Significance and Confidence Intervals .

The graph below shows a reference distribution (this one could be either the normal or the t- distribution) with the acceptance and rejection regions marked. It also shows the critical values. µ is the mean. For more about this, you may like to read our page on Statistical Distributions .

Reference distribution showing acceptance and rejection regions, critical values and mean.

The critical values are identified from published statistical tables for your reference distribution, which are available for different levels of significance.

If your test statistic falls within either of the two rejection regions (that is, it is greater than the higher critical value, or less than the lower one), you will reject the null hypothesis. You can therefore accept your alternative hypothesis.

Step 6. Draw Conclusions and Inferences

The final step is to draw conclusions.

If your test statistic fell within the rejection region, and you have rejected the null hypothesis, you can therefore conclude that there is a gender difference in liking for ice cream, using the example above.

Types of Error

There are four possible outcomes from statistical testing (see table):

The groups are different, and you conclude that they are different (correct result)

The groups are different, but you conclude that they are not (Type II error)

The groups are the same, but you conclude that they are different (Type I error)

The groups are the same, and you conclude that they are the same (correct result).

Type I errors are generally considered more important than Type II, because they have the potential to change the status quo.

For example, if you wrongly conclude that a new medical treatment is effective, doctors are likely to move to providing that treatment. Patients may receive the treatment instead of an alternative that could have fewer side effects, and pharmaceutical companies may stop looking for an alternative treatment.

Data Handling and Algebra - The Skills You Need Guide to Numeracy

Further Reading from Skills You Need

Data Handling and Algebra Part of The Skills You Need Guide to Numeracy

This eBook covers the basics of data handling, data visualisation, basic statistical analysis and algebra. The book contains plenty of worked examples to improve understanding as well as real-world examples to show you how these concepts are useful.

Whether you want to brush up on your basics, or help your children with their learning, this is the book for you.

There are statistical software packages available that will carry out all these tests for you. However, if you have never studied statistics, and you’re not very confident about what you’re doing, you are probably best off discussing it with a statistician or consulting a detailed statistical textbook.

Poorly executed statistical analysis can invalidate very good research.  It is much better to find someone to help you. However, this page will help you to understand your friendly statistician!

Continue to: Significance and Confidence Intervals Statistical Analysis: Types of Data

See also: Understanding Correlations Understanding Statistical Distributions Averages (Mean, Median and Mode)

  • Resources Home 🏠
  • Try SciSpace Copilot
  • Search research papers
  • Add Copilot Extension
  • Try AI Detector
  • Try Paraphraser
  • Try Citation Generator
  • April Papers
  • June Papers
  • July Papers

SciSpace Resources

The Craft of Writing a Strong Hypothesis

Deeptanshu D

Table of Contents

Writing a hypothesis is one of the essential elements of a scientific research paper. It needs to be to the point, clearly communicating what your research is trying to accomplish. A blurry, drawn-out, or complexly-structured hypothesis can confuse your readers. Or worse, the editor and peer reviewers.

A captivating hypothesis is not too intricate. This blog will take you through the process so that, by the end of it, you have a better idea of how to convey your research paper's intent in just one sentence.

What is a Hypothesis?

The first step in your scientific endeavor, a hypothesis, is a strong, concise statement that forms the basis of your research. It is not the same as a thesis statement , which is a brief summary of your research paper .

The sole purpose of a hypothesis is to predict your paper's findings, data, and conclusion. It comes from a place of curiosity and intuition . When you write a hypothesis, you're essentially making an educated guess based on scientific prejudices and evidence, which is further proven or disproven through the scientific method.

The reason for undertaking research is to observe a specific phenomenon. A hypothesis, therefore, lays out what the said phenomenon is. And it does so through two variables, an independent and dependent variable.

The independent variable is the cause behind the observation, while the dependent variable is the effect of the cause. A good example of this is “mixing red and blue forms purple.” In this hypothesis, mixing red and blue is the independent variable as you're combining the two colors at your own will. The formation of purple is the dependent variable as, in this case, it is conditional to the independent variable.

Different Types of Hypotheses‌

Types-of-hypotheses

Types of hypotheses

Some would stand by the notion that there are only two types of hypotheses: a Null hypothesis and an Alternative hypothesis. While that may have some truth to it, it would be better to fully distinguish the most common forms as these terms come up so often, which might leave you out of context.

Apart from Null and Alternative, there are Complex, Simple, Directional, Non-Directional, Statistical, and Associative and casual hypotheses. They don't necessarily have to be exclusive, as one hypothesis can tick many boxes, but knowing the distinctions between them will make it easier for you to construct your own.

1. Null hypothesis

A null hypothesis proposes no relationship between two variables. Denoted by H 0 , it is a negative statement like “Attending physiotherapy sessions does not affect athletes' on-field performance.” Here, the author claims physiotherapy sessions have no effect on on-field performances. Even if there is, it's only a coincidence.

2. Alternative hypothesis

Considered to be the opposite of a null hypothesis, an alternative hypothesis is donated as H1 or Ha. It explicitly states that the dependent variable affects the independent variable. A good  alternative hypothesis example is “Attending physiotherapy sessions improves athletes' on-field performance.” or “Water evaporates at 100 °C. ” The alternative hypothesis further branches into directional and non-directional.

  • Directional hypothesis: A hypothesis that states the result would be either positive or negative is called directional hypothesis. It accompanies H1 with either the ‘<' or ‘>' sign.
  • Non-directional hypothesis: A non-directional hypothesis only claims an effect on the dependent variable. It does not clarify whether the result would be positive or negative. The sign for a non-directional hypothesis is ‘≠.'

3. Simple hypothesis

A simple hypothesis is a statement made to reflect the relation between exactly two variables. One independent and one dependent. Consider the example, “Smoking is a prominent cause of lung cancer." The dependent variable, lung cancer, is dependent on the independent variable, smoking.

4. Complex hypothesis

In contrast to a simple hypothesis, a complex hypothesis implies the relationship between multiple independent and dependent variables. For instance, “Individuals who eat more fruits tend to have higher immunity, lesser cholesterol, and high metabolism.” The independent variable is eating more fruits, while the dependent variables are higher immunity, lesser cholesterol, and high metabolism.

5. Associative and casual hypothesis

Associative and casual hypotheses don't exhibit how many variables there will be. They define the relationship between the variables. In an associative hypothesis, changing any one variable, dependent or independent, affects others. In a casual hypothesis, the independent variable directly affects the dependent.

6. Empirical hypothesis

Also referred to as the working hypothesis, an empirical hypothesis claims a theory's validation via experiments and observation. This way, the statement appears justifiable and different from a wild guess.

Say, the hypothesis is “Women who take iron tablets face a lesser risk of anemia than those who take vitamin B12.” This is an example of an empirical hypothesis where the researcher  the statement after assessing a group of women who take iron tablets and charting the findings.

7. Statistical hypothesis

The point of a statistical hypothesis is to test an already existing hypothesis by studying a population sample. Hypothesis like “44% of the Indian population belong in the age group of 22-27.” leverage evidence to prove or disprove a particular statement.

Characteristics of a Good Hypothesis

Writing a hypothesis is essential as it can make or break your research for you. That includes your chances of getting published in a journal. So when you're designing one, keep an eye out for these pointers:

  • A research hypothesis has to be simple yet clear to look justifiable enough.
  • It has to be testable — your research would be rendered pointless if too far-fetched into reality or limited by technology.
  • It has to be precise about the results —what you are trying to do and achieve through it should come out in your hypothesis.
  • A research hypothesis should be self-explanatory, leaving no doubt in the reader's mind.
  • If you are developing a relational hypothesis, you need to include the variables and establish an appropriate relationship among them.
  • A hypothesis must keep and reflect the scope for further investigations and experiments.

Separating a Hypothesis from a Prediction

Outside of academia, hypothesis and prediction are often used interchangeably. In research writing, this is not only confusing but also incorrect. And although a hypothesis and prediction are guesses at their core, there are many differences between them.

A hypothesis is an educated guess or even a testable prediction validated through research. It aims to analyze the gathered evidence and facts to define a relationship between variables and put forth a logical explanation behind the nature of events.

Predictions are assumptions or expected outcomes made without any backing evidence. They are more fictionally inclined regardless of where they originate from.

For this reason, a hypothesis holds much more weight than a prediction. It sticks to the scientific method rather than pure guesswork. "Planets revolve around the Sun." is an example of a hypothesis as it is previous knowledge and observed trends. Additionally, we can test it through the scientific method.

Whereas "COVID-19 will be eradicated by 2030." is a prediction. Even though it results from past trends, we can't prove or disprove it. So, the only way this gets validated is to wait and watch if COVID-19 cases end by 2030.

Finally, How to Write a Hypothesis

Quick-tips-on-how-to-write-a-hypothesis

Quick tips on writing a hypothesis

1.  Be clear about your research question

A hypothesis should instantly address the research question or the problem statement. To do so, you need to ask a question. Understand the constraints of your undertaken research topic and then formulate a simple and topic-centric problem. Only after that can you develop a hypothesis and further test for evidence.

2. Carry out a recce

Once you have your research's foundation laid out, it would be best to conduct preliminary research. Go through previous theories, academic papers, data, and experiments before you start curating your research hypothesis. It will give you an idea of your hypothesis's viability or originality.

Making use of references from relevant research papers helps draft a good research hypothesis. SciSpace Discover offers a repository of over 270 million research papers to browse through and gain a deeper understanding of related studies on a particular topic. Additionally, you can use SciSpace Copilot , your AI research assistant, for reading any lengthy research paper and getting a more summarized context of it. A hypothesis can be formed after evaluating many such summarized research papers. Copilot also offers explanations for theories and equations, explains paper in simplified version, allows you to highlight any text in the paper or clip math equations and tables and provides a deeper, clear understanding of what is being said. This can improve the hypothesis by helping you identify potential research gaps.

3. Create a 3-dimensional hypothesis

Variables are an essential part of any reasonable hypothesis. So, identify your independent and dependent variable(s) and form a correlation between them. The ideal way to do this is to write the hypothetical assumption in the ‘if-then' form. If you use this form, make sure that you state the predefined relationship between the variables.

In another way, you can choose to present your hypothesis as a comparison between two variables. Here, you must specify the difference you expect to observe in the results.

4. Write the first draft

Now that everything is in place, it's time to write your hypothesis. For starters, create the first draft. In this version, write what you expect to find from your research.

Clearly separate your independent and dependent variables and the link between them. Don't fixate on syntax at this stage. The goal is to ensure your hypothesis addresses the issue.

5. Proof your hypothesis

After preparing the first draft of your hypothesis, you need to inspect it thoroughly. It should tick all the boxes, like being concise, straightforward, relevant, and accurate. Your final hypothesis has to be well-structured as well.

Research projects are an exciting and crucial part of being a scholar. And once you have your research question, you need a great hypothesis to begin conducting research. Thus, knowing how to write a hypothesis is very important.

Now that you have a firmer grasp on what a good hypothesis constitutes, the different kinds there are, and what process to follow, you will find it much easier to write your hypothesis, which ultimately helps your research.

Now it's easier than ever to streamline your research workflow with SciSpace Discover . Its integrated, comprehensive end-to-end platform for research allows scholars to easily discover, write and publish their research and fosters collaboration.

It includes everything you need, including a repository of over 270 million research papers across disciplines, SEO-optimized summaries and public profiles to show your expertise and experience.

If you found these tips on writing a research hypothesis useful, head over to our blog on Statistical Hypothesis Testing to learn about the top researchers, papers, and institutions in this domain.

Frequently Asked Questions (FAQs)

1. what is the definition of hypothesis.

According to the Oxford dictionary, a hypothesis is defined as “An idea or explanation of something that is based on a few known facts, but that has not yet been proved to be true or correct”.

2. What is an example of hypothesis?

The hypothesis is a statement that proposes a relationship between two or more variables. An example: "If we increase the number of new users who join our platform by 25%, then we will see an increase in revenue."

3. What is an example of null hypothesis?

A null hypothesis is a statement that there is no relationship between two variables. The null hypothesis is written as H0. The null hypothesis states that there is no effect. For example, if you're studying whether or not a particular type of exercise increases strength, your null hypothesis will be "there is no difference in strength between people who exercise and people who don't."

4. What are the types of research?

• Fundamental research

• Applied research

• Qualitative research

• Quantitative research

• Mixed research

• Exploratory research

• Longitudinal research

• Cross-sectional research

• Field research

• Laboratory research

• Fixed research

• Flexible research

• Action research

• Policy research

• Classification research

• Comparative research

• Causal research

• Inductive research

• Deductive research

5. How to write a hypothesis?

• Your hypothesis should be able to predict the relationship and outcome.

• Avoid wordiness by keeping it simple and brief.

• Your hypothesis should contain observable and testable outcomes.

• Your hypothesis should be relevant to the research question.

6. What are the 2 types of hypothesis?

• Null hypotheses are used to test the claim that "there is no difference between two groups of data".

• Alternative hypotheses test the claim that "there is a difference between two data groups".

7. Difference between research question and research hypothesis?

A research question is a broad, open-ended question you will try to answer through your research. A hypothesis is a statement based on prior research or theory that you expect to be true due to your study. Example - Research question: What are the factors that influence the adoption of the new technology? Research hypothesis: There is a positive relationship between age, education and income level with the adoption of the new technology.

8. What is plural for hypothesis?

The plural of hypothesis is hypotheses. Here's an example of how it would be used in a statement, "Numerous well-considered hypotheses are presented in this part, and they are supported by tables and figures that are well-illustrated."

9. What is the red queen hypothesis?

The red queen hypothesis in evolutionary biology states that species must constantly evolve to avoid extinction because if they don't, they will be outcompeted by other species that are evolving. Leigh Van Valen first proposed it in 1973; since then, it has been tested and substantiated many times.

10. Who is known as the father of null hypothesis?

The father of the null hypothesis is Sir Ronald Fisher. He published a paper in 1925 that introduced the concept of null hypothesis testing, and he was also the first to use the term itself.

11. When to reject null hypothesis?

You need to find a significant difference between your two populations to reject the null hypothesis. You can determine that by running statistical tests such as an independent sample t-test or a dependent sample t-test. You should reject the null hypothesis if the p-value is less than 0.05.

formulate and test a working hypothesis

You might also like

Consensus GPT vs. SciSpace GPT: Choose the Best GPT for Research

Consensus GPT vs. SciSpace GPT: Choose the Best GPT for Research

Sumalatha G

Literature Review and Theoretical Framework: Understanding the Differences

Nikhil Seethi

Types of Essays in Academic Writing - Quick Guide (2024)

  • Business Essentials
  • Leadership & Management
  • Credential of Leadership, Impact, and Management in Business (CLIMB)
  • Entrepreneurship & Innovation
  • Digital Transformation
  • Finance & Accounting
  • Business in Society
  • For Organizations
  • Support Portal
  • Media Coverage
  • Founding Donors
  • Leadership Team

formulate and test a working hypothesis

  • Harvard Business School →
  • HBS Online →
  • Business Insights →

Business Insights

Harvard Business School Online's Business Insights Blog provides the career insights you need to achieve your goals and gain confidence in your business skills.

  • Career Development
  • Communication
  • Decision-Making
  • Earning Your MBA
  • Negotiation
  • News & Events
  • Productivity
  • Staff Spotlight
  • Student Profiles
  • Work-Life Balance
  • AI Essentials for Business
  • Alternative Investments
  • Business Analytics
  • Business Strategy
  • Business and Climate Change
  • Design Thinking and Innovation
  • Digital Marketing Strategy
  • Disruptive Strategy
  • Economics for Managers
  • Entrepreneurship Essentials
  • Financial Accounting
  • Global Business
  • Launching Tech Ventures
  • Leadership Principles
  • Leadership, Ethics, and Corporate Accountability
  • Leading with Finance
  • Management Essentials
  • Negotiation Mastery
  • Organizational Leadership
  • Power and Influence for Positive Impact
  • Strategy Execution
  • Sustainable Business Strategy
  • Sustainable Investing
  • Winning with Digital Platforms

A Beginner’s Guide to Hypothesis Testing in Business

Business professionals performing hypothesis testing

  • 30 Mar 2021

Becoming a more data-driven decision-maker can bring several benefits to your organization, enabling you to identify new opportunities to pursue and threats to abate. Rather than allowing subjective thinking to guide your business strategy, backing your decisions with data can empower your company to become more innovative and, ultimately, profitable.

If you’re new to data-driven decision-making, you might be wondering how data translates into business strategy. The answer lies in generating a hypothesis and verifying or rejecting it based on what various forms of data tell you.

Below is a look at hypothesis testing and the role it plays in helping businesses become more data-driven.

Access your free e-book today.

What Is Hypothesis Testing?

To understand what hypothesis testing is, it’s important first to understand what a hypothesis is.

A hypothesis or hypothesis statement seeks to explain why something has happened, or what might happen, under certain conditions. It can also be used to understand how different variables relate to each other. Hypotheses are often written as if-then statements; for example, “If this happens, then this will happen.”

Hypothesis testing , then, is a statistical means of testing an assumption stated in a hypothesis. While the specific methodology leveraged depends on the nature of the hypothesis and data available, hypothesis testing typically uses sample data to extrapolate insights about a larger population.

Hypothesis Testing in Business

When it comes to data-driven decision-making, there’s a certain amount of risk that can mislead a professional. This could be due to flawed thinking or observations, incomplete or inaccurate data , or the presence of unknown variables. The danger in this is that, if major strategic decisions are made based on flawed insights, it can lead to wasted resources, missed opportunities, and catastrophic outcomes.

The real value of hypothesis testing in business is that it allows professionals to test their theories and assumptions before putting them into action. This essentially allows an organization to verify its analysis is correct before committing resources to implement a broader strategy.

As one example, consider a company that wishes to launch a new marketing campaign to revitalize sales during a slow period. Doing so could be an incredibly expensive endeavor, depending on the campaign’s size and complexity. The company, therefore, may wish to test the campaign on a smaller scale to understand how it will perform.

In this example, the hypothesis that’s being tested would fall along the lines of: “If the company launches a new marketing campaign, then it will translate into an increase in sales.” It may even be possible to quantify how much of a lift in sales the company expects to see from the effort. Pending the results of the pilot campaign, the business would then know whether it makes sense to roll it out more broadly.

Related: 9 Fundamental Data Science Skills for Business Professionals

Key Considerations for Hypothesis Testing

1. alternative hypothesis and null hypothesis.

In hypothesis testing, the hypothesis that’s being tested is known as the alternative hypothesis . Often, it’s expressed as a correlation or statistical relationship between variables. The null hypothesis , on the other hand, is a statement that’s meant to show there’s no statistical relationship between the variables being tested. It’s typically the exact opposite of whatever is stated in the alternative hypothesis.

For example, consider a company’s leadership team that historically and reliably sees $12 million in monthly revenue. They want to understand if reducing the price of their services will attract more customers and, in turn, increase revenue.

In this case, the alternative hypothesis may take the form of a statement such as: “If we reduce the price of our flagship service by five percent, then we’ll see an increase in sales and realize revenues greater than $12 million in the next month.”

The null hypothesis, on the other hand, would indicate that revenues wouldn’t increase from the base of $12 million, or might even decrease.

Check out the video below about the difference between an alternative and a null hypothesis, and subscribe to our YouTube channel for more explainer content.

2. Significance Level and P-Value

Statistically speaking, if you were to run the same scenario 100 times, you’d likely receive somewhat different results each time. If you were to plot these results in a distribution plot, you’d see the most likely outcome is at the tallest point in the graph, with less likely outcomes falling to the right and left of that point.

distribution plot graph

With this in mind, imagine you’ve completed your hypothesis test and have your results, which indicate there may be a correlation between the variables you were testing. To understand your results' significance, you’ll need to identify a p-value for the test, which helps note how confident you are in the test results.

In statistics, the p-value depicts the probability that, assuming the null hypothesis is correct, you might still observe results that are at least as extreme as the results of your hypothesis test. The smaller the p-value, the more likely the alternative hypothesis is correct, and the greater the significance of your results.

3. One-Sided vs. Two-Sided Testing

When it’s time to test your hypothesis, it’s important to leverage the correct testing method. The two most common hypothesis testing methods are one-sided and two-sided tests , or one-tailed and two-tailed tests, respectively.

Typically, you’d leverage a one-sided test when you have a strong conviction about the direction of change you expect to see due to your hypothesis test. You’d leverage a two-sided test when you’re less confident in the direction of change.

Business Analytics | Become a data-driven leader | Learn More

4. Sampling

To perform hypothesis testing in the first place, you need to collect a sample of data to be analyzed. Depending on the question you’re seeking to answer or investigate, you might collect samples through surveys, observational studies, or experiments.

A survey involves asking a series of questions to a random population sample and recording self-reported responses.

Observational studies involve a researcher observing a sample population and collecting data as it occurs naturally, without intervention.

Finally, an experiment involves dividing a sample into multiple groups, one of which acts as the control group. For each non-control group, the variable being studied is manipulated to determine how the data collected differs from that of the control group.

A Beginner's Guide to Data and Analytics | Access Your Free E-Book | Download Now

Learn How to Perform Hypothesis Testing

Hypothesis testing is a complex process involving different moving pieces that can allow an organization to effectively leverage its data and inform strategic decisions.

If you’re interested in better understanding hypothesis testing and the role it can play within your organization, one option is to complete a course that focuses on the process. Doing so can lay the statistical and analytical foundation you need to succeed.

Do you want to learn more about hypothesis testing? Explore Business Analytics —one of our online business essentials courses —and download our Beginner’s Guide to Data & Analytics .

formulate and test a working hypothesis

About the Author

  • Bipolar Disorder
  • Therapy Center
  • When To See a Therapist
  • Types of Therapy
  • Best Online Therapy
  • Best Couples Therapy
  • Best Family Therapy
  • Managing Stress
  • Sleep and Dreaming
  • Understanding Emotions
  • Self-Improvement
  • Healthy Relationships
  • Student Resources
  • Personality Types
  • Guided Meditations
  • Verywell Mind Insights
  • 2023 Verywell Mind 25
  • Mental Health in the Classroom
  • Editorial Process
  • Meet Our Review Board
  • Crisis Support

How to Write a Great Hypothesis

Hypothesis Definition, Format, Examples, and Tips

Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

formulate and test a working hypothesis

Amy Morin, LCSW, is a psychotherapist and international bestselling author. Her books, including "13 Things Mentally Strong People Don't Do," have been translated into more than 40 languages. Her TEDx talk,  "The Secret of Becoming Mentally Strong," is one of the most viewed talks of all time.

formulate and test a working hypothesis

Verywell / Alex Dos Diaz

  • The Scientific Method

Hypothesis Format

Falsifiability of a hypothesis.

  • Operationalization

Hypothesis Types

Hypotheses examples.

  • Collecting Data

A hypothesis is a tentative statement about the relationship between two or more variables. It is a specific, testable prediction about what you expect to happen in a study. It is a preliminary answer to your question that helps guide the research process.

Consider a study designed to examine the relationship between sleep deprivation and test performance. The hypothesis might be: "This study is designed to assess the hypothesis that sleep-deprived people will perform worse on a test than individuals who are not sleep-deprived."

At a Glance

A hypothesis is crucial to scientific research because it offers a clear direction for what the researchers are looking to find. This allows them to design experiments to test their predictions and add to our scientific knowledge about the world. This article explores how a hypothesis is used in psychology research, how to write a good hypothesis, and the different types of hypotheses you might use.

The Hypothesis in the Scientific Method

In the scientific method , whether it involves research in psychology, biology, or some other area, a hypothesis represents what the researchers think will happen in an experiment. The scientific method involves the following steps:

  • Forming a question
  • Performing background research
  • Creating a hypothesis
  • Designing an experiment
  • Collecting data
  • Analyzing the results
  • Drawing conclusions
  • Communicating the results

The hypothesis is a prediction, but it involves more than a guess. Most of the time, the hypothesis begins with a question which is then explored through background research. At this point, researchers then begin to develop a testable hypothesis.

Unless you are creating an exploratory study, your hypothesis should always explain what you  expect  to happen.

In a study exploring the effects of a particular drug, the hypothesis might be that researchers expect the drug to have some type of effect on the symptoms of a specific illness. In psychology, the hypothesis might focus on how a certain aspect of the environment might influence a particular behavior.

Remember, a hypothesis does not have to be correct. While the hypothesis predicts what the researchers expect to see, the goal of the research is to determine whether this guess is right or wrong. When conducting an experiment, researchers might explore numerous factors to determine which ones might contribute to the ultimate outcome.

In many cases, researchers may find that the results of an experiment  do not  support the original hypothesis. When writing up these results, the researchers might suggest other options that should be explored in future studies.

In many cases, researchers might draw a hypothesis from a specific theory or build on previous research. For example, prior research has shown that stress can impact the immune system. So a researcher might hypothesize: "People with high-stress levels will be more likely to contract a common cold after being exposed to the virus than people who have low-stress levels."

In other instances, researchers might look at commonly held beliefs or folk wisdom. "Birds of a feather flock together" is one example of folk adage that a psychologist might try to investigate. The researcher might pose a specific hypothesis that "People tend to select romantic partners who are similar to them in interests and educational level."

Elements of a Good Hypothesis

So how do you write a good hypothesis? When trying to come up with a hypothesis for your research or experiments, ask yourself the following questions:

  • Is your hypothesis based on your research on a topic?
  • Can your hypothesis be tested?
  • Does your hypothesis include independent and dependent variables?

Before you come up with a specific hypothesis, spend some time doing background research. Once you have completed a literature review, start thinking about potential questions you still have. Pay attention to the discussion section in the  journal articles you read . Many authors will suggest questions that still need to be explored.

How to Formulate a Good Hypothesis

To form a hypothesis, you should take these steps:

  • Collect as many observations about a topic or problem as you can.
  • Evaluate these observations and look for possible causes of the problem.
  • Create a list of possible explanations that you might want to explore.
  • After you have developed some possible hypotheses, think of ways that you could confirm or disprove each hypothesis through experimentation. This is known as falsifiability.

In the scientific method ,  falsifiability is an important part of any valid hypothesis. In order to test a claim scientifically, it must be possible that the claim could be proven false.

Students sometimes confuse the idea of falsifiability with the idea that it means that something is false, which is not the case. What falsifiability means is that  if  something was false, then it is possible to demonstrate that it is false.

One of the hallmarks of pseudoscience is that it makes claims that cannot be refuted or proven false.

The Importance of Operational Definitions

A variable is a factor or element that can be changed and manipulated in ways that are observable and measurable. However, the researcher must also define how the variable will be manipulated and measured in the study.

Operational definitions are specific definitions for all relevant factors in a study. This process helps make vague or ambiguous concepts detailed and measurable.

For example, a researcher might operationally define the variable " test anxiety " as the results of a self-report measure of anxiety experienced during an exam. A "study habits" variable might be defined by the amount of studying that actually occurs as measured by time.

These precise descriptions are important because many things can be measured in various ways. Clearly defining these variables and how they are measured helps ensure that other researchers can replicate your results.

Replicability

One of the basic principles of any type of scientific research is that the results must be replicable.

Replication means repeating an experiment in the same way to produce the same results. By clearly detailing the specifics of how the variables were measured and manipulated, other researchers can better understand the results and repeat the study if needed.

Some variables are more difficult than others to define. For example, how would you operationally define a variable such as aggression ? For obvious ethical reasons, researchers cannot create a situation in which a person behaves aggressively toward others.

To measure this variable, the researcher must devise a measurement that assesses aggressive behavior without harming others. The researcher might utilize a simulated task to measure aggressiveness in this situation.

Hypothesis Checklist

  • Does your hypothesis focus on something that you can actually test?
  • Does your hypothesis include both an independent and dependent variable?
  • Can you manipulate the variables?
  • Can your hypothesis be tested without violating ethical standards?

The hypothesis you use will depend on what you are investigating and hoping to find. Some of the main types of hypotheses that you might use include:

  • Simple hypothesis : This type of hypothesis suggests there is a relationship between one independent variable and one dependent variable.
  • Complex hypothesis : This type suggests a relationship between three or more variables, such as two independent and dependent variables.
  • Null hypothesis : This hypothesis suggests no relationship exists between two or more variables.
  • Alternative hypothesis : This hypothesis states the opposite of the null hypothesis.
  • Statistical hypothesis : This hypothesis uses statistical analysis to evaluate a representative population sample and then generalizes the findings to the larger group.
  • Logical hypothesis : This hypothesis assumes a relationship between variables without collecting data or evidence.

A hypothesis often follows a basic format of "If {this happens} then {this will happen}." One way to structure your hypothesis is to describe what will happen to the  dependent variable  if you change the  independent variable .

The basic format might be: "If {these changes are made to a certain independent variable}, then we will observe {a change in a specific dependent variable}."

A few examples of simple hypotheses:

  • "Students who eat breakfast will perform better on a math exam than students who do not eat breakfast."
  • "Students who experience test anxiety before an English exam will get lower scores than students who do not experience test anxiety."​
  • "Motorists who talk on the phone while driving will be more likely to make errors on a driving course than those who do not talk on the phone."
  • "Children who receive a new reading intervention will have higher reading scores than students who do not receive the intervention."

Examples of a complex hypothesis include:

  • "People with high-sugar diets and sedentary activity levels are more likely to develop depression."
  • "Younger people who are regularly exposed to green, outdoor areas have better subjective well-being than older adults who have limited exposure to green spaces."

Examples of a null hypothesis include:

  • "There is no difference in anxiety levels between people who take St. John's wort supplements and those who do not."
  • "There is no difference in scores on a memory recall task between children and adults."
  • "There is no difference in aggression levels between children who play first-person shooter games and those who do not."

Examples of an alternative hypothesis:

  • "People who take St. John's wort supplements will have less anxiety than those who do not."
  • "Adults will perform better on a memory task than children."
  • "Children who play first-person shooter games will show higher levels of aggression than children who do not." 

Collecting Data on Your Hypothesis

Once a researcher has formed a testable hypothesis, the next step is to select a research design and start collecting data. The research method depends largely on exactly what they are studying. There are two basic types of research methods: descriptive research and experimental research.

Descriptive Research Methods

Descriptive research such as  case studies ,  naturalistic observations , and surveys are often used when  conducting an experiment is difficult or impossible. These methods are best used to describe different aspects of a behavior or psychological phenomenon.

Once a researcher has collected data using descriptive methods, a  correlational study  can examine how the variables are related. This research method might be used to investigate a hypothesis that is difficult to test experimentally.

Experimental Research Methods

Experimental methods  are used to demonstrate causal relationships between variables. In an experiment, the researcher systematically manipulates a variable of interest (known as the independent variable) and measures the effect on another variable (known as the dependent variable).

Unlike correlational studies, which can only be used to determine if there is a relationship between two variables, experimental methods can be used to determine the actual nature of the relationship—whether changes in one variable actually  cause  another to change.

The hypothesis is a critical part of any scientific exploration. It represents what researchers expect to find in a study or experiment. In situations where the hypothesis is unsupported by the research, the research still has value. Such research helps us better understand how different aspects of the natural world relate to one another. It also helps us develop new hypotheses that can then be tested in the future.

Thompson WH, Skau S. On the scope of scientific hypotheses .  R Soc Open Sci . 2023;10(8):230607. doi:10.1098/rsos.230607

Taran S, Adhikari NKJ, Fan E. Falsifiability in medicine: what clinicians can learn from Karl Popper [published correction appears in Intensive Care Med. 2021 Jun 17;:].  Intensive Care Med . 2021;47(9):1054-1056. doi:10.1007/s00134-021-06432-z

Eyler AA. Research Methods for Public Health . 1st ed. Springer Publishing Company; 2020. doi:10.1891/9780826182067.0004

Nosek BA, Errington TM. What is replication ?  PLoS Biol . 2020;18(3):e3000691. doi:10.1371/journal.pbio.3000691

Aggarwal R, Ranganathan P. Study designs: Part 2 - Descriptive studies .  Perspect Clin Res . 2019;10(1):34-36. doi:10.4103/picr.PICR_154_18

Nevid J. Psychology: Concepts and Applications. Wadworth, 2013.

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

  • Privacy Policy

Research Method

Home » What is a Hypothesis – Types, Examples and Writing Guide

What is a Hypothesis – Types, Examples and Writing Guide

Table of Contents

What is a Hypothesis

Definition:

Hypothesis is an educated guess or proposed explanation for a phenomenon, based on some initial observations or data. It is a tentative statement that can be tested and potentially proven or disproven through further investigation and experimentation.

Hypothesis is often used in scientific research to guide the design of experiments and the collection and analysis of data. It is an essential element of the scientific method, as it allows researchers to make predictions about the outcome of their experiments and to test those predictions to determine their accuracy.

Types of Hypothesis

Types of Hypothesis are as follows:

Research Hypothesis

A research hypothesis is a statement that predicts a relationship between variables. It is usually formulated as a specific statement that can be tested through research, and it is often used in scientific research to guide the design of experiments.

Null Hypothesis

The null hypothesis is a statement that assumes there is no significant difference or relationship between variables. It is often used as a starting point for testing the research hypothesis, and if the results of the study reject the null hypothesis, it suggests that there is a significant difference or relationship between variables.

Alternative Hypothesis

An alternative hypothesis is a statement that assumes there is a significant difference or relationship between variables. It is often used as an alternative to the null hypothesis and is tested against the null hypothesis to determine which statement is more accurate.

Directional Hypothesis

A directional hypothesis is a statement that predicts the direction of the relationship between variables. For example, a researcher might predict that increasing the amount of exercise will result in a decrease in body weight.

Non-directional Hypothesis

A non-directional hypothesis is a statement that predicts the relationship between variables but does not specify the direction. For example, a researcher might predict that there is a relationship between the amount of exercise and body weight, but they do not specify whether increasing or decreasing exercise will affect body weight.

Statistical Hypothesis

A statistical hypothesis is a statement that assumes a particular statistical model or distribution for the data. It is often used in statistical analysis to test the significance of a particular result.

Composite Hypothesis

A composite hypothesis is a statement that assumes more than one condition or outcome. It can be divided into several sub-hypotheses, each of which represents a different possible outcome.

Empirical Hypothesis

An empirical hypothesis is a statement that is based on observed phenomena or data. It is often used in scientific research to develop theories or models that explain the observed phenomena.

Simple Hypothesis

A simple hypothesis is a statement that assumes only one outcome or condition. It is often used in scientific research to test a single variable or factor.

Complex Hypothesis

A complex hypothesis is a statement that assumes multiple outcomes or conditions. It is often used in scientific research to test the effects of multiple variables or factors on a particular outcome.

Applications of Hypothesis

Hypotheses are used in various fields to guide research and make predictions about the outcomes of experiments or observations. Here are some examples of how hypotheses are applied in different fields:

  • Science : In scientific research, hypotheses are used to test the validity of theories and models that explain natural phenomena. For example, a hypothesis might be formulated to test the effects of a particular variable on a natural system, such as the effects of climate change on an ecosystem.
  • Medicine : In medical research, hypotheses are used to test the effectiveness of treatments and therapies for specific conditions. For example, a hypothesis might be formulated to test the effects of a new drug on a particular disease.
  • Psychology : In psychology, hypotheses are used to test theories and models of human behavior and cognition. For example, a hypothesis might be formulated to test the effects of a particular stimulus on the brain or behavior.
  • Sociology : In sociology, hypotheses are used to test theories and models of social phenomena, such as the effects of social structures or institutions on human behavior. For example, a hypothesis might be formulated to test the effects of income inequality on crime rates.
  • Business : In business research, hypotheses are used to test the validity of theories and models that explain business phenomena, such as consumer behavior or market trends. For example, a hypothesis might be formulated to test the effects of a new marketing campaign on consumer buying behavior.
  • Engineering : In engineering, hypotheses are used to test the effectiveness of new technologies or designs. For example, a hypothesis might be formulated to test the efficiency of a new solar panel design.

How to write a Hypothesis

Here are the steps to follow when writing a hypothesis:

Identify the Research Question

The first step is to identify the research question that you want to answer through your study. This question should be clear, specific, and focused. It should be something that can be investigated empirically and that has some relevance or significance in the field.

Conduct a Literature Review

Before writing your hypothesis, it’s essential to conduct a thorough literature review to understand what is already known about the topic. This will help you to identify the research gap and formulate a hypothesis that builds on existing knowledge.

Determine the Variables

The next step is to identify the variables involved in the research question. A variable is any characteristic or factor that can vary or change. There are two types of variables: independent and dependent. The independent variable is the one that is manipulated or changed by the researcher, while the dependent variable is the one that is measured or observed as a result of the independent variable.

Formulate the Hypothesis

Based on the research question and the variables involved, you can now formulate your hypothesis. A hypothesis should be a clear and concise statement that predicts the relationship between the variables. It should be testable through empirical research and based on existing theory or evidence.

Write the Null Hypothesis

The null hypothesis is the opposite of the alternative hypothesis, which is the hypothesis that you are testing. The null hypothesis states that there is no significant difference or relationship between the variables. It is important to write the null hypothesis because it allows you to compare your results with what would be expected by chance.

Refine the Hypothesis

After formulating the hypothesis, it’s important to refine it and make it more precise. This may involve clarifying the variables, specifying the direction of the relationship, or making the hypothesis more testable.

Examples of Hypothesis

Here are a few examples of hypotheses in different fields:

  • Psychology : “Increased exposure to violent video games leads to increased aggressive behavior in adolescents.”
  • Biology : “Higher levels of carbon dioxide in the atmosphere will lead to increased plant growth.”
  • Sociology : “Individuals who grow up in households with higher socioeconomic status will have higher levels of education and income as adults.”
  • Education : “Implementing a new teaching method will result in higher student achievement scores.”
  • Marketing : “Customers who receive a personalized email will be more likely to make a purchase than those who receive a generic email.”
  • Physics : “An increase in temperature will cause an increase in the volume of a gas, assuming all other variables remain constant.”
  • Medicine : “Consuming a diet high in saturated fats will increase the risk of developing heart disease.”

Purpose of Hypothesis

The purpose of a hypothesis is to provide a testable explanation for an observed phenomenon or a prediction of a future outcome based on existing knowledge or theories. A hypothesis is an essential part of the scientific method and helps to guide the research process by providing a clear focus for investigation. It enables scientists to design experiments or studies to gather evidence and data that can support or refute the proposed explanation or prediction.

The formulation of a hypothesis is based on existing knowledge, observations, and theories, and it should be specific, testable, and falsifiable. A specific hypothesis helps to define the research question, which is important in the research process as it guides the selection of an appropriate research design and methodology. Testability of the hypothesis means that it can be proven or disproven through empirical data collection and analysis. Falsifiability means that the hypothesis should be formulated in such a way that it can be proven wrong if it is incorrect.

In addition to guiding the research process, the testing of hypotheses can lead to new discoveries and advancements in scientific knowledge. When a hypothesis is supported by the data, it can be used to develop new theories or models to explain the observed phenomenon. When a hypothesis is not supported by the data, it can help to refine existing theories or prompt the development of new hypotheses to explain the phenomenon.

When to use Hypothesis

Here are some common situations in which hypotheses are used:

  • In scientific research , hypotheses are used to guide the design of experiments and to help researchers make predictions about the outcomes of those experiments.
  • In social science research , hypotheses are used to test theories about human behavior, social relationships, and other phenomena.
  • I n business , hypotheses can be used to guide decisions about marketing, product development, and other areas. For example, a hypothesis might be that a new product will sell well in a particular market, and this hypothesis can be tested through market research.

Characteristics of Hypothesis

Here are some common characteristics of a hypothesis:

  • Testable : A hypothesis must be able to be tested through observation or experimentation. This means that it must be possible to collect data that will either support or refute the hypothesis.
  • Falsifiable : A hypothesis must be able to be proven false if it is not supported by the data. If a hypothesis cannot be falsified, then it is not a scientific hypothesis.
  • Clear and concise : A hypothesis should be stated in a clear and concise manner so that it can be easily understood and tested.
  • Based on existing knowledge : A hypothesis should be based on existing knowledge and research in the field. It should not be based on personal beliefs or opinions.
  • Specific : A hypothesis should be specific in terms of the variables being tested and the predicted outcome. This will help to ensure that the research is focused and well-designed.
  • Tentative: A hypothesis is a tentative statement or assumption that requires further testing and evidence to be confirmed or refuted. It is not a final conclusion or assertion.
  • Relevant : A hypothesis should be relevant to the research question or problem being studied. It should address a gap in knowledge or provide a new perspective on the issue.

Advantages of Hypothesis

Hypotheses have several advantages in scientific research and experimentation:

  • Guides research: A hypothesis provides a clear and specific direction for research. It helps to focus the research question, select appropriate methods and variables, and interpret the results.
  • Predictive powe r: A hypothesis makes predictions about the outcome of research, which can be tested through experimentation. This allows researchers to evaluate the validity of the hypothesis and make new discoveries.
  • Facilitates communication: A hypothesis provides a common language and framework for scientists to communicate with one another about their research. This helps to facilitate the exchange of ideas and promotes collaboration.
  • Efficient use of resources: A hypothesis helps researchers to use their time, resources, and funding efficiently by directing them towards specific research questions and methods that are most likely to yield results.
  • Provides a basis for further research: A hypothesis that is supported by data provides a basis for further research and exploration. It can lead to new hypotheses, theories, and discoveries.
  • Increases objectivity: A hypothesis can help to increase objectivity in research by providing a clear and specific framework for testing and interpreting results. This can reduce bias and increase the reliability of research findings.

Limitations of Hypothesis

Some Limitations of the Hypothesis are as follows:

  • Limited to observable phenomena: Hypotheses are limited to observable phenomena and cannot account for unobservable or intangible factors. This means that some research questions may not be amenable to hypothesis testing.
  • May be inaccurate or incomplete: Hypotheses are based on existing knowledge and research, which may be incomplete or inaccurate. This can lead to flawed hypotheses and erroneous conclusions.
  • May be biased: Hypotheses may be biased by the researcher’s own beliefs, values, or assumptions. This can lead to selective interpretation of data and a lack of objectivity in research.
  • Cannot prove causation: A hypothesis can only show a correlation between variables, but it cannot prove causation. This requires further experimentation and analysis.
  • Limited to specific contexts: Hypotheses are limited to specific contexts and may not be generalizable to other situations or populations. This means that results may not be applicable in other contexts or may require further testing.
  • May be affected by chance : Hypotheses may be affected by chance or random variation, which can obscure or distort the true relationship between variables.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Data collection

Data Collection – Methods Types and Examples

Delimitations

Delimitations in Research – Types, Examples and...

Research Process

Research Process – Steps, Examples and Tips

Research Design

Research Design – Types, Methods and Examples

Institutional Review Board (IRB)

Institutional Review Board – Application Sample...

Evaluating Research

Evaluating Research – Process, Examples and...

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology
  • How to Write a Strong Hypothesis | Guide & Examples

How to Write a Strong Hypothesis | Guide & Examples

Published on 6 May 2022 by Shona McCombes .

A hypothesis is a statement that can be tested by scientific research. If you want to test a relationship between two or more variables, you need to write hypotheses before you start your experiment or data collection.

Table of contents

What is a hypothesis, developing a hypothesis (with example), hypothesis examples, frequently asked questions about writing hypotheses.

A hypothesis states your predictions about what your research will find. It is a tentative answer to your research question that has not yet been tested. For some research projects, you might have to write several hypotheses that address different aspects of your research question.

A hypothesis is not just a guess – it should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations, and statistical analysis of data).

Variables in hypotheses

Hypotheses propose a relationship between two or more variables . An independent variable is something the researcher changes or controls. A dependent variable is something the researcher observes and measures.

In this example, the independent variable is exposure to the sun – the assumed cause . The dependent variable is the level of happiness – the assumed effect .

Prevent plagiarism, run a free check.

Step 1: ask a question.

Writing a hypothesis begins with a research question that you want to answer. The question should be focused, specific, and researchable within the constraints of your project.

Step 2: Do some preliminary research

Your initial answer to the question should be based on what is already known about the topic. Look for theories and previous studies to help you form educated assumptions about what your research will find.

At this stage, you might construct a conceptual framework to identify which variables you will study and what you think the relationships are between them. Sometimes, you’ll have to operationalise more complex constructs.

Step 3: Formulate your hypothesis

Now you should have some idea of what you expect to find. Write your initial answer to the question in a clear, concise sentence.

Step 4: Refine your hypothesis

You need to make sure your hypothesis is specific and testable. There are various ways of phrasing a hypothesis, but all the terms you use should have clear definitions, and the hypothesis should contain:

  • The relevant variables
  • The specific group being studied
  • The predicted outcome of the experiment or analysis

Step 5: Phrase your hypothesis in three ways

To identify the variables, you can write a simple prediction in if … then form. The first part of the sentence states the independent variable and the second part states the dependent variable.

In academic research, hypotheses are more commonly phrased in terms of correlations or effects, where you directly state the predicted relationship between variables.

If you are comparing two groups, the hypothesis can state what difference you expect to find between them.

Step 6. Write a null hypothesis

If your research involves statistical hypothesis testing , you will also have to write a null hypothesis. The null hypothesis is the default position that there is no association between the variables. The null hypothesis is written as H 0 , while the alternative hypothesis is H 1 or H a .

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

A hypothesis is not just a guess. It should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations, and statistical analysis of data).

A research hypothesis is your proposed answer to your research question. The research hypothesis usually includes an explanation (‘ x affects y because …’).

A statistical hypothesis, on the other hand, is a mathematical statement about a population parameter. Statistical hypotheses always come in pairs: the null and alternative hypotheses. In a well-designed study , the statistical hypotheses correspond logically to the research hypothesis.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

McCombes, S. (2022, May 06). How to Write a Strong Hypothesis | Guide & Examples. Scribbr. Retrieved 22 April 2024, from https://www.scribbr.co.uk/research-methods/hypothesis-writing/

Is this article helpful?

Shona McCombes

Shona McCombes

Other students also liked, operationalisation | a guide with examples, pros & cons, what is a conceptual framework | tips & examples, a quick guide to experimental design | 5 steps & examples.

A concise guide to reproducible research using secondary data

Chapter 2 formulating a hypothesis.

formulate and test a working hypothesis

“There is no single best way to develop a research idea.” ( Pischke 2012 )

2.1 How do you develop a research question and formulate a hypothesis?

You decide to undertake a scientific project. Where do you start? First, you need to find a research question that interests you and formulate a hypothesis. We will introduce some key terminology, steps you can take, and examples how to develop research questions. Note that .

What if someone assigns a topic to me? For students attending undergraduate and graduate courses that often pick topics from a list, all of these steps are equally important and necessary. You still need to formulate a research question and a hypothesis. And it is important to clarify the relevance of your topic for yourself.

When thinking about a research question, you need to identify a topic that is:

  • Relevant , important in the world and interesting to you as a researcher: Does working on the topic excites you? You will spend many hours thinking about it and working on it. Therefore, it should be interesting and engaging enough for you to motivate your continued work on this topic.
  • Specific : not too broad and not too narrow
  • Feasible to research within a given time frame: Is it possible to answer the research question based on your time budget, data and additional resources.

How do you find a topic or develop a feasible research idea in the first place? Finding an idea is not difficult, the critical part is to find a good idea. How do you do that? There is no one specific way how one gets an idea, rather there is a myriad of ways how people come up with potential ideas (for example, as stated by Varian ( 2016 ) ).

You can find inspiration by

  • Looking at insights from the world around you: your own life and experiences, observe the behavior of people around you
  • Talking to people around you, experts, other students, family members
  • Talking to individuals outside your field (non-economists)
  • Talking to professionals working in the area you are interested in (you may use social media and professional platforms like LinkedIN or Twitter to make contact)
  • Reading journal articles from other non-economic social sciences and the medical literature
  • What are the issues being discussed?
  • How do these issues affect people’s lives?

In addition you could

  • Go to virtual and in-person seminars, for example, the Essen Health Economics Seminar
  • Look at abstracts of scientific articles and working papers
  • Look at the literature in a specific field you are interested in, for example, screening complete issues of journals or editorials about certain research advancements. By reading this literature you might come up with the idea on how to extend and refine previous research.

Once you identified a research question that is of interest to you, you need to define a hypothesis.

2.2 What is a hypothesis?

A hypothesis is a statement that introduces your research question and suggests the results you might find. It is an educated guess. You start by posing an economic question and formulate a hypothesis about this question. Then you test it with your data and empirical analysis and either accept or reject the hypothesis. It constitutes the main basis of your scientific investigation and you should be careful when creating it.

2.2.1 Develop a hypothesis

Before you formulate your hypothesis, read up on the topic of interest. This should provide you with sufficient information to narrow down your research question. Once you find your question you need to develop a hypothesis, which contains a statement of your expectations regarding your research question’s results. You propose to prove your hypothesis with your research by testing the relationship between two variables of interest. Thus, a hypothesis should be testable with the data at hand. There are two types of hypotheses: alternative or null. Null states that there is no effect. Alternative states that there is an effect.

There is an alternative view on this that suggests one should not look at the literature too early on in the idea-generating process to not be influenced and shaped by someone else’s ideas ( Varian 2016 ) . According to this view you can spend some time (i.e. a few weeks) trying to develop your own original idea. Even if you end up with an idea that has already been pursued by someone else, this will still provide you with good practice in developing publishable ideas. After you have developed an idea and made sure that it was not yet investigated in the literature, you can start conducting a systematic literature review. By doing this, you can find some other interesting insights from the work of others that you can synthesize in your own work to produce something novel and original.

2.2.2 Identify relevant literature

For your research project you will need to identify and collect previous relevant literature. It should involve a thorough search of the keywords in relevant databases and journals. Place emphasis on articles from high-ranking journals with significant numbers of citations. This will give you an indication of the most influential and important work in the field. Once you identify and collect the relevant literature for your topic, you will need to critically synthesize it in your literature review.

When you perform your literature review, consider theories that may inform your research question. For example, when studying physician behavior you may consider principal-agent theory.

2.2.3 Research question or literature review: the chicken or the egg problem?

Whether you start reading the literature first or by developing an idea may depend on your level (graduate student, early career researcher) and other goals. However, thinking freely about what you like to investigate first may help to critically develop a feasible and interesting research question.

We highlight an example how to start with investigating the real world and subsequently posing a research question ( “How to Write a Strong Hypothesis Steps and Examples ” 2019 ; “Developing Strong Research Questions Criteria and Examples ” 2019 ; Schilbach 2019 ) . For example, based on your observation you notice that people spend extensive amount of time looking at their smartphones. Maybe even you yourself engage in the same behavior. In addition, you read a BBC News article Social media damages teenagers’ mental health, report says .

Social media and mental health

(#fig:social_media)Social media and mental health

Source: BBC

You decide to translate this article and your observations into a research question : How does social media use affect mental health? Before you formulate your hypothesis, read up on the topic of interest. Read economic, medical and other social science literature on the topic. There is likely to be a vast amount of literature from non-economic fields that are doing research on your topic of interest, for example, psychology or neuroscience. Familiarize yourself with it and master it. Do not get distracted by different scientific methodologies and techniques that might seem not up-to-par to the economic studies (small sample sizes, endogeneity, uncovering association rather than causation, etc.), but rather focus on suggestions of potential mechanisms.

A hypothesis is then your research question distilled into a one sentence statement, which presents your expectations regarding the results. You propose to prove your hypothesis by testing the relationship between two variables of interest with the data at hand. There are two types of hypotheses: alternative or null. The null hypothesis states that there is no effect. The alternative hypothesis states that there is an effect.

A hypothesis related to the above-stated research question could be: The increased use of social media among teenagers leads to (is associated with) worse mental health outcomes, i.e. increased incidence of depression, eating disorders, worse well-being and lower self-esteem. It suggests a direction of a relationship that you expect to find that is guided by your observations and existing evidence. It is testable with scientific research methods by using statistical analysis of the relevant data.

Your hypothesis suggests a relationship between two variables: social media use (your independent variable \(X\) ) and mental health (dependent variable \(Y\) ). It could be framed in terms of correlation (is associated with) or causation (leads to). This should be reflected in the choice of scientific investigation you decide to undertake.

The null hypothesis is: There is no relationship between social media use among teenagers and their mental health .

2.3 Resources box

2.3.1 how to develop strong research questions.

  • The form of the research process
  • Varian, H. R. (2016). How to build an economic model in your spare time. The American Economist, 61(1), 81-90.

2.3.2 Identify relevant literature from major general interest and field literature

To identify the relevant literature you can

  • use academic search engines such as Google Scholar, Web of Science, EconLit, PubMed.
  • search working paper series such as the National Bureau of Economic Research , NetEc or IZA
  • search more general resource sites such as Resources for Economists
  • go to the library/use library database

2.3.3 Assess the quality of a journal article

Several rankings may help to assess the quality of research you consider

  • Journals of general interest and by field in economics and management - For German-speaking countries, consider the VWL / BWL Handelsblatt Ranking for economics and management - The German Association of Management Scholars provides an expert-based ranking VHB JourQual 3.0, Teilranking Management im Gesundheitswesen - Web of Science Impact Factors - Scimago
  • Health Economics, Health Services and Health Care Managment Research: Health Economics Journals List
  • Be aware that like in any other domain there are predatory publishing practices .

Use tools to investigate how a journal article is connected to other works

  • Citationgecko
  • Connected papers
  • scite_ – a tool to get a first impression whether a study is disputed or academic consensus

2.3.4 Organize your literature

  • Zotero (free of charge)
  • Mendeley (free of charge)
  • EndNote (potentially free of charge via your university)
  • Citavi (potentially free of charge via your university)
  • BibTEX if you work with TEX
  • Excel spread sheet

2.4 Checklist to get started with formulating your hypothesis

  • Find an interesting and relevant research topic, if not assigned
  • Try to suck up all information you can easily obtain from various sources within and outside academic literature
  • Formulate one compelling research question
  • Find the best available empirical and theoretical evidence that is related to your research question
  • Formulate a hypothesis
  • Check whether data are available for analysis
  • Challenge your idea with your fellows or senior researchers

2.5 Example: Hellerstein ( 1998 )

As an illustration of the research process of formulating a hypothesis, designing a study, running a study, collecting and analyzing the data and, finally, reporting the study, we provide an example by replicating Judith K. Hellerstein’s paper “The Importance of the Physician in the Generic versus Trade-Name Prescription Decision” that was published in 1998 in the RAND Journal of Economics.

Hellerstein’s 1998 paper has impacted discussion about behavioral factors of physician decisions and pharmaceutical markets over two decades. The study received 448 citations on Google Scholar since 1998 by 27/03/2022, including recent mentions in top field journals such as Journal of Public Economics (2021) , Journal of Health Economics (2019) , and Health Economics (2019) .

Connected graph of @hellerstein_importance_1998, February 2022

Figure 2.1: Connected graph of Hellerstein ( 1998 ) , February 2022

Figure 2.1 shows a connected graph of prior and derivative works related to the study.

The work has impacted the literature researching the role of physician behavior and its influence on access, adoption and diffusion of health services, moral hazard and incentives in prescription and treatment decisions and the influence of different payment schemes, and a vast body of literature studying the pharmaceutical market.

The research that has been influenced by Hellerstein includes evidence on:

  • generic drug entries and market efficiency
  • the effectiveness of pharmaceutical promotion
  • the effectiveness of price regulations
  • the role of patents and dynamics of market segmentation

At the end of each chapter, we demonstrate insights into this study that we replicate.

2.5.1 Context of the study - escalating health expenditures

In the United States, the total prescription drug expenditure in 2020 marked about 358.7 billion US Dollars ( Statista n.d. ) . The prescription of generic drugs in comparison to more expensive brand-name versions is an option in reducing the total health care expenditure. Generic drugs are bioequivalent in the active ingredients and can serve as a channel to contain prescription expenditure ( Kesselheim 2008 ) as generic drugs are between 20 and 90% cheaper than their trade-name alternatives ( Dunne et al. 2013 ) .

2.5.2 Research question - How does a patient’s insurance status influence the physician’s choice between generic compared to brand-name drugs?

Physicians are faced with a multitude of medication options, including the choice between generic and trade-name drugs. Physicians ideally act as agents for their patients to identify the best available treatment option based on their needs. Choosing the best treatment entails cost of coordination and cognition. The prescription of generic drugs may serve as an example to what extent physicians customize treatments according to patients’ needs with regards to cost. From an economic point of view we may expect that once a generic drug is available, a perfectly rational agent (i.e. physician) would prescribe a generic drug instead of the trade-name version if therapeutically identical ( Dranove 1989 ) . This leads to the following research question: “Do physicians vary their prescription decisions on a patient-by-patient basis or do they systematically prescribe the same version, trade-name or generic, to all patients?” .

The 1998 Hellerstein’s study examines two hypotheses:

  • The physician prescribing choice influences the selection of a generic over a brand-name drug
  • The patient’s insurance status influences the physician’s choice between generic and brand-name drugs.

For the purpose of this example and in the replication exercise we focus on the second aspect.

2.5.3 Hypothesis

The paper formulates the following hypothesis:

Physicians are more likely to prescribe generics to patients who do not have insurance coverage for prescription pharmaceuticals (moral hazard in insurance)

Hellerstein ( 1998 ) discusses that, based on insurance status, some patients may demand certain care more than others. If, for example, the prescription drug is reimbursed by the patient’s health insurance, this may cause overconsumption. This behavior can potentially differ by the patient’s insurance scheme. A patient that has no insurance and, thus, does not get any reimbursement for prescription drugs, might have a higher incentive to demand cheaper generic drugs ( Danzon and Furukawa 2011 ) than a patient with insurance that covers prescription drugs, either generic or trade-name. Given that the United States have different insurance schemes with varying prescription drug coverage, it is of interest to investigate the role of a patient’s insurance status in the physician’s choice between generic compared to brand-name drugs.

Hellerstein ( 1998 ) considers a patient’s insurance status as a matter of dividing the study population in groups for which the choice between generic and brand-name drugs differs. She suggests that There is a relationship between the prescription of a generic drug and insurance status of a patient. ( Hellerstein 1998 ) .

Providing answers to a research question requires formulating and testing a hypothesis. Based on logic, theory or previous research, a hypothesis proposes an expected relationship within the given data. According to her research question, Hellerstein hypothesizes that: Physicians are more likely to prescribe generics to patients who do not have insurance coverage for prescription pharmaceuticals.

Specifically, she writes “if there is moral hazard in insurance when it comes to physician prescription behavior, there will be differences in the propensity of physicians to prescribe low-cost generic drugs, and these differences will be (partially) a function of the insurance held by the patient. In particular, if moral hazard exists, patients with extensive insurance coverage for prescription drugs (like those on Medicaid in 1989) should receive prescriptions written for generic drugs less frequently than patients with no prescription drug coverage.” ( Hellerstein 1998, 113 )

Based on Hellerstein’s considerations, we expect the effect of the insurance status on whether a patient receives a generic to be different from zero. To obtain a testable null hypothesis, we reformulate this relationship so that we reject the hypothesis if our expectations are correct. This means, if we expect to see an effect of insurance on prescriptions of generics, our null hypothesis is that insurance status has no effect on the outcome (prescription of generic drugs). No moral hazard arises from having obtained insurance.

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Statistics LibreTexts

9.E: Hypothesis Testing with One Sample (Exercises)

  • Last updated
  • Save as PDF
  • Page ID 1146

These are homework exercises to accompany the Textmap created for "Introductory Statistics" by OpenStax.

9.1: Introduction

9.2: null and alternative hypotheses.

Some of the following statements refer to the null hypothesis, some to the alternate hypothesis.

State the null hypothesis, \(H_{0}\), and the alternative hypothesis. \(H_{a}\), in terms of the appropriate parameter \((\mu \text{or} p)\).

  • The mean number of years Americans work before retiring is 34.
  • At most 60% of Americans vote in presidential elections.
  • The mean starting salary for San Jose State University graduates is at least $100,000 per year.
  • Twenty-nine percent of high school seniors get drunk each month.
  • Fewer than 5% of adults ride the bus to work in Los Angeles.
  • The mean number of cars a person owns in her lifetime is not more than ten.
  • About half of Americans prefer to live away from cities, given the choice.
  • Europeans have a mean paid vacation each year of six weeks.
  • The chance of developing breast cancer is under 11% for women.
  • Private universities' mean tuition cost is more than $20,000 per year.
  • \(H_{0}: \mu = 34; H_{a}: \mu \neq 34\)
  • \(H_{0}: p \leq 0.60; H_{a}: p > 0.60\)
  • \(H_{0}: \mu \geq 100,000; H_{a}: \mu < 100,000\)
  • \(H_{0}: p = 0.29; H_{a}: p \neq 0.29\)
  • \(H_{0}: p = 0.05; H_{a}: p < 0.05\)
  • \(H_{0}: \mu \leq 10; H_{a}: \mu > 10\)
  • \(H_{0}: p = 0.50; H_{a}: p \neq 0.50\)
  • \(H_{0}: \mu = 6; H_{a}: \mu \neq 6\)
  • \(H_{0}: p ≥ 0.11; H_{a}: p < 0.11\)
  • \(H_{0}: \mu \leq 20,000; H_{a}: \mu > 20,000\)

Over the past few decades, public health officials have examined the link between weight concerns and teen girls' smoking. Researchers surveyed a group of 273 randomly selected teen girls living in Massachusetts (between 12 and 15 years old). After four years the girls were surveyed again. Sixty-three said they smoked to stay thin. Is there good evidence that more than thirty percent of the teen girls smoke to stay thin? The alternative hypothesis is:

  • \(p < 0.30\)
  • \(p \leq 0.30\)
  • \(p \geq 0.30\)
  • \(p > 0.30\)

A statistics instructor believes that fewer than 20% of Evergreen Valley College (EVC) students attended the opening night midnight showing of the latest Harry Potter movie. She surveys 84 of her students and finds that 11 attended the midnight showing. An appropriate alternative hypothesis is:

  • \(p = 0.20\)
  • \(p > 0.20\)
  • \(p < 0.20\)
  • \(p \leq 0.20\)

Previously, an organization reported that teenagers spent 4.5 hours per week, on average, on the phone. The organization thinks that, currently, the mean is higher. Fifteen randomly chosen teenagers were asked how many hours per week they spend on the phone. The sample mean was 4.75 hours with a sample standard deviation of 2.0. Conduct a hypothesis test. The null and alternative hypotheses are:

  • \(H_{0}: \bar{x} = 4.5, H_{a}: \bar{x} > 4.5\)
  • \(H_{0}: \mu \geq 4.5, H_{a}: \mu < 4.5\)
  • \(H_{0}: \mu = 4.75, H_{a}: \mu > 4.75\)
  • \(H_{0}: \mu = 4.5, H_{a}: \mu > 4.5\)

9.3: Outcomes and the Type I and Type II Errors

State the Type I and Type II errors in complete sentences given the following statements.

  • The mean number of cars a person owns in his or her lifetime is not more than ten.
  • Private universities mean tuition cost is more than $20,000 per year.
  • Type I error: We conclude that the mean is not 34 years, when it really is 34 years. Type II error: We conclude that the mean is 34 years, when in fact it really is not 34 years.
  • Type I error: We conclude that more than 60% of Americans vote in presidential elections, when the actual percentage is at most 60%.Type II error: We conclude that at most 60% of Americans vote in presidential elections when, in fact, more than 60% do.
  • Type I error: We conclude that the mean starting salary is less than $100,000, when it really is at least $100,000. Type II error: We conclude that the mean starting salary is at least $100,000 when, in fact, it is less than $100,000.
  • Type I error: We conclude that the proportion of high school seniors who get drunk each month is not 29%, when it really is 29%. Type II error: We conclude that the proportion of high school seniors who get drunk each month is 29% when, in fact, it is not 29%.
  • Type I error: We conclude that fewer than 5% of adults ride the bus to work in Los Angeles, when the percentage that do is really 5% or more. Type II error: We conclude that 5% or more adults ride the bus to work in Los Angeles when, in fact, fewer that 5% do.
  • Type I error: We conclude that the mean number of cars a person owns in his or her lifetime is more than 10, when in reality it is not more than 10. Type II error: We conclude that the mean number of cars a person owns in his or her lifetime is not more than 10 when, in fact, it is more than 10.
  • Type I error: We conclude that the proportion of Americans who prefer to live away from cities is not about half, though the actual proportion is about half. Type II error: We conclude that the proportion of Americans who prefer to live away from cities is half when, in fact, it is not half.
  • Type I error: We conclude that the duration of paid vacations each year for Europeans is not six weeks, when in fact it is six weeks. Type II error: We conclude that the duration of paid vacations each year for Europeans is six weeks when, in fact, it is not.
  • Type I error: We conclude that the proportion is less than 11%, when it is really at least 11%. Type II error: We conclude that the proportion of women who develop breast cancer is at least 11%, when in fact it is less than 11%.
  • Type I error: We conclude that the average tuition cost at private universities is more than $20,000, though in reality it is at most $20,000. Type II error: We conclude that the average tuition cost at private universities is at most $20,000 when, in fact, it is more than $20,000.

For statements a-j in Exercise 9.109 , answer the following in complete sentences.

  • State a consequence of committing a Type I error.
  • State a consequence of committing a Type II error.

When a new drug is created, the pharmaceutical company must subject it to testing before receiving the necessary permission from the Food and Drug Administration (FDA) to market the drug. Suppose the null hypothesis is “the drug is unsafe.” What is the Type II Error?

  • To conclude the drug is safe when in, fact, it is unsafe.
  • Not to conclude the drug is safe when, in fact, it is safe.
  • To conclude the drug is safe when, in fact, it is safe.
  • Not to conclude the drug is unsafe when, in fact, it is unsafe.

A statistics instructor believes that fewer than 20% of Evergreen Valley College (EVC) students attended the opening midnight showing of the latest Harry Potter movie. She surveys 84 of her students and finds that 11 of them attended the midnight showing. The Type I error is to conclude that the percent of EVC students who attended is ________.

  • at least 20%, when in fact, it is less than 20%.
  • 20%, when in fact, it is 20%.
  • less than 20%, when in fact, it is at least 20%.
  • less than 20%, when in fact, it is less than 20%.

It is believed that Lake Tahoe Community College (LTCC) Intermediate Algebra students get less than seven hours of sleep per night, on average. A survey of 22 LTCC Intermediate Algebra students generated a mean of 7.24 hours with a standard deviation of 1.93 hours. At a level of significance of 5%, do LTCC Intermediate Algebra students get less than seven hours of sleep per night, on average?

The Type II error is not to reject that the mean number of hours of sleep LTCC students get per night is at least seven when, in fact, the mean number of hours

  • is more than seven hours.
  • is at most seven hours.
  • is at least seven hours.
  • is less than seven hours.

Previously, an organization reported that teenagers spent 4.5 hours per week, on average, on the phone. The organization thinks that, currently, the mean is higher. Fifteen randomly chosen teenagers were asked how many hours per week they spend on the phone. The sample mean was 4.75 hours with a sample standard deviation of 2.0. Conduct a hypothesis test, the Type I error is:

  • to conclude that the current mean hours per week is higher than 4.5, when in fact, it is higher
  • to conclude that the current mean hours per week is higher than 4.5, when in fact, it is the same
  • to conclude that the mean hours per week currently is 4.5, when in fact, it is higher
  • to conclude that the mean hours per week currently is no higher than 4.5, when in fact, it is not higher

9.4: Distribution Needed for Hypothesis Testing

It is believed that Lake Tahoe Community College (LTCC) Intermediate Algebra students get less than seven hours of sleep per night, on average. A survey of 22 LTCC Intermediate Algebra students generated a mean of 7.24 hours with a standard deviation of 1.93 hours. At a level of significance of 5%, do LTCC Intermediate Algebra students get less than seven hours of sleep per night, on average? The distribution to be used for this test is \(\bar{X} \sim\) ________________

  • \(N\left(7.24, \frac{1.93}{\sqrt{22}}\right)\)
  • \(N\left(7.24, 1.93\right)\)

9.5: Rare Events, the Sample, Decision and Conclusion

The National Institute of Mental Health published an article stating that in any one-year period, approximately 9.5 percent of American adults suffer from depression or a depressive illness. Suppose that in a survey of 100 people in a certain town, seven of them suffered from depression or a depressive illness. Conduct a hypothesis test to determine if the true proportion of people in that town suffering from depression or a depressive illness is lower than the percent in the general adult American population.

  • Is this a test of one mean or proportion?
  • State the null and alternative hypotheses. \(H_{0}\) : ____________________ \(H_{a}\) : ____________________
  • Is this a right-tailed, left-tailed, or two-tailed test?
  • What symbol represents the random variable for this test?
  • In words, define the random variable for this test.
  • \(x =\) ________________
  • \(n =\) ________________
  • \(p′ =\) _____________
  • Calculate \(\sigma_{x} =\) __________. Show the formula set-up.
  • State the distribution to use for the hypothesis test.
  • Find the \(p\text{-value}\).
  • Reason for the decision:
  • Conclusion (write out in a complete sentence):

9.6: Additional Information and Full Hypothesis Test Examples

For each of the word problems, use a solution sheet to do the hypothesis test. The solution sheet is found in [link] . Please feel free to make copies of the solution sheets. For the online version of the book, it is suggested that you copy the .doc or the .pdf files.

If you are using a Student's \(t\) - distribution for one of the following homework problems, you may assume that the underlying population is normally distributed. (In general, you must first prove that assumption, however.)

A particular brand of tires claims that its deluxe tire averages at least 50,000 miles before it needs to be replaced. From past studies of this tire, the standard deviation is known to be 8,000. A survey of owners of that tire design is conducted. From the 28 tires surveyed, the mean lifespan was 46,500 miles with a standard deviation of 9,800 miles. Using \(\alpha = 0.05\), is the data highly inconsistent with the claim?

  • \(H_{0}: \mu \geq 50,000\)
  • \(H_{a}: \mu < 50,000\)
  • Let \(\bar{X} =\) the average lifespan of a brand of tires.
  • normal distribution
  • \(z = -2.315\)
  • \(p\text{-value} = 0.0103\)
  • Check student’s solution.
  • alpha: 0.05
  • Decision: Reject the null hypothesis.
  • Reason for decision: The \(p\text{-value}\) is less than 0.05.
  • Conclusion: There is sufficient evidence to conclude that the mean lifespan of the tires is less than 50,000 miles.
  • \((43,537, 49,463)\)

From generation to generation, the mean age when smokers first start to smoke varies. However, the standard deviation of that age remains constant of around 2.1 years. A survey of 40 smokers of this generation was done to see if the mean starting age is at least 19. The sample mean was 18.1 with a sample standard deviation of 1.3. Do the data support the claim at the 5% level?

The cost of a daily newspaper varies from city to city. However, the variation among prices remains steady with a standard deviation of 20¢. A study was done to test the claim that the mean cost of a daily newspaper is $1.00. Twelve costs yield a mean cost of 95¢ with a standard deviation of 18¢. Do the data support the claim at the 1% level?

  • \(H_{0}: \mu = $1.00\)
  • \(H_{a}: \mu \neq $1.00\)
  • Let \(\bar{X} =\) the average cost of a daily newspaper.
  • \(z = –0.866\)
  • \(p\text{-value} = 0.3865\)
  • \(\alpha: 0.01\)
  • Decision: Do not reject the null hypothesis.
  • Reason for decision: The \(p\text{-value}\) is greater than 0.01.
  • Conclusion: There is sufficient evidence to support the claim that the mean cost of daily papers is $1. The mean cost could be $1.
  • \(($0.84, $1.06)\)

An article in the San Jose Mercury News stated that students in the California state university system take 4.5 years, on average, to finish their undergraduate degrees. Suppose you believe that the mean time is longer. You conduct a survey of 49 students and obtain a sample mean of 5.1 with a sample standard deviation of 1.2. Do the data support your claim at the 1% level?

The mean number of sick days an employee takes per year is believed to be about ten. Members of a personnel department do not believe this figure. They randomly survey eight employees. The number of sick days they took for the past year are as follows: 12; 4; 15; 3; 11; 8; 6; 8. Let \(x =\) the number of sick days they took for the past year. Should the personnel team believe that the mean number is ten?

  • \(H_{0}: \mu = 10\)
  • \(H_{a}: \mu \neq 10\)
  • Let \(\bar{X}\) the mean number of sick days an employee takes per year.
  • Student’s t -distribution
  • \(t = –1.12\)
  • \(p\text{-value} = 0.300\)
  • \(\alpha: 0.05\)
  • Reason for decision: The \(p\text{-value}\) is greater than 0.05.
  • Conclusion: At the 5% significance level, there is insufficient evidence to conclude that the mean number of sick days is not ten.
  • \((4.9443, 11.806)\)

In 1955, Life Magazine reported that the 25 year-old mother of three worked, on average, an 80 hour week. Recently, many groups have been studying whether or not the women's movement has, in fact, resulted in an increase in the average work week for women (combining employment and at-home work). Suppose a study was done to determine if the mean work week has increased. 81 women were surveyed with the following results. The sample mean was 83; the sample standard deviation was ten. Does it appear that the mean work week has increased for women at the 5% level?

Your statistics instructor claims that 60 percent of the students who take her Elementary Statistics class go through life feeling more enriched. For some reason that she can't quite figure out, most people don't believe her. You decide to check this out on your own. You randomly survey 64 of her past Elementary Statistics students and find that 34 feel more enriched as a result of her class. Now, what do you think?

  • \(H_{0}: p \geq 0.6\)
  • \(H_{a}: p < 0.6\)
  • Let \(P′ =\) the proportion of students who feel more enriched as a result of taking Elementary Statistics.
  • normal for a single proportion
  • \(p\text{-value} = 0.1308\)
  • Conclusion: There is insufficient evidence to conclude that less than 60 percent of her students feel more enriched.

The “plus-4s” confidence interval is \((0.411, 0.648)\)

A Nissan Motor Corporation advertisement read, “The average man’s I.Q. is 107. The average brown trout’s I.Q. is 4. So why can’t man catch brown trout?” Suppose you believe that the brown trout’s mean I.Q. is greater than four. You catch 12 brown trout. A fish psychologist determines the I.Q.s as follows: 5; 4; 7; 3; 6; 4; 5; 3; 6; 3; 8; 5. Conduct a hypothesis test of your belief.

Refer to Exercise 9.119 . Conduct a hypothesis test to see if your decision and conclusion would change if your belief were that the brown trout’s mean I.Q. is not four.

  • \(H_{0}: \mu = 4\)
  • \(H_{a}: \mu \neq 4\)
  • Let \(\bar{X}\) the average I.Q. of a set of brown trout.
  • two-tailed Student's t-test
  • \(t = 1.95\)
  • \(p\text{-value} = 0.076\)
  • Reason for decision: The \(p\text{-value}\) is greater than 0.05
  • Conclusion: There is insufficient evidence to conclude that the average IQ of brown trout is not four.
  • \((3.8865,5.9468)\)

According to an article in Newsweek , the natural ratio of girls to boys is 100:105. In China, the birth ratio is 100: 114 (46.7% girls). Suppose you don’t believe the reported figures of the percent of girls born in China. You conduct a study. In this study, you count the number of girls and boys born in 150 randomly chosen recent births. There are 60 girls and 90 boys born of the 150. Based on your study, do you believe that the percent of girls born in China is 46.7?

A poll done for Newsweek found that 13% of Americans have seen or sensed the presence of an angel. A contingent doubts that the percent is really that high. It conducts its own survey. Out of 76 Americans surveyed, only two had seen or sensed the presence of an angel. As a result of the contingent’s survey, would you agree with the Newsweek poll? In complete sentences, also give three reasons why the two polls might give different results.

  • \(H_{a}: p < 0.13\)
  • Let \(P′ =\) the proportion of Americans who have seen or sensed angels
  • –2.688
  • \(p\text{-value} = 0.0036\)
  • Reason for decision: The \(p\text{-value}\)e is less than 0.05.
  • Conclusion: There is sufficient evidence to conclude that the percentage of Americans who have seen or sensed an angel is less than 13%.

The“plus-4s” confidence interval is (0.0022, 0.0978)

The mean work week for engineers in a start-up company is believed to be about 60 hours. A newly hired engineer hopes that it’s shorter. She asks ten engineering friends in start-ups for the lengths of their mean work weeks. Based on the results that follow, should she count on the mean work week to be shorter than 60 hours?

Data (length of mean work week): 70; 45; 55; 60; 65; 55; 55; 60; 50; 55.

Use the “Lap time” data for Lap 4 (see [link] ) to test the claim that Terri finishes Lap 4, on average, in less than 129 seconds. Use all twenty races given.

  • \(H_{0}: \mu \geq 129\)
  • \(H_{a}: \mu < 129\)
  • Let \(\bar{X} =\) the average time in seconds that Terri finishes Lap 4.
  • Student's t -distribution
  • \(t = 1.209\)
  • Conclusion: There is insufficient evidence to conclude that Terri’s mean lap time is less than 129 seconds.
  • \((128.63, 130.37)\)

Use the “Initial Public Offering” data (see [link] ) to test the claim that the mean offer price was $18 per share. Do not use all the data. Use your random number generator to randomly survey 15 prices.

The following questions were written by past students. They are excellent problems!

"Asian Family Reunion," by Chau Nguyen

Every two years it comes around.

We all get together from different towns.

In my honest opinion,

It's not a typical family reunion.

Not forty, or fifty, or sixty,

But how about seventy companions!

The kids would play, scream, and shout

One minute they're happy, another they'll pout.

The teenagers would look, stare, and compare

From how they look to what they wear.

The men would chat about their business

That they make more, but never less.

Money is always their subject

And there's always talk of more new projects.

The women get tired from all of the chats

They head to the kitchen to set out the mats.

Some would sit and some would stand

Eating and talking with plates in their hands.

Then come the games and the songs

And suddenly, everyone gets along!

With all that laughter, it's sad to say

That it always ends in the same old way.

They hug and kiss and say "good-bye"

And then they all begin to cry!

I say that 60 percent shed their tears

But my mom counted 35 people this year.

She said that boys and men will always have their pride,

So we won't ever see them cry.

I myself don't think she's correct,

So could you please try this problem to see if you object?

  • \(H_{0}: p = 0.60\)
  • \(H_{a}: p < 0.60\)
  • Let \(P′ =\) the proportion of family members who shed tears at a reunion.
  • –1.71
  • Reason for decision: \(p\text{-value} < \alpha\)
  • Conclusion: At the 5% significance level, there is sufficient evidence to conclude that the proportion of family members who shed tears at a reunion is less than 0.60. However, the test is weak because the \(p\text{-value}\) and alpha are quite close, so other tests should be done.
  • We are 95% confident that between 38.29% and 61.71% of family members will shed tears at a family reunion. \((0.3829, 0.6171)\). The“plus-4s” confidence interval (see chapter 8) is \((0.3861, 0.6139)\)

Note that here the “large-sample” \(1 - \text{PropZTest}\) provides the approximate \(p\text{-value}\) of 0.0438. Whenever a \(p\text{-value}\) based on a normal approximation is close to the level of significance, the exact \(p\text{-value}\) based on binomial probabilities should be calculated whenever possible. This is beyond the scope of this course.

"The Problem with Angels," by Cyndy Dowling

Although this problem is wholly mine,

The catalyst came from the magazine, Time.

On the magazine cover I did find

The realm of angels tickling my mind.

Inside, 69% I found to be

In angels, Americans do believe.

Then, it was time to rise to the task,

Ninety-five high school and college students I did ask.

Viewing all as one group,

Random sampling to get the scoop.

So, I asked each to be true,

"Do you believe in angels?" Tell me, do!

Hypothesizing at the start,

Totally believing in my heart

That the proportion who said yes

Would be equal on this test.

Lo and behold, seventy-three did arrive,

Out of the sample of ninety-five.

Now your job has just begun,

Solve this problem and have some fun.

"Blowing Bubbles," by Sondra Prull

Studying stats just made me tense,

I had to find some sane defense.

Some light and lifting simple play

To float my math anxiety away.

Blowing bubbles lifts me high

Takes my troubles to the sky.

POIK! They're gone, with all my stress

Bubble therapy is the best.

The label said each time I blew

The average number of bubbles would be at least 22.

I blew and blew and this I found

From 64 blows, they all are round!

But the number of bubbles in 64 blows

Varied widely, this I know.

20 per blow became the mean

They deviated by 6, and not 16.

From counting bubbles, I sure did relax

But now I give to you your task.

Was 22 a reasonable guess?

Find the answer and pass this test!

  • \(H_{0}: \mu \geq 22\)
  • \(H_{a}: \mu < 22\)
  • Let \(\bar{X} =\) the mean number of bubbles per blow.
  • –2.667
  • \(p\text{-value} = 0.00486\)
  • Conclusion: There is sufficient evidence to conclude that the mean number of bubbles per blow is less than 22.
  • \((18.501, 21.499)\)

"Dalmatian Darnation," by Kathy Sparling

A greedy dog breeder named Spreckles

Bred puppies with numerous freckles

The Dalmatians he sought

Possessed spot upon spot

The more spots, he thought, the more shekels.

His competitors did not agree

That freckles would increase the fee.

They said, “Spots are quite nice

But they don't affect price;

One should breed for improved pedigree.”

The breeders decided to prove

This strategy was a wrong move.

Breeding only for spots

Would wreak havoc, they thought.

His theory they want to disprove.

They proposed a contest to Spreckles

Comparing dog prices to freckles.

In records they looked up

One hundred one pups:

Dalmatians that fetched the most shekels.

They asked Mr. Spreckles to name

An average spot count he'd claim

To bring in big bucks.

Said Spreckles, “Well, shucks,

It's for one hundred one that I aim.”

Said an amateur statistician

Who wanted to help with this mission.

“Twenty-one for the sample

Standard deviation's ample:

They examined one hundred and one

Dalmatians that fetched a good sum.

They counted each spot,

Mark, freckle and dot

And tallied up every one.

Instead of one hundred one spots

They averaged ninety six dots

Can they muzzle Spreckles’

Obsession with freckles

Based on all the dog data they've got?

"Macaroni and Cheese, please!!" by Nedda Misherghi and Rachelle Hall

As a poor starving student I don't have much money to spend for even the bare necessities. So my favorite and main staple food is macaroni and cheese. It's high in taste and low in cost and nutritional value.

One day, as I sat down to determine the meaning of life, I got a serious craving for this, oh, so important, food of my life. So I went down the street to Greatway to get a box of macaroni and cheese, but it was SO expensive! $2.02 !!! Can you believe it? It made me stop and think. The world is changing fast. I had thought that the mean cost of a box (the normal size, not some super-gigantic-family-value-pack) was at most $1, but now I wasn't so sure. However, I was determined to find out. I went to 53 of the closest grocery stores and surveyed the prices of macaroni and cheese. Here are the data I wrote in my notebook:

Price per box of Mac and Cheese:

  • 5 stores @ $2.02
  • 15 stores @ $0.25
  • 3 stores @ $1.29
  • 6 stores @ $0.35
  • 4 stores @ $2.27
  • 7 stores @ $1.50
  • 5 stores @ $1.89
  • 8 stores @ 0.75.

I could see that the cost varied but I had to sit down to figure out whether or not I was right. If it does turn out that this mouth-watering dish is at most $1, then I'll throw a big cheesy party in our next statistics lab, with enough macaroni and cheese for just me. (After all, as a poor starving student I can't be expected to feed our class of animals!)

  • \(H_{0}: \mu \leq 1\)
  • \(H_{a}: \mu > 1\)
  • Let \(\bar{X} =\) the mean cost in dollars of macaroni and cheese in a certain town.
  • Student's \(t\)-distribution
  • \(t = 0.340\)
  • \(p\text{-value} = 0.36756\)
  • Conclusion: The mean cost could be $1, or less. At the 5% significance level, there is insufficient evidence to conclude that the mean price of a box of macaroni and cheese is more than $1.
  • \((0.8291, 1.241)\)

"William Shakespeare: The Tragedy of Hamlet, Prince of Denmark," by Jacqueline Ghodsi

THE CHARACTERS (in order of appearance):

  • HAMLET, Prince of Denmark and student of Statistics
  • POLONIUS, Hamlet’s tutor
  • HOROTIO, friend to Hamlet and fellow student

Scene: The great library of the castle, in which Hamlet does his lessons

(The day is fair, but the face of Hamlet is clouded. He paces the large room. His tutor, Polonius, is reprimanding Hamlet regarding the latter’s recent experience. Horatio is seated at the large table at right stage.)

POLONIUS: My Lord, how cans’t thou admit that thou hast seen a ghost! It is but a figment of your imagination!

HAMLET: I beg to differ; I know of a certainty that five-and-seventy in one hundred of us, condemned to the whips and scorns of time as we are, have gazed upon a spirit of health, or goblin damn’d, be their intents wicked or charitable.

POLONIUS If thou doest insist upon thy wretched vision then let me invest your time; be true to thy work and speak to me through the reason of the null and alternate hypotheses. (He turns to Horatio.) Did not Hamlet himself say, “What piece of work is man, how noble in reason, how infinite in faculties? Then let not this foolishness persist. Go, Horatio, make a survey of three-and-sixty and discover what the true proportion be. For my part, I will never succumb to this fantasy, but deem man to be devoid of all reason should thy proposal of at least five-and-seventy in one hundred hold true.

HORATIO (to Hamlet): What should we do, my Lord?

HAMLET: Go to thy purpose, Horatio.

HORATIO: To what end, my Lord?

HAMLET: That you must teach me. But let me conjure you by the rights of our fellowship, by the consonance of our youth, but the obligation of our ever-preserved love, be even and direct with me, whether I am right or no.

(Horatio exits, followed by Polonius, leaving Hamlet to ponder alone.)

(The next day, Hamlet awaits anxiously the presence of his friend, Horatio. Polonius enters and places some books upon the table just a moment before Horatio enters.)

POLONIUS: So, Horatio, what is it thou didst reveal through thy deliberations?

HORATIO: In a random survey, for which purpose thou thyself sent me forth, I did discover that one-and-forty believe fervently that the spirits of the dead walk with us. Before my God, I might not this believe, without the sensible and true avouch of mine own eyes.

POLONIUS: Give thine own thoughts no tongue, Horatio. (Polonius turns to Hamlet.) But look to’t I charge you, my Lord. Come Horatio, let us go together, for this is not our test. (Horatio and Polonius leave together.)

HAMLET: To reject, or not reject, that is the question: whether ‘tis nobler in the mind to suffer the slings and arrows of outrageous statistics, or to take arms against a sea of data, and, by opposing, end them. (Hamlet resignedly attends to his task.)

(Curtain falls)

"Untitled," by Stephen Chen

I've often wondered how software is released and sold to the public. Ironically, I work for a company that sells products with known problems. Unfortunately, most of the problems are difficult to create, which makes them difficult to fix. I usually use the test program X, which tests the product, to try to create a specific problem. When the test program is run to make an error occur, the likelihood of generating an error is 1%.

So, armed with this knowledge, I wrote a new test program Y that will generate the same error that test program X creates, but more often. To find out if my test program is better than the original, so that I can convince the management that I'm right, I ran my test program to find out how often I can generate the same error. When I ran my test program 50 times, I generated the error twice. While this may not seem much better, I think that I can convince the management to use my test program instead of the original test program. Am I right?

  • \(H_{0}: p = 0.01\)
  • \(H_{a}: p > 0.01\)
  • Let \(P′ =\) the proportion of errors generated
  • Normal for a single proportion
  • Decision: Reject the null hypothesis
  • Conclusion: At the 5% significance level, there is sufficient evidence to conclude that the proportion of errors generated is more than 0.01.

The“plus-4s” confidence interval is \((0.004, 0.144)\).

"Japanese Girls’ Names"

by Kumi Furuichi

It used to be very typical for Japanese girls’ names to end with “ko.” (The trend might have started around my grandmothers’ generation and its peak might have been around my mother’s generation.) “Ko” means “child” in Chinese characters. Parents would name their daughters with “ko” attaching to other Chinese characters which have meanings that they want their daughters to become, such as Sachiko—happy child, Yoshiko—a good child, Yasuko—a healthy child, and so on.

However, I noticed recently that only two out of nine of my Japanese girlfriends at this school have names which end with “ko.” More and more, parents seem to have become creative, modernized, and, sometimes, westernized in naming their children.

I have a feeling that, while 70 percent or more of my mother’s generation would have names with “ko” at the end, the proportion has dropped among my peers. I wrote down all my Japanese friends’, ex-classmates’, co-workers, and acquaintances’ names that I could remember. Following are the names. (Some are repeats.) Test to see if the proportion has dropped for this generation.

Ai, Akemi, Akiko, Ayumi, Chiaki, Chie, Eiko, Eri, Eriko, Fumiko, Harumi, Hitomi, Hiroko, Hiroko, Hidemi, Hisako, Hinako, Izumi, Izumi, Junko, Junko, Kana, Kanako, Kanayo, Kayo, Kayoko, Kazumi, Keiko, Keiko, Kei, Kumi, Kumiko, Kyoko, Kyoko, Madoka, Maho, Mai, Maiko, Maki, Miki, Miki, Mikiko, Mina, Minako, Miyako, Momoko, Nana, Naoko, Naoko, Naoko, Noriko, Rieko, Rika, Rika, Rumiko, Rei, Reiko, Reiko, Sachiko, Sachiko, Sachiyo, Saki, Sayaka, Sayoko, Sayuri, Seiko, Shiho, Shizuka, Sumiko, Takako, Takako, Tomoe, Tomoe, Tomoko, Touko, Yasuko, Yasuko, Yasuyo, Yoko, Yoko, Yoko, Yoshiko, Yoshiko, Yoshiko, Yuka, Yuki, Yuki, Yukiko, Yuko, Yuko.

"Phillip’s Wish," by Suzanne Osorio

My nephew likes to play

Chasing the girls makes his day.

He asked his mother

If it is okay

To get his ear pierced.

She said, “No way!”

To poke a hole through your ear,

Is not what I want for you, dear.

He argued his point quite well,

Says even my macho pal, Mel,

Has gotten this done.

It’s all just for fun.

C’mon please, mom, please, what the hell.

Again Phillip complained to his mother,

Saying half his friends (including their brothers)

Are piercing their ears

And they have no fears

He wants to be like the others.

She said, “I think it’s much less.

We must do a hypothesis test.

And if you are right,

I won’t put up a fight.

But, if not, then my case will rest.”

We proceeded to call fifty guys

To see whose prediction would fly.

Nineteen of the fifty

Said piercing was nifty

And earrings they’d occasionally buy.

Then there’s the other thirty-one,

Who said they’d never have this done.

So now this poem’s finished.

Will his hopes be diminished,

Or will my nephew have his fun?

  • \(H_{0}: p = 0.50\)
  • \(H_{a}: p < 0.50\)
  • Let \(P′ =\) the proportion of friends that has a pierced ear.
  • –1.70
  • \(p\text{-value} = 0.0448\)
  • Reason for decision: The \(p\text{-value}\) is less than 0.05. (However, they are very close.)
  • Conclusion: There is sufficient evidence to support the claim that less than 50% of his friends have pierced ears.
  • Confidence Interval: \((0.245, 0.515)\): The “plus-4s” confidence interval is \((0.259, 0.519)\).

"The Craven," by Mark Salangsang

Once upon a morning dreary

In stats class I was weak and weary.

Pondering over last night’s homework

Whose answers were now on the board

This I did and nothing more.

While I nodded nearly napping

Suddenly, there came a tapping.

As someone gently rapping,

Rapping my head as I snore.

Quoth the teacher, “Sleep no more.”

“In every class you fall asleep,”

The teacher said, his voice was deep.

“So a tally I’ve begun to keep

Of every class you nap and snore.

The percentage being forty-four.”

“My dear teacher I must confess,

While sleeping is what I do best.

The percentage, I think, must be less,

A percentage less than forty-four.”

This I said and nothing more.

“We’ll see,” he said and walked away,

And fifty classes from that day

He counted till the month of May

The classes in which I napped and snored.

The number he found was twenty-four.

At a significance level of 0.05,

Please tell me am I still alive?

Or did my grade just take a dive

Plunging down beneath the floor?

Upon thee I hereby implore.

Toastmasters International cites a report by Gallop Poll that 40% of Americans fear public speaking. A student believes that less than 40% of students at her school fear public speaking. She randomly surveys 361 schoolmates and finds that 135 report they fear public speaking. Conduct a hypothesis test to determine if the percent at her school is less than 40%.

  • \(H_{0}: p = 0.40\)
  • \(H_{a}: p < 0.40\)
  • Let \(P′ =\) the proportion of schoolmates who fear public speaking.
  • –1.01
  • \(p\text{-value} = 0.1563\)
  • Conclusion: There is insufficient evidence to support the claim that less than 40% of students at the school fear public speaking.
  • Confidence Interval: \((0.3241, 0.4240)\): The “plus-4s” confidence interval is \((0.3257, 0.4250)\).

Sixty-eight percent of online courses taught at community colleges nationwide were taught by full-time faculty. To test if 68% also represents California’s percent for full-time faculty teaching the online classes, Long Beach City College (LBCC) in California, was randomly selected for comparison. In the same year, 34 of the 44 online courses LBCC offered were taught by full-time faculty. Conduct a hypothesis test to determine if 68% represents California. NOTE: For more accurate results, use more California community colleges and this past year's data.

According to an article in Bloomberg Businessweek , New York City's most recent adult smoking rate is 14%. Suppose that a survey is conducted to determine this year’s rate. Nine out of 70 randomly chosen N.Y. City residents reply that they smoke. Conduct a hypothesis test to determine if the rate is still 14% or if it has decreased.

  • \(H_{0}: p = 0.14\)
  • \(H_{a}: p < 0.14\)
  • Let \(P′ =\) the proportion of NYC residents that smoke.
  • –0.2756
  • \(p\text{-value} = 0.3914\)
  • At the 5% significance level, there is insufficient evidence to conclude that the proportion of NYC residents who smoke is less than 0.14.
  • Confidence Interval: \((0.0502, 0.2070)\): The “plus-4s” confidence interval (see chapter 8) is \((0.0676, 0.2297)\).

The mean age of De Anza College students in a previous term was 26.6 years old. An instructor thinks the mean age for online students is older than 26.6. She randomly surveys 56 online students and finds that the sample mean is 29.4 with a standard deviation of 2.1. Conduct a hypothesis test.

Registered nurses earned an average annual salary of $69,110. For that same year, a survey was conducted of 41 California registered nurses to determine if the annual salary is higher than $69,110 for California nurses. The sample average was $71,121 with a sample standard deviation of $7,489. Conduct a hypothesis test.

  • \(H_{0}: \mu = 69,110\)
  • \(H_{0}: \mu > 69,110\)
  • Let \(\bar{X} =\) the mean salary in dollars for California registered nurses.
  • \(t = 1.719\)
  • \(p\text{-value}: 0.0466\)
  • Conclusion: At the 5% significance level, there is sufficient evidence to conclude that the mean salary of California registered nurses exceeds $69,110.
  • \(($68,757, $73,485)\)

La Leche League International reports that the mean age of weaning a child from breastfeeding is age four to five worldwide. In America, most nursing mothers wean their children much earlier. Suppose a random survey is conducted of 21 U.S. mothers who recently weaned their children. The mean weaning age was nine months (3/4 year) with a standard deviation of 4 months. Conduct a hypothesis test to determine if the mean weaning age in the U.S. is less than four years old.

Over the past few decades, public health officials have examined the link between weight concerns and teen girls' smoking. Researchers surveyed a group of 273 randomly selected teen girls living in Massachusetts (between 12 and 15 years old). After four years the girls were surveyed again. Sixty-three said they smoked to stay thin. Is there good evidence that more than thirty percent of the teen girls smoke to stay thin?

After conducting the test, your decision and conclusion are

  • Reject \(H_{0}\): There is sufficient evidence to conclude that more than 30% of teen girls smoke to stay thin.
  • Do not reject \(H_{0}\): There is not sufficient evidence to conclude that less than 30% of teen girls smoke to stay thin.
  • Do not reject \(H_{0}\): There is not sufficient evidence to conclude that more than 30% of teen girls smoke to stay thin.
  • Reject \(H_{0}\): There is sufficient evidence to conclude that less than 30% of teen girls smoke to stay thin.

A statistics instructor believes that fewer than 20% of Evergreen Valley College (EVC) students attended the opening night midnight showing of the latest Harry Potter movie. She surveys 84 of her students and finds that 11 of them attended the midnight showing.

At a 1% level of significance, an appropriate conclusion is:

  • There is insufficient evidence to conclude that the percent of EVC students who attended the midnight showing of Harry Potter is less than 20%.
  • There is sufficient evidence to conclude that the percent of EVC students who attended the midnight showing of Harry Potter is more than 20%.
  • There is sufficient evidence to conclude that the percent of EVC students who attended the midnight showing of Harry Potter is less than 20%.
  • There is insufficient evidence to conclude that the percent of EVC students who attended the midnight showing of Harry Potter is at least 20%.

Previously, an organization reported that teenagers spent 4.5 hours per week, on average, on the phone. The organization thinks that, currently, the mean is higher. Fifteen randomly chosen teenagers were asked how many hours per week they spend on the phone. The sample mean was 4.75 hours with a sample standard deviation of 2.0. Conduct a hypothesis test.

At a significance level of \(a = 0.05\), what is the correct conclusion?

  • There is enough evidence to conclude that the mean number of hours is more than 4.75
  • There is enough evidence to conclude that the mean number of hours is more than 4.5
  • There is not enough evidence to conclude that the mean number of hours is more than 4.5
  • There is not enough evidence to conclude that the mean number of hours is more than 4.75

Instructions: For the following ten exercises,

Hypothesis testing: For the following ten exercises, answer each question.

State the null and alternate hypothesis.

State the \(p\text{-value}\).

State \(\alpha\).

What is your decision?

Write a conclusion.

Answer any other questions asked in the problem.

According to the Center for Disease Control website, in 2011 at least 18% of high school students have smoked a cigarette. An Introduction to Statistics class in Davies County, KY conducted a hypothesis test at the local high school (a medium sized–approximately 1,200 students–small city demographic) to determine if the local high school’s percentage was lower. One hundred fifty students were chosen at random and surveyed. Of the 150 students surveyed, 82 have smoked. Use a significance level of 0.05 and using appropriate statistical evidence, conduct a hypothesis test and state the conclusions.

A recent survey in the N.Y. Times Almanac indicated that 48.8% of families own stock. A broker wanted to determine if this survey could be valid. He surveyed a random sample of 250 families and found that 142 owned some type of stock. At the 0.05 significance level, can the survey be considered to be accurate?

  • \(H_{0}: p = 0.488\) \(H_{a}: p \neq 0.488\)
  • \(p\text{-value} = 0.0114\)
  • \(\alpha = 0.05\)
  • Reject the null hypothesis.
  • At the 5% level of significance, there is enough evidence to conclude that 48.8% of families own stocks.
  • The survey does not appear to be accurate.

Driver error can be listed as the cause of approximately 54% of all fatal auto accidents, according to the American Automobile Association. Thirty randomly selected fatal accidents are examined, and it is determined that 14 were caused by driver error. Using \(\alpha = 0.05\), is the AAA proportion accurate?

The US Department of Energy reported that 51.7% of homes were heated by natural gas. A random sample of 221 homes in Kentucky found that 115 were heated by natural gas. Does the evidence support the claim for Kentucky at the \(\alpha = 0.05\) level in Kentucky? Are the results applicable across the country? Why?

  • \(H_{0}: p = 0.517\) \(H_{0}: p \neq 0.517\)
  • \(p\text{-value} = 0.9203\).
  • \(\alpha = 0.05\).
  • Do not reject the null hypothesis.
  • At the 5% significance level, there is not enough evidence to conclude that the proportion of homes in Kentucky that are heated by natural gas is 0.517.
  • However, we cannot generalize this result to the entire nation. First, the sample’s population is only the state of Kentucky. Second, it is reasonable to assume that homes in the extreme north and south will have extreme high usage and low usage, respectively. We would need to expand our sample base to include these possibilities if we wanted to generalize this claim to the entire nation.

For Americans using library services, the American Library Association claims that at most 67% of patrons borrow books. The library director in Owensboro, Kentucky feels this is not true, so she asked a local college statistic class to conduct a survey. The class randomly selected 100 patrons and found that 82 borrowed books. Did the class demonstrate that the percentage was higher in Owensboro, KY? Use \(\alpha = 0.01\) level of significance. What is the possible proportion of patrons that do borrow books from the Owensboro Library?

The Weather Underground reported that the mean amount of summer rainfall for the northeastern US is at least 11.52 inches. Ten cities in the northeast are randomly selected and the mean rainfall amount is calculated to be 7.42 inches with a standard deviation of 1.3 inches. At the \(\alpha = 0.05 level\), can it be concluded that the mean rainfall was below the reported average? What if \(\alpha = 0.01\)? Assume the amount of summer rainfall follows a normal distribution.

  • \(H_{0}: \mu \geq 11.52\) \(H_{a}: \mu < 11.52\)
  • \(p\text{-value} = 0.000002\) which is almost 0.
  • At the 5% significance level, there is enough evidence to conclude that the mean amount of summer rain in the northeaster US is less than 11.52 inches, on average.
  • We would make the same conclusion if alpha was 1% because the \(p\text{-value}\) is almost 0.

A survey in the N.Y. Times Almanac finds the mean commute time (one way) is 25.4 minutes for the 15 largest US cities. The Austin, TX chamber of commerce feels that Austin’s commute time is less and wants to publicize this fact. The mean for 25 randomly selected commuters is 22.1 minutes with a standard deviation of 5.3 minutes. At the \(\alpha = 0.10\) level, is the Austin, TX commute significantly less than the mean commute time for the 15 largest US cities?

A report by the Gallup Poll found that a woman visits her doctor, on average, at most 5.8 times each year. A random sample of 20 women results in these yearly visit totals

3; 2; 1; 3; 7; 2; 9; 4; 6; 6; 8; 0; 5; 6; 4; 2; 1; 3; 4; 1

At the \(\alpha = 0.05\) level can it be concluded that the sample mean is higher than 5.8 visits per year?

  • \(H_{0}: \mu \leq 5.8\) \(H_{a}: \mu > 5.8\)
  • \(p\text{-value} = 0.9987\)
  • At the 5% level of significance, there is not enough evidence to conclude that a woman visits her doctor, on average, more than 5.8 times a year.

According to the N.Y. Times Almanac the mean family size in the U.S. is 3.18. A sample of a college math class resulted in the following family sizes:

5; 4; 5; 4; 4; 3; 6; 4; 3; 3; 5; 5; 6; 3; 3; 2; 7; 4; 5; 2; 2; 2; 3; 2

At \(\alpha = 0.05\) level, is the class’ mean family size greater than the national average? Does the Almanac result remain valid? Why?

The student academic group on a college campus claims that freshman students study at least 2.5 hours per day, on average. One Introduction to Statistics class was skeptical. The class took a random sample of 30 freshman students and found a mean study time of 137 minutes with a standard deviation of 45 minutes. At α = 0.01 level, is the student academic group’s claim correct?

  • \(H_{0}: \mu \geq 150\) \(H_{0}: \mu < 150\)
  • \(p\text{-value} = 0.0622\)
  • \(\alpha = 0.01\)
  • At the 1% significance level, there is not enough evidence to conclude that freshmen students study less than 2.5 hours per day, on average.
  • The student academic group’s claim appears to be correct.

9.7: Hypothesis Testing of a Single Mean and Single Proportion

Formulating and Testing Hypotheses

Cite this chapter.

formulate and test a working hypothesis

  • Gary A. Wobeser 2  

161 Accesses

The term hypothesis has been mentioned several times in the preceding chapters. The definition that will be used here is that a hypothesis is a proposition set forth as explanation for the occurrence of a specified phenomenon. The basis of scientific investigation is the collection of information that is used either to formulate or to test hypotheses. One assesses the important variables and tries to build a model or hypothesis that explains the observed phenomenon. In general, a hypothesis is formulated by rephrasing the objective of a study as a statement, e.g., if the objective of an investigation is to determine if a pesticide is safe, the resulting hypothesis might be “ the pesticide is not safe ”, or alternatively that “ the pesticide is safe ”. A hypothesis is a statistical hypothesis only if it is stated in terms related to the distribution of populations. The general hypothesis above might be refined to: “ this pesticide, when used as directed, has no effect on the average number of robins in an area ”, which is a testable hypothesis. The hypothesis to be tested is called the null hypothesis (H 0 ). The alternative hypothesis (H 1 ) for the above example would be “ this pesticide, when used as directed, has an effect on the average number of robins in an area”. In testing a hypothesis, H 0 is considered to be true, unless the sample data indicate otherwise, (i.e., that the pesticide is innocent, unless proven guilty). Testing cannot prove H 0 to be true but the results can cause it to be rejected. In accepting or rejecting H 0 , two types of error may be made. If H 0 is rejected when, in fact, it is true a type 1 error has been committed. If Ho is not true and the test fails to reject it, a type 2 error has been made.

“ Research in the field, through study of disease as it manifests itself in nature, is an important and independent approach to solution of medical problems. Modern medical progress has been so thoroughly associated with research in the biological laboratory, and it has been so largely a development of the experimental method, that this other and older method has come in recent years to be overshadowed ” (Gordon, 1950)

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Unable to display preview.  Download preview PDF.

Author information

Authors and affiliations.

Western College of Veterinary Medicine, University of Saskatchewan, Saskatoon, Canada

Gary A. Wobeser

You can also search for this author in PubMed   Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

© 1994 Springer Science+Business Media New York

About this chapter

Wobeser, G.A. (1994). Formulating and Testing Hypotheses. In: Investigation and Management of Disease in Wild Animals. Springer, Boston, MA. https://doi.org/10.1007/978-1-4757-5609-8_6

Download citation

DOI : https://doi.org/10.1007/978-1-4757-5609-8_6

Publisher Name : Springer, Boston, MA

Print ISBN : 978-1-4757-5611-1

Online ISBN : 978-1-4757-5609-8

eBook Packages : Springer Book Archive

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Statology

Statistics Made Easy

How to Write Hypothesis Test Conclusions (With Examples)

A   hypothesis test is used to test whether or not some hypothesis about a population parameter is true.

To perform a hypothesis test in the real world, researchers obtain a random sample from the population and perform a hypothesis test on the sample data, using a null and alternative hypothesis:

  • Null Hypothesis (H 0 ): The sample data occurs purely from chance.
  • Alternative Hypothesis (H A ): The sample data is influenced by some non-random cause.

If the p-value of the hypothesis test is less than some significance level (e.g. α = .05), then we reject the null hypothesis .

Otherwise, if the p-value is not less than some significance level then we fail to reject the null hypothesis .

When writing the conclusion of a hypothesis test, we typically include:

  • Whether we reject or fail to reject the null hypothesis.
  • The significance level.
  • A short explanation in the context of the hypothesis test.

For example, we would write:

We reject the null hypothesis at the 5% significance level.   There is sufficient evidence to support the claim that…

Or, we would write:

We fail to reject the null hypothesis at the 5% significance level.   There is not sufficient evidence to support the claim that…

The following examples show how to write a hypothesis test conclusion in both scenarios.

Example 1: Reject the Null Hypothesis Conclusion

Suppose a biologist believes that a certain fertilizer will cause plants to grow more during a one-month period than they normally do, which is currently 20 inches. To test this, she applies the fertilizer to each of the plants in her laboratory for one month.

She then performs a hypothesis test at a 5% significance level using the following hypotheses:

  • H 0 : μ = 20 inches (the fertilizer will have no effect on the mean plant growth)
  • H A : μ > 20 inches (the fertilizer will cause mean plant growth to increase)

Suppose the p-value of the test turns out to be 0.002.

Here is how she would report the results of the hypothesis test:

We reject the null hypothesis at the 5% significance level.   There is sufficient evidence to support the claim that this particular fertilizer causes plants to grow more during a one-month period than they normally do.

Example 2: Fail to Reject the Null Hypothesis Conclusion

Suppose the manager of a manufacturing plant wants to test whether or not some new method changes the number of defective widgets produced per month, which is currently 250. To test this, he measures the mean number of defective widgets produced before and after using the new method for one month.

He performs a hypothesis test at a 10% significance level using the following hypotheses:

  • H 0 : μ after = μ before (the mean number of defective widgets is the same before and after using the new method)
  • H A : μ after ≠ μ before (the mean number of defective widgets produced is different before and after using the new method)

Suppose the p-value of the test turns out to be 0.27.

Here is how he would report the results of the hypothesis test:

We fail to reject the null hypothesis at the 10% significance level.   There is not sufficient evidence to support the claim that the new method leads to a change in the number of defective widgets produced per month.

Additional Resources

The following tutorials provide additional information about hypothesis testing:

Introduction to Hypothesis Testing 4 Examples of Hypothesis Testing in Real Life How to Write a Null Hypothesis

Featured Posts

5 Statistical Biases to Avoid

Hey there. My name is Zach Bobbitt. I have a Master of Science degree in Applied Statistics and I’ve worked on machine learning algorithms for professional businesses in both healthcare and retail. I’m passionate about statistics, machine learning, and data visualization and I created Statology to be a resource for both students and teachers alike.  My goal with this site is to help you learn statistics through using simple terms, plenty of real-world examples, and helpful illustrations.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Definition of a Hypothesis

What it is and how it's used in sociology

  • Key Concepts
  • Major Sociologists
  • News & Issues
  • Research, Samples, and Statistics
  • Recommended Reading
  • Archaeology

A hypothesis is a prediction of what will be found at the outcome of a research project and is typically focused on the relationship between two different variables studied in the research. It is usually based on both theoretical expectations about how things work and already existing scientific evidence.

Within social science, a hypothesis can take two forms. It can predict that there is no relationship between two variables, in which case it is a null hypothesis . Or, it can predict the existence of a relationship between variables, which is known as an alternative hypothesis.

In either case, the variable that is thought to either affect or not affect the outcome is known as the independent variable, and the variable that is thought to either be affected or not is the dependent variable.

Researchers seek to determine whether or not their hypothesis, or hypotheses if they have more than one, will prove true. Sometimes they do, and sometimes they do not. Either way, the research is considered successful if one can conclude whether or not a hypothesis is true. 

Null Hypothesis

A researcher has a null hypothesis when she or he believes, based on theory and existing scientific evidence, that there will not be a relationship between two variables. For example, when examining what factors influence a person's highest level of education within the U.S., a researcher might expect that place of birth, number of siblings, and religion would not have an impact on the level of education. This would mean the researcher has stated three null hypotheses.

Alternative Hypothesis

Taking the same example, a researcher might expect that the economic class and educational attainment of one's parents, and the race of the person in question are likely to have an effect on one's educational attainment. Existing evidence and social theories that recognize the connections between wealth and cultural resources , and how race affects access to rights and resources in the U.S. , would suggest that both economic class and educational attainment of the one's parents would have a positive effect on educational attainment. In this case, economic class and educational attainment of one's parents are independent variables, and one's educational attainment is the dependent variable—it is hypothesized to be dependent on the other two.

Conversely, an informed researcher would expect that being a race other than white in the U.S. is likely to have a negative impact on a person's educational attainment. This would be characterized as a negative relationship, wherein being a person of color has a negative effect on one's educational attainment. In reality, this hypothesis proves true, with the exception of Asian Americans , who go to college at a higher rate than whites do. However, Blacks and Hispanics and Latinos are far less likely than whites and Asian Americans to go to college.

Formulating a Hypothesis

Formulating a hypothesis can take place at the very beginning of a research project , or after a bit of research has already been done. Sometimes a researcher knows right from the start which variables she is interested in studying, and she may already have a hunch about their relationships. Other times, a researcher may have an interest in ​a particular topic, trend, or phenomenon, but he may not know enough about it to identify variables or formulate a hypothesis.

Whenever a hypothesis is formulated, the most important thing is to be precise about what one's variables are, what the nature of the relationship between them might be, and how one can go about conducting a study of them.

Updated by Nicki Lisa Cole, Ph.D

  • What Is a Hypothesis? (Science)
  • Understanding Path Analysis
  • Null Hypothesis Examples
  • What Are the Elements of a Good Hypothesis?
  • What It Means When a Variable Is Spurious
  • What 'Fail to Reject' Means in a Hypothesis Test
  • How Intervening Variables Work in Sociology
  • Null Hypothesis Definition and Examples
  • Understanding Simple vs Controlled Experiments
  • Scientific Method Vocabulary Terms
  • Null Hypothesis and Alternative Hypothesis
  • Six Steps of the Scientific Method
  • What Are Examples of a Hypothesis?
  • Structural Equation Modeling
  • Scientific Method Flow Chart
  • Lambda and Gamma as Defined in Sociology
  • Machine Learning Tutorial
  • Data Analysis Tutorial
  • Python - Data visualization tutorial
  • Machine Learning Projects
  • Machine Learning Interview Questions
  • Machine Learning Mathematics
  • Deep Learning Tutorial
  • Deep Learning Project
  • Deep Learning Interview Questions
  • Computer Vision Tutorial
  • Computer Vision Projects
  • NLP Project
  • NLP Interview Questions
  • Statistics with Python
  • 100 Days of Machine Learning
  • Data Analysis with Python

Introduction to Data Analysis

  • What is Data Analysis?
  • Data Analytics and its type
  • How to Install Numpy on Windows?
  • How to Install Pandas in Python?
  • How to Install Matplotlib on python?
  • How to Install Python Tensorflow in Windows?

Data Analysis Libraries

  • Pandas Tutorial
  • NumPy Tutorial - Python Library
  • Data Analysis with SciPy
  • Introduction to TensorFlow

Data Visulization Libraries

  • Matplotlib Tutorial
  • Python Seaborn Tutorial
  • Plotly tutorial
  • Introduction to Bokeh in Python

Exploratory Data Analysis (EDA)

  • Univariate, Bivariate and Multivariate data and its analysis
  • Measures of Central Tendency in Statistics
  • Measures of spread - Range, Variance, and Standard Deviation
  • Interquartile Range and Quartile Deviation using NumPy and SciPy
  • Anova Formula
  • Skewness of Statistical Data
  • How to Calculate Skewness and Kurtosis in Python?
  • Difference Between Skewness and Kurtosis
  • Histogram | Meaning, Example, Types and Steps to Draw
  • Interpretations of Histogram
  • Quantile Quantile plots
  • What is Univariate, Bivariate & Multivariate Analysis in Data Visualisation?
  • Using pandas crosstab to create a bar plot
  • Exploring Correlation in Python
  • Mathematics | Covariance and Correlation
  • Factor Analysis | Data Analysis
  • Data Mining - Cluster Analysis
  • MANOVA Test in R Programming
  • Python - Central Limit Theorem
  • Probability Distribution Function
  • Probability Density Estimation & Maximum Likelihood Estimation
  • Exponential Distribution in R Programming - dexp(), pexp(), qexp(), and rexp() Functions
  • Mathematics | Probability Distributions Set 4 (Binomial Distribution)
  • Poisson Distribution - Definition, Formula, Table and Examples
  • P-Value: Comprehensive Guide to Understand, Apply, and Interpret
  • Z-Score in Statistics
  • How to Calculate Point Estimates in R?
  • Confidence Interval
  • Chi-square test in Machine Learning

Understanding Hypothesis Testing

Data preprocessing.

  • ML | Data Preprocessing in Python
  • ML | Overview of Data Cleaning
  • ML | Handling Missing Values
  • Detect and Remove the Outliers using Python

Data Transformation

  • Data Normalization Machine Learning
  • Sampling distribution Using Python

Time Series Data Analysis

  • Data Mining - Time-Series, Symbolic and Biological Sequences Data
  • Basic DateTime Operations in Python
  • Time Series Analysis & Visualization in Python
  • How to deal with missing values in a Timeseries in Python?
  • How to calculate MOVING AVERAGE in a Pandas DataFrame?
  • What is a trend in time series?
  • How to Perform an Augmented Dickey-Fuller Test in R
  • AutoCorrelation

Case Studies and Projects

  • Top 8 Free Dataset Sources to Use for Data Science Projects
  • Step by Step Predictive Analysis - Machine Learning
  • 6 Tips for Creating Effective Data Visualizations

Hypothesis testing involves formulating assumptions about population parameters based on sample statistics and rigorously evaluating these assumptions against empirical evidence. This article sheds light on the significance of hypothesis testing and the critical steps involved in the process.

What is Hypothesis Testing?

Hypothesis testing is a statistical method that is used to make a statistical decision using experimental data. Hypothesis testing is basically an assumption that we make about a population parameter. It evaluates two mutually exclusive statements about a population to determine which statement is best supported by the sample data. 

Example: You say an average height in the class is 30 or a boy is taller than a girl. All of these is an assumption that we are assuming, and we need some statistical way to prove these. We need some mathematical conclusion whatever we are assuming is true.

Defining Hypotheses

\mu

Key Terms of Hypothesis Testing

\alpha

  • P-value: The P value , or calculated probability, is the probability of finding the observed/extreme results when the null hypothesis(H0) of a study-given problem is true. If your P-value is less than the chosen significance level then you reject the null hypothesis i.e. accept that your sample claims to support the alternative hypothesis.
  • Test Statistic: The test statistic is a numerical value calculated from sample data during a hypothesis test, used to determine whether to reject the null hypothesis. It is compared to a critical value or p-value to make decisions about the statistical significance of the observed results.
  • Critical value : The critical value in statistics is a threshold or cutoff point used to determine whether to reject the null hypothesis in a hypothesis test.
  • Degrees of freedom: Degrees of freedom are associated with the variability or freedom one has in estimating a parameter. The degrees of freedom are related to the sample size and determine the shape.

Why do we use Hypothesis Testing?

Hypothesis testing is an important procedure in statistics. Hypothesis testing evaluates two mutually exclusive population statements to determine which statement is most supported by sample data. When we say that the findings are statistically significant, thanks to hypothesis testing. 

One-Tailed and Two-Tailed Test

One tailed test focuses on one direction, either greater than or less than a specified value. We use a one-tailed test when there is a clear directional expectation based on prior knowledge or theory. The critical region is located on only one side of the distribution curve. If the sample falls into this critical region, the null hypothesis is rejected in favor of the alternative hypothesis.

One-Tailed Test

There are two types of one-tailed test:

\mu \geq 50

Two-Tailed Test

A two-tailed test considers both directions, greater than and less than a specified value.We use a two-tailed test when there is no specific directional expectation, and want to detect any significant difference.

\mu =

What are Type 1 and Type 2 errors in Hypothesis Testing?

In hypothesis testing, Type I and Type II errors are two possible errors that researchers can make when drawing conclusions about a population based on a sample of data. These errors are associated with the decisions made regarding the null hypothesis and the alternative hypothesis.

\alpha

How does Hypothesis Testing work?

Step 1: define null and alternative hypothesis.

H_0

We first identify the problem about which we want to make an assumption keeping in mind that our assumption should be contradictory to one another, assuming Normally distributed data.

Step 2 – Choose significance level

\alpha

Step 3 – Collect and Analyze data.

Gather relevant data through observation or experimentation. Analyze the data using appropriate statistical methods to obtain a test statistic.

Step 4-Calculate Test Statistic

The data for the tests are evaluated in this step we look for various scores based on the characteristics of data. The choice of the test statistic depends on the type of hypothesis test being conducted.

There are various hypothesis tests, each appropriate for various goal to calculate our test. This could be a Z-test , Chi-square , T-test , and so on.

  • Z-test : If population means and standard deviations are known. Z-statistic is commonly used.
  • t-test : If population standard deviations are unknown. and sample size is small than t-test statistic is more appropriate.
  • Chi-square test : Chi-square test is used for categorical data or for testing independence in contingency tables
  • F-test : F-test is often used in analysis of variance (ANOVA) to compare variances or test the equality of means across multiple groups.

We have a smaller dataset, So, T-test is more appropriate to test our hypothesis.

T-statistic is a measure of the difference between the means of two groups relative to the variability within each group. It is calculated as the difference between the sample means divided by the standard error of the difference. It is also known as the t-value or t-score.

Step 5 – Comparing Test Statistic:

In this stage, we decide where we should accept the null hypothesis or reject the null hypothesis. There are two ways to decide where we should accept or reject the null hypothesis.

Method A: Using Crtical values

Comparing the test statistic and tabulated critical value we have,

  • If Test Statistic>Critical Value: Reject the null hypothesis.
  • If Test Statistic≤Critical Value: Fail to reject the null hypothesis.

Note: Critical values are predetermined threshold values that are used to make a decision in hypothesis testing. To determine critical values for hypothesis testing, we typically refer to a statistical distribution table , such as the normal distribution or t-distribution tables based on.

Method B: Using P-values

We can also come to an conclusion using the p-value,

p\leq\alpha

Note : The p-value is the probability of obtaining a test statistic as extreme as, or more extreme than, the one observed in the sample, assuming the null hypothesis is true. To determine p-value for hypothesis testing, we typically refer to a statistical distribution table , such as the normal distribution or t-distribution tables based on.

Step 7- Interpret the Results

At last, we can conclude our experiment using method A or B.

Calculating test statistic

To validate our hypothesis about a population parameter we use statistical functions . We use the z-score, p-value, and level of significance(alpha) to make evidence for our hypothesis for normally distributed data .

1. Z-statistics:

When population means and standard deviations are known.

z = \frac{\bar{x} - \mu}{\frac{\sigma}{\sqrt{n}}}

  • μ represents the population mean, 
  • σ is the standard deviation
  • and n is the size of the sample.

2. T-Statistics

T test is used when n<30,

t-statistic calculation is given by:

t=\frac{x̄-μ}{s/\sqrt{n}}

  • t = t-score,
  • x̄ = sample mean
  • μ = population mean,
  • s = standard deviation of the sample,
  • n = sample size

3. Chi-Square Test

Chi-Square Test for Independence categorical Data (Non-normally distributed) using:

\chi^2 = \sum \frac{(O_{ij} - E_{ij})^2}{E_{ij}}

  • i,j are the rows and columns index respectively.

E_{ij}

Real life Hypothesis Testing example

Let’s examine hypothesis testing using two real life situations,

Case A: D oes a New Drug Affect Blood Pressure?

Imagine a pharmaceutical company has developed a new drug that they believe can effectively lower blood pressure in patients with hypertension. Before bringing the drug to market, they need to conduct a study to assess its impact on blood pressure.

  • Before Treatment: 120, 122, 118, 130, 125, 128, 115, 121, 123, 119
  • After Treatment: 115, 120, 112, 128, 122, 125, 110, 117, 119, 114

Step 1 : Define the Hypothesis

  • Null Hypothesis : (H 0 )The new drug has no effect on blood pressure.
  • Alternate Hypothesis : (H 1 )The new drug has an effect on blood pressure.

Step 2: Define the Significance level

Let’s consider the Significance level at 0.05, indicating rejection of the null hypothesis.

If the evidence suggests less than a 5% chance of observing the results due to random variation.

Step 3 : Compute the test statistic

Using paired T-test analyze the data to obtain a test statistic and a p-value.

The test statistic (e.g., T-statistic) is calculated based on the differences between blood pressure measurements before and after treatment.

t = m/(s/√n)

  • m  = mean of the difference i.e X after, X before
  • s  = standard deviation of the difference (d) i.e d i ​= X after, i ​− X before,
  • n  = sample size,

then, m= -3.9, s= 1.8 and n= 10

we, calculate the , T-statistic = -9 based on the formula for paired t test

Step 4: Find the p-value

The calculated t-statistic is -9 and degrees of freedom df = 9, you can find the p-value using statistical software or a t-distribution table.

thus, p-value = 8.538051223166285e-06

Step 5: Result

  • If the p-value is less than or equal to 0.05, the researchers reject the null hypothesis.
  • If the p-value is greater than 0.05, they fail to reject the null hypothesis.

Conclusion: Since the p-value (8.538051223166285e-06) is less than the significance level (0.05), the researchers reject the null hypothesis. There is statistically significant evidence that the average blood pressure before and after treatment with the new drug is different.

Python Implementation of Hypothesis Testing

Let’s create hypothesis testing with python, where we are testing whether a new drug affects blood pressure. For this example, we will use a paired T-test. We’ll use the scipy.stats library for the T-test.

Scipy is a mathematical library in Python that is mostly used for mathematical equations and computations.

We will implement our first real life problem via python,

In the above example, given the T-statistic of approximately -9 and an extremely small p-value, the results indicate a strong case to reject the null hypothesis at a significance level of 0.05. 

  • The results suggest that the new drug, treatment, or intervention has a significant effect on lowering blood pressure.
  • The negative T-statistic indicates that the mean blood pressure after treatment is significantly lower than the assumed population mean before treatment.

Case B : Cholesterol level in a population

Data: A sample of 25 individuals is taken, and their cholesterol levels are measured.

Cholesterol Levels (mg/dL): 205, 198, 210, 190, 215, 205, 200, 192, 198, 205, 198, 202, 208, 200, 205, 198, 205, 210, 192, 205, 198, 205, 210, 192, 205.

Populations Mean = 200

Population Standard Deviation (σ): 5 mg/dL(given for this problem)

Step 1: Define the Hypothesis

  • Null Hypothesis (H 0 ): The average cholesterol level in a population is 200 mg/dL.
  • Alternate Hypothesis (H 1 ): The average cholesterol level in a population is different from 200 mg/dL.

As the direction of deviation is not given , we assume a two-tailed test, and based on a normal distribution table, the critical values for a significance level of 0.05 (two-tailed) can be calculated through the z-table and are approximately -1.96 and 1.96.

(203.8 - 200) / (5 \div \sqrt{25})

Step 4: Result

Since the absolute value of the test statistic (2.04) is greater than the critical value (1.96), we reject the null hypothesis. And conclude that, there is statistically significant evidence that the average cholesterol level in the population is different from 200 mg/dL

Limitations of Hypothesis Testing

  • Although a useful technique, hypothesis testing does not offer a comprehensive grasp of the topic being studied. Without fully reflecting the intricacy or whole context of the phenomena, it concentrates on certain hypotheses and statistical significance.
  • The accuracy of hypothesis testing results is contingent on the quality of available data and the appropriateness of statistical methods used. Inaccurate data or poorly formulated hypotheses can lead to incorrect conclusions.
  • Relying solely on hypothesis testing may cause analysts to overlook significant patterns or relationships in the data that are not captured by the specific hypotheses being tested. This limitation underscores the importance of complimenting hypothesis testing with other analytical approaches.

Hypothesis testing stands as a cornerstone in statistical analysis, enabling data scientists to navigate uncertainties and draw credible inferences from sample data. By systematically defining null and alternative hypotheses, choosing significance levels, and leveraging statistical tests, researchers can assess the validity of their assumptions. The article also elucidates the critical distinction between Type I and Type II errors, providing a comprehensive understanding of the nuanced decision-making process inherent in hypothesis testing. The real-life example of testing a new drug’s effect on blood pressure using a paired T-test showcases the practical application of these principles, underscoring the importance of statistical rigor in data-driven decision-making.

Frequently Asked Questions (FAQs)

1. what are the 3 types of hypothesis test.

There are three types of hypothesis tests: right-tailed, left-tailed, and two-tailed. Right-tailed tests assess if a parameter is greater, left-tailed if lesser. Two-tailed tests check for non-directional differences, greater or lesser.

2.What are the 4 components of hypothesis testing?

Null Hypothesis ( ): No effect or difference exists. Alternative Hypothesis ( ): An effect or difference exists. Significance Level ( ): Risk of rejecting null hypothesis when it’s true (Type I error). Test Statistic: Numerical value representing observed evidence against null hypothesis.

3.What is hypothesis testing in ML?

Statistical method to evaluate the performance and validity of machine learning models. Tests specific hypotheses about model behavior, like whether features influence predictions or if a model generalizes well to unseen data.

4.What is the difference between Pytest and hypothesis in Python?

Pytest purposes general testing framework for Python code while Hypothesis is a Property-based testing framework for Python, focusing on generating test cases based on specified properties of the code.

Please Login to comment...

Similar reads.

  • data-science
  • Data Science
  • Machine Learning

advertisewithusBannerImg

Improve your Coding Skills with Practice

 alt=

What kind of Experience do you want to share?

COMMENTS

  1. Hypothesis Testing

    Hypothesis testing example. You want to test whether there is a relationship between gender and height. Based on your knowledge of human physiology, you formulate a hypothesis that men are, on average, taller than women. To test this hypothesis, you restate it as: H 0: Men are, on average, not taller than women. H a: Men are, on average, taller ...

  2. Formulating Hypotheses for Different Study Designs

    Formulating Hypotheses for Different Study Designs. Generating a testable working hypothesis is the first step towards conducting original research. Such research may prove or disprove the proposed hypothesis. Case reports, case series, online surveys and other observational studies, clinical trials, and narrative reviews help to generate ...

  3. 1.1: The Working Hypothesis

    1.1: The Working Hypothesis. Using the scientific method, before any statistical analysis can be conducted, a researcher must generate a guess, or hypothesis about what is going on. The process begins with a Working Hypothesis. This is a direct statement of the research idea. For example, a plant biologist may think that plant height may be ...

  4. (PDF) FORMULATING AND TESTING HYPOTHESIS

    The researcher states a hypothesis to be tested, formulates an analysis plan, analyzes sample data. according to the plan, and accepts or rejects the null hypothesis, based on r esults of the ...

  5. 1.2

    Step 7: Based on Steps 5 and 6, draw a conclusion about H 0. If F calculated is larger than F α, then you are in the rejection region and you can reject the null hypothesis with ( 1 − α) level of confidence. Note that modern statistical software condenses Steps 6 and 7 by providing a p -value. The p -value here is the probability of getting ...

  6. Working hypothesis

    A working hypothesis is a hypothesis that is provisionally accepted as a basis for further ongoing research in the hope that a tenable theory will be produced, even if the hypothesis ultimately fails. Like all hypotheses, a working hypothesis is constructed as a statement of expectations, which can be linked to deductive, exploratory research in empirical investigation and is often used as a ...

  7. How Do You Formulate (Important) Hypotheses?

    Building on the ideas in Chap. 1, we describe formulating, testing, and revising hypotheses as a continuing cycle of clarifying what you want to study, making predictions about what you might find together with developing your reasons for these predictions, imagining tests of these predictions, revising your predictions and rationales, and so ...

  8. Developing and Testing Hypotheses

    Testing Research Hypotheses. The purpose of statistical hypothesis testing is to use a sample to draw inferences about a population. Testing research hypotheses requires a number of steps: Step 1. Define your research hypothesis. The first step in any hypothesis testing is to identify your hypothesis, which you will then go on to test.

  9. Research Hypothesis: Definition, Types, Examples and Quick Tips

    6. Empirical hypothesis. Also referred to as the working hypothesis, an empirical hypothesis claims a theory's validation via experiments and observation. This way, the statement appears justifiable and different from a wild guess. Say, the hypothesis is "Women who take iron tablets face a lesser risk of anemia than those who take vitamin B12."

  10. A Beginner's Guide to Hypothesis Testing in Business

    3. One-Sided vs. Two-Sided Testing. When it's time to test your hypothesis, it's important to leverage the correct testing method. The two most common hypothesis testing methods are one-sided and two-sided tests, or one-tailed and two-tailed tests, respectively. Typically, you'd leverage a one-sided test when you have a strong conviction ...

  11. Hypothesis: Definition, Examples, and Types

    A hypothesis is a tentative statement about the relationship between two or more variables. It is a specific, testable prediction about what you expect to happen in a study. It is a preliminary answer to your question that helps guide the research process. Consider a study designed to examine the relationship between sleep deprivation and test ...

  12. What is a Hypothesis

    Definition: Hypothesis is an educated guess or proposed explanation for a phenomenon, based on some initial observations or data. It is a tentative statement that can be tested and potentially proven or disproven through further investigation and experimentation. Hypothesis is often used in scientific research to guide the design of experiments ...

  13. How to Write a Strong Hypothesis

    Step 5: Phrase your hypothesis in three ways. To identify the variables, you can write a simple prediction in if … then form. The first part of the sentence states the independent variable and the second part states the dependent variable. If a first-year student starts attending more lectures, then their exam scores will improve.

  14. Chapter 2 Formulating a hypothesis

    A hypothesis is a statement that introduces your research question and suggests the results you might find. It is an educated guess. You start by posing an economic question and formulate a hypothesis about this question. Then you test it with your data and empirical analysis and either accept or reject the hypothesis.

  15. Scientific hypothesis

    Countless hypotheses have been developed and tested throughout the history of science.Several examples include the idea that living organisms develop from nonliving matter, which formed the basis of spontaneous generation, a hypothesis that ultimately was disproved (first in 1668, with the experiments of Italian physician Francesco Redi, and later in 1859, with the experiments of French ...

  16. 9.E: Hypothesis Testing with One Sample (Exercises)

    An Introduction to Statistics class in Davies County, KY conducted a hypothesis test at the local high school (a medium sized-approximately 1,200 students-small city demographic) to determine if the local high school's percentage was lower. One hundred fifty students were chosen at random and surveyed.

  17. Formulating and Testing Hypotheses

    A hypothesis is a statistical hypothesis only if it is stated in terms related to the distribution of populations. The general hypothesis above might be refined to: " this pesticide, when used as directed, has no effect on the average number of robins in an area ", which is a testable hypothesis. The hypothesis to be tested is called the ...

  18. How to Write Hypothesis Test Conclusions (With Examples)

    H0: μafter = μbefore (the mean number of defective widgets is the same before and after using the new method) HA: μafter ≠ μbefore (the mean number of defective widgets produced is different before and after using the new method) Suppose the p-value of the test turns out to be 0.27. Here is how he would report the results of the ...

  19. What a Hypothesis Is and How to Formulate One

    A hypothesis is a prediction of what will be found at the outcome of a research project and is typically focused on the relationship between two different variables studied in the research. It is usually based on both theoretical expectations about how things work and already existing scientific evidence. Within social science, a hypothesis can ...

  20. Understanding Hypothesis Testing

    Step 3: Compute the test statistic. The test statistic is calculated by using the z formula Z= and we get accordingly , Z=2.039999999999992. Step 4: Result. Since the absolute value of the test statistic (2.04) is greater than the critical value (1.96), we reject the null hypothesis.