what is inductive argument in critical thinking

JavaScript seems to be disabled in your browser. For the best experience on our site, be sure to turn on Javascript in your browser.

  • Order Tracking
  • Create an Account

what is inductive argument in critical thinking

200+ Award-Winning Educational Textbooks, Activity Books, & Printable eBooks!

  • Compare Products

Reading, Writing, Math, Science, Social Studies

  • Search by Book Series
  • Algebra I & II  Gr. 7-12+
  • Algebra Magic Tricks  Gr. 2-12+
  • Algebra Word Problems  Gr. 7-12+
  • Balance Benders  Gr. 2-12+
  • Balance Math & More!  Gr. 2-12+
  • Basics of Critical Thinking  Gr. 4-7
  • Brain Stretchers  Gr. 5-12+
  • Building Thinking Skills  Gr. Toddler-12+
  • Building Writing Skills  Gr. 3-7
  • Bundles - Critical Thinking  Gr. PreK-9
  • Bundles - Language Arts  Gr. K-8
  • Bundles - Mathematics  Gr. PreK-9
  • Bundles - Multi-Subject Curriculum  Gr. PreK-12+
  • Bundles - Test Prep  Gr. Toddler-12+
  • Can You Find Me?  Gr. PreK-1
  • Complete the Picture Math  Gr. 1-3
  • Cornell Critical Thinking Tests  Gr. 5-12+
  • Cranium Crackers  Gr. 3-12+
  • Creative Problem Solving  Gr. PreK-2
  • Critical Thinking Activities to Improve Writing  Gr. 4-12+
  • Critical Thinking Coloring  Gr. PreK-2
  • Critical Thinking Detective  Gr. 3-12+
  • Critical Thinking Tests  Gr. PreK-6
  • Critical Thinking for Reading Comprehension  Gr. 1-5
  • Critical Thinking in United States History  Gr. 6-12+
  • CrossNumber Math Puzzles  Gr. 4-10
  • Crypt-O-Words  Gr. 2-7
  • Crypto Mind Benders  Gr. 3-12+
  • Daily Mind Builders  Gr. 5-12+
  • Dare to Compare Math  Gr. 2-7
  • Developing Critical Thinking through Science  Gr. 1-8
  • Dr. DooRiddles  Gr. PreK-12+
  • Dr. Funster's  Gr. 2-12+
  • Editor in Chief  Gr. 2-12+
  • Fun-Time Phonics!  Gr. PreK-2
  • Half 'n Half Animals  Gr. K-4
  • Hands-On Thinking Skills  Gr. K-1
  • Inference Jones  Gr. 1-6
  • James Madison  Gr. 10-12+
  • Jumbles  Gr. 3-5
  • Language Mechanic  Gr. 4-7
  • Language Smarts  Gr. 1-4
  • Mastering Logic & Math Problem Solving  Gr. 6-9
  • Math Analogies  Gr. K-9
  • Math Detective  Gr. 3-8
  • Math Games  Gr. 3-8
  • Math Mind Benders  Gr. 5-12+
  • Math Ties  Gr. 4-8
  • Math Word Problems  Gr. 4-10
  • Mathematical Reasoning  Gr. Toddler-11
  • Middle School Science  Gr. 6-8
  • Mind Benders  Gr. PreK-12+
  • Mind Building Math  Gr. K-1
  • Mind Building Reading  Gr. K-1
  • Novel Thinking  Gr. 3-6
  • OLSAT® Test Prep  Gr. PreK-K
  • Organizing Thinking  Gr. 2-8
  • Pattern Explorer  Gr. 3-9
  • Practical Critical Thinking  Gr. 8-12+
  • Punctuation Puzzler  Gr. 3-8
  • Reading Detective  Gr. 3-12+
  • Red Herring Mysteries  Gr. 4-12+
  • Red Herrings Science Mysteries  Gr. 4-9
  • Science Detective  Gr. 3-6
  • Science Mind Benders  Gr. PreK-3
  • Science Vocabulary Crossword Puzzles  Gr. 4-6
  • Sciencewise  Gr. 4-12+
  • Scratch Your Brain  Gr. 2-12+
  • Sentence Diagramming  Gr. 3-12+
  • Smarty Pants Puzzles  Gr. 3-12+
  • Snailopolis  Gr. K-4
  • Something's Fishy at Lake Iwannafisha  Gr. 5-9
  • Teaching Technology  Gr. 3-12+
  • Tell Me a Story  Gr. PreK-1
  • Think Analogies  Gr. 3-12+
  • Think and Write  Gr. 3-8
  • Think-A-Grams  Gr. 4-12+
  • Thinking About Time  Gr. 3-6
  • Thinking Connections  Gr. 4-12+
  • Thinking Directionally  Gr. 2-6
  • Thinking Skills & Key Concepts  Gr. PreK-2
  • Thinking Skills for Tests  Gr. PreK-5
  • U.S. History Detective  Gr. 8-12+
  • Understanding Fractions  Gr. 2-6
  • Visual Perceptual Skill Building  Gr. PreK-3
  • Vocabulary Riddles  Gr. 4-8
  • Vocabulary Smarts  Gr. 2-5
  • Vocabulary Virtuoso  Gr. 2-12+
  • What Would You Do?  Gr. 2-12+
  • Who Is This Kid? Colleges Want to Know!  Gr. 9-12+
  • Word Explorer  Gr. 6-8
  • Word Roots  Gr. 3-12+
  • World History Detective  Gr. 6-12+
  • Writing Detective  Gr. 3-6
  • You Decide!  Gr. 6-12+

what is inductive argument in critical thinking

  • Special of the Month
  • Sign Up for our Best Offers
  • Bundles = Greatest Savings!
  • Sign Up for Free Puzzles
  • Sign Up for Free Activities
  • Toddler (Ages 0-3)
  • PreK (Ages 3-5)
  • Kindergarten (Ages 5-6)
  • 1st Grade (Ages 6-7)
  • 2nd Grade (Ages 7-8)
  • 3rd Grade (Ages 8-9)
  • 4th Grade (Ages 9-10)
  • 5th Grade (Ages 10-11)
  • 6th Grade (Ages 11-12)
  • 7th Grade (Ages 12-13)
  • 8th Grade (Ages 13-14)
  • 9th Grade (Ages 14-15)
  • 10th Grade (Ages 15-16)
  • 11th Grade (Ages 16-17)
  • 12th Grade (Ages 17-18)
  • 12th+ Grade (Ages 18+)
  • Test Prep Directory
  • Test Prep Bundles
  • Test Prep Guides
  • Preschool Academics
  • Store Locator
  • Submit Feedback/Request
  • Sales Alerts Sign-Up
  • Technical Support
  • Mission & History
  • Articles & Advice
  • Testimonials
  • Our Guarantee
  • New Products
  • Free Activities
  • Libros en Español

Guide To Inductive & Deductive Reasoning

Induction vs. Deduction

October 15, 2008, by The Critical Thinking Co. Staff

Induction and deduction are pervasive elements in critical thinking. They are also somewhat misunderstood terms. Arguments based on experience or observation are best expressed inductively , while arguments based on laws or rules are best expressed deductively . Most arguments are mainly inductive. In fact, inductive reasoning usually comes much more naturally to us than deductive reasoning.

Inductive reasoning moves from specific details and observations (typically of nature) to the more general underlying principles or process that explains them (e.g., Newton's Law of Gravity). It is open-ended and exploratory, especially at the beginning. The premises of an inductive argument are believed to support the conclusion, but do not ensure it. Thus, the conclusion of an induction is regarded as a hypothesis. In the Inductive method, also called the scientific method , observation of nature is the authority.

In contrast, deductive reasoning typically moves from general truths to specific conclusions. It opens with an expansive explanation (statements known or believed to be true) and continues with predictions for specific observations supporting it. Deductive reasoning is narrow in nature and is concerned with testing or confirming a hypothesis. It is dependent on its premises. For example, a false premise can lead to a false result, and inconclusive premises will also yield an inconclusive conclusion. Deductive reasoning leads to a confirmation (or not) of our original theories. It guarantees the correctness of a conclusion. Logic is the authority in the deductive method.

If you can strengthen your argument or hypothesis by adding another piece of information, you are using inductive reasoning. If you cannot improve your argument by adding more evidence, you are employing deductive reasoning.

Pursuing Truth: A Guide to Critical Thinking

Chapter 2 arguments.

The fundamental tool of the critical thinker is the argument. For a good example of what we are not talking about, consider a bit from a famous sketch by Monty Python’s Flying Circus : 3

2.1 Identifying Arguments

People often use “argument” to refer to a dispute or quarrel between people. In critical thinking, an argument is defined as

A set of statements, one of which is the conclusion and the others are the premises.

There are three important things to remember here:

  • Arguments contain statements.
  • They have a conclusion.
  • They have at least one premise

Arguments contain statements, or declarative sentences. Statements, unlike questions or commands, have a truth value. Statements assert that the world is a particular way; questions do not. For example, if someone asked you what you did after dinner yesterday evening, you wouldn’t accuse them of lying. When the world is the way that the statement says that it is, we say that the statement is true. If the statement is not true, it is false.

One of the statements in the argument is called the conclusion. The conclusion is the statement that is intended to be proved. Consider the following argument:

Calculus II will be no harder than Calculus I. Susan did well in Calculus I. So, Susan should do well in Calculus II.

Here the conclusion is that Susan should do well in Calculus II. The other two sentences are premises. Premises are the reasons offered for believing that the conclusion is true.

2.1.1 Standard Form

Now, to make the argument easier to evaluate, we will put it into what is called “standard form.” To put an argument in standard form, write each premise on a separate, numbered line. Draw a line underneath the last premise, the write the conclusion underneath the line.

  • Calculus II will be no harder than Calculus I.
  • Susan did well in Calculus I.
  • Susan should do well in Calculus II.

Now that we have the argument in standard form, we can talk about premise 1, premise 2, and all clearly be referring to the same thing.

2.1.2 Indicator Words

Unfortunately, when people present arguments, they rarely put them in standard form. So, we have to decide which statement is intended to be the conclusion, and which are the premises. Don’t make the mistake of assuming that the conclusion comes at the end. The conclusion is often at the beginning of the passage, but could even be in the middle. A better way to identify premises and conclusions is to look for indicator words. Indicator words are words that signal that statement following the indicator is a premise or conclusion. The example above used a common indicator word for a conclusion, ‘so.’ The other common conclusion indicator, as you can probably guess, is ‘therefore.’ This table lists the indicator words you might encounter.

Each argument will likely use only one indicator word or phrase. When the conlusion is at the end, it will generally be preceded by a conclusion indicator. Everything else, then, is a premise. When the conclusion comes at the beginning, the next sentence will usually be introduced by a premise indicator. All of the following sentences will also be premises.

For example, here’s our previous argument rewritten to use a premise indicator:

Susan should do well in Calculus II, because Calculus II will be no harder than Calculus I, and Susan did well in Calculus I.

Sometimes, an argument will contain no indicator words at all. In that case, the best thing to do is to determine which of the premises would logically follow from the others. If there is one, then it is the conclusion. Here is an example:

Spot is a mammal. All dogs are mammals, and Spot is a dog.

The first sentence logically follows from the others, so it is the conclusion. When using this method, we are forced to assume that the person giving the argument is rational and logical, which might not be true.

2.1.3 Non-Arguments

One thing that complicates our task of identifying arguments is that there are many passages that, although they look like arguments, are not arguments. The most common types are:

  • Explanations
  • Mere asssertions
  • Conditional statements
  • Loosely connected statements

Explanations can be tricky, because they often use one of our indicator words. Consider this passage:

Abraham Lincoln died because he was shot.

If this were an argument, then the conclusion would be that Abraham Lincoln died, since the other statement is introduced by a premise indicator. If this is an argument, though, it’s a strange one. Do you really think that someone would be trying to prove that Abraham Lincoln died? Surely everyone knows that he is dead. On the other hand, there might be people who don’t know how he died. This passage does not attempt to prove that something is true, but instead attempts to explain why it is true. To determine if a passage is an explanation or an argument, first find the statement that looks like the conclusion. Next, ask yourself if everyone likely already believes that statement to be true. If the answer to that question is yes, then the passage is an explanation.

Mere assertions are obviously not arguments. If a professor tells you simply that you will not get an A in her course this semester, she has not given you an argument. This is because she hasn’t given you any reasons to believe that the statement is true. If there are no premises, then there is no argument.

Conditional statements are sentences that have the form “If…, then….” A conditional statement asserts that if something is true, then something else would be true also. For example, imagine you are told, “If you have the winning lottery ticket, then you will win ten million dollars.” What is being claimed to be true, that you have the winning lottery ticket, or that you will win ten million dollars? Neither. The only thing claimed is the entire conditional. Conditionals can be premises, and they can be conclusions. They can be parts of arguments, but that cannot, on their own, be arguments themselves.

Finally, consider this passage:

I woke up this morning, then took a shower and got dressed. After breakfast, I worked on chapter 2 of the critical thinking text. I then took a break and drank some more coffee….

This might be a description of my day, but it’s not an argument. There’s nothing in the passage that plays the role of a premise or a conclusion. The passage doesn’t attempt to prove anything. Remember that arguments need a conclusion, there must be something that is the statement to be proved. Lacking that, it simply isn’t an argument, no matter how much it looks like one.

2.2 Evaluating Arguments

The first step in evaluating an argument is to determine what kind of argument it is. We initially categorize arguments as either deductive or inductive, defined roughly in terms of their goals. In deductive arguments, the truth of the premises is intended to absolutely establish the truth of the conclusion. For inductive arguments, the truth of the premises is only intended to establish the probable truth of the conclusion. We’ll focus on deductive arguments first, then examine inductive arguments in later chapters.

Once we have established that an argument is deductive, we then ask if it is valid. To say that an argument is valid is to claim that there is a very special logical relationship between the premises and the conclusion, such that if the premises are true, then the conclusion must also be true. Another way to state this is

An argument is valid if and only if it is impossible for the premises to be true and the conclusion false.

An argument is invalid if and only if it is not valid.

Note that claiming that an argument is valid is not the same as claiming that it has a true conclusion, nor is it to claim that the argument has true premises. Claiming that an argument is valid is claiming nothing more that the premises, if they were true , would be enough to make the conclusion true. For example, is the following argument valid or not?

  • If pigs fly, then an increase in the minimum wage will be approved next term.
  • An increase in the minimum wage will be approved next term.

The argument is indeed valid. If the two premises were true, then the conclusion would have to be true also. What about this argument?

  • All dogs are mammals
  • Spot is a mammal.
  • Spot is a dog.

In this case, both of the premises are true and the conclusion is true. The question to ask, though, is whether the premises absolutely guarantee that the conclusion is true. The answer here is no. The two premises could be true and the conclusion false if Spot were a cat, whale, etc.

Neither of these arguments are good. The second fails because it is invalid. The two premises don’t prove that the conclusion is true. The first argument is valid, however. So, the premises would prove that the conclusion is true, if those premises were themselves true. Unfortunately, (or fortunately, I guess, considering what would be dropping from the sky) pigs don’t fly.

These examples give us two important ways that deductive arguments can fail. The can fail because they are invalid, or because they have at least one false premise. Of course, these are not mutually exclusive, an argument can be both invalid and have a false premise.

If the argument is valid, and has all true premises, then it is a sound argument. Sound arguments always have true conclusions.

A deductively valid argument with all true premises.

Inductive arguments are never valid, since the premises only establish the probable truth of the conclusion. So, we evaluate inductive arguments according to their strength. A strong inductive argument is one in which the truth of the premises really do make the conclusion probably true. An argument is weak if the truth of the premises fail to establish the probable truth of the conclusion.

There is a significant difference between valid/invalid and strong/weak. If an argument is not valid, then it is invalid. The two categories are mutually exclusive and exhaustive. There can be no such thing as an argument being more valid than another valid argument. Validity is all or nothing. Inductive strength, however, is on a continuum. A strong inductive argument can be made stronger with the addition of another premise. More evidence can raise the probability of the conclusion. A valid argument cannot be made more valid with an additional premise. Why not? If the argument is valid, then the premises were enough to absolutely guarantee the truth of the conclusion. Adding another premise won’t give any more guarantee of truth than was already there. If it could, then the guarantee wasn’t absolute before, and the original argument wasn’t valid in the first place.

2.3 Counterexamples

One way to prove an argument to be invalid is to use a counterexample. A counterexample is a consistent story in which the premises are true and the conclusion false. Consider the argument above:

By pointing out that Spot could have been a cat, I have told a story in which the premises are true, but the conclusion is false.

Here’s another one:

  • If it is raining, then the sidewalks are wet.
  • The sidewalks are wet.
  • It is raining.

The sprinklers might have been on. If so, then the sidewalks would be wet, even if it weren’t raining.

Counterexamples can be very useful for demonstrating invalidity. Keep in mind, though, that validity can never be proved with the counterexample method. If the argument is valid, then it will be impossible to give a counterexample to it. If you can’t come up with a counterexample, however, that does not prove the argument to be valid. It may only mean that you’re not creative enough.

  • An argument is a set of statements; one is the conclusion, the rest are premises.
  • The conclusion is the statement that the argument is trying to prove.
  • The premises are the reasons offered for believing the conclusion to be true.
  • Explanations, conditional sentences, and mere assertions are not arguments.
  • Deductive reasoning attempts to absolutely guarantee the truth of the conclusion.
  • Inductive reasoning attempts to show that the conclusion is probably true.
  • In a valid argument, it is impossible for the premises to be true and the conclusion false.
  • In an invalid argument, it is possible for the premises to be true and the conclusion false.
  • A sound argument is valid and has all true premises.
  • An inductively strong argument is one in which the truth of the premises makes the the truth of the conclusion probable.
  • An inductively weak argument is one in which the truth of the premises do not make the conclusion probably true.
  • A counterexample is a consistent story in which the premises of an argument are true and the conclusion is false. Counterexamples can be used to prove that arguments are deductively invalid.

( Cleese and Chapman 1980 ) . ↩︎

FVTC Library Resources

Critical & Creative Thinking - OER & More Resources: Inductive Arguments

  • Self evaluation
  • Creating goals
  • Creating personal mission statement
  • Creative Thinking
  • Problem Solving
  • IDEAL problem solving
  • CRITICAL THINKING
  • Critical Thinking Tips
  • Logic Terms
  • Logic Traps
  • Free OER Textbooks
  • More Thinking: OER
  • Ethics - OER Textbooks
  • Evidence-Based Critical Thinking
  • BELIEFS & BIAS
  • Limits of Perception
  • Reality & Assumptions
  • Stereotypes & Race
  • MAKING YOUR CASE
  • Argument (OER)
  • Inductive Arguments
  • Information Literacy: Be Savvy about your Sources
  • Persuasive Speaking (OER)
  • Philosophy & Thinking
  • WiPhi Philosophy Project
  • Browse All Guides

Writing an Inductive Argument

  • How to Write an Effective Inductive Argument
  • Essay Tips: Inductive Argument Examples

Inductive Argument

  • Deductive and Inductive Arguments | Internet Encyclopedia of Philosophy

" An  inductive argument  can be affected by acquiring new premises (evidence), but a deductive  argument  cannot be. For  example , this is a reasonably strong  inductive argument : ... If the arguer believes that the truth of the premises definitely establishes the truth of the conclusion, then the  argument  is deductive."

  • Examples of Inductive Reasoning (Click link and scroll down for examples)

" The term " inductive reasoning " refers to reasoning that takes specific information and makes a broader generalization that is considered probable, allowing for the fact that the conclusion may not be accurate."

  • << Previous: Argument (OER)
  • Next: Information Literacy: Be Savvy about your Sources >>
  • Last Updated: Mar 13, 2024 3:04 PM
  • URL: https://library.fvtc.edu/Thinking

About Us • Contact Us • FVTC Terms of Service • Sitemap FVTC Mission, Vision, Values & Purposes • FVTC Privacy Statement • FVTC Library Services Accessibility Statement DISCLAIMER: Any commercial mentions on our website are for instructional purposes only. Our guides are not a substitute for professional legal or medical advice. Fox Valley Technical College • Library Services • 1825 N. Bluemound Drive • Room G113 Appleton, WI 54912-2277 • United States • (920) 735-5653 © 2024 • Fox Valley Technical College • All Rights Reserved.

The https://library.fvtc.edu/ pages are hosted by SpringShare. Springshare Privacy Policy.

Jessie Ball duPont Library

Critical thinking skills, what is inductive reasoning.

  • Logical Fallacies
  • Need Research Help?

Research Librarian

Profile Photo

Inductive reasoning: conclusion merely likely Inductive reasoning begins with observations that are specific and limited in scope, and proceeds to a generalized conclusion that is likely, but not certain, in light of accumulated evidence. You could say that inductive reasoning moves from the specific to the general. Much scientific research is carried out by the inductive method: gathering evidence, seeking patterns, and forming a hypothesis or theory to explain what is seen.

Conclusions reached by the inductive method are not logical necessities; no amount of inductive evidence guarantees the conclusion. This is because there is no way to know that all the possible evidence has been gathered, and that there exists no further bit of unobserved evidence that might invalidate my hypothesis. Thus, while the newspapers might report the conclusions of scientific research as absolutes, scientific literature itself uses more cautious language, the language of inductively reached, probable conclusions:

What we have seen is the ability of these cells to feed the blood vessels of tumors and to heal the blood vessels surrounding wounds. The findings suggest that these adult stem cells may be an ideal source of cells for clinical therapy. For example, we can envision the use of these stem cells for therapies against cancer tumors [...].

Because inductive conclusions are not logical necessities, inductive arguments are not simply true. Rather, they are cogent: that is, the evidence seems complete, relevant, and generally convincing, and the conclusion is therefore probably true. Nor are inductive arguments simply false; rather, they are  not cogent .

It is an important difference from deductive reasoning that, while inductive reasoning cannot yield an absolutely certain conclusion, it can actually increase human knowledge (it is  ampliative ). It can make predictions about future events or as-yet unobserved phenomena.

For example, Albert Einstein observed the movement of a pocket compass when he was five years old and became fascinated with the idea that something invisible in the space around the compass needle was causing it to move. This observation, combined with additional observations (of moving trains, for example) and the results of logical and mathematical tools (deduction), resulted in a rule that fit his observations and could predict events that were as yet unobserved.

  • << Previous: Deduction
  • Next: Abduction >>
  • Last Updated: Jul 28, 2020 5:53 PM
  • URL: https://library.sewanee.edu/critical_thinking

Research Tools

  • Find Articles
  • Find Research Guides
  • Find Databases
  • Ask a Librarian
  • Learn Research Skills
  • How-To Videos
  • Borrow from Another Library (Sewanee ILL)
  • Find Audio and Video
  • Find Reserves
  • Access Electronic Resources

Services for...

  • College Students
  • The School of Theology
  • The School of Letters
  • Community Members

Spaces & Places

  • Center for Leadership
  • Center for Speaking and Listening
  • Center for Teaching
  • Floor Maps & Locations
  • Ralston Listening Library
  • Research Help
  • Study Spaces
  • Tech Help Desk
  • Writing Center

About the Library

  • Where is the Library?
  • Library Collections
  • New Items & Themed Collections
  • Library Policies
  • Library Staff
  • Friends of the Library

Jessie Ball duPont Library, University of the South

178 Georgia Avenue, Sewanee, TN 37383

931.598.1664

Logo for Pressbooks

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

21 Arguments VI: Inductive Arguments

I. Introduction 

The last chapter introduced the distinction between deductive and inductive arguments. Deductive arguments are those whose conclusion is supposed to follow with logical necessity from the premises, while inductive arguments are those that aim to establish a conclusion as only being probably true, given the premises.

To define arguments in this way is to define them in terms of their aim  or  intent . That means there is a question as to whether the intention succeeds or not. One of the things that marks out inductive from deductive arguments is that there are  degrees of success possible for inductive arguments. That’s because we’re reasoning about claims that involve probability and uncertainty. Deductive arguments, by contrast, are like on-off switches. Either they work logically, or they don’t.

So, let’s start this chapter by defining terms that will let us talk about successful and unsuccessful inductive arguments. For this we use the language of strength and weakness, which, just like real strength and weakness, come in degrees.

  • A strong argument is an inductive argument that succeeds in having its conclusion be probably true, given the truth of the premises.
  • A weak argument is an inductive argument that fails in having its conclusion be probably true, even given the truth of the premises.

With this in mind, let’s next see how we can identify inductive arguments. Then we’ll put these things together and see how we can determine when the arguments we’ve identified are strong or weak.

II. Identifying Inductive Arguments

There’s an art to knowing when you’ve got an inductive argument on your hands, and with practice you can become good at it. The general idea is to think about the reasons being offered for a claim and to ask whether someone offering those reasons is doing so in a way that says or implies that there is some  uncertainty involved, and thus some degree of probability of the conclusion being true. Here are some things to look for that can help you do this.

1. Just as the parts of an argument often have indicator words that clue you in to what you’re looking at, there are certain words and phrases that often – but not always!  – show up to make clear that a conclusion is being inductively rather than deductively drawn. These include the following (but I bet you can think of other words that function in a similar way once you see how these work):

Inductive Argument Indicator Words

there’s a good chance that

in all likelihood

2. Arguments whose conclusion is a claim about the future , where it is made based on t he way things have typically happened in the past , are inductive. That’s because, even if experience provides a good basis for making predictions, the future is not fully knowable and can always surprise us.

Arguments for Claims about the Future

It rained yesterday, and it’s raining today. Therefore, it will rain tomorrow.

The Pirates did not make it to the playoffs the last three years in a row. So, they probably won’t make it to the playoffs next year either.

The sun will rise in the east tomorrow, for it has risen in the east every day for the last 4.5 billion years.

The Democrats will likely pick up seats in Congress, because midterm election years tend to favor the party that’s in the minority.

3. Generalizations are arguments that involve making a claim about a large group if things (or people) based on what is known about a small subset of that group. These are always inductive. Lots of claims in social sciences, health sciences, and the like, are supported by arguments of this kind. We’ll go into these in detail in the next chapter.

Examples of Generalizations

You have shortness of breath, coughing, and have lost your sense of smell. Those are widespread symptoms among people diagnosed with COVID-19. There’s thus a good chance that you are infected with COVID-19.

You should eat more fiber in your diet, for a study of several hundred people on a high-fiber diet showed a variety of improved health outcomes, and you should try to have good health outcomes.

A large survey of people that looked at their salaries and levels of educational achievement found that most of those with higher salaries had completed at least a four-year college degree. Therefore, people who complete a college degree tend to earn more than those who do not.

A vaccine was tested on a sample of several thousand people and it proved effective. Therefore, the vaccine will be effective for the entire population.

I met a person from the Upper Peninsula of Michigan the other day, and she had a very strange accent. I bet all those people up there talk strangely.

4. Analogies/analogical arguments  are arguments in which a prediction is made about how two things (or people or groups) are likely to be similar based on known similarities  between them. These are always inductive. (These will also be dealt with in more detail in the chapter after the one on generalizations.)

Examples of Analogies/Analogical Arguments

Cats have eyes that are physiologically very similar to human eyes. When mascara was put into cats eyes in a study, they became mildly inflamed. Thus, it’s likely that the same mascara will inflame human eyes if it comes into contact with them.

Dogs have four legs and like to bark. Cats have four legs. Therefore, cats probably like to bark too.

McDonalds has inexpensive burgers and fries, and it pays its workers low wages. Burger King has inexpensive burgers and fries. So I bet it pays its workers low wages too.

Note: the first of these examples of analogies is a real-world example of how animals are used to test the safety of cosmetic products. There is a factual issue about the effectiveness  of such testing as well as an ethical issue about whether it is  inhumane or not. The analogical argument given here presumes that such testing is an effective way to determine the truth of non-evaluative claims, such as the conclusion in this argument about mascara and its effect on human eyes. But even if such non-evaluative arguments are strong, they do not settle one way or another the issue of the ethical acceptability of such testing.

Phil-P102 Critical Thinking and Applied Ethics Copyright © 2020 by R. Matthew Shockey is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License , except where otherwise noted.

Share This Book

Inductive Reasoning

LESSON SUMMARY

This lesson shows how to recognize and construct an inductive argument. These arguments move from specific facts to general conclusions by using common sense and/or past experience.

Induction is the process of reasoning from the specific (particular facts or instances) to the general (principles, theories, rules). It uses two premises that support the probable truth of the conclusion. Thus, an inductive argument looks like this: If A is true and B is true, then C is probably true.

How can you determine or measure what is probable or improbable? By using two things:

1. past experience

2. common sense

Past experience tells you what you might be able to expect. For instance, "for the past three weeks, my colleague has showed up a half hour late for work. Today, she will probably be late, too." Common sense allows you to draw an inference, or a "smart guess," based on the premises, such as, "They need five people on the team. I'm one of the strongest of the seven players at the tryouts. It's likely that I will be picked for the team."

Because you must make a leap from the premises to the truth of the conclusion, inductive reasoning is more likely to fail and produce fallacies, such as a hasty conclusion fallacy (see Lesson 15 to learn about these fallacies). Even so, most reasoning is inductive. One of the basic theories of modern biology, cell theory, is a product of inductive reasoning. It states that because every organism that has been observed is made up of cells, it is most likely that all living things are made up of cells.

There are two forms of inductive arguments. Those that compare one thing, event, or idea to another to see if they are similar are called comparative arguments. Those that try to determine cause from effect are causal arguments.

Continue reading here: Lesson 9 Persuasion Techniques

Was this article helpful?

Related Posts

  • Making Decisions Under Stress
  • Venn Diagram - Critical Thinking
  • Chicken and Egg Confusing Cause and Effect
  • Effective Persuasion Techniques
  • Avoid Making Assumptions - Critical Thinking
  • What Is a Judgment Call - Critical Thinking

Readers' Questions

What is the goal of inductive thinking?
Is radiant thinking inductive?

Logo for OPEN OKSTATE

Unit 1: What Is Philosophy?

LOGOS: Critical Thinking, Arguments, and Fallacies

Heather Wilburn, Ph.D

Critical Thinking:

With respect to critical thinking, it seems that everyone uses this phrase. Yet, there is a fear that this is becoming a buzz-word (i.e. a word or phrase you use because it’s popular or enticing in some way). Ultimately, this means that we may be using the phrase without a clear sense of what we even mean by it. So, here we are going to think about what this phrase might mean and look at some examples. As a former colleague of mine, Henry Imler, explains:

By critical thinking, we refer to thinking that is recursive in nature. Any time we encounter new information or new ideas, we double back and rethink our prior conclusions on the subject to see if any other conclusions are better suited. Critical thinking can be contrasted with Authoritarian thinking. This type of thinking seeks to preserve the original conclusion. Here, thinking and conclusions are policed, as to question the system is to threaten the system. And threats to the system demand a defensive response. Critical thinking is short-circuited in authoritarian systems so that the conclusions are conserved instead of being open for revision. [1]

A condition for being recursive is to be open and not arrogant. If we come to a point where we think we have a handle on what is True, we are no longer open to consider, discuss, or accept information that might challenge our Truth. One becomes closed off and rejects everything that is different or strange–out of sync with one’s own Truth. To be open and recursive entails a sense of thinking about your beliefs in a critical and reflective way, so that you have a chance to either strengthen your belief system or revise it if needed. I have been teaching philosophy and humanities classes for nearly 20 years; critical thinking is the single most important skill you can develop. In close but second place is communication, In my view, communication skills follow as a natural result of critical thinking because you are attempting to think through and articulate stronger and rationally justified views. At the risk of sounding cliche, education isn’t about instilling content; it is about learning how to think.

In your philosophy classes your own ideas and beliefs will very likely be challenged. This does not mean that you will be asked to abandon your beliefs, but it does mean that you might be asked to defend them. Additionally, your mind will probably be twisted and turned about, which can be an uncomfortable experience. Yet, if at all possible, you should cherish these experiences and allow them to help you grow as a thinker. To be challenged and perplexed is difficult; however, it is worthwhile because it compels deeper thinking and more significant levels of understanding. In turn, thinking itself can transform us not only in thought, but in our beliefs, and our actions. Hannah Arendt, a social and political philosopher that came to the United States in exile during WWII, relates the transformative elements of philosophical thinking to Socrates. She writes:

Socrates…who is commonly said to have believed in the teachability of virtue, seems to have held that talking and thinking about piety, justice, courage, and the rest were liable to make men more pious, more just, more courageous, even though they were not given definitions or “values” to direct their further conduct. [2]

Thinking and communication are transformative insofar as these activities have the potential to alter our perspectives and, thus, change our behavior. In fact, Arendt connects the ability to think critically and reflectively to morality. As she notes above, morality does not have to give a predetermined set of rules to affect our behavior. Instead, morality can also be related to the open and sometimes perplexing conversations we have with others (and ourselves) about moral issues and moral character traits. Theodor W. Adorno, another philosopher that came to the United States in exile during WWII, argues that autonomous thinking (i.e. thinking for oneself) is crucial if we want to prevent the occurrence of another event like Auschwitz, a concentration camp where over 1 million individuals died during the Holocaust. [3] To think autonomously entails reflective and critical thinking—a type of thinking rooted in philosophical activity and a type of thinking that questions and challenges social norms and the status quo. In this sense thinking is critical of what is, allowing us to think beyond what is and to think about what ought to be, or what ought not be. This is one of the transformative elements of philosophical activity and one that is useful in promoting justice and ethical living.

With respect to the meaning of education, the German philosopher Hegel uses the term bildung, which means education or upbringing, to indicate the differences between the traditional type of education that focuses on facts and memorization, and education as transformative. Allen Wood explains how Hegel uses the term bildung: it is “a process of self-transformation and an acquisition of the power to grasp and articulate the reasons for what one believes or knows.” [4] If we think back through all of our years of schooling, particularly those subject matters that involve the teacher passing on information that is to be memorized and repeated, most of us would be hard pressed to recall anything substantial. However, if the focus of education is on how to think and the development of skills include analyzing, synthesizing, and communicating ideas and problems, most of us will use those skills whether we are in the field of philosophy, politics, business, nursing, computer programming, or education. In this sense, philosophy can help you develop a strong foundational skill set that will be marketable for your individual paths. While philosophy is not the only subject that will foster these skills, its method is one that heavily focuses on the types of activities that will help you develop such skills.

Let’s turn to discuss arguments. Arguments consist of a set of statements, which are claims that something is or is not the case, or is either true or false. The conclusion of your argument is a statement that is being argued for, or the point of view being argued for. The other statements serve as evidence or support for your conclusion; we refer to these statements as premises. It’s important to keep in mind that a statement is either true or false, so questions, commands, or exclamations are not statements. If we are thinking critically we will not accept a statement as true or false without good reason(s), so our premises are important here. Keep in mind the idea that supporting statements are called premises and the statement that is being supported is called the conclusion. Here are a couple of examples:

Example 1: Capital punishment is morally justifiable since it restores some sense of

balance to victims or victims’ families.

Let’s break it down so it’s easier to see in what we might call a typical argument form:

Premise: Capital punishment restores some sense of balance to victims or victims’ families.

Conclusion: Capital punishment is morally justifiable.

Example 2 : Because innocent people are sometimes found guilty and potentially

executed, capital punishment is not morally justifiable.

Premise: Innocent people are sometimes found guilty and potentially executed.

Conclusion: Capital punishment is not morally justifiable.

It is worth noting the use of the terms “since” and “because” in these arguments. Terms or phrases like these often serve as signifiers that we are looking at evidence, or a premise.

Check out another example:

Example 3 : All human beings are mortal. Heather is a human being. Therefore,

Heather is mortal.

Premise 1: All human beings are mortal.

Premise 2: Heather is a human being.

Conclusion: Heather is mortal.

In this example, there are a couple of things worth noting: First, there can be more than one premise. In fact, you could have a rather complex argument with several premises. If you’ve written an argumentative paper you may have encountered arguments that are rather complex. Second, just as the arguments prior had signifiers to show that we are looking at evidence, this argument has a signifier (i.e. therefore) to demonstrate the argument’s conclusion.

So many arguments!!! Are they all equally good?

No, arguments are not equally good; there are many ways to make a faulty argument. In fact, there are a lot of different types of arguments and, to some extent, the type of argument can help us figure out if the argument is a good one. For a full elaboration of arguments, take a logic class! Here’s a brief version:

Deductive Arguments: in a deductive argument the conclusion necessarily follows the premises. Take argument Example 3 above. It is absolutely necessary that Heather is a mortal, if she is a human being and if mortality is a specific condition for being human. We know that all humans die, so that’s tight evidence. This argument would be a very good argument; it is valid (i.e the conclusion necessarily follows the premises) and it is sound (i.e. all the premises are true).

Inductive Arguments : in an inductive argument the conclusion likely (at best) follows the premises. Let’s have an example:

Example 4 : 98.9% of all TCC students like pizza. You are a TCC student. Thus, you like pizza.

Premise 1: 98.9% of all TCC students like pizza

Premise 2: You are a TCC student.

Conclusion: You like pizza. (*Thus is a conclusion indicator)

In this example, the conclusion doesn’t necessarily follow; it likely follows. But you might be part of that 1.1% for whatever reason. Inductive arguments are good arguments if they are strong. So, instead of saying an inductive argument is valid, we say it is strong. You can also use the term sound to describe the truth of the premises, if they are true. Let’s suppose they are true and you absolutely love Hideaway pizza. Let’s also assume you are a TCC student. So, the argument is really strong and it is sound.

There are many types of inductive argument, including: causal arguments, arguments based on probabilities or statistics, arguments that are supported by analogies, and arguments that are based on some type of authority figure. So, when you encounter an argument based on one of these types, think about how strong the argument is. If you want to see examples of the different types, a web search (or a logic class!) will get you where you need to go.

Some arguments are faulty, not necessarily because of the truth or falsity of the premises, but because they rely on psychological and emotional ploys. These are bad arguments because people shouldn’t accept your conclusion if you are using scare tactics or distracting and manipulating reasoning. Arguments that have this issue are called fallacies. There are a lot of fallacies, so, again, if you want to know more a web search will be useful. We are going to look at several that seem to be the most relevant for our day-to-day experiences.

  • Inappropriate Appeal to Authority : We are definitely going to use authority figures in our lives (e.g. doctors, lawyers, mechanics, financial advisors, etc.), but we need to make sure that the authority figure is a reliable one.

Things to look for here might include: reputation in the field, not holding widely controversial views, experience, education, and the like. So, if we take an authority figure’s word and they’re not legit, we’ve committed the fallacy of appeal to authority.

Example 5 : I think I am going to take my investments to Voya. After all, Steven Adams advocates for Voya in an advertisement I recently saw.

If we look at the criteria for evaluating arguments that appeal to authority figures, it is pretty easy to see that Adams is not an expert in the finance field. Thus, this is an inappropropriate appeal to authority.

  • Slippery Slope Arguments : Slippery slope arguments are found everywhere it seems. The essential characteristic of a slippery slope argument is that it uses problematic premises to argue that doing ‘x’ will ultimately lead to other actions that are extreme, unlikely, and disastrous. You can think of this type of argument as a faulty chain of events or domino effect type of argument.

Example 6 : If you don’t study for your philosophy exam you will not do well on the exam. This will lead to you failing the class. The next thing you know you will have lost your scholarship, dropped out of school, and will be living on the streets without any chance of getting a job.

While you should certainly study for your philosophy exam, if you don’t it is unlikely that this will lead to your full economic demise.

One challenge to evaluating slippery slope arguments is that they are predictions, so we cannot be certain about what will or will not actually happen. But this chain of events type of argument should be assessed in terms of whether the outcome will likely follow if action ‘x” is pursued.

  • Faulty Analogy : We often make arguments based on analogy and these can be good arguments. But we often use faulty reasoning with analogies and this is what we want to learn how to avoid.

When evaluating an argument that is based on an analogy here are a few things to keep in mind: you want to look at the relevant similarities and the relevant differences between the things that are being compared. As a general rule, if there are more differences than similarities the argument is likely weak.

Example 7 : Alcohol is legal. Therefore, we should legalize marijuana too.

So, the first step here is to identify the two things being compared, which are alcohol and marijuana. Next, note relevant similarities and differences. These might include effects on health, community safety, economic factors, criminal justice factors, and the like.

This is probably not the best argument in support for marijuana legalization. It would seem that one could just as easily conclude that since marijuana is illegal, alcohol should be too. In fact, one might find that alcohol is an often abused and highly problematic drug for many people, so it is too risky to legalize marijuana if it is similar to alcohol.

  • Appeal to Emotion : Arguments should be based on reason and evidence, not emotional tactics. When we use an emotional tactic, we are essentially trying to manipulate someone into accepting our position by evoking pity or fear, when our positions should actually be backed by reasonable and justifiable evidence.

Example 8 : Officer please don’t give me a speeding ticket. My girlfriend broke up with me last night, my alarm didn’t go off this morning, and I’m late for class.

While this is a really horrible start to one’s day, being broken up with and an alarm malfunctioning is not a justifiable reason for speeding.

Example 9 : Professor, I’d like you to remember that my mother is a dean here at TCC. I’m sure that she will be very disappointed if I don’t receive an A in your class.

This is a scare tactic and is not a good way to make an argument. Scare tactics can come in the form of psychological or physical threats; both forms are to be avoided.

  • Appeal to Ignorance : This fallacy occurs when our argument relies on lack of evidence when evidence is actually needed to support a position.

Example 10 : No one has proven that sasquatch doesn’t exist; therefore it does exist.

Example 11 : No one has proven God exists; therefore God doesn’t exist.

The key here is that lack of evidence against something cannot be an argument for something. Lack of evidence can only show that we are ignorant of the facts.

  • Straw Man : A straw man argument is a specific type of argument that is intended to weaken an opponent’s position so that it is easier to refute. So, we create a weaker version of the original argument (i.e. a straw man argument), so when we present it everyone will agree with us and denounce the original position.

Example 12 : Women are crazy arguing for equal treatment. No one wants women hanging around men’s locker rooms or saunas.

This is a misrepresentation of arguments for equal treatment. Women (and others arguing for equal treatment) are not trying to obtain equal access to men’s locker rooms or saunas.

The best way to avoid this fallacy is to make sure that you are not oversimplifying or misrepresenting others’ positions. Even if we don’t agree with a position, we want to make the strongest case against it and this can only be accomplished if we can refute the actual argument, not a weakened version of it. So, let’s all bring the strongest arguments we have to the table!

  • Red Herring : A red herring is a distraction or a change in subject matter. Sometimes this is subtle, but if you find yourself feeling lost in the argument, take a close look and make sure there is not an attempt to distract you.

Example 13 : Can you believe that so many people are concerned with global warming? The real threat to our country is terrorism.

It could be the case that both global warming and terrorism are concerns for us. But the red herring fallacy is committed when someone tries to distract you from the argument at hand by bringing up another issue or side-stepping a question. Politicians are masters at this, by the way.

  • Appeal to the Person : This fallacy is also referred to as the ad hominem fallacy. We commit this fallacy when we dismiss someone’s argument or position by attacking them instead of refuting the premises or support for their argument.

Example 14 : I am not going to listen to what Professor ‘X’ has to say about the history of religion. He told one of his previous classes he wasn’t religious.

The problem here is that the student is dismissing course material based on the professor’s religious views and not evaluating the course content on its own ground.

To avoid this fallacy, make sure that you target the argument or their claims and not the person making the argument in your rebuttal.

  • Hasty Generalization : We make and use generalizations on a regular basis and in all types of decisions. We rely on generalizations when trying to decide which schools to apply to, which phone is the best for us, which neighborhood we want to live in, what type of job we want, and so on. Generalizations can be strong and reliable, but they can also be fallacious. There are three main ways in which a generalization can commit a fallacy: your sample size is too small, your sample size is not representative of the group you are making a generalization about, or your data could be outdated.

Example 15 : I had horrible customer service at the last Starbucks I was at. It is clear that Starbucks employees do not care about their customers. I will never visit another Starbucks again.

The problem with this generalization is that the claim made about all Starbucks is based on one experience. While it is tempting to not spend your money where people are rude to their customers, this is only one employee and presumably doesn’t reflect all employees or the company as a whole. So, to make this a stronger generalization we would want to have a larger sample size (multiple horrible experiences) to support the claim. Let’s look at a second hasty generalization:

Example 16 : I had horrible customer service at the Starbucks on 81st street. It is clear that Starbucks employees do not care about their customers. I will never visit another Starbucks again.

The problem with this generalization mirrors the previous problem in that the claim is based on only one experience. But there’s an additional issue here as well, which is that the claim is based off of an experience at one location. To make a claim about the whole company, our sample group needs to be larger than one and it needs to come from a variety of locations.

  • Begging the Question : An argument begs the question when the argument’s premises assume the conclusion, instead of providing support for the conclusion. One common form of begging the question is referred to as circular reasoning.

Example 17 : Of course, everyone wants to see the new Marvel movie is because it is the most popular movie right now!

The conclusion here is that everyone wants to see the new Marvel movie, but the premise simply assumes that is the case by claiming it is the most popular movie. Remember the premise should give reasons for the conclusion, not merely assume it to be true.

  • Equivocation : In the English language there are many words that have different meanings (e.g. bank, good, right, steal, etc.). When we use the same word but shift the meaning without explaining this move to your audience, we equivocate the word and this is a fallacy. So, if you must use the same word more than once and with more than one meaning you need to explain that you’re shifting the meaning you intend. Although, most of the time it is just easier to use a different word.

Example 18 : Yes, philosophy helps people argue better, but should we really encourage people to argue? There is enough hostility in the world.

Here, argue is used in two different senses. The meaning of the first refers to the philosophical meaning of argument (i.e. premises and a conclusion), whereas the second sense is in line with the common use of argument (i.e. yelling between two or more people, etc.).

  • Henry Imler, ed., Phronesis An Ethics Primer with Readings, (2018). 7-8. ↵
  • Arendt, Hannah, “Thinking and Moral Considerations,” Social Research, 38:3 (1971: Autumn): 431. ↵
  • Theodor W. Adorno, “Education After Auschwitz,” in Can One Live After Auschwitz, ed. by Rolf Tiedemann, trans. by Rodney Livingstone (Stanford: Stanford University Press, 2003): 23. ↵
  • Allen W. Wood, “Hegel on Education,” in Philosophers on Education: New Historical Perspectives, ed. Amelie O. Rorty (London: Routledge 1998): 302. ↵

LOGOS: Critical Thinking, Arguments, and Fallacies Copyright © 2020 by Heather Wilburn, Ph.D is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

SEP home page

  • Table of Contents
  • Random Entry
  • Chronological
  • Editorial Information
  • About the SEP
  • Editorial Board
  • How to Cite the SEP
  • Special Characters
  • Advanced Tools
  • Support the SEP
  • PDFs for SEP Friends
  • Make a Donation
  • SEPIA for Libraries
  • Entry Contents

Bibliography

Academic tools.

  • Friends PDF Preview
  • Author and Citation Info
  • Back to Top

Inductive Logic

An inductive logic is a logic of evidential support. In a deductive logic, the premises of a valid deductive argument logically entail the conclusion, where logical entailment means that every logically possible state of affairs that makes the premises true must make the conclusion true as well. Thus, the premises of a valid deductive argument provide total support for the conclusion. An inductive logic extends this idea to weaker arguments. In a good inductive argument, the truth of the premises provides some degree of support for the truth of the conclusion, where this degree-of-support might be measured via some numerical scale. By analogy with the notion of deductive entailment, the notion of inductive degree-of-support might mean something like this: among the logically possible states of affairs that make the premises true, the conclusion must be true in (at least) proportion r of them—where r is some numerical measure of the support strength.

If a logic of good inductive arguments is to be of any real value, the measure of support it articulates should be up to the task. Presumably, the logic should at least satisfy the following condition:

Criterion of Adequacy (CoA) : The logic should make it likely (as a matter of logic) that as evidence accumulates, the total body of true evidence claims will eventually come to indicate, via the logic’s measure of support , that false hypotheses are probably false and that true hypotheses are probably true.

The CoA stated here may strike some readers as surprisingly strong. Given a specific logic of evidential support, how might it be shown to satisfy such a condition? Section 4 will show precisely how this condition is satisfied by the logic of evidential support articulated in Sections 1 through 3 of this article.

This article will focus on the kind of the approach to inductive logic most widely studied by epistemologists and logicians in recent years. This approach employs conditional probability functions to represent measures of the degree to which evidence statements support hypotheses. Presumably, hypotheses should be empirically evaluated based on what they say (or imply) about the likelihood that evidence claims will be true. A straightforward theorem of probability theory, called Bayes’ Theorem, articulates the way in which what hypotheses say about the likelihoods of evidence claims influences the degree to which hypotheses are supported by those evidence claims. Thus, this approach to the logic of evidential support is often called a Bayesian Inductive Logic or a Bayesian Confirmation Theory . This article will first provide a detailed explication of a Bayesian approach to inductive logic. It will then examine the extent to which this logic may pass muster as an adequate logic of evidential support for hypotheses. In particular, we will see how such a logic may be shown to satisfy the Criterion of Adequacy stated above.

Sections 1 through 3 present all of the main ideas underlying the (Bayesian) probabilistic logic of evidential support. These three sections should suffice to provide an adequate understanding of the subject. Section 5 extends this account to cases where the implications of hypotheses about evidence claims (called likelihoods ) are vague or imprecise. After reading Sections 1 through 3, the reader may safely skip directly to Section 5, bypassing the rather technical account in Section 4 of how how the CoA is satisfied.

Section 4 is for the more advanced reader who wants an understanding of how this logic may bring about convergence to the true hypothesis as evidence accumulates. This result shows that the Criterion of Adequacy is indeed satisfied—that as evidence accumulates, false hypotheses will very probably come to have evidential support values (as measured by their posterior probabilities ) that approach 0; and as this happens, a true hypothesis may very probably acquire evidential support values (as measured by its posterior probability ) that approaches 1.

1. Inductive Arguments

2.1 the historical origins of probabilistic logic, 2.2 probabilistic logic: axioms and characteristics, 2.3 two conceptions of inductive probability, 3.1 likelihoods, 3.2 posterior probabilities and prior probabilities, 3.3 bayes’ theorem, 3.4 on prior probabilities and representations of vague and diverse plausibility assessments, 4.1 the space of possible outcomes of experiments and observations, 4.2 probabilistic independence, 4.3 likelihood ratio convergence when falsifying outcomes are possible, 4.4 likelihood ratio convergence when no falsifying outcomes are possible, 5. when the likelihoods are vague or diverse, list of supplements, other internet resources, related entries.

Let us begin by considering some common kinds of examples of inductive arguments. Consider the following two arguments:

Example 1. Every raven in a random sample of 3200 ravens is black. This strongly supports the following conclusion: All ravens are black.

Example 2. 62 percent of voters in a random sample of 400 registered voters (polled on February 20, 2004) said that they favor John Kerry over George W. Bush for President in the 2004 Presidential election. This supports with a probability of at least .95 the following conclusion: Between 57 percent and 67 percent of all registered voters favor Kerry over Bush for President (at or around the time the poll was taken).

This kind of argument is often called an induction by enumeration . It is closely related to the technique of statistical estimation. We may represent the logical form of such arguments semi-formally as follows:

Premise: In random sample S consisting of n members of population B , the proportion of members that have attribute A is r .

Therefore, with degree of support p ,

Conclusion: The proportion of all members of B that have attribute A is between \(r-q\) and \(r+q\) (i.e., lies within margin of error q of r ).

Let’s lay out this argument more formally. The premise breaks down into three separate statements: [ 1 ]

Any inductive logic that treats such arguments should address two challenges. (1) It should tell us which enumerative inductive arguments should count as good inductive arguments. In particular, it should tell us how to determine the appropriate degree p to which such premises inductively support the conclusion, for a given margin of error q . (2) It should demonstrably satisfy the CoA . That is, it should be provable (as a metatheorem) that if a conclusion expressing the approximate proportion for an attribute in a population is true, then it is very likely that sufficiently numerous random samples of the population will provide true premises for good inductive arguments that confer degrees of support p approaching 1 for that true conclusion—where, on pain of triviality, these sufficiently numerous samples are only a tiny fraction of a large population. The supplement on Enumerative Inductions: Bayesian Estimation and Convergence , shows precisely how a a Bayesian account of enumerative induction may meet these two challenges.

Enumerative induction is, however, rather limited in scope. This form of induction is only applicable to the support of claims involving simple universal conditionals (i.e., claims of form ‘All B s are A s’) and claims about the proportion of an attribute in a population (i.e., claims of form ‘the frequency of A s among the B s is r ’). But, many important empirical hypotheses are not reducible to this simple form, and the evidence for these hypotheses is not composed of an enumeration of such instances. Consider, for example, the Newtonian Theory of Mechanics:

All objects remain at rest or in uniform motion unless acted upon by some external force. An object’s acceleration (i.e., the rate at which its motion changes from rest or from uniform motion) is in the same direction as the force exerted on it; and the rate at which the object accelerates due to a force is equal to the magnitude of the force divided by the object’s mass. If an object exerts a force on another object, the second object exerts an equal amount of force on the first object, but in the opposite direction to the force exerted by the first object.

The evidence for (and against) this theory is not gotten by examining a randomly selected subset of objects and the forces acting upon them. Rather, the theory is tested by calculating what this theory says (or implies) about observable phenomena in a wide variety of specific situations—e.g., ranging from simple collisions between small bodies to the trajectories of planets and comets—and then seeing whether those phenomena occur in the way that the theory says they will. This approach to testing hypotheses and theories is ubiquitous, and should be captured by an adequate inductive logic.

More generally, for a wide range of cases where inductive reasoning is important, enumerative induction is inadequate. Rather, the kind of evidential reasoning that judges the likely truth of hypotheses on the basis of what they say (or imply) about the evidence is more appropriate. Consider the kinds of inferences jury members are supposed to make, based on the evidence presented at a murder trial. The inference to probable guilt or innocence is based on a patchwork of evidence of various kinds. It almost never involves consideration of a randomly selected sequences of past situations when people like the accused committed similar murders. Or, consider how a doctor diagnoses her patient on the basis of his symptoms. Although the frequency of occurrence of various diseases when similar symptoms have been present may play a role, this is clearly not the whole story. Diagnosticians commonly employ a form of hypothesis evaluation —e.g., would the hypothesis that the patient has a brain tumor account for his symptoms?; or are these symptoms more likely the result of a minor stroke?; or may some other hypothesis better account for the patient’s symptoms? Thus, a fully adequate account of inductive logic should explicate the logic of hypothesis evaluation , through which a hypothesis or theory may be tested on the basis of what it says (or "predicts") about observable phenomena. In Section 3 we will see how a kind of probabilistic inductive logic called "Bayesian Inference" or "Bayesian Confirmation Theory" captures such reasoning. The full logical structure of such arguments will be spelled out in that section.

2. Inductive Logic and Inductive Probabilities

Perhaps the oldest and best understood way of representing partial belief, uncertain inference, and inductive support is in terms of probability and the equivalent notion odds . Mathematicians have studied probability for over 350 years, but the concept is certainly much older. In recent times a number of other, related representations of partial belief and uncertain inference have emerged. Some of these approaches have found useful application in computer based artificial intelligence systems that perform inductive inferences in expert domains such as medical diagnosis. Nevertheless, probabilistic representations have predominated in such application domains. So, in this article we will focus exclusively on probabilistic representations of inductive support. A brief comparative description of some of the most prominent alternative representations of uncertainty and support-strength can be found in the supplement Some Prominent Approaches to the Representation of Uncertain Inference .

The mathematical study of probability originated with Blaise Pascal and Pierre de Fermat in the mid-17 th century. From that time through the early 19 th century, as the mathematical theory continued to develop, probability theory was primarily applied to the assessment of risk in games of chance and to drawing simple statistical inferences about characteristics of large populations—e.g., to compute appropriate life insurance premiums based on mortality rates. In the early 19 th century Pierre de Laplace made further theoretical advances and showed how to apply probabilistic reasoning to a much wider range of scientific and practical problems. Since that time probability has become an indispensable tool in the sciences, business, and many other areas of modern life.

Throughout the development of probability theory various researchers appear to have thought of it as a kind of logic. But the first extended treatment of probability as an explicit part of logic was George Boole’s The Laws of Thought (1854). John Venn followed two decades later with an alternative empirical frequentist account of probability in The Logic of Chance (1876). Not long after that the whole discipline of logic was transformed by new developments in deductive logic.

In the late 19 th and early 20 th century Frege, followed by Russell and Whitehead, showed how deductive logic may be represented in the kind of rigorous formal system we now call quantified predicate logic . For the first time logicians had a fully formal deductive logic powerful enough to represent all valid deductive arguments that arise in mathematics and the sciences. In this logic the validity of deductive arguments depends only on the logical structure of the sentences involved. This development in deductive logic spurred some logicians to attempt to apply a similar approach to inductive reasoning. The idea was to extend the deductive entailment relation to a notion of probabilistic entailment for cases where premises provide less than conclusive support for conclusions. These partial entailments are expressed in terms of conditional probabilities , probabilities of the form \(P[C \pmid B] = r\) (read “the probability of C given B is r ”), where P is a probability function, C is a conclusion sentence, B is a conjunction of premise sentences, and r is the probabilistic degree of support that premises B provide for conclusion C . Attempts to develop such a logic vary somewhat with regard to the ways in which they attempt to emulate the paradigm of formal deductive logic.

Some inductive logicians have tried to follow the deductive paradigm by attempting to specify inductive support probabilities solely in terms of the syntactic structures of premise and conclusion sentences. In deductive logic the syntactic structure of the sentences involved completely determines whether premises logically entail a conclusion. So these inductive logicians have attempted to follow suit. In such a system each sentence confers a syntactically specified degree of support on each of the other sentences of the language. Thus, the inductive probabilities in such a system are logical in the sense that they depend on syntactic structure alone. This kind of conception was articulated to some extent by John Maynard Keynes in his Treatise on Probability (1921). Rudolf Carnap pursued this idea with greater rigor in his Logical Foundations of Probability (1950) and in several subsequent works (e.g., Carnap 1952). (For details of Carnap’s approach see the section on logical probability in the entry on interpretations of the probability calculus , in this Encyclopedia .)

In the inductive logics of Keynes and Carnap, Bayes’ theorem, a straightforward theorem of probability theory, plays a central role in expressing how evidence comes to bear on hypotheses. Bayes’ theorem expresses how the probability of a hypothesis h on the evidence e , \(P[h \pmid e]\), depends on the probability that e should occur if h is true, \(P[e \pmid h]\), and on the probability of hypothesis h prior to taking the evidence into account, \(P[h]\) (called the prior probability of h ). (Later we’ll examine Bayes’ theorem in detail.) So, such approaches might well be called Bayesian logicist inductive logics. Other prominent Bayesian logicist attempts to develop a probabilistic inductive logic include the works of Jeffreys (1939), Jaynes (1968), and Rosenkrantz (1981).

It is now widely held that the core idea of this syntactic approach to Bayesian logicism is fatally flawed—that syntactic logical structure cannot be the sole determiner of the degree to which premises inductively support conclusions. A crucial facet of the problem faced by syntactic Bayesian logicism involves how the logic is supposed to apply in scientific contexts where the conclusion sentence is some scientific hypothesis or theory, and the premises are evidence claims. The difficulty is that in any probabilistic logic that satisfies the usual axioms for probabilities, the inductive support for a hypothesis must depend in part on its prior probability . This prior probability represents (arguably) how plausible the hypothesis is taken to be on the basis of considerations other than the observational and experimental evidence (e.g., perhaps due to various plausibility arguments). A syntactic Bayesian logicist must tell us how to assign values to these pre-evidential prior probabilities of hypotheses in a way that relies only on the syntactic logical structure of the hypothesis, perhaps based on some measure of syntactic simplicity. There are severe problems with getting this idea to work. Various kinds of examples seem to show that such an approach must assign intuitively quite unreasonable prior probabilities to hypotheses in specific cases (see the footnote cited near the end of Section 3.2 for details). Furthermore, for this idea to apply to the evidential support of real scientific theories, scientists would have to formalize theories in a way that makes their relevant syntactic structures apparent, and then evaluate theories solely on that syntactic basis (together with their syntactic relationships to evidence statements). Are we to evaluate alternative theories of gravitation, and alternative quantum theories, this way? This seems an extremely dubious approach to the evaluation of real scientific hypotheses and theories. Thus, it seems that logical structure alone may not suffice for the inductive evaluation of scientific hypotheses. (This issue will be treated in more detail in Section 3 , after we first see how probabilistic logics employ Bayes’ theorem to represent the evidential support for hypotheses as a function of prior probabilities together with evidential likelihoods .)

At about the time that the syntactic Bayesian logicist idea was developing, an alternative conception of probabilistic inductive reasoning was also emerging. This approach is now generally referred to as the Bayesian subjectivist or personalist approach to inductive reasoning (see, e.g., Ramsey 1926; De Finetti 1937; Savage 1954; Edwards, Lindman, & Savage 1963; Jeffrey 1983, 1992; Howson & Urbach 1993; Joyce 1999). This approach treats inductive probability as a measure of an agent’s degree-of-belief that a hypothesis is true, given the truth of the evidence. This approach was originally developed as part of a larger normative theory of belief and action known as Bayesian decision theory . The principal idea is that the strength of an agent’s desires for various possible outcomes should combine with her belief-strengths regarding claims about the world to produce optimally rational decisions. Bayesian subjectivists provide a logic of decision that captures this idea, and they attempt to justify this logic by showing that in principle it leads to optimal decisions about which of various risky alternatives should be pursued. On the Bayesian subjectivist or personalist account of inductive probability, inductive probability functions represent the subjective (or personal) belief-strengths of ideally rational agents, the kind of belief strengths that figure into rational decision making. (See the section on subjective probability in the entry on interpretations of the probability calculus , in this Encyclopedia .)

Elements of a logicist conception of inductive logic live on today as part of the general approach called Bayesian inductive logic . However, among philosophers and statisticians the term ‘Bayesian’ is now most closely associated with the subjectivist or personalist account of belief and decision. And the term ‘Bayesian inductive logic’ has come to carry the connotation of a logic that involves purely subjective probabilities. This usage is misleading since, for inductive logics, the Bayesian/non-Bayesian distinction should really turn on whether the logic gives Bayes’ theorem a prominent role, or the approach largely eschews the use of Bayes’ theorem in inductive inferences, as do the classical approaches to statistical inference developed by R. A. Fisher (1922) and by Neyman & Pearson (1967)). Indeed, any inductive logic that employs the same probability functions to represent both the probabilities of evidence claims due to hypotheses and the probabilities of hypotheses due to those evidence claims must be a Bayesian inductive logic in this broader sense; because Bayes’ theorem follows directly from the axioms that each probability function must satisfy, and Bayes’ theorem expresses a necessary connection between the probabilities of evidence claims due to hypotheses and the probabilities of hypotheses due to those evidence claims .

In this article the probabilistic inductive logic we will examine is a Bayesian inductive logic in this broader sense. This logic will not presuppose the subjectivist Bayesian theory of belief and decision, and will avoid the objectionable features of the syntactic version of Bayesian logicism. We will see that there are good reasons to distinguish inductive probabilities from degree-of-belief probabilities and from purely syntactic logical probabilities . So, the probabilistic logic articulated in this article will be presented in a way that depends on neither of these conceptions of what the probability functions are . However, this version of the logic will be general enough that it may be fitted to a Bayesian subjectivist or Bayesian syntactic-logicist program, if one desires to do that.

All logics derive from the meanings of terms in sentences. What we now recognize as formal deductive logic rests on the meanings (i.e., the truth-functional properties) of the standard logical terms. These logical terms, and the symbols we will employ to represent them, are as follows:

  • ‘not’, ‘\({\nsim}\)’;
  • ‘and’, ‘\(\cdot\)’;
  • ‘inclusive or’, ‘\(\vee\)’;
  • truth-functional ‘if-then’, ‘\(\supset\)’;
  • ‘if and only if’, ‘\(\equiv\)’;
  • ‘all’, ‘\(\forall\)’, and
  • ‘some’, ‘\(\exists\)’;
  • the identity relation, ‘=’.

The meanings of all other terms, the non-logical terms such as names and predicate and relational expressions, are permitted to “float free”. That is, the logical validity of deductive arguments depends neither on the meanings of the name and predicate and relation terms, nor on the truth-values of sentences containing them. It merely supposes that these non-logical terms are meaningful, and that sentences containing them have truth-values. Deductive logic then tells us that the logical structures of some sentences—i.e., the syntactic arrangements of their logical terms—preclude them from being jointly true of any possible state of affairs. This is the notion of logical inconsistency . The notion of logical entailment is inter-definable with it. A collection of premise sentences logically entails a conclusion sentence just when the negation of the conclusion is logically inconsistent with those premises.

An inductive logic must, it seems, deviate from the paradigm provided by deductive logic in several significant ways. For one thing, logical entailment is an absolute, all-or-nothing relationship between sentences, whereas inductive support comes in degrees-of-strength. For another, although the notion of inductive support is analogous to the deductive notion of logical entailment , and is arguably an extension of it, there seems to be no inductive logic extension of the notion of logical inconsistency —at least none that is inter-definable with inductive support in the way that logical inconsistency is inter-definable with logical entailment . Indeed, it turns out that when the unconditional probability of \((B\cdot{\nsim}A)\) is very nearly 0 (i.e., when \((B\cdot{\nsim}A)\) is “nearly inconsistent”), the degree to which B inductively supports A , \(P[A \pmid B]\), may range anywhere between 0 and 1.

Another notable difference is that when B logically entails A , adding a premise C cannot undermine the logical entailment—i.e., \((C\cdot B)\) must logically entail A as well. This property of logical entailment is called monotonicity . But inductive support is nonmonotonic . In general, depending on what \(A, B\), and C mean, adding a premise C to B may substantially raise the degree of support for A , or may substantially lower it, or may leave it completely unchanged—i.e., \(P[A \pmid (C\cdot B)]\) may have a value much larger than \(P[A \pmid B]\), or may have a much smaller value, or it may have the same, or nearly the same value as \(P[A \pmid B]\).

In a formal treatment of probabilistic inductive logic, inductive support is represented by conditional probability functions defined on sentences of a formal language L . These conditional probability functions are constrained by certain rules or axioms that are sensitive to the meanings of the logical terms (i.e., ‘not’, ‘and’, ‘or’, etc., the quantifiers ‘all’ and ‘some’, and the identity relation). The axioms apply without regard for what the other terms of the language may mean. In essence the axioms specify a family of possible support functions , \(\{P_{\beta}, P_{\gamma}, \ldots ,P_{\delta}, \ldots \}\) for a given language L . Although each support function satisfies these same axioms, the further issue of which among them provides an appropriate measure of inductive support is not settled by the axioms alone. That may depend on additional factors, such as the meanings of the non-logical terms (i.e., the names and predicate expressions) of the language.

A good way to specify the axioms of the logic of inductive support functions is as follows. These axioms are apparently weaker than the usual axioms for conditional probabilities. For instance, the usual axioms assume that conditional probability values are restricted to real numbers between 0 and 1. The following axioms do not assume this, but only that support functions assign some real numbers as values for support strengths. However, it turns out that the following axioms suffice to derive all the usual axioms for conditional probabilities (including the usual restriction to values between 0 and 1). We draw on these weaker axioms only to forestall some concerns about whether the support function axioms may assume too much, or may be overly restrictive.

Let L be a language for predicate logic with identity, and let ‘\(\vDash\)’ be the standard logical entailment relation—i.e., the expression ‘\(B \vDash A\)’ says “ B logically entails A ” and the expression ‘\(\vDash A\)’ says “ A is a tautology”. A support function is a function \(P_{\alpha}\) from pairs of sentences of L to real numbers that satisfies the following axioms:

  • (1) \(P_{\alpha}[E \pmid F] \ne P_{\alpha}[G \pmid H]\) for at least some sentences \(E, F, G\), and H .

For all sentence \(A, B, C\), and D :

  • (2) If \(B \vDash A\), then \(P_{\alpha}[A \pmid B] \ge P_{\alpha}[C \pmid D]\);
  • (3) \(P_{\alpha}[A \pmid (B \cdot C)] = P_{\alpha}[A \pmid (C \cdot B)]\);
  • (4) If \(C \vDash{\nsim}(B \cdot A)\), then either \[P_{\alpha}[(A \vee B) \pmid C] = P_{\alpha}[A \pmid C] + P_{\alpha}[B \pmid C]\] or else \[P_{\alpha}[E \pmid C] = P_{\alpha}[C \pmid C]\] for every sentence E ;
  • (5) \(P_{\alpha}[(A \cdot B) \pmid C] = P_{\alpha}[A \pmid (B \cdot C)] \times P_{\alpha}[B \pmid C]\).

This axiomatization takes conditional probability as basic, as seems appropriate for evidential support functions . (These functions agree with the more usual unconditional probability functions when the latter are defined—just let \(P_{\alpha}[A] = P_{\alpha}[A \pmid (D \vee{\nsim}D)]\). However, these axioms permit conditional probabilities \(P_{\alpha}[A \pmid C]\) to remain defined even when condition statement C has probability 0—i.e., even when \(P_{\alpha}[C \pmid (D\vee{\nsim}D)] = 0\).)

Notice that conditional probability functions apply only to pairs of sentences, a conclusion sentence and a premise sentence. So, in probabilistic inductive logic we represent finite collections of premises by conjoining them into a single sentence. Rather than say,

A is supported to degree r by the set of premises \(\{B_1\), \(B_2\), \(B_3\),…, \(B_n\}\),

we instead say that

A is supported to degree r by the conjunctive premise \((((B_1\cdot B_2)\cdot B_3)\cdot \ldots \cdot B_n)\),

and write this as

The above axioms are quite weak. For instance, they do not say that logically equivalent sentences are supported by all other sentences to the same degree; rather, that result is derivable from these axioms (see result 6 below). Nor do these axioms say that logically equivalent sentences support all other sentences to the same degree; rather, that result is also derivable (see result 8 below). Indeed, from these axioms all of the usual theorems of probability theory may be derived. The following results are particularly useful in probabilistic logic. Their derivations from these axioms are provided in note 2. [ 2 ]

  • If \(B \vDash A\), then \(P_{\alpha}[A \pmid B] = 1\).
  • If \(C \vDash{\nsim}(B\cdot A)\), then either \[P_{\alpha}[(A \vee B) \pmid C] = P_{\alpha}[A \pmid C] + P_{\alpha}[B \pmid C]\] or else \(P_{\alpha}[E \pmid C] = 1\) for every sentence E .
  • \(P_{\alpha}[{\nsim}A \pmid B] = 1 - P_{\alpha}[A \pmid B]\) or else \(P_{\alpha}[C \pmid B] = 1\) for every sentence C .
  • \(1 \ge P_{\alpha}[A \pmid B] \ge 0\).
  • If \(B \vDash A\), then \(P_{\alpha}[A \pmid C] \ge P_{\alpha}[B \pmid C]\).
  • If \(B \vDash A\) and \(A \vDash B\), then \(P_{\alpha}[A \pmid C] = P_{\alpha}[B \pmid C]\).
  • If \(C \vDash B\), then \(P_{\alpha}[(A\cdot B) \pmid C] = P_{\alpha}[(B\cdot A) \pmid C] = P_{\alpha}[A \pmid C]\).
  • If \(C \vDash B\) and \(B \vDash C\), then \(P_{\alpha}[A \pmid B] = P_{\alpha}[A \pmid C]\).
  • \(P_{\alpha}[B \pmid C] \gt 0\), then \[P_{\alpha}[A \pmid (B\cdot C)] = P_{\alpha}[B \pmid (A\cdot C)] \times \frac{P_{\alpha}[A \pmid C]}{P_{\alpha}[B \pmid C]}\] (this is a simple form of Bayes’ theorem).
  • \(P_{\alpha}[(A\vee B) \pmid C] = P_{\alpha}[A \pmid C] + P_{\alpha}[B \pmid C] - P_{\alpha}[(A\cdot B) \pmid C]\).
  • If \(\{B_1 , \ldots ,B_n\}\) is any finite set of sentences such that for each pair \(B_i\) and \(B_j, C \vDash{\nsim}(B_{i}\cdot B_{j})\) (i.e., the members of the set are mutually exclusive, given C ), then either \(P_{\alpha}[D \pmid C] = 1\) for every sentence D , or \[ P_{\alpha}[((B_1\vee B_2)\vee \ldots \vee B_n) \pmid C] = \sum ^{n}_{i=1} P_{\alpha}[B_i \pmid C]. \]
  • If \(\{B_1 , \ldots ,B_n , \ldots \}\) is any countably infinite set of sentences such that for each pair \(B_i\) and \(B_j, C \vDash{\nsim}(B_{i}\cdot B_{j})\), then either \(P_{\alpha}[D \pmid C] = 1\) for every sentence D , or [ 3 ] \[ \lim_n P_{\alpha}[((B_1\vee B_2)\vee \ldots \vee B_n) \pmid C] = \sum^{\infty}_{i=1} P_{\alpha}[B_i \pmid C]. \]

Let us now briefly consider each axiom to see how plausible it is as a constraint on a quantitative measure of inductive support, and how it extends the notion of deductive entailment. First notice that each degree-of-support function \(P_{\alpha}\) on L measures support strength with some real number values, but the axioms don’t explicitly restrict these values to lie between 0 and 1. It turns out that the all support values must lie between 0 and 1, but this follows from the axioms, rather than being assumed by them. The scaling of inductive support via the real numbers is surely a reasonable way to go.

Axiom 1 is a non-triviality requirement. It says that the support values cannot be the same for all sentence pairs. This axiom merely rules out the trivial support function that assigns the same amount of support to each sentence by every sentence. One might replace this axiom with the following rule:

But this alternative rule turns out to be derivable from axiom 1 together with the other axioms.

Axiom 2 asserts that when B logically entail A , the support of A by B is as strong as support can possibly be. This comports with the idea that an inductive support function is a generalization of the deductive entailment relation, where the premises of deductive entailments provide the strongest possible support for their conclusions.

Axiom 3 merely says that \((B \cdot C)\) supports sentences to precisely the same degree that \((C \cdot B)\) supports them. This is an especially weak axiom. But taken together with the other axioms, it suffices to entail that logically equivalent sentences support all sentences to precisely the same degree.

Axiom 4 says that inductive support adds up in a plausible way. When C logically entails the incompatibility of A and B , i.e., when no possible state of affairs can make both A and B true together, the degrees of support that C provides to each of them individually must sum to the support it provides to their disjunction. The only exception is in those cases where C acts like a logical contradiction and supports all sentences to the maximum possible degree (in deductive logic a logical contradiction logically entails every sentence).

To understand what axiom 5 says, think of a support function \(P_{\alpha}\) as describing a measure on possible states of affairs. Read each degree-of-support expression of form ‘\(P_{\alpha}[D \pmid E] = r\)’ to say that the proportion of states of affairs in which D is true among those states of affairs where E is true is r . Read this way, axiom 5 then says the following. Suppose B is true in proportion q of all the states of affairs where C is true, and suppose A is true in fraction r of those states where B and C are true together. Then A and B should be true together in what proportion of all the states where C is true? In fraction r (the \((A\cdot B)\) part) of proportion q (the B portion) of all those states where C is true.

The degree to which a sentence B supports a sentence A may well depend on what these sentences mean. In particular it will usually depend on the meanings we associate with the non-logical terms (those terms other than the logical terms not , and , or , etc., the quantifiers , and identity ), that is, on the meanings of the names, and the predicate and relation terms of the language. For example, we should want

given the usual meanings of ‘bachelor’ and ‘married’, since “all bachelors are unmarried” is analytically true—i.e. no empirical evidence is required to establish this connection. (In the formal language for predicate logic, if we associate the meaning “is married” with predicate term ‘ M ’, the meaning “is a bachelor” with the predicate term ‘ B ’, and take the name term ‘ g ’ to refer to George, then we should want \(P_{\alpha}[{\nsim}Mg \pmid Bg] = 1\), since \(\forall x (Bx \supset{\nsim}Mx)\) is analytically true on this meaning assignment to the non-logical terms.) So, let’s associate with each individual support function \(P_{\alpha}\) a specific assignment of meanings ( primary intensions ) to all the non-logical terms of the language. (However, evidential support functions should not presuppose meaning assignments in the sense of so-called secondary intensions —e.g., those associated with rigid designators across possible states of affairs. For, we should not want a confirmation function \(P_{\alpha}\) to make

since we presumably want the inductive logic to draw on explicit empirical evidence to support the claim that water is made of H 2 O. Thus, the meanings of terms we associate with a support function should only be their primary intensions, not their secondary intensions.)

In the context of inductive logic it makes good sense to supplement the above axioms with two additional axioms. Here is the first of them:

  • (6) If A is an axiom of set theory or any other piece of pure mathematics employed by the sciences, or if A is analytically true (i.e., if the truth of A depends only on the meanings of the words it contains, where the specific meanings for names and predicates are those associated with the particular support function \(P_{\alpha})\), then, for all sentences C , \(P_{\alpha}[A \pmid C] = P_{\alpha}[C \pmid C]\) (i.e., \(P_{\alpha}[A \pmid C] = 1)\).

Here is how axiom 6 applies to the above example, yielding \(P_{\alpha}[{\nsim}Mg \pmid Bg] = 1\) when the meaning assignments to non-logical terms associated with support function \(P_{\alpha}\) makes \(\forall x(Bx \supset{\nsim}Mx)\) analytically true. From axiom 6 (followed by results 7, 5, and 4) we have

thus, \(P_{\alpha}[{\nsim}Mg \pmid Bg] = 1\). The idea behind axiom 6 is that inductive logic is about evidential support for contingent claims. Nothing can count as empirical evidence for or against non-contingent truths. In particular, analytic truths should be maximally supported by all premises C .

One important respect in which inductive logic should follow the deductive paradigm is that the logic should not presuppose the truth of contingent statements. If a statement C is contingent, then some other statements should be able to count as evidence against C . Otherwise, a support function \(P_{\alpha}\) will take C and all of its logical consequences to be supported to degree 1 by all possible evidence claims. This is no way for an inductive logic to behave. The whole idea of inductive logic is to provide a measure of the extent to which premise statements indicate the likely truth-values of contingent conclusion statements. This idea won’t work properly if the truth-values of some contingent statements are presupposed by assigning them support value 1 on every possible premise. Such probability assignments would make the inductive logic enthymematic by hiding significant premises in inductive support relationships. It would be analogous to permitting deductive arguments to count as valid in cases where the explicitly stated premises are insufficient to logically entail the conclusion, but where the validity of the argument is permitted to depend on additional unstated premises. This is not how a rigorous approach to deductive logic should work, and it should not be a common practice in a rigorous approach to inductive logic.

Nevertheless, it is common practice for probabilistic logicians to sweep provisionally accepted contingent claims under the rug by assigning them probability 1 (regardless of the fact that no explicit evidence for them is provided). This practice saves the trouble of repeatedly writing a given contingent sentence B as a premise, since \(P_{\gamma}[A \pmid B\cdot C]\) will equal \(P_{\gamma}[A \pmid C]\) whenever \(P_{\gamma}[B \pmid C] = 1\). Although this convention is useful, such probability functions should be considered mere abbreviations for proper, logically explicit, non-enthymematic, inductive support relations. Thus, properly speaking, an inductive support function \(P_{\alpha}\) should not assign probability 1 to a sentence on every possible premise unless that sentence is either (i) logically true, or (ii) an axiom of set theory or some other piece of pure mathematics employed by the sciences, or (iii) unless according to the interpretation of the language that \(P_{\alpha}\) presupposes, the sentence is analytic (and so outside the realm of evidential support). Thus, we adopt the following version of the so-called “axiom of regularity”.

  • (7) If, for all C , \(P_{\alpha}[A \pmid C] = P_{\alpha}[C \pmid C]\) (i.e., \(P_{\alpha}[A \pmid C] = 1\)), then A must be a logical truth or an axiom of set theory or some other piece of pure mathematics employed by the sciences, or A must be analytically true (according to the meanings of the terms of L associated with support function \(P_{\alpha})\).

Axioms 6 and 7 taken together say that a support function \(P_{\alpha}\) counts as non-contingently true, and so not subject to empirical support, just those sentences that are assigned probability 1 by every premise.

Some Bayesian logicists have proposed that an inductive logic might be made to depend solely on the logical form of sentences, as is the case for deductive logic. The idea is, effectively, to supplement axioms 1–7 with additional axioms that depend only on the logical structures of sentences, and to introduce enough such axioms to reduce the number of possible support functions to a single uniquely best support function. It is now widely agreed that this project cannot be carried out in a plausible way. Perhaps support functions should obey some rules in addition to axioms 1–7. But it is doubtful that any plausible collection of additional rules can suffice to determine a single, uniquely qualified support function. Later, in Section 3 , we will briefly return to this issue, after we develop a more detailed account of how inductive probabilities capture the relationship between hypotheses and evidence.

Axioms 1–7 for conditional probability functions merely place formal constraints on what may properly count as a degree of support function . Each function \(P_{\alpha}\) that satisfies these axioms may be viewed as a possible way of applying the notion of inductive support to a language L that respects the meanings of the logical terms, much as each possible truth-value assignment for a language represents a possible way of assigning truth-values to its sentences in a way that respects the meanings of the logical terms. The issue of which of the possible truth-value assignments to a language represents the actual truth or falsehood of its sentences depends on more than this. It depends on the meanings of the non-logical terms and on the state of the actual world. Similarly, the degree to which some sentences actually support others in a fully meaningful language must rely on something more than the mere satisfaction of the axioms for support functions. It must, at least, rely on what the sentences of the language mean, and perhaps on much more besides. But, what more? Perhaps a better understanding of what inductive probability is may provide some help by filling out our conception of what inductive support is about. Let’s pause to discuss two prominent views—two interpretations of the notion of inductive probability.

One kind of non-syntactic logicist reading of inductive probability takes each support function \(P_{\alpha}\) to be a measure on possible states of affairs. The idea is that, given a fully meaningful language (associated with support function \(P_{\alpha}\)) ‘\(P_{\alpha}[A \pmid B] = r\)’ says that among those states of affairs in which B is true, A is true in proportion r of them. There will not generally be a single privileged way to define such a measure on possible states of affairs. Rather, each of a number of functions \(P_{\alpha}\), \(P_{\beta}\), \(P_{\gamma}\),…, etc., that satisfy the constraints imposed by axioms 1–7 may represent a viable measure of the inferential import of the propositions expressed by sentences of the language. This idea needs more fleshing out, of course. The next section will provide some indication of how that might go.

Subjectivist Bayesians offer an alternative reading of the support functions. First, they usually take unconditional probability as basic, and take conditional probabilities as defined in terms of unconditional probabilities: the conditional probability ‘\(P_{\alpha}[A \pmid B]\)’ is defined as a ratio of unconditional probabilities:

Subjectivist Bayesians take each unconditional probability function \(P_{\alpha}\) to represent the belief-strengths or confidence-strengths of an ideally rational agent, \(\alpha\). On this understanding ‘\(P_{\alpha}[A] =r\)’ says, “the strength of \(\alpha\)’s belief (or confidence) that A is truth is r ”. Subjectivist Bayesians usually tie such belief strengths to how much money (or how many units of utility ) the agent would be willing to bet on A turning out to be true. Roughly, the idea is this. Suppose that an ideally rational agent \(\alpha\) would be willing to accept a wager that would yield (no less than) $ u if A turns out to be true and would lose him $1 if A turns out to be false. Then, under reasonable assumptions about the agent’s desire money, it can be shown that the agent’s belief strength that A is true should be

And it can further be shown that any function \(P_{\alpha}\) that expresses such betting-related belief-strengths on all statements in agent \(\alpha\)’s language must satisfy axioms for unconditional probabilities analogous to axioms 1–5. [ 4 ] Moreover, it can be shown that any function \(P_{\beta}\) that satisfies these axioms is a possible rational belief function for some ideally rational agent \(\beta\). These relationships between belief-strengths and the desirability of outcomes (e.g., gaining money or goods on bets) are at the core of subjectivist Bayesian decision theory . Subjectivist Bayesians usually take inductive probability to just be this notion of probabilistic belief-strength .

Undoubtedly real agents do believe some claims more strongly than others. And, arguably, the belief strengths of real agents can be measured on a probabilistic scale between 0 and 1, at least approximately. And clearly the inductive support of a hypothesis by evidence should influence the strength of an agent’s belief in the truth of that hypothesis—that’s the point of engaging in inductive reasoning, isn’t it? However, there is good reason for caution about viewing inductive support functions as Bayesian belief-strength functions, as we’ll see a bit later. So, perhaps an agent’s support function is not simply identical to his belief function, and perhaps the relationship between inductive support and belief-strength is somewhat more complicated.

In any case, some account of what support functions are supposed to represent is clearly needed. The belief function account and the logicist account (in terms of measures on possible states of affairs) are two attempts to provide this account. But let us put this interpretative issue aside for now. One may be able to get a better handle on what inductive support functions really are after one sees how the inductive logic that draws on them is supposed to work.

3. The Application of Inductive Probabilities to the Evaluation of Scientific Hypotheses

One of the most important applications of an inductive logic is its treatment of the evidential evaluation of scientific hypotheses. The logic should capture the structure of evidential support for all sorts of scientific hypotheses, ranging from simple diagnostic claims (e.g., “the patient is infected by the HIV”) to complex scientific theories about the fundamental nature of the world, such as quantum mechanics or the theory of relativity. This section will show how evidential support functions (a.k.a. Bayesian confirmation functions) represent the evidential evaluation of scientific hypotheses and theories. This logic is essentially comparative. The evaluation of a hypothesis depends on how strongly evidence supports it over alternative hypotheses.

Consider some collection of mutually incompatible, alternative hypotheses (or theories) about a common subject matter, \(\{h_1, h_2 , \ldots \}\). The collection of alternatives may be very simple, e.g., {“the patient has HIV”, “the patient is free of HIV”}. Or, when the physician is trying to determine which among a range of diseases is causing the patient’s symptoms, the collection of alternatives may consist of a long list of possible disease hypotheses. For the cosmologist, the collection of alternatives may consist of several distinct gravitational theories, or several empirically distinct variants of the “same” theory. Whenever two variants of a hypothesis (or theory) differ in empirical import, they count as distinct hypotheses. (This should not be confused with the converse positivistic assertion that theories with the same empirical content are really the same theory. Inductive logic doesn’t necessarily endorse that view.)

The collection of competing hypotheses (or theories) to be evaluated by the logic may be finite in number, or may be countably infinite. No realistic language contains more than a countable number of expressions; so it suffices for a logic to apply to countably infinite number of sentences. From a purely logical perspective the collection of competing alternatives may consist of every rival hypothesis (or theory) about a given subject matter that can be expressed within a given language — e.g., all possible theories of the origin and evolution of the universe expressible in English and contemporary mathematics. In practice, alternative hypotheses (or theories) will often be constructed and evidentially evaluated over a long period of time. The logic of evidential support works in much the same way regardless of whether all alternative hypotheses are considered together, or only a few alternative hypotheses are available at a time.

Evidence for scientific hypotheses consists of the results of specific experiments or observations. For a given experiment or observation, let ‘\(c\)’ represent a description of the relevant conditions under which it is performed, and let ‘\(e\)’ represent a description of the result of the experiment or observation, the evidential outcome of conditions \(c\).

The logical connection between scientific hypotheses and the evidence often requires the mediation of background information and auxiliary hypotheses. Let ‘\(b\)’ represent whatever background and auxiliary hypotheses are required to connect each hypothesis \(h_i\) among the competing hypotheses \(\{h_1, h_2 , \ldots \}\) to the evidence. Although the claims expressed by the auxiliary hypotheses within \(b\) may themselves be subject to empirical evaluation, they should be the kinds of claims that are not at issue in the evaluation of the alternative hypothesis in the collection \(\{h_1, h_2 , \ldots \}\). Rather, each of the alternative hypotheses under consideration draws on the same background and auxiliaries to logically connect to the evidential events. (If competing hypotheses \(h_i\) and \(h_j\) draw on distinct auxiliary hypotheses \(a_i\) and \(a_j\), respectively, in making logical contact with evidential claims, then the following treatment should be applied to the respective conjunctive hypotheses, \((h_{i}\cdot a_{i})\) and \((h_{j}\cdot a_{j})\), since these alternative conjunctive hypotheses will constitute the empirically distinct alternatives at issue.)

In cases where a hypothesis is deductively related to an outcome \(e\) of an observational or experimental condition \(c\) (via background and auxiliaries \(b\)), we will have either \(h_i\cdot b\cdot c \vDash e\) or \(h_i\cdot b\cdot c \vDash{\nsim}e\) . For example, \(h_i\) might be the Newtonian Theory of Gravitation. A test of the theory might involve a condition statement \(c\) that describes the results of some earlier measurements of Jupiter’s position, and that describes the means by which the next position measurement will be made; the outcome description \(e\) states the result of this additional position measurement; and the background information (and auxiliary hypotheses) \(b\) might state some already well confirmed theory about the workings and accuracy of the devices used to make the position measurements. Then, from \(h_i\cdot b\cdot c\) we may calculate the specific outcome \(e\) we expect to find; thus, the following logical entailment holds: \(h_i\cdot b\cdot c \vDash e\) . Then, provided that the experimental and observational conditions stated by \(c\) are in fact true, if the evidential outcome described by \(e\) actually occurs, the resulting conjoint evidential claim \((c\cdot e)\) may be considered good evidence for \(h_i\), given \(b\). (This method of theory evaluation is called the hypothetical-deductive approach to evidential support.) On the other hand, when from \(h_i\cdot b\cdot c\) we calculate some outcome incompatible with the observed evidential outcome \(e\), then the following logical entailment holds: \(h_i\cdot b\cdot c \vDash{\nsim}e\). In that case, from deductive logic alone we must also have that \(b\cdot c\cdot e \vDash{\nsim}h_i\) ; thus, \(h_i\) is said to be falsified by \(b\cdot c\cdot e\). The Bayesian account of evidential support we will be describing below extends this deductivist approach to include cases where the hypothesis \(h_i\) (and its alternatives) may not be deductive related to the evidence, but may instead imply that the evidential outcome is likely or unlikely to some specific degree r . That is, the Bayesian approach applies to cases where we may have neither \(h_i\cdot b\cdot c \vDash e\) nor \(h_i\cdot b\cdot c \vDash{\nsim}e\), but may instead only have \(P[e \pmid h_i\cdot b\cdot c] = r\), where r is some “entailment strength” between 0 and 1.

Before going on to describing the logic of evidential support in more detail, perhaps a few more words are in order about the background knowledge and auxiliary hypotheses, represented here by ‘\(b\)’. Duhem (1906) and Quine (1953) are generally credited with alerting inductive logicians to the importance of auxiliary hypotheses in connecting scientific hypotheses and theories to empirical evidence. (See the entry on Pierre Duhem .) They point out that scientific hypotheses often make little contact with evidence claims on their own. Rather, in most cases scientific hypotheses make testable predictions only relative to background information and auxiliary hypotheses that tie them to the evidence. (Some specific examples of such auxiliary hypotheses will be provided in the next subsection.) Typically auxiliaries are highly confirmed hypotheses from other scientific domains. They often describe the operating characteristics of various devices (e.g., measuring instruments) used to make observations or conduct experiments. Their credibility is usually not at issue in the testing of hypothesis \(h_i\) against its competitors, because \(h_i\) and its alternatives usually rely on the same auxiliary hypotheses to tie them to the evidence. But even when an auxiliary hypothesis is already well-confirmed, we cannot simply assume that it is unproblematic, or just known to be true . Rather, the evidential support or refutation of a hypothesis \(h_i\) is relative to whatever auxiliaries and background information (in \(b\)) is being supposed in the confirmational context. In other contexts the auxiliary hypotheses used to test \(h_i\) may themselves be among a collection of alternative hypotheses that are subject to evidential support or refutation. Furthermore, to the extent that competing hypotheses employ different auxiliary hypotheses in accounting for evidence, the evidence only tests each such hypothesis in conjunction with its distinct auxiliaries against alternative hypotheses packaged with their distinct auxiliaries, as described earlier. Thus, what counts as a hypothesis to be tested , \(h_i\), and what counts as auxiliary hypotheses and background information, \(b\), may depend on the epistemic context—on what class of alternative hypotheses are being tested by a collection of experiments or observations, and on what claims are presupposed in that context. No statement is intrinsically a test hypothesis , or intrinsically an auxiliary hypothesis or background condition . Rather, these categories are roles statements may play in a particular epistemic context.

In a probabilistic inductive logic the degree to which the evidence \((c\cdot e)\) supports a hypothesis \(h_i\) relative to background and auxiliaries \(b\) is represented by the posterior probability of \(h_i\), \(P_{\alpha}[h_i \pmid b\cdot c\cdot e]\), according to an evidential support function \(P_{\alpha}\). It turns out that the posterior probability of a hypothesis depends on just two kinds of factors: (1) its prior probability , \(P_{\alpha}[h_i \pmid b]\), together with the prior probabilities of its competitors, \(P_{\alpha}[h_j \pmid b]\), \(P_{\alpha}[h_k \pmid b]\), etc.; and (2) the likelihood of evidential outcomes \(e\) according to \(h_i\) in conjunction with with \(b\) and \(c\), \(P[e \pmid h_i\cdot b\cdot c]\), together with the likelihoods of these same evidential outcomes according to competing hypotheses, \(P[e \pmid h_j\cdot b\cdot c]\), \(P[e \pmid h_k\cdot b\cdot c]\), etc. We will now examine each of these factors in some detail. Following that we will see precisely how the values of posterior probabilities depend on the values of likelihoods and prior probabilities.

In probabilistic inductive logic the likelihoods carry the empirical import of hypotheses. A likelihood is a support function probability of form \(P[e \pmid h_i\cdot b\cdot c]\). It expresses how likely it is that outcome \(e\) will occur according to hypothesis \(h_i\) together with the background and auxiliaries \(b\) and the experimental (or observational) conditions \(c\). [ 5 ] If a hypothesis together with auxiliaries and experimental/observation conditions deductively entails an evidence claim, the axioms of probability make the corresponding likelihood objective in the sense that every support function must agree on its values: \(P[e \pmid h_i\cdot b\cdot c] = 1\) if \(h_i\cdot b\cdot c \vDash e\); \(P[e \pmid h_i\cdot b\cdot c] = 0\) if \(h_i\cdot b\cdot c \vDash{\nsim}e\). However, in many cases a hypothesis \(h_i\) will not be deductively related to the evidence, but will only imply it probabilistically. There are several ways this might happen: (1) hypothesis \(h_i\) may itself be an explicitly probabilistic or statistical hypothesis; (2) an auxiliary statistical hypothesis, as part of the background b , may connect hypothesis \(h_i\) to the evidence; (3) the connection between the hypothesis and the evidence may be somewhat loose or imprecise, not mediated by explicit statistical claims, but nevertheless objective enough for the purposes of evidential evaluation. Let’s briefly consider examples of the first two kinds. We’ll treat case (3) in Section 5 , which addresses the issue of vague and imprecise likelihoods.

The hypotheses being tested may themselves be statistical in nature. One of the simplest examples of statistical hypotheses and their role in likelihoods are hypotheses about the chance characteristic of coin-tossing. Let \(h_{[r]}\) be a hypothesis that says a specific coin has a propensity (or objective chance ) r for coming up heads on normal tosses, let \(b\) say that such tosses are probabilistically independent of one another. Let \(c\) state that the coin is tossed n times in the normal way; and let \(e\) say that on these tosses the coin comes up heads m times. In cases like this the value of the likelihood of the outcome \(e\) on hypothesis \(h_{[r]}\) for condition \(c\) is given by the well-known binomial formula:

There are, of course, more complex cases of likelihoods involving statistical hypotheses. Consider, for example, the hypothesis that plutonium 233 nuclei have a half-life of 20 minutes—i.e., that the propensity (or objective chance ) for a Pu-233 nucleus to decay within a 20 minute period is 1/2. The full statistical model for the lifetime of such a system says that the propensity (or objective chance ) for that system to remain intact (i.e., to not decay) within any time period x is governed by the formula \(1/2^{x/\tau}\), where \(\tau\) is the half-life of such a system. Let \(h\) be a hypothesis that says that this statistical model applies to Pu-233 nuclei with \(\tau = 20\) minutes; let \(c\) say that some specific Pu-233 nucleus is intact within a decay detector (of some specific kind) at an initial time \(t_0\); let \(e\) say that no decay of this same Pu-233 nucleus is detected by the later time \(t\); and let \(b\) say that the detector is completely accurate (it always registers a real decay, and it never registers false-positive detections). Then, the associated likelihood of \(e\) given \(h\) and \(c\) is this: \(P[e \pmid h\cdot b\cdot c] = 1/2^{(t - t_0)/\tau}\), where the value of \(\tau\) is 20 minutes.

An auxiliary statistical hypothesis, as part of the background \(b\), may be required to connect hypothesis \(h_i\) to the evidence. For example, a blood test for HIV has a known false-positive rate and a known true-positive rate. Suppose the false-positive rate is .05—i.e., the test tends to incorrectly show the blood sample to be positive for HIV in 5% of all cases where HIV is not present . And suppose that the true-positive rate is .99—i.e., the test tends to correctly show the blood sample to be positive for HIV in 99% of all cases where HIV really is present . When a particular patient’s blood is tested, the hypotheses under consideration are this patient is infected with HIV , \(h\), and this patient is not infected with HIV , \({\nsim}h\). In this context the known test characteristics function as background information, b . The experimental condition \(c\) merely states that this particular patient was subjected to this specific kind of blood test for HIV, which was processed by the lab using proper procedures. Let us suppose that the outcome \(e\) states that the result is a positive test result for HIV. The relevant likelihoods then, are \(P[e \pmid h\cdot b\cdot c] = .99\) and \(P[e \pmid {\nsim}h\cdot b\cdot c]\) = .05.

In this example the values of the likelihoods are entirely due to the statistical characteristics of the accuracy of the test, which is carried by the background/auxiliary information \(b\). The hypothesis \(h\) being tested by the evidence is not itself statistical.

This kind of situation may, of course, arise for much more complex hypotheses. The alternative hypotheses of interest may be deterministic physical theories, say Newtonian Gravitation Theory and some specific alternatives. Some of the experiments that test this theory relay on somewhat imprecise measurements that have known statistical error characteristics, which are expressed as part of the background or auxiliary hypotheses, \(b\). For example, the auxiliary \(b\) may describe the error characteristics of a device that measures the torque imparted to a quartz fiber, where the measured torque is used to assess the strength of the gravitational force between test masses. In that case \(b\) may say that for this kind of device the measurement errors are normally distributed about whatever value a given gravitational theory predicts, with some specified standard deviation that is characteristic of the device. This results in specific values \(r_i\) for the likelihoods, \(P[e \pmid h_i\cdot b\cdot c] = r_i\), for each of the various gravitational theories, \(h_i\), being tested.

Likelihoods that arise from explicit statistical claims—either within the hypotheses being tested, or from explicit statistical background claims that tie the hypotheses to the evidence—are often called direct inference likelihoods . Such likelihoods should be completely objective. So, all evidential support functions should agree on their values, just as all support functions agree on likelihoods when evidence is logically entailed. Direct inference likelihoods are logical in an extended, non-deductive sense. Indeed, some logicians have attempted to spell out the logic of direct inferences in terms of the logical form of the sentences involved. [ 6 ] But regardless of whether that project succeeds, it seems reasonable to take likelihoods of this sort to have highly objective or intersubjectively agreed values.

Not all likelihoods of interest in confirmational contexts are warranted deductively or by explicitly stated statistical claims. In such cases the likelihoods may have vague, imprecise values, but values that are determinate enough to still underwrite an objective evaluation of hypotheses on the evidence. In Section 5 we’ll consider such cases, where no underlying statistical theory is involved, but where likelihoods are determinate enough to play their standard role in the evidential evaluation of scientific hypotheses. However, the proper treatment of such cases will be more easily understood after we have first seen how the logic works when likelihoods are precisely known (such as cases where the likelihood values are endorsed by explicit statistical hypotheses and/or explicit statistical auxiliaries). In any case, the likelihoods that relate hypotheses to evidence claims in many scientific contexts will have such objective values. So, although a variety of different support functions \(P_{\alpha}\), \(P_{\beta}\),…, \(P_{\gamma}\), etc., may be needed to represent the differing “inductive proclivities” of the various members of a scientific community, for now we will consider cases where all evidential support functions agree on the values of the likelihoods. For, the likelihoods represent the empirical content of a scientific hypothesis, what the hypothesis (together with experimental conditions, \(c\), and background and auxiliaries \(b\)) says or probabilistically implies about the evidence. Thus, the empirical objectivity of a science relies on a high degree of objectivity or intersubjective agreement among scientists on the numerical values of likelihoods.

To see the point more vividly, imagine what a science would be like if scientists disagreed widely about the values of likelihoods. Each practitioner interprets a theory to say quite different things about how likely it is that various possible evidence statements will turn out to be true. Whereas scientist \(\alpha\) takes theory \(h_1\) to probabilistically imply that event \(e\) is highly likely, his colleague \(\beta\) understands the empirical import of \(h_1\) to say that \(e\) is very unlikely. And, conversely, \(\alpha\) takes competing theory \(h_2\) to probabilistically imply that \(e\) is very unlikely, whereas \(\beta\) reads \(h_2\) to say that \(e\) is extremely likely. So, for \(\alpha\) the evidential outcome \(e\) supplies strong support for \(h_1\) over \(h_2\), because

But his colleague \(\beta\) takes outcome \(e\) to show just the opposite, that \(h_2\) is strongly supported over \(h_1\), because

If this kind of situation were to occur often, or for significant evidence claims in a scientific domain, it would make a shambles of the empirical objectivity of that science. It would completely undermine the empirical testability of such hypotheses and theories within that scientific domain. Under these circumstances, although each scientist employs the same sentences to express a given theory \(h_i\), each understands the empirical import of these sentences so differently that \(h_i\) as understood by \(\alpha\) is an empirically different theory than \(h_i\) as understood by \(\beta\). (Indeed, arguably, \(\alpha\) must take at least one of the two sentences, \(h_1\) or \(h_2\), to express a different proposition than does \(\beta\).) Thus, the empirical objectivity of the sciences requires that experts should be in close agreement about the values of the likelihoods. [ 7 ]

For now we will suppose that the likelihoods have objective or intersubjectively agreed values, common to all agents in a scientific community. We mark this agreement by dropping the subscript ‘\(\alpha\)’, ‘\(\beta\)’, etc., from expressions that represent likelihoods, since all support functions under consideration are supposed to agree on the values for likelihoods. One might worry that this supposition is overly strong. There are legitimate scientific contexts where, although scientists should have enough of a common understanding of the empirical import of hypotheses to assign quite similar values to likelihoods, precise agreement on their numerical values may be unrealistic. This point is right in some important kinds of cases. So later, in Section 5, we will see how to relax the supposition that precise likelihood values are available, and see how the logic works in such cases. But for now the main ideas underlying probabilistic inductive logic will be more easily explained if we focus on those contexts were objective or intersubjectively agreed likelihoods are available. Later we will see that much the same logic continues to apply in contexts where the values of likelihoods may be somewhat vague, or where members of the scientific community disagree to some extent about their values.

An adequate treatment of the likelihoods calls for the introduction of one additional notational device. Scientific hypotheses are generally tested by a sequence of experiments or observations conducted over a period of time. To explicitly represent the accumulation of evidence, let the series of sentences \(c_1\), \(c_2\), …, \(c_n\), describe the conditions under which a sequence of experiments or observations are conducted. And let the corresponding outcomes of these observations be represented by sentences \(e_1\), \(e_2\), …, \(e_n\). We will abbreviate the conjunction of the first n descriptions of experimental or observational conditions by ‘\(c^n\)’, and abbreviate the conjunction of descriptions of their outcomes by ‘\(e^n\)’. Then, for a stream of n observations or experiments and their outcomes, the likelihoods take form \(P[e^n \pmid h_{i}\cdot b\cdot c^{n}] = r\), for appropriate values of \(r\). In many cases the likelihood of the evidence stream will be equal to the product of the likelihoods of the individual outcomes:

When this equality holds, the individual bits of evidence are said to be probabilistically independent on the hypothesis (together with auxiliaries) . In the following account of the logic of evidential support, such probabilistic independence will not be assumed, except in those places where it is explicitly invoked.

The probabilistic logic of evidential support represents the net support of a hypothesis by the posterior probability of the hypothesis , \(P_{\alpha}[h_i \pmid b\cdot c^{n}\cdot e^{n}]\). The posterior probability represents the net support for the hypothesis that results from the evidence, \(c^n \cdot e^n\), together with whatever plausibility considerations are taken to be relevant to the assessment of \(h_i\). Whereas the likelihoods are the means through which evidence contributes to the posterior probability of a hypothesis, all other relevant plausibility consideration are represented by a separate factor, called the prior probability of the hypothesis : \(P_{\alpha}[h_i \pmid b]\). The prior probability represents the weight of any important considerations not captured by the evidential likelihoods. Any relevant considerations that go beyond the evidence itself may be explicitly stated within expression \(b\) (in addition to whatever auxiliary hypotheses \(b\) may contain in support of the likelihoods). Thus, the prior probability of \(h_i\) may depend explicitly on the content of \(b\). It turns out that posterior probabilities depend only on the values of evidential likelihoods together with the values of prior probabilities.

As an illustration of the role of prior probabilities , consider the HIV test example described in the previous section. What the physician and the patient want to know is the value of the posterior probability, \(P_{\alpha}[h \pmid b\cdot c\cdot e]\), that the patient has HIV, \(h\), given the evidence of the positive test, \(c\cdot e\), and given the error rates of the test, described within \(b\). The value of this posterior probability depends on the likelihood (due to the error rates) of this patient obtaining a true-positive result, \(P[e \pmid h\cdot b\cdot c] = .99\), and of obtaining a false-positive result, \(P[e \pmid {\nsim}h\cdot b\cdot c] = .05\). In addition, the value of the of the posterior probability depends on how plausible it is that the patient has HIV prior to taking the test results into account, \(P_{\alpha}[h \pmid b]\). In the context of medical diagnosis, this prior probability is usually assessed on the basis of the base rate for HIV in the patient’s risk group (i.e., whether the patient is an IV drug user, has unprotected sex with multiple partners, etc.). On a rigorous approach to the logic, such information and its risk-relevance should be explicitly stated within the background information \(b\). To see the importance of this information, consider the following numerical results (which may be calculated using the formula called Bayes’ Theorem, presented in the next section). If the base rate for the patient’s risk group is relatively high, say \(P_{\alpha}[h \pmid b] = .10\), then the positive test result yields a posterior probability value for his having HIV of \(P_{\alpha}[h \pmid b\cdot c\cdot e] = .69\). However, if the patient is in a very low risk group, say \(P_{\alpha}[h \pmid b] = .001\), then a positive test result only raises the posterior probability of his having an HIV infection to \(P_{\alpha}[h \pmid b\cdot c\cdot e] = .02\). This posterior probability is much higher than the prior probability of .001, but should not worry the patient too much. This positive test result may well be due to the comparatively high false-positive rate for the test, rather than to the presence of HIV. This sort of test, with a false-positive rate as large as .05, is best used as a screening test; a positive result warrants conducting a second, more rigorous, less error-prone test.

More generally, in the evidential evaluation of scientific hypotheses and theories, prior probabilities represent assessments of non-evidential plausibility weightings among hypotheses. However, because the strengths of such plausibility assessments may vary among members of a scientific community, critics often brand such assessments as merely subjective , and take their role in Bayesian inference to be highly problematic. Bayesian inductivists counter that plausibility assessments play an important, legitimate role in the sciences, especially when evidence cannot suffice to distinguish among some alternative hypotheses. And, they argue, the epithet “merely subjective” is unwarranted. Such plausibility assessments are often backed by extensive arguments that may draw on forceful conceptual considerations.

Scientists often bring plausibility arguments to bear in assessing competing views. Although such arguments are seldom decisive, they may bring the scientific community into widely shared agreement, especially with regard to the implausibility of some logically possible alternatives. This seems to be the primary epistemic role of thought experiments. Consider, for example, the kinds of plausibility arguments that have been brought to bear on the various interpretations of quantum theory (e.g., those related to the measurement problem). These arguments go to the heart of conceptual issues that were central to the original development of the theory. Many of these issues were first raised by those scientists who made the greatest contributions to the development of quantum theory, in their attempts to get a conceptual hold on the theory and its implications.

Given any body of evidence, it is fairly easy to cook up a host of logically possible alternative hypotheses that make the evidence as probable as desired. In particular, it is easy to cook up hypotheses that logically entail any given body evidence, providing likelihood values equal to 1 for all the available evidence. Although most of these cooked up hypotheses will be laughably implausible, evidential likelihoods cannot rule them out. But, the only factors other than likelihoods that figure into the values of posterior probabilities for hypotheses are the values of their prior probabilities; so only prior probability assessments provide a place for the Bayesian logic to bring important plausibility considerations to bear. Thus, the Bayesian logic can only give implausible hypotheses their due via prior probability assessments.

It turns out that the mathematical structure of Bayesian inference makes prior probabilities especially well-suited to represent plausibility assessments among competing hypotheses. For, in the fully fleshed out account of evidential support for hypotheses (spelled out below), it will turn out that only ratios of prior probabilities for competing hypotheses, \(P_{\alpha}[h_j \pmid b] / P_{\alpha}[h_i \pmid b]\), together with ratios of likelihoods, \(P_{\alpha}[e \pmid h_j\cdot b\cdot c] / P_{\alpha}[e \pmid h_2\cdot b\cdot c]\), play essential roles. The ratio of prior probabilities is well-suited to represent how much more (or less) plausible hypothesis \(h_j\) is than competing hypothesis \(h_i\). Furthermore, the plausibility arguments on which such this comparative assessment is based may be explicitly stated within \(b\). So, given that an inductive logic needs to incorporate well-considered plausibility assessments (e.g. in order to lay low wildly implausible alternative hypotheses), the comparative assessment of Bayesian prior probabilities seems well-suited to do the job.

Thus, although prior probabilities may be subjective in the sense that agents may disagree on the relative strengths of plausibility arguments, the priors used in scientific contexts need not represent mere subjective whims . Rather, the comparative strengths of the priors for hypotheses should be supported by arguments about how much more plausible one hypothesis is than another. The important role of plausibility assessments is captured by such received bits of scientific wisdom as the well-known scientific aphorism, extraordinary claims require extraordinary evidence . That is, it takes especially strong evidence, in the form of extremely high values for (ratios of) likelihoods, to overcome the extremely low pre-evidential plausibility values possessed by some hypotheses. In the next section we’ll see precisely how this idea works, and we’ll return to it again in Section 3.4 .

When sufficiently strong evidence becomes available, it turns out that the contributions of prior plausibility assessments to the values of posterior probabilities may be substantially “washed out”, overridden by the evidence. That is, provided the prior probability of a true hypothesis isn’t assessed to be too close to zero, the influence of the values of the prior probabilities will very probably fade away as evidence accumulates. In Section 4 we’ll see precisely how this kind of Bayesian convergence to the true hypothesis works. Thus, it turns out that prior plausibility assessments play their most important role when the distinguishing evidence represented by the likelihoods remains weak.

One more point before moving on to the logic of Bayes’ Theorem. Some Bayesian logicists have maintained that posterior probabilities of hypotheses should be determined by syntactic logical form alone. The idea is that the likelihoods might reasonably be specified in terms of syntactic logical form; so if syntactic form might be made to determine the values of prior probabilities as well, then inductive logic would be fully “formal” in the same way that deductive logic is “formal”. Keynes and Carnap tried to implement this idea through syntactic versions of the principle of indifference—the idea that syntactically similar hypotheses should be assigned the same prior probability values. Carnap showed how to carry out this project in detail, but only for extremely simple formal languages. Most logicians now take the project to have failed because of a fatal flaw with the whole idea that reasonable prior probabilities can be made to depend on logical form alone. Semantic content should matter. Goodmanian grue-predicates provide one way to illustrate this point. [ 8 ] Furthermore, as suggested earlier, for this idea to apply to the evidential support of real scientific theories, scientists would have to assess the prior probabilities of each alternative theory based only on its syntactic structure. That seems an unreasonable way to proceed. Are we to evaluate the prior probabilities of alternative theories of gravitation, or for alternative quantum theories, by exploring only their syntactic structures, with absolutely no regard for their content—with no regard for what they say about the world? This seems an extremely dubious approach to the evaluation of real scientific theories. Logical structure alone cannot, and should not suffice for determining reasonable prior probability values for real scientific theories. Moreover, real scientific hypotheses and theories are inevitably subject to plausibility considerations based on what they say about the world. Prior probabilities are well-suited to represent the comparative weight of plausibility considerations for alternative hypotheses. But no reasonable assessment of comparative plausibility can derive solely from the logical form of hypotheses.

We will return to a discussion of prior probabilities a bit later. Let’s now see how Bayesian logic combines likelihoods with prior probabilities to yield posterior probabilities for hypotheses.

Any probabilistic inductive logic that draws on the usual rules of probability theory to represent how evidence supports hypotheses must be a Bayesian inductive logic in the broad sense. For, Bayes’ Theorem follows directly from the usual axioms of probability theory. Its importance derives from the relationship it expresses between hypotheses and evidence. It shows how evidence, via the likelihoods, combines with prior probabilities to produce posterior probabilities for hypotheses. We now examine several forms of Bayes’ Theorem, each derivable from axioms 1–5 .

The simplest version of Bayes’ Theorem as it applies to evidence for a hypothesis goes like this:

Bayes’ Theorem: Simple Form

This equation expresses the posterior probability of hypothesis \(h_i\) due to evidence \(e\), \(P_{\alpha}[h_i \pmid e]\), in terms of the likelihood of the evidence on that hypothesis, \(P_{\alpha}[e \pmid h_i]\), the prior probability of the hypothesis , \(P_{\alpha}[h_i]\), and the simple probability of the evidence , \(P_{\alpha}[e]\). The factor \(P_{\alpha}[e]\) is often called the expectedness of the evidence . Written this way, the theorem suppresses the experimental (or observational) conditions, \(c\), and all background information and auxiliary hypotheses, \(b\). As discussed earlier, both of these terms play an important role in logically connecting the hypothesis at issue, \(h_i\), to the evidence \(e\). In scientific contexts the objectivity of the likelihoods, \(P_{\alpha}[e \pmid h_i\cdot b \cdot c]\), almost always depends on such terms. So, although the suppression of experimental (or observational) conditions and auxiliary hypotheses is a common practice in accounts of Bayesian inference, the treatment below, and throughout the remainder of this article will make the role of these terms explicit.

The subscript \(\alpha\) on the evidential support function \(P_{\alpha}\) is there to remind us that more than one such function exists. A host of distinct probability functions satisfy axioms 1–5 , so each of them satisfies Bayes’ Theorem. Some of these probability functions may provide a better fit with our intuitive conception of how the evidential support for hypotheses should work. Nevertheless, there are bound to be reasonable differences among Bayesian agents regarding to the initial plausibility of a hypothesis \(h_i\). This diversity in initial plausibility assessments is represented by diverse values for prior probabilities for the hypothesis: \(P_{\alpha}[h_i]\), \(P_{\beta}[h_i]\), \(P_{\gamma}[h_i]\), etc. This usually results in diverse values for posterior probabilities for hypotheses: \(P_{\alpha}[h_i \pmid e]\), \(P_{\beta}[h_i \pmid e]\), \(P_{\gamma}[h_i \pmid e]\), etc. So it is important to keep the diversity among evidential support functions in mind.

Here is how the Simple Form of Bayes’ Theorem looks when terms for the experimental (or observational) conditions, \(c\), and the background information and auxiliary hypotheses \(b\) are made explicit:

Bayes’ Theorem: Simple Form with explicit Experimental Conditions, Background Information and Auxiliary Hypotheses

This version of the theorem determines the posterior probability of the hypothesis, \(P_{\alpha}[h_i \pmid b\cdot c\cdot e]\), from the value of the likelihood of the evidence according to that hypothesis (taken together with background and auxiliaries and the experimental conditions), \(P[e \pmid h_i\cdot b\cdot c]\), the value of the prior probability of the hypothesis (on background and auxiliaries), \(P_{\alpha}[h_i \pmid b]\), and the value of the expectedness of the evidence (on background and auxiliaries and the experimental conditions), \(P_{\alpha}[e \pmid b\cdot c]\). Notice that in the factor for the likelihood, \(P[e \pmid h_i\cdot b\cdot c]\), the subscript \(\alpha\) has been dropped. This marks the fact that in scientific contexts the likelihood of an evidential outcome \(e\) on the hypothesis together with explicit background and auxiliary hypotheses and the description of the experimental conditions, \(h_i\cdot b\cdot c\), is usually objectively determinate. This factor represents what the hypothesis (in conjunction with background and auxiliaries) objectively says about the likelihood of possible evidential outcomes of the experimental conditions. So, all reasonable support functions should agree on the values for likelihoods. (Section 5 will treat cases where the likelihoods may lack this kind of objectivity.)

This version of Bayes’ Theorem includes a term that represents the ratio of the likelihood of the experimental conditions on the hypothesis and background information (and auxiliaries) to the “likelihood” of the experimental conditions on the background (and auxiliaries) alone: \(P_{\alpha}[c \pmid h_i\cdot b]/ P_{\alpha}[c \pmid b]\). Arguably the value of this term should be 1, or very nearly 1, since the truth of the hypothesis at issue should not significantly affect how likely it is that the experimental conditions are satisfied. If various alternative hypotheses assign significantly different likelihoods to the experimental conditions themselves, then such conditions should more properly be included as part of the evidential outcome \(e\).

Both the prior probability of the hypothesis and the expectedness tend to be somewhat subjective factors in that various agents from the same scientific community may legitimately disagree on what values these factors should take. Bayesian logicians usually accept the apparent subjectivity of the prior probabilities of hypotheses, but find the subjectivity of the expectedness to be more troubling. This is due at least in part to the fact that in a Bayesian logic of evidential support the value of the expectedness cannot be determined independently of likelihoods and prior probabilities of hypotheses. That is, when, for each member of a collection of alternative hypotheses, the likelihood \(P[e \pmid h_j\cdot b\cdot c]\) has an objective (or intersubjectively agreed) value, the expectedness is constrained by the following equation (where the sum ranges over a mutually exclusive and exhaustive collection of alternative hypotheses \(\{h_1, h_2 , \ldots ,h_m , \ldots \}\), which may be finite or countably infinite):

This equation shows that the values for the prior probabilities together with the values of the likelihoods uniquely determine the value for the expectedness of the evidence . Furthermore, it implies that the value of the expectedness must lie between the largest and smallest of the various likelihood values implied by the alternative hypotheses. However, the precise value of the expectedness can only be calculated this way when every alternative to hypothesis \(h_j\) is specified. In cases where some alternative hypotheses remain unspecified (or undiscovered), the value of the expectedness is constrained in principle by the totality of possible alternative hypotheses, but there is no way to figure out precisely what its value should be.

Troubles with determining a numerical value for the expectedness of the evidence may be circumvented by appealing to another form of Bayes’ Theorem, a ratio form that compares hypotheses one pair at a time:

Bayes’ Theorem: Ratio Form

The clause \(P_{\alpha}[c \pmid h_j\cdot b] = P_{\alpha}[c \pmid h_i\cdot b]\) says that the experimental (or observation) condition described by \(c\) is as likely on \((h_i\cdot b)\) as on \((h_j\cdot b)\) — i.e., the experimental or observation conditions are no more likely according to one hypothesis than according to the other. [ 9 ]

This Ratio Form of Bayes’ Theorem expresses how much more plausible, on the evidence, one hypothesis is than another. Notice that the likelihood ratios carry the full import of the evidence. The evidence influences the evaluation of hypotheses in no other way. The only other factor that influences the value of the ratio of posterior probabilities is the ratio of the prior probabilities. When the likelihoods are fully objective, any subjectivity that affects the ratio of posteriors can only arise via subjectivity in the ratio of the priors.

This version of Bayes’s Theorem shows that in order to evaluate the posterior probability ratios for pairs of hypotheses, the prior probabilities of hypotheses need not be evaluated absolutely; only their ratios are needed. That is, with regard to the priors, the Bayesian evaluation of hypotheses only relies on how much more plausible one hypothesis is than another (due to considerations expressed within b ). This kind of Bayesian evaluation of hypotheses is essentially comparative in that only ratios of likelihoods and ratios of prior probabilities are ever really needed for the assessment of scientific hypotheses. Furthermore, we will soon see that the absolute values of the posterior probabilities of hypotheses entirely derive from the posterior probability ratios provided by the Ratio Form of Bayes’ Theorem.

When the evidence consists of a collection of n distinct experiments or observations, we may explicitly represent this fact by replacing the term ‘\(c\)’ by the conjunction of experimental or observational conditions, \((c_1\cdot c_2\cdot \ldots \cdot c_n)\), and replacing the term ‘\(e\)’ by the conjunction of their respective outcomes, \((e_1\cdot e_2\cdot \ldots \cdot e_n)\). For notational convenience, let’s use the term ‘\(c^n\)’ to abbreviate the conjunction of n the experimental conditions, and we use the term ‘\(e^n\)’ to abbreviate the corresponding conjunction of n their respective outcomes. Relative to any given hypothesis \(h\), the evidential outcomes of distinct experiments or observations will usually be probabilistically independent of one another, and also independent of the experimental conditions for one another. In that case we have:

When the Ratio Form of Bayes’ Theorem is extended to explicitly represent the evidence as consisting of a collection of n of distinct experiments (or observations) and their respective outcomes, it takes the following form.

Bayes’ Theorem: Ratio Form for a Collection of n Distinct Evidence Claims

Furthermore, when evidence claims are probabilistically independent of one another, we have

Let’s consider a simple example of how the Ratio Form of Bayes’ Theorem applies to a collection of independent evidential events. Suppose we possess a warped coin and want to determine its propensity for heads when tossed in the usual way. Consider two hypotheses, \(h_{[p]}\) and \(h_{[q]}\), which say that the propensities for the coin to come up heads on the usual kinds of tosses are \(p\) and \(q\), respectively. Let \(c^n\) report that the coin is tossed n times in the normal way, and let \(e^n\) report that precisely m occurrences of heads has resulted. Supposing that the outcomes of such tosses are probabilistically independent (asserted by \(b\)), the respective likelihoods take the binomial form

with \(r\) standing in for \(p\) and for \(q\), respectively. Then, Equation 9** yields the following formula, where the likelihood ratio is the ratio of the respective binomial terms:

When, for instance, the coin is tossed \(n = 100\) times and comes up heads \(m = 72\) times, the evidence for hypothesis \(h_{[1/2]}\) as compared to \(h_{[3/4]}\) is given by the likelihood ratio

In that case, even if the prior plausibility considerations (expressed within \(b\)) make it 100 times more plausible that the coin is fair than that it is warped towards heads with propensity 3/4 — i.e., even if \(P_{\alpha}[h_{[1/2]} \pmid b] / P_{\alpha}[h_{[3/4]} \pmid b] = 100\) — the evidence provided by these tosses makes the posterior plausibility that the coin is fair only about 6/1000 ths as plausible as the hypothesis that it is warped towards heads with propensity 3/4 :

Thus, such evidence strongly refutes the “fairness hypothesis” relative to the “3/4- heads hypothesis”, provided the assessment of prior prior plausibilities doesn’t make the latter hypothesis too extremely implausible to begin with. Notice, however, that strong refutation is not absolute refutation . Additional evidence could reverse this trend towards the refutation of the fairness hypothesis .

This example employs repetitions of the same kind of experiment—repeated tosses of a coin. But the point holds more generally. If, as the evidence increases, the likelihood ratios

approach 0, then the Ratio Forms of Bayes’ Theorem, Equations \(9*)\) and \(9**)\), show that the posterior probability of \(h_j\) must approach 0 as well, since

Such evidence comes to strongly refute \(h_j\), with little regard for its prior plausibility value. Indeed, Bayesian induction turns out to be a version of eliminative induction , and Equation \(9*\) and \(9**\) begin to illustrate this. For, suppose that \(h_i\) is the true hypothesis, and consider what happens to each of its false competitors, \(h_j\). If enough evidence becomes available to drive each of the likelihood ratios

toward 0 (as n increases), then Equation \(9*\) says that each false \(h_j\) will become effectively refuted — each of their posterior probabilities will approaches 0 (as n increases). As a result, the posterior probability of \(h_i\) must approach 1. The next two equations show precisely how this works.

If we sum the ratio versions of Bayes’ Theorem in Equation \(9*\) over all alternatives to hypothesis \(h_i\) (including the catch-all alternative \(h_K\), if appropriate), we get the Odds Form of Bayes’ Theorem. By definition, the odds against a statement \(A\) given \(B\) is related to the probability of \(A\) given \(B\) as follows:

This notion of odds gives rise to the following version of Bayes’ Theorem:

Bayes’ Theorem: Odds Form

where the factor following the ‘ + ’ sign is only required in cases where a catch-all alternative hypothesis, \(h_K\), is needed.

Recall that when we have a finite collection of concrete alternative hypotheses available, \(\{h_1, h_2 , \ldots ,h_m\}\), but where this set of alternatives is not exhaustive (where additional, unarticulated, undiscovered alternative hypotheses may exist), the catch-all alternative hypothesis \(h_K\) is just the denial of each of the concrete alternatives, \(({\nsim}h_1\cdot{\nsim}h_2\cdot \ldots \cdot{\nsim}h_m)\). Generally, the likelihood of evidence claims relative to a catch-all hypothesis will not enjoy the same kind of objectivity possessed by the likelihoods for concrete alternative hypotheses. So, we leave the subscript \(\alpha\) attached to the likelihood for the catch-all hypothesis to indicate this lack of objectivity.

Although the catch-all hypothesis may lack objective likelihoods, the influence of the catch-all term in Bayes’ Theorem diminishes as additional concrete hypotheses are articulated. That is, as new hypotheses are discovered they are “peeled off” of the catch-all. So, when a new hypothesis \(h_{m+1}\) is formulated and made explicit, the old catch-all hypothesis \(h_K\) is replaced by a new catch-all, \(h_{K*}\), of form \(({\nsim}h_1\cdot \cdot{\nsim}h_2\cdot \ldots \cdot{\nsim}h_{m}\cdot{\nsim}h_{m+1})\); and the prior probability for the new catch-all hypothesis is gotten by diminishing the prior of the old catch-all: \(P_{\alpha}[h_{K*} \pmid b] = P_{\alpha}[h_K \pmid b] - P_{\alpha}[h_{m+1} \pmid b]\). Thus, the influence of the catch-all term should diminish towards 0 as new alternative hypotheses are made explicit. [ 10 ]

If increasing evidence drives towards 0 the likelihood ratios comparing each competitor \(h_j\) with hypothesis \(h_i\), then the odds against \(h_i\), \(\Omega_{\alpha}[{\nsim}h_i \pmid b\cdot c^{n}\cdot e^{n}]\), will approach 0 (provided that priors of catch-all terms, if needed, approach 0 as well, as new alternative hypotheses are made explicit and peeled off). And, as \(\Omega_{\alpha}[{\nsim}h_i \pmid b\cdot c^{n}\cdot e^{n}]\) approaches 0, the posterior probability of \(h_i\) goes to 1. This derives from the fact that the odds against \(h_i\) is related to and its posterior probability by the following formula:

Bayes’ Theorem: General Probabilistic Form

The odds against a hypothesis depends only on the values of ratios of posterior probabilities , which entirely derive from the Ratio Form of Bayes’ Theorem. Thus, we see that the individual value of the posterior probability of a hypothesis depends only on the ratios of posterior probabilities , which come from the Ratio Form of Bayes’ Theorem. Thus, the Ratio Form of Bayes’ Theorem captures all the essential features of the Bayesian evaluation of hypothesis. It shows how the impact of evidence (in the form of likelihood ratios) combines with comparative plausibility assessments of hypotheses (in the form of ratios of prior probabilities) to provide a net assessment of the extent to which hypotheses are refuted or supported via contests with their rivals.

There is a result, a kind of Bayesian Convergence Theorem , that shows that if \(h_i\) (together with \(b\cdot c^n)\) is true, then the likelihood ratios

comparing evidentially distinguishable alternative hypothesis \(h_j\) to \(h_i\) will very probably approach 0 as evidence accumulates (i.e., as n increases). Let’s call this result the Likelihood Ratio Convergence Theorem . When this theorem applies, Equation \(9^*\) shows that the posterior probability of a false competitor \(h_j\) will very probably approach 0 as evidence accumulates, regardless of the value of its prior probability \(P_{\alpha}[h_j \pmid b]\). As this happens to each of \(h_i\)’s false competitors, Equations 10 and 11 say that the posterior probability of the true hypothesis, \(h_i\), will approach 1 as evidence increases. [ 11 ] Thus, Bayesian induction is at bottom a version of induction by elimination , where the elimination of alternatives comes by way of likelihood ratios approaching 0 as evidence accumulates. Thus, when the Likelihood Ratio Convergence Theorem applies, the Criterion of Adequacy for an Inductive Logic described at the beginning of this article will be satisfied: As evidence accumulates, the degree to which the collection of true evidence statements comes to support a hypothesis, as measured by the logic, should very probably come to indicate that false hypotheses are probably false and that true hypotheses are probably true. We will examine this Likelihood Ratio Convergence Theorem in Section 4 . [ 12 ]

A view called Likelihoodism relies on likelihood ratios in much the same way as the Bayesian logic articulated above. However, Likelihoodism attempts to avoid the use of prior probabilities. For an account of this alternative view, see the supplement Likelihood Ratios, Likelihoodism, and the Law of Likelihood . For more discussion of Bayes’ Theorem and its application, see the entries on Bayes’ Theorem and on Bayesian Epistemology in this Encyclopedia .

Given that a scientific community should largely agree on the values of the likelihoods, any significant disagreement among them with regard to the values of posterior probabilities of hypotheses should derive from disagreements over their assessments of values for the prior probabilities of those hypotheses. We saw in Section 3.3 that the Bayesian logic of evidential support need only rely on assessments of ratios of prior probabilities —on how much more plausible one hypothesis is than another. Thus, the logic of evidential support only requires that scientists can assess the comparative plausibilities of various hypotheses. Presumably, in scientific contexts the comparative plausibility values for hypotheses should depend on explicit plausibility arguments, not merely on privately held opinions. (Formally, the logic may represent comparative plausibility arguments by explicit statements expressed within \(b\).) It would be highly unscientific for a member of the scientific community to disregard or dismiss a hypothesis that other members take to be a reasonable proposal with only the comment, “don’t ask me to give my reasons, it’s just my opinion”. Even so, agents may be unable to specify precisely how much more strongly the available plausibility arguments support a hypothesis over an alternative; so prior probability ratios for hypotheses may be vague. Furthermore, agents in a scientific community may disagree about how strongly the available plausibility arguments support a hypothesis over a rival hypothesis; so prior probability ratios may be somewhat diverse as well.

Both the vagueness of comparative plausibilities assessments for individual agents and the diversity of such assessments among the community of agents can be represented formally by sets of support functions, \(\{P_{\alpha}, P_{\beta}, \ldots \}\), that agree on the values for the likelihoods but encompass a range of values for the (ratios of) prior probabilities of hypotheses. Vagueness and diversity are somewhat different issues, but they may be represented in much the same way. Let’s briefly consider each in turn.

Assessments of the prior plausibilities of hypotheses will often be vague—not subject to the kind of precise quantitative treatment that a Bayesian version of probabilistic inductive logic may seem to require for prior probabilities. So, it may seem that the kind of assessment of prior probabilities required to get the Bayesian algorithm going cannot be accomplished in practice. To see how Bayesian inductivists address this worry, first recall the Ratio Form of Bayes’ Theorem, Equation \(9^*\).

Recall that this Ratio Form of the theorem captures the essential features of the logic of evidential support, even though it only provides a value for the ratio of the posterior probabilities. Notice that the ratio form of the theorem easily accommodates situations where we don’t have precise numerical values for prior probabilities. It only depends on our ability to assess how much more or less plausible alternative hypothesis \(h_j\) is than hypothesis \(h_i\)—only the value of the ratio \(P_{\alpha}[h_j \pmid b] / P_{\alpha}[h_i \pmid b]\) need be assessed; the values of the individual prior probabilities are not needed. Such comparative plausibilities are much easier to assess than specific numerical values for the prior probabilities of individual hypotheses. When combined with the ratio of likelihoods , this ratio of priors suffices to yield an assessment of the ratio of posterior plausibilities ,

Although such posterior ratios don’t supply values for the posterior probabilities of individual hypotheses, they place a crucial constraint on the posterior support of hypothesis \(h_j\), since

This Ratio Form of Bayes’ Theorem tolerates a good deal of vagueness or imprecision in assessments of the ratios of prior probabilities. In practice one need only assess bounds for these prior plausibility ratios to achieve meaningful results. Given a prior ratio in a specific interval,

a likelihood ratio

results in a posterior support ratio in the interval

(Technically each probabilistic support function assigns a specific numerical value to each pair of sentences; so when we write an inequality like

we are really referring to a set of probability functions \(P_{\alpha}\), a vagueness set , for which the inequality holds. Thus, technically, the Bayesian logic employs sets of probabilistic support functions to represent the vagueness in comparative plausibility values for hypotheses.)

Observe that if the likelihood ratio values \(\LR^n\) approach 0 as the amount of evidence \(e^n\) increases, the interval of values for the posterior probability ratio must become tighter as the upper bound (\(\LR^n\times r)\) approaches 0. Furthermore, the absolute degree of support for \(h_j\), \(P_{\alpha}[h_j \pmid b\cdot c^{n}\cdot e^{n}]\), must also approach 0.

This observation is really useful. For, it can be shown that when \(h_{i}\cdot b\cdot c^{n}\) is true and \(h_j\) is empirically distinct from \(h_i\), the continual pursuit of evidence is very likely to result in evidential outcomes \(e^n\) that (as n increases) yield values of likelihood ratios \(P[e^n \pmid h_{j}\cdot b\cdot c^{n}] / P[e^n \pmid h_{i}\cdot b\cdot c^{n}]\) that approach 0 as the amount of evidence increases. This result, called the Likelihood Ratio Convergence Theorem , will be investigated in more detail in Section 4 . When that kind of convergence towards 0 for likelihood ratios occurs, the upper bound on the posterior probability ratio also approaches 0, driving the posterior probability of \(h_j\) to approach 0 as well, effectively refuting hypothesis \(h_j\). Thus, false competitors of a true hypothesis will effectively be eliminated by increasing evidence. As this happens, Equations 9* through 11 show that the posterior probability \(P_{\alpha}[h_i \pmid b\cdot c^{n}\cdot e^{n}]\) of the true hypothesis \(h_i\) approaches 1.

Thus, Bayesian logic of inductive support for hypotheses is a form of eliminative induction, where the evidence effectively refutes false alternatives to the true hypothesis. Because of its eliminative nature, the Bayesian logic of evidential support doesn’t require precise values for prior probabilities. It only needs to draw on bounds on the values of comparative plausibility ratios, and these bounds only play a significant role while evidence remains fairly weak. If the true hypothesis is assessed to be comparatively plausible (due to plausibility arguments contained in b ), then plausibility assessments give it a leg-up over alternatives. If the true hypothesis is assessed to be comparatively implausible, the plausibility assessments merely slow down the rate at which it comes to dominate its rivals, reflecting the idea that extraordinary hypotheses require extraordinary evidence (or an extraordinary accumulation of evidence) to overcome their initial implausibilities. Thus, as evidence accumulates, the agent’s vague initial plausibility assessments transform into quite sharp posterior probabilities that indicate their strong refutation or support by the evidence.

When the various agents in a community may widely disagree over the non-evidential plausibilities of hypotheses, the Bayesian logic of evidential support may represent this kind of diversity across the community of agents as a collection of the agents’ vagueness sets of support functions. Let’s call such a collection of support functions a diversity set . That is, a diversity set is just a set of support functions \(P_{\alpha}\) that cover the ranges of values for comparative plausibility assessments for pairs of competing hypotheses

as assessed by the scientific community. But, once again, if accumulating evidence drives the likelihood ratios comparing various alternative hypotheses to the true hypothesis towards 0, the range of support functions in a diversity set will come to near agreement, near 0, on the values for posterior probabilities of false competitors of the true hypothesis. So, not only does such evidence firm up each agent’s vague initial plausibility assessment, it also brings the whole community into agreement on the near refutation of empirically distinct competitors of a true hypothesis. As this happens, the posterior probability of the true hypothesis may approach 1. The Likelihood Ratio Convergence Theorem implies that this kind of convergence to the truth should very probably happen, provided that the true hypothesis is empirically distinct enough from its rivals.

One more point about prior probabilities and Bayesian convergence should be mentioned before proceeding to Section 4 . Some subjectivist versions of Bayesian induction seem to suggest that an agent’s prior plausibility assessments for hypotheses should stay fixed once-and-for-all, and that all plausibility updating should be brought about via the likelihoods in accord with Bayes’ Theorem. Critics argue that this is unreasonable. The members of a scientific community may quite legitimately revise their (comparative) prior plausibility assessments for hypotheses from time to time as they rethink plausibility arguments and bring new considerations to bear. This seems a natural part of the conceptual development of a science. It turns out that such reassessments of the comparative plausibilities of hypotheses poses no difficulty for the probabilistic inductive logic discussed here. Such reassessments may be represented by the addition or modification of explicit statements that modify the background information b . Such reassessments may result in (non-Bayesian) transitions to new vagueness sets for individual agents and new diversity sets for the community. The logic of Bayesian induction (as described here) has nothing to say about what values the prior plausibility assessments for hypotheses should have; and it places no restrictions on how they might change over time. Provided that the series of reassessments of (comparative) prior plausibilities doesn’t happen to diminish the (comparative) prior plausibility value of the true hypothesis towards zero (or, at least, doesn’t do so too quickly), the Likelihood Ratio Convergence Theorem implies that the evidence will very probably bring the posterior probabilities of empirically distinct rivals of the true hypothesis to approach 0 via decreasing likelihood ratios; and as this happens, the posterior probability of the true hypothesis will head towards 1.

(Those interested in a Bayesian account of Enumerative Induction and the estimation of values for relative frequencies of attributes in populations should see the supplement, Enumerative Inductions: Bayesian Estimation and Convergence .)

4. The Likelihood Ratio Convergence Theorem

In this section we will investigate the Likelihood Ratio Convergence Theorem . This theorem shows that under certain reasonable conditions, when hypothesis \(h_i\) (in conjunction with auxiliaries in b ) is true and an alternative hypothesis \(h_j\) is empirically distinct from \(h_i\) on some possible outcomes of experiments or observations described by conditions \(c_k\), then it is very likely that a long enough sequence of such experiments and observations c\(^n\) will produce a sequence of outcomes \(e^n\) that yields likelihood ratios \(P[e^n \pmid h_{j}\cdot b\cdot c^{n}] / P[e^n \pmid h_{i}\cdot b\cdot c^{n}]\) that approach 0, favoring \(h_i\) over \(h_j\), as evidence accumulates (i.e., as n increases). This theorem places an explicit lower bound on the “rate of probable convergence” of these likelihood ratios towards 0. That is, it puts a lower bound on how likely it is, if \(h_i\) is true, that a stream of outcomes will occur that yields likelihood ratio values against \(h_j\) as compared to \(h_i\) that lie within any specified small distance above 0.

The theorem itself does not require the full apparatus of Bayesian probability functions. It draws only on likelihoods. Neither the statement of the theorem nor its proof employ prior probabilities of any kind. So even likelihoodists , who eschew the use of Bayesian prior probabilities, may embrace this result. Given the forms of Bayes’ Theorem, 9*-11 from the previous section, the Likelihood Ratio Convergence Theorem further implies the likely convergence to 0 of the posterior probabilities of false competitors of a true hypothesis. That is, when the ratios \(P[e^n \pmid h_{j}\cdot b\cdot c^{n}] / P[e^n \pmid h_{i}\cdot b\cdot c^{n}]\) approach 0 for increasing n , the Ratio Form of Bayes’ Theorem, Equation 9* , says that the posterior probability of \(h_j\) must also approach 0 as evidence accumulates, regardless of the value of its prior probability. So, support functions in collections representing vague prior plausibilities for an individual agent (i.e., a vagueness set) and representing the diverse range of priors for a community of agents (i.e., a diversity set) will come to agree on the near 0 posterior probability of empirically distinct false rivals of a true hypothesis. And as the posterior probabilities of false competitors fall, the posterior probability of the true hypothesis heads towards 1. Thus, the theorem establishes that the inductive logic of probabilistic support functions satisfies the Criterion of Adequacy (CoA) suggested at the beginning of this article.

The Likelihood Ratio Convergence Theorem merely provides some sufficient conditions for probable convergence. But likelihood ratios may well converge towards 0 (in the way described by the theorem) even when the antecedent conditions of the theorem are not satisfied. This theorem overcomes many of the objections raised by critics of Bayesian convergence results. First, this theorem does not employ second-order probabilities ; it says noting about the probability of a probability. It only concerns the probability of a particular disjunctive sentence that expresses a disjunction of various possible sequences of experimental or observational outcomes. The theorem does not require evidence to consist of sequences of events that, according to the hypothesis, are identically distributed (like repeated tosses of a die). The result is most easily expressed in cases where the individual outcomes of a sequence of experiments or observations are probabilistically independent, given each hypothesis. So that is the version that will be presented in this section. However, a version of the theorem also holds when the individual outcomes of the evidence stream are not probabilistically independent, given the hypotheses. (This more general version of the theorem will be presented in a supplement on the Probabilistic Refutation Theorem , below, where the proof of both versions is provided.) In addition, this result does not rely on supposing that the probability functions involved are countably additive . Furthermore, the explicit lower bounds on the rate of convergence provided by this result means that there is no need to wait for the infinitely long run before convergence occurs (as some critics seem to think).

It is sometimes claimed that Bayesian convergence results only work when an agent locks in values for the prior probabilities of hypotheses once-and-for-all, and then updates posterior probabilities from there only by conditioning on evidence via Bayes Theorem. The Likelihood Ratio Convergence Theorem , however, applies even if agents revise their prior probability assessments over time. Such non-Bayesian shifts from one support function (or vagueness set) to another may arise from new plausibility arguments or from reassessments of the strengths of old ones. The Likelihood Ratio Convergence Theorem itself only involves the values of likelihoods. So, provided such reassessments don’t push the prior probability of the true hypothesis towards 0 too rapidly , the theorem implies that the posterior probabilities of each empirically distinct false competitor will very probably approach 0 as evidence increases. [ 13 ]

To specify the details of the Likelihood Ratio Convergence Theorem we’ll need a few additional notational conventions and definitions. Here they are.

For a given sequence of n experiments or observations \(c^n\), consider the set of those possible sequences of outcomes that would result in likelihood ratios for \(h_j\) over \(h_i\) that are less than some chosen small number \(\varepsilon \gt 0\). This set is represented by the expression,

Placing the disjunction symbol ‘\(\vee\)’ in front of this expression yields an expression,

that we’ll use to represent the disjunction of all outcome sequences \(e^n\) in this set. So,

is just a particular sentence that says, in effect, “one of the sequences of outcomes of the first n experiments or observations will occur that makes the likelihood ratio for \(h_j\) over \(h_i\) less than \(\varepsilon\)”.

The Likelihood Ratio Convergence Theorem says that under certain conditions (covered in detail below), the likelihood of a disjunctive sentence of this sort, given that ‘\(h_{i}\cdot b\cdot c^{n}\)’ is true,

must be at least \(1-(\psi /n)\), for some explicitly calculable term \(\psi\). Thus, the true hypothesis \(h_i\) probabilistically implies that as the amount of evidence, n , increases, it becomes highly likely (as close to 1 as you please) that one of the outcome sequences \(e^n\) will occur that yields a likelihood ratio \(P[e^n \pmid h_{j}\cdot b\cdot c^{n}] / P[e^n \pmid h_{i}\cdot b\cdot c^{n}]\) less than \(\varepsilon\); and this holds for any specific value of \(\varepsilon\) you may choose. As this happens, the posterior probability of \(h_i\)’s false competitor, \(h_j\), must approach 0, as required by the Ratio Form of Bayes’ Theorem, Equation 9* .

The term \(\psi\) in the lower bound of this probability depends on a measure of the empirical distinctness of the two hypotheses \(h_j\) and \(h_i\) for the proposed sequence of experiments and observations \(c^n\). To specify this measure we need to contemplate the collection of possible outcomes of each experiment or observation. So, consider some sequence of experimental or observational conditions described by sentences \(c_1,c_2 ,\ldots ,c_n\). Corresponding to each condition \(c_k\) there will be some range of possible alternative outcomes. Let \(O_{k} = \{o_{k1},o_{k2},\ldots ,o_{kw}\}\) be a set of statements describing the alternative possible outcomes for condition \(c_k\). (The number of alternative outcomes will usually differ for distinct experiments among those in the sequence \(c_1 ,\ldots ,c_n\); so, the value of w may depend on \(c_k\).) For each hypothesis \(h_j\), the alternative outcomes of \(c_k\) in \(O_k\) are mutually exclusive and exhaustive, so we have:

We now let expressions of form ‘\(e_k\)’ act as variables that range over the possible outcomes of condition \(c_k\)—i.e., \(e_k\) ranges over the members of \(O_k\). As before, ‘\(c^n\)’ denotes the conjunction of the first n test conditions, \((c_1\cdot c_2\cdot \ldots \cdot c_n)\), and ‘\(e^n\)’ represents possible sequences of corresponding outcomes, \((e_1\cdot e_2\cdot \ldots \cdot e_n)\). Let’s use the expression ‘ E\(^n\) ’ to represent the set of all possible outcome sequences that may result from the sequence of conditions c\(^n\) . So, for each hypothesis \(h_j\) (including \(h_i)\), \(\sum_{e^n\in E^n} P[e^n \pmid h_{j}\cdot b\cdot c^{n}] = 1\).

Everything introduced in this subsection is mere notational convention. No substantive suppositions (other than the axioms of probability theory) have yet been introduced. The version of the Likelihood Ratio Convergence Theorem I’ll present below does, however, draw on one substantive supposition, although a rather weak one. The next subsection will discuss that supposition in detail.

In most scientific contexts the outcomes in a stream of experiments or observations are probabilistically independent of one another relative to each hypothesis under consideration, or can at least be divided up into probabilistically independent parts. For our purposes probabilistic independence of evidential outcomes on a hypothesis divides neatly into two types.

Definition: Independent Evidence Conditions :

  • A sequence of outcomes \(e^k\) is condition-independent of a condition for an additional experiment or observation \(c_{k+1}\), given \(h\cdot b\) together with its own conditions \(c^k\), if and only if \[ P[e^k \pmid h\cdot b\cdot c^{k }\cdot c_{ k+1}] = P[e^k \pmid h\cdot b\cdot c^k] . \]
  • An individual outcome \(e_k\) is result-independent of a sequence of other observations and their outcomes \((c^{k-1}\cdot e^{k-1})\), given \(h\cdot b\) and its own condition \(c_k\), if and only if \[ P[e_k \pmid h\cdot b\cdot c_k\cdot(c^{k-1 }\cdot e^{ k-1})] = P[e_k \pmid h\cdot b\cdot c_k] . \]

When these two conditions hold, the likelihood for an evidence sequence may be decomposed into the product of the likelihoods for individual experiments or observations. To see how the two independence conditions affect the decomposition, first consider the following formula, which holds even when neither independence condition is satisfied:

When condition-independence holds, the likelihood of the whole evidence stream parses into a product of likelihoods that probabilistically depend on only past observation conditions and their outcomes. They do not depend on the conditions for other experiments whose outcomes are not yet specified. Here is the formula:

Finally, whenever both independence conditions are satisfied we have the following relationship between the likelihood of the evidence stream and the likelihoods of individual experiments or observations:

(For proofs of Equations 12–14 see the supplement Immediate Consequences of Independent Evidence Conditions .)

In scientific contexts the evidence can almost always be divided into parts that satisfy both clauses of the Independent Evidence Condition with respect to each alternative hypothesis. To see why, let us consider each independence condition more carefully.

Condition-independence says that the mere addition of a new observation condition \(c_{k+1}\), without specifying one of its outcomes , does not alter the likelihood of the outcomes \(e^k\) of other experiments \(c^k\). To appreciate the significance of this condition, imagine what it would be like if it were violated. Suppose hypothesis \(h_j\) is some statistical theory, say, for example, a quantum theory of superconductivity. The conditions expressed in \(c^k\) describe a number of experimental setups, perhaps conducted in numerous labs throughout the world, that test a variety of aspects of the theory (e.g., experiments that test electrical conductivity in different materials at a range of temperatures). An outcome sequence \(e^k\) describes the results of these experiments. The violation of condition-independence would mean that merely adding to \(h_{j}\cdot b\cdot c^{k}\) a statement \(c_{k+1}\) describing how an additional experiment has been set up, but with no mention of its outcome, changes how likely the evidence sequence \(e^k\) is taken to be. What \((h_j\cdot b)\) says via likelihoods about the outcomes \(e^k\) of experiments \(c^k\) differs as a result of merely supplying a description of another experimental arrangement, \(c_{k+1}\). Condition-independence , when it holds, rules out such strange effects.

Result-independence says that the description of previous test conditions together with their outcomes is irrelevant to the likelihoods of outcomes for additional experiments. If this condition were widely violated, then in order to specify the most informed likelihoods for a given hypothesis one would need to include information about volumes of past observations and their outcomes. What a hypothesis says about future cases would depend on how past cases have gone. Such dependence had better not happen on a large scale. Otherwise, the hypothesis would be fairly useless, since its empirical import in each specific case would depend on taking into account volumes of past observational and experimental results. However, even if such dependencies occur, provided they are not too pervasive, result-independence can be accommodated rather easily by packaging each collection of result-dependent data together, treating it like a single extended experiment or observation. The result-independence condition will then be satisfied by letting each term ‘\(c_k\)’ in the statement of the independence condition represent a conjunction of test conditions for a collection of result-dependent tests, and by letting each term ‘\(e_k\)’ (and each term ‘\(o_{ku}\)’) stand for a conjunction of the corresponding result-dependent outcomes. Thus, by packaging result-dependent data together in this way, the result-independence condition is satisfied by those (conjunctive) statements that describe the separate, result-independent chunks. [ 14 ]

The version of the Likelihood Ratio Convergence Theorem we will examine depends only on the Independent Evidence Conditions (together with the axioms of probability theory). It draws on no other assumptions. Indeed, an even more general version of the theorem can be established, a version that draws on neither of the Independent Evidence Conditions . However, the Independent Evidence Conditions will be satisfied in almost all scientific contexts, so little will be lost by assuming them. (And the presentation will run more smoothly if we side-step the added complications needed to explain the more general result.)

From this point on, let us assume that the following versions of the Independent Evidence Conditions hold.

Assumption: Independent Evidence Assumptions . For each hypothesis h and background b under consideration, we assume that the experiments and observations can be packaged into condition statements, \(c_1 ,\ldots ,c_k, c_{k+1},\ldots\), and possible outcomes in a way that satisfies the following conditions:

  • Each sequence of possible outcomes \(e^k\) of a sequence of conditions \(c^k\) is condition-independent of additional conditions \(c_{k+1}\)—i.e., \[P[e^k \pmid h\cdot b\cdot c^{k}\cdot c_{k+1}] = P[e^k \pmid h\cdot b\cdot c^k].\]
  • Each possible outcome \(e_k\) of condition \(c_k\) is result-independent of sequences of other observations and possible outcomes \((c^{k-1}\cdot e^{k-1})\)—i.e., \[P[e_k \pmid h\cdot b\cdot c_k\cdot(c^{k-1}\cdot e^{k-1})] = P[e_k \pmid h\cdot b\cdot c_k].\]

We now have all that is needed to begin to state the Likelihood Ratio Convergence Theorem .

The Likelihood Ratio Convergence Theorem comes in two parts. The first part applies only to those experiments or observations \(c_k\) within the total evidence stream \(c^n\) for which some of the possible outcomes have 0 likelihood of occurring according to hypothesis \(h_j\) but have non-0 likelihood of occurring according to \(h_i\). Such outcomes are highly desirable. If they occur, the likelihood ratio comparing \(h_j\) to \(h_i\) will become 0, and \(h_j\) will be falsified . So-called crucial experiments are a special case of this, where for at least one possible outcome \(o_{ku}\), \(P[o_{ku} \pmid h_{i}\cdot b\cdot c_{k}] = 1\) and \(P[o_{ku} \pmid h_{j}\cdot b\cdot c_{k}] = 0\). In the more general case \(h_i\) together with b says that one of the outcomes of \(c_k\) is at least minimally probable, whereas \(h_j\) says that this outcome is impossible—i.e., \(P[o_{ku} \pmid h_{i}\cdot b\cdot c_{k}] \gt 0\) and \(P[o_{ku} \pmid h_{j}\cdot b\cdot c_{k}] = 0\). It will be convenient to define a term for this situation.

Definition: Full Outcome Compatibility. Let’s call \(h_j\) fully outcome-compatible with \(h_i\) on experiment or observation \(c_k\) just when , for each of its possible outcomes \(e_k\), if \(P[e_k \pmid h_{i}\cdot b\cdot c_{k}] \gt 0\), then \(P[e_k \pmid h_{j}\cdot b\cdot c_{k}] \gt 0\). Equivalently, \(h_j\) is fails to be fully outcome-compatible with \(h_i\) on experiment or observation \(c_k\) just when , for at least one of its possible outcomes \(e_k\), \(P[e_k \pmid h_{i}\cdot b\cdot c_{k}] \gt 0\) but \(P[e_k \pmid h_{j}\cdot b\cdot c_{k}] = 0\).

The first part of the Likelihood Ratio Convergence Theorem applies to that part of the total stream of evidence (i.e., that subsequence of the total evidence stream) on which hypothesis \(h_j\) fails to be fully outcome-compatible with hypothesis \(h_i\); the second part of the theorem applies to the remaining part of the total stream of evidence, that subsequence of the total evidence stream on which \(h_j\) is fully outcome-compatible with \(h_i\). It turns out that these two kinds of cases must be treated differently. (This is due to the way in which the expected information content for empirically distinguishing between the two hypotheses will be measured for experiments and observations that are fully outcome compatible ; this measure of information content blows up (becomes infinite) for experiments and observations that fail to be fully outcome compatible ). Thus, the following part of the convergence theorem applies to just that part of the total stream of evidence that consists of experiments and observations that fail to be fully outcome compatible for the pair of hypotheses involved. Here, then, is the first part of the convergence theorem.

Likelihood Ratio Convergence Theorem 1—The Falsification Theorem: Suppose that the total stream of evidence \(c^n\) contains precisely m experiments or observations on which \(h_j\) fails to be fully outcome-compatible with \(h_i\). And suppose that the Independent Evidence Conditions hold for evidence stream \(c^n\) with respect to each of these two hypotheses. Furthermore, suppose there is a lower bound \(\delta \gt 0\) such that for each \(c_k\) on which \(h_j\) fails to be fully outcome-compatible with \(h_i\),

—i.e., \(h_i\) together with \(b\cdot c_k\) says , with likelihood at least as large as \(\delta\), that one of the outcomes will occur that \(h_j\) says cannot occur. Then,

which approaches 1 for large m . (For proof see Proof of the Falsification Theorem .)

In other words, we only suppose that for each of m observations, \(c_k, h_i\) says observation \(c_k\) has at least a small likelihood \(\delta\) of producing one of the outcomes \(o_{ku}\) that \(h_j\) says is impossible. If the number m of such experiments or observations is large enough (or if the lower bound \(\delta\) on the likelihoods of getting such outcomes is large enough), and if \(h_i\) (together with \(b\cdot c^n)\) is true, then it is highly likely that one of the outcomes held to be impossible by \(h_j\) will actually occur. If one of these outcomes does occur, then the likelihood ratio for \(h_j\) as compared to over \(h_i\) will become 0. According to Bayes’ Theorem, when this happen, \(h_j\) is absolutely refuted by the evidence—its posterior probability becomes 0.

The Falsification Theorem is quite commonsensical. First, notice that if there is a crucial experiment in the evidence stream, the theorem is completely obvious. That is, suppose for the specific experiment \(c_k\) (in evidence stream \(c^n)\) there are two incompatible possible outcomes \(o_{kv}\) and \(o_{ku}\) such that \(P[o_{kv} \pmid h_{j}\cdot b\cdot c_{k}] = 1\) and \(P[o_{ku} \pmid h_{i}\cdot b\cdot c_{k}] = 1\). Then, clearly, \(P[\vee \{ o_{ku}: P[o_{ku} \pmid h_{j}\cdot b\cdot c_{k}] = 0\} \pmid h_{i}\cdot b\cdot c_{k}] = 1\), since \(o_{ku}\) is one of the \(o_{ku}\) such that \(P[o_{ku} \pmid h_{j}\cdot b\cdot c_{k}] = 0\). So, where a crucial experiment is available, the theorem applies with \(m = 1\) and \(\delta = 1\).

The theorem is equally commonsensical for cases where no crucial experiment is available. To see what it says in such cases, consider an example. Let \(h_i\) be some theory that implies a specific rate of proton decay, but a rate so low that there is only a very small probability that any particular proton will decay in a given year. Consider an alternative theory \(h_j\) that implies that protons never decay. If \(h_i\) is true, then for a persistent enough sequence of observations (i.e., if proper detectors can keep trillions of protons under observation for long enough), eventually a proton decay will almost surely be detected. When this happens, the likelihood ratio becomes 0. Thus, the posterior probability of \(h_j\) becomes 0.

It is instructive to plug some specific values into the formula given by the Falsification Theorem, to see what the convergence rate might look like. For example, the theorem tells us that if we compare any pair of hypotheses \(h_i\) and \(h_j\) on an evidence stream \(c^n\) that contains at least \(m = 19\) observations or experiments, where each has a likelihood \(\delta \ge .10\) of yielding a falsifying outcome , then the likelihood (on \(h_{i}\cdot b\cdot c^{n})\) of obtaining an outcome sequence \(e^n\) that yields likelihood-ratio

will be at least as large as \((1 - (1-.1)^{19}) = .865\). (The reader is invited to try other values of \(\delta\) and m .)

A comment about the need for and usefulness of such convergence theorems is in order, now that we’ve seen one. Given some specific pair of scientific hypotheses \(h_i\) and \(h_j\) one may directly compute the likelihood, given \((h_{i}\cdot b\cdot c^{n})\), that a proposed sequence of experiments or observations \(c^n\) will result in one of the sequences of outcomes that would yield low likelihood ratios. So, given a specific pair of hypotheses and a proposed sequence of experiments, we don’t need a general Convergence Theorem to tell us the likelihood of obtaining refuting evidence. The specific hypotheses \(h_i\) and \(h_j\) tell us this themselves . They tell us the likelihood of obtaining each specific outcome stream, including those that either refute the competitor or produce a very small likelihood ratio for it. Furthermore, after we’ve actually performed an experiment and recorded its outcome, all that matters is the actual ratio of likelihoods for that outcome. Convergence theorems become moot.

The point of the Likelihood Ratio Convergence Theorem (both the Falsification Theorem and the part of the theorem still to come) is to assure us in advance of considering any specific pair of hypotheses that if the possible evidence streams that test hypotheses have certain characteristics which reflect the empirical distinctness of the two hypotheses, then it is highly likely that one of the sequences of outcomes will occur that yields a very small likelihood ratio. These theorems provide finite lower bounds on how quickly such convergence is likely to be. Thus, they show that the CoA is satisfied in advance of our using the logic to test specific pairs of hypotheses against one another.

The Falsification Theorem applies whenever the evidence stream includes possible outcomes that may falsify the alternative hypothesis. However, it completely ignores the influence of any experiments or observations in the evidence stream on which hypothesis \(h_j\) is fully outcome-compatible with hypothesis \(h_i\). We now turn to a theorem that applies to those evidence streams (or to parts of evidence streams) consisting only of experiments and observations on which hypothesis \(h_j\) is fully outcome-compatible with hypothesis \(h_i\). Evidence streams of this kind contain no possibly falsifying outcomes. In such cases the only outcomes of an experiment or observation \(c_k\) for which hypothesis \(h_j\) may specify 0 likelihoods are those for which hypothesis \(h_i\) specifies 0 likelihoods as well.

Hypotheses whose connection with the evidence is entirely statistical in nature will usually be fully outcome-compatible on the entire evidence stream. So, evidence streams of this kind are undoubtedly much more common in practice than those containing possibly falsifying outcomes. Furthermore, whenever an entire stream of evidence contains some mixture of experiments and observations on which the hypotheses are not fully outcome compatible along with others on which they are fully outcome compatible , we may treat the experiments and observations for which full outcome compatibility holds as a separate subsequence of the entire evidence stream, to see the likely impact of that part of the evidence in producing values for likelihood ratios.

To cover evidence streams (or subsequences of evidence streams) consisting entirely of experiments or observations on which \(h_j\) is fully outcome-compatible with hypothesis \(h_i\) we will first need to identify a useful way to measure the degree to which hypotheses are empirically distinct from one another on such evidence. Consider some particular sequence of outcomes \(e^n\) that results from observations \(c^n\). The likelihood ratio \(P[e^n \pmid h_{j}\cdot b\cdot c^{n}] / P[e^n \pmid h_{i}\cdot b\cdot c^{n}]\) itself measures the extent to which the outcome sequence distinguishes between \(h_i\) and \(h_j\). But as a measure of the power of evidence to distinguish among hypotheses, raw likelihood ratios provide a rather lopsided scale, a scale that ranges from 0 to infinity with the midpoint, where \(e^n\) doesn’t distinguish at all between \(h_i\) and \(h_j\), at 1. So, rather than using raw likelihood ratios to measure the ability of \(e^n\) to distinguish between hypotheses, it proves more useful to employ a symmetric measure. The logarithm of the likelihood ratio provides such a measure.

Definition: QI—the Quality of the Information . For each experiment or observation \(c_k\), define the quality of the information provided by possible outcome \(o_{ku}\) for distinguishing \(h_j\) from \(h_i\), given b , as follows (where henceforth we take “logs” to be base-2):

Similarly, for the sequence of experiments or observations \(c^n\), define the quality of the information provided by possible outcome \(e^n\) for distinguishing \(h_j\) from \(h_i\), given b , as follows:

That is, QI is the base-2 logarithm of the likelihood ratio for \(h_i\) over that for \(h_j\).

So, we’ll measure the Quality of the Information an outcome would yield in distinguishing between two hypotheses as the base-2 logarithm of the likelihood ratio. This is clearly a symmetric measure of the outcome’s evidential strength at distinguishing between the two hypotheses. On this measure hypotheses \(h_i\) and \(h_j\) assign the same likelihood value to a given outcome \(o_{ku}\) just when \(\QI[o_{ku} \pmid h_i /h_j \pmid b\cdot c_k] = 0\). Thus, QI measures information on a logarithmic scale that is symmetric about the natural no-information midpoint, 0. This measure is set up so that positive information favors \(h_i\) over \(h_j\), and negative information favors \(h_j\) over \(h_i\).

Given the Independent Evidence Assumptions with respect to each hypothesis, it’s easy to show that the QI for a sequence of outcomes is just the sum of the QIs of the individual outcomes in the sequence:

Probability theorists measure the expected value of a quantity by first multiplying each of its possible values by their probabilities of occurring, and then summing these products. Thus, the expected value of QI is given by the following formula:

Definition: EQI—the Expected Quality of the Information . We adopt the convention that if \(P[o_{ku} \pmid h_{i}\cdot b\cdot c_{k}] = 0\), then the term \(\QI[o_{ku} \pmid h_i /h_j \pmid b\cdot c_k] \times P[o_{ku} \pmid h_{i}\cdot b\cdot c_{k}] = 0\). This convention will make good sense in the context of the following definition because, whenever the outcome \(o_{ku}\) has 0 probability of occurring according to \(h_i\) (together with \(b\cdot c_k)\), it makes good sense to give it 0 impact on the ability of the evidence to distinguish between \(h_j\) and \(h_i\) when \(h_i\) (together with \(b\cdot c_k)\) is true. Also notice that the full outcome-compatibility of \(h_j\) with \(h_i\) on \(c_k\) means that whenever \(P[e_k \pmid h_{j}\cdot b\cdot c_{k}] = 0\), we must have \(P[e_k \pmid h_{i}\cdot b\cdot c_{k}] = 0\) as well; so whenever the denominator would be 0 in the term

the convention just described makes the term

Thus the following notion is well-defined:

For \(h_j\) fully outcome-compatible with \(h_i\) on experiment or observation \(c_k\), define

Also, for \(h_j\) fully outcome-compatible with \(h_i\) on each experiment and observation in the sequence \(c^n\), define

The EQI of an experiment or observation is the Expected Quality of its Information for distinguishing \(h_i\) from \(h_j\) when \(h_i\) is true. It is a measure of the expected evidential strength of the possible outcomes of an experiment or observation at distinguishing between the hypotheses when \(h_i\) (together with \(b\cdot c)\) is true. Whereas QI measures the ability of each particular outcome or sequence of outcomes to empirically distinguish hypotheses, EQI measures the tendency of experiments or observations to produce distinguishing outcomes. It can be shown that EQI tracks empirical distinctness in a very precise way. We return to this in a moment.

It is easily seen that the EQI for a sequence of observations \(c^n\) is just the sum of the EQIs of the individual observations \(c_k\) in the sequence:

(For proof see the supplement Proof that the EQI for \(c^n\) is the sum of the EQI for the individual \(c_k\) .)

This suggests that it may be useful to average the values of the \(\EQI[c_k \pmid h_i /h_j \pmid b]\) over the number of observations n to obtain a measure of the average expected quality of the information among the experiments and observations that make up the evidence stream \(c^n\).

Definition: The Average Expected Quality of Information For \(h_j\) fully outcome-compatible with \(h_i\) on each experiment and observation in the evidence stream \(c^n\), define the average expected quality of information, \(\bEQI\), from \(c^n\) for distinguishing \(h_j\) from \(h_i\), given \(h_i\cdot b\), as follows:

It turns out that the value of \(\EQI[c_k \pmid h_i /h_j \pmid b_{}]\) cannot be less than 0; and it must be greater than 0 just in case \(h_i\) is empirically distinct from \(h_j\) on at least one outcome \(o_{ku}\)—i.e., just in case it is empirically distinct in the sense that \(P[o_{ku} \pmid h_{i}\cdot b\cdot c_{k}] \ne P[o_{ku} \pmid h_{j}\cdot b\cdot c_{k}]\), for at least one outcome \(o_{ku}\). The same goes for the average, \(\bEQI[c^n \pmid h_i /h_j \pmid b]\).

Theorem: Nonnegativity of EQI.

\(\EQI[c_k \pmid h_i /h_j \pmid b_{}] \ge 0\); and \(\EQI[c_k \pmid h_i /h_j \pmid b_{}] \gt 0\) if and only if for at least one of its possible outcomes \(o_{ku}\),

As a result, \(\bEQI[c^n \pmid h_i /h_j \pmid b] \ge 0\); and \(\bEQI[c^n \pmid h_i /h_j \pmid b] \gt 0\) if and only if at least one experiment or observation \(c_k\) has at least one possible outcome \(o_{ku}\) such that

(For proof, see the supplement The Effect on EQI of Partitioning the Outcome Space More Finely—Including Proof of the Nonnegativity of EQI .)

In fact, the more finely one partitions the outcome space \(O_{k} = \{o_{k1},\ldots ,o_{kv},\ldots ,o_{kw}\}\) into distinct outcomes that differ on likelihood ratio values, the larger EQI becomes. [ 15 ] This shows that EQI tracks empirical distinctness in a precise way. The importance of the Non-negativity of EQI result for the Likelihood Ratio Convergence Theorem will become clear in a moment.

We are now in a position to state the second part of the Likelihood Ratio Convergence Theorem . It applies to all evidence streams not containing possibly falsifying outcomes for \(h_j\) when \(h_i\) holds—i.e., it applies to all evidence streams for which \(h_j\) is fully outcome-compatible with \(h_i\) on each \(c_k\) in the stream.

Likelihood Ratio Convergence Theorem 2—The Probabilistic Refutation Theorem.

Suppose the evidence stream \(c^n\) contains only experiments or observations on which \(h_j\) is fully outcome-compatible with \(h_i\)—i.e., suppose that for each condition \(c_k\) in sequence \(c^n\), for each of its possible outcomes possible outcomes \(o_{ku}\), either \(P[o_{ku} \pmid h_{i}\cdot b\cdot c_{k}] = 0\) or \(P[o_{ku} \pmid h_{j}\cdot b\cdot c_{k}] \gt 0\). In addition (as a slight strengthening of the previous supposition), for some \(\gamma \gt 0\) a number smaller than \(1/e^2\) (\(\approx .135\); where e ’ is the base of the natural logarithm), suppose that for each possible outcome \(o_{ku}\) of each observation condition \(c_k\) in \(c^n\), either \(P[o_{ku} \pmid h_{i}\cdot b\cdot c_{k}] = 0\) or

And suppose that the Independent Evidence Conditions hold for evidence stream \(c^n\) with respect to each of these hypotheses. Now, choose any positive \(\varepsilon \lt 1\), as small as you like, but large enough (for the number of observations n being contemplated) that the value of

For \(\varepsilon = 1/2^m\) and \(\gamma = 1/2^q\), this formula becomes,

(For proof see the supplement Proof of the Probabilistic Refutation Theorem .)

This theorem provides sufficient conditions for the likely refutation of false alternatives via exceeding small likelihood ratios. The conditions under which this happens characterize the degree to which the hypotheses involved are empirically distinct from one another. The theorem says that when these conditions are met, according to hypothesis \(h_i\) (taken together with \(b\cdot c^n)\), the likelihood is near 1 that that one of the outcome sequence \(e^n\) will occur for which the likelihood ratio is smaller than \(\varepsilon\) (for any value of \(\varepsilon\) you may choose). The likelihood of getting such an evidential outcome \(e^n\) is quite close to 1—i.e., no more than the amount

below 1. (Notice that this amount below 1 goes to 0 as n increases.)

It turns out that in almost every case (for almost any pair of hypotheses) the actual likelihood of obtaining such evidence (i.e., evidence that has a likelihood ratio value less than \(\varepsilon)\) will be much closer to 1 than this factor indicates. [ 16 ] Thus, the theorem provides an overly cautious lower bound on the likelihood of obtaining small likelihood ratios. It shows that the larger the value of \(\bEQI\) for an evidence stream, the more likely that stream is to produce a sequence of outcomes that yield a very small likelihood ratio value. But even if \(\bEQI\) remains quite small, a long enough evidence stream, n , of such low-grade evidence will, nevertheless, almost surely produce an outcome sequence having a very small likelihood ratio value. [ 17 ]

Notice that the antecedent condition of the theorem, that “either

for some \(\gamma \gt 0\) but less than \(1/e^2\) (\(\approx .135\))”, does not favor hypothesis \(h_i\) over \(h_j\) in any way. The condition only rules out the possibility that some outcomes might furnish extremely strong evidence against \(h_j\) relative to \(h_i\)—by making \(P[o_{ku} \pmid h_{i}\cdot b\cdot c_{k}] = 0\) or by making

less than some quite small \(\gamma\). This condition is only needed because our measure of evidential distinguishability, QI, blows up when the ratio

is extremely small. Furthermore, this condition is really no restriction at all on possible experiments or observations. If \(c_k\) has some possible outcome sentence \(o_{ku}\) that would make

(for a given small \(\gamma\) of interest), one may disjunctively lump \(o_{ku}\) together with some other outcome sentence \(o_{kv}\) for \(c_k\). Then, the antecedent condition of the theorem will be satisfied, but with the sentence ‘\((o_{ku} \vee o_{kv})\)’ treated as a single outcome. It can be proved that the only effect of such “disjunctive lumping” is to make \(\bEQI\) smaller than it would otherwise be (whereas larger values of \(\bEQI\) are more desirable). If the too strongly refuting disjunct \(o_{ku}\) actually occurs when the experiment or observation \(c_k\) is conducted, all the better, since this results in a likelihood ratio

smaller than \(\gamma\) on that particular evidential outcome. We merely failed to take this more strongly refuting possibility into account when computing our lower bound on the likelihood that refutation via likelihood ratios would occur.

The point of the two Convergence Theorems explored in this section is to assure us, in advance of the consideration of any specific pair of hypotheses, that if the possible evidence streams that test them have certain characteristics which reflect their evidential distinguishability, it is highly likely that outcomes yielding small likelihood ratios will result. These theorems provide finite lower bounds on how quickly convergence is likely to occur. Thus, there is no need to wait through some infinitely long run for convergence to occur. Indeed, for any evidence sequence on which the probability distributions are at all well behaved, the actual likelihood of obtaining outcomes that yield small likelihood ratio values will inevitably be much higher than the lower bounds given by Theorems 1 and 2.

In sum, according to Theorems 1 and 2, each hypothesis \(h_i\) says , via likelihoods, that given enough observations, it is very likely to dominate its empirically distinct rivals in a contest of likelihood ratios. The true hypothesis speaks truthfully about this, and its competitors lie. Even a sequence of observations with an extremely low average expected quality of information is very likely to do the job if that evidential sequence is long enough. Thus (by Equation 9* ), as evidence accumulates, the degree of support for false hypotheses will very probably approach 0, indicating that they are probably false; and as this happens, (by Equations 10 and 11) the degree of support for the true hypothesis will approach 1, indicating its probable truth. Thus, the Criterion of Adequacy (CoA) is satisfied.

Up to this point we have been supposing that likelihoods possess objective or agreed numerical values. Although this supposition is often satisfied in scientific contexts, there are important settings where it is unrealistic, where hypotheses only support vague likelihood values, and where there is enough ambiguity in what hypotheses say about evidential claims that the scientific community cannot agree on precise values for the likelihoods of evidential claims. [ 18 ] Let us now see how the supposition of precise, agreed likelihood values may be relaxed in a reasonable way.

Recall why agreement, or near agreement, on precise values for likelihoods is so important to the scientific enterprise. To the extent that members of a scientific community disagree on the likelihoods, they disagree about the empirical content of their hypotheses, about what each hypothesis says about how the world is likely to be. This can lead to disagreement about which hypotheses are refuted or supported by a given body of evidence. Similarly, to the extent that the values of likelihoods are only vaguely implied by hypotheses as understood by an individual agent, that agent may be unable to determine which of several hypotheses is refuted or supported by a given body of evidence.

We have seen, however, that the individual values of likelihoods are not really crucial to the way evidence impacts hypotheses. Rather, as Equations 9–11 show, it is ratios of likelihoods that do the heavy lifting. So, even if two support functions \(P_{\alpha}\) and \(P_{\beta}\) disagree on the values of individual likelihoods, they may, nevertheless, largely agree on the refutation or support that accrues to various rival hypotheses, provided that the following condition is satisfied:

  • whenever possible outcome sequence \(e^n\) makes \[\frac{P_{\alpha}[e^n \pmid h_{j}\cdot b\cdot c^{n}]}{P_{\alpha}[e^n \pmid h_{i}\cdot b\cdot c^{n}]} \lt 1,\] it also makes \[\frac{P_{\beta}[e^n \pmid h_{j}\cdot b\cdot c^{n}]}{P_{\beta}[e^n \pmid h_{i}\cdot b\cdot c^{n}]} \lt 1;\]
  • whenever possible outcome sequence \(e^n\) makes \[\frac{P_{\alpha}[e^n \pmid h_{j}\cdot b\cdot c^{n}]}{P_{\alpha}[e^n \pmid h_{i}\cdot b\cdot c^{n}]} \gt 1,\] it also makes \[\frac{P_{\beta}[e^n \pmid h_{j}\cdot b\cdot c^{n}]}{P_{\beta}[e^n \pmid h_{i}\cdot b\cdot c^{n}]} \gt 1;\]
  • each of these likelihood ratios is either close to 1 for both of these support functions, or is quite far from 1 for both of them. [ 19 ]

When this condition holds, the evidence will support \(h_i\) over \(h_j\) according to \(P_{\alpha}\) just in case it does so for \(P_{\beta}\) as well, although the strength of support may differ. Furthermore, although the rate at which the likelihood ratios increase or decrease on a stream of evidence may differ for the two support functions, the impact of the cumulative evidence should ultimately affect their refutation or support in much the same way.

When likelihoods are vague or diverse, we may take an approach similar to that we employed for vague and diverse prior plausibility assessments. We may extend the vagueness sets for individual agents to include a collection of inductive support functions that cover the range of values for likelihood ratios of evidence claims (as well as cover the ranges of comparative support strengths for hypotheses due to plausibility arguments within b , as represented by ratios of prior probabilities). Similarly, we may extend the diversity sets for communities of agents to include support functions that cover the ranges of likelihood ratio values that arise within the vagueness sets of members of the scientific community.

This broadening of vagueness and diversity sets to accommodate vague and diverse likelihood values makes no trouble for the convergence to truth results for hypotheses. For, provided that the Directional Agreement Condition is satisfied by all support functions in an extended vagueness or diversity set under consideration, the Likelihood Ratio Convergence Theorem applies to each individual support function in that set. For, the proof of that convergence theorem doesn’t depend on the supposition that likelihoods are objective or have intersubjectively agreed values. Rather, it applies to each individual support function \(P_{\alpha}\). The only possible problem with applying this result across a range of support functions is that when their values for likelihoods differ, function \(P_{\alpha}\) may disagree with \(P_{\beta}\) on which of the hypotheses is favored by a given sequence of evidence. That can happen because different support functions may represent the evidential import of hypotheses differently, by specifying different likelihood values for the very same evidence claims. So, an evidence stream that favors \(h_i\) according to \(P_{\alpha}\) may instead favor \(h_j\) according to \(P_{\beta}\). However, when the Directional Agreement Condition holds for a given collection of support functions, this problem cannot arise. Directional Agreement means that the evidential import of hypotheses is similar enough for \(P_{\alpha}\) and \(P_{\beta}\) that a sequence of outcomes may favor a hypothesis according to \(P_{\alpha}\) only if it does so for \(P_{\beta}\) as well.

Thus, when the Directional Agreement Condition holds for all support functions in a vagueness or diversity set that is extended to include vague or diverse likelihoods, and provided that enough evidentially distinguishing experiments or observations can be performed, all support functions in the extended vagueness or diversity set will very probably come to agree that the likelihood ratios for empirically distinct false competitors of a true hypothesis are extremely small. As that happens, the community comes to agree on the refutation of these competitors, and the true hypothesis rises to the top of the heap. [ 20 ]

What if the true hypothesis has evidentially equivalent rivals? Their posterior probabilities must rise as well. In that case we are only assured that the disjunction of the true hypothesis with its evidentially equivalent rivals will be driven to 1 as evidence lays low its evidentially distinct rivals. The true hypothesis will itself approach 1 only if either it has no evidentially equivalent rivals, or whatever equivalent rivals it does have can be laid low by plausibility arguments of a kind that don’t depend on the evidential likelihoods, but only show up via the comparative plausibility assessments represented by ratios of prior probabilities.

  • Enumerative Inductions: Bayesian Estimation and Convergence
  • Some Prominent Approaches to the Representation of Uncertain Inference
  • Likelihood Ratios, Likelihoodism, and the Law of Likelihood
  • Immediate Consequences of the Independent Evidence Conditions
  • Proof of the Falsification Theorem
  • Proof that the EQI for \(c^n\) is the sum of EQI for the individual \(c_k\)
  • The Effect on EQI of Partitioning the Outcome Space More Finely—Including Proof of the Nonnegativity of EQI
  • Proof of the Probabilistic Refutation Theorem
  • Boole, George, 1854, The Laws of Thought , London: MacMillan. Republished in 1958 by Dover: New York.
  • Bovens, Luc and Stephan Hartmann, 2003, Bayesian Epistemology , Oxford: Oxford University Press. doi:10.1093/0199269750.001.0001
  • Carnap, Rudolf, 1950, Logical Foundations of Probability , Chicago: University of Chicago Press.
  • –––, 1952, The Continuum of Inductive Methods , Chicago: University of Chicago Press.
  • –––, 1963, “Replies and Systematic Expositions”, in The Philosophy of Rudolf Carnap , Paul Arthur Schilpp (ed.),La Salle, IL: Open Court.
  • Chihara, Charles S., 1987, “Some Problems for Bayesian Confirmation Theory”, British Journal for the Philosophy of Science , 38(4): 551–560. doi:10.1093/bjps/38.4.551
  • Christensen, David, 1999, “Measuring Confirmation”, Journal of Philosophy , 96(9): 437–61. doi:10.2307/2564707
  • –––, 2004, Putting Logic in its Place: Formal Constraints on Rational Belief , Oxford: Oxford University Press. doi:10.1093/0199263256.001.0001
  • De Finetti, Bruno, 1937, “La Prévision: Ses Lois Logiques, Ses Sources Subjectives”, Annales de l’Institut Henri Poincaré , 7: 1–68; translated by Henry E. Kyburg, Jr. as “Foresight. Its Logical Laws, Its Subjective Sources”, in Studies in Subjective Probability , Henry E. Kyburg, Jr. and H.E. Smokler (eds.), Robert E. Krieger Publishing Company, 1980.
  • Dowe, David L., Steve Gardner, and Graham Oppy, 2007, “Bayes, Not Bust! Why Simplicity is No Problem for Bayesians”, British Journal for the Philosophy of Science , 58(4): 709–754. doi:10.1093/bjps/axm033
  • Dubois, Didier J. and Henri Prade, 1980, Fuzzy Sets and Systems , (Mathematics in Science and Engineering, 144), New York: Academic Press.
  • –––, 1990, “An Introduction to Possibilistic and Fuzzy Logics”, in Glenn Shafer and Judea Pearl (eds.), Readings in Uncertain Reasoning , San Mateo, CA: Morgan Kaufmann, 742–761..
  • Duhem, P., 1906, La theorie physique. Son objet et sa structure , Paris: Chevalier et Riviere; translated by P.P. Wiener, The Aim and Structure of Physical Theory , Princeton, NJ: Princeton University Press, 1954.
  • Earman, John, 1992, Bayes or Bust? A Critical Examination of Bayesian Confirmation Theory , Cambridge, MA: MIT Press.
  • Edwards, A.W.F., 1972, Likelihood: an account of the statistical concept of likelihood and its application to scientific inference , Cambridge: Cambridge University Press.
  • Edwards, Ward, Harold Lindman, and Leonard J. Savage, 1963, “Bayesian Statistical Inference for Psychological Research”, Psychological Review , 70(3): 193–242. doi:10.1037/h0044139
  • Eells, Ellery, 1985, “Problems of Old Evidence”, Pacific Philosophical Quarterly , 66(3–4): 283–302. doi:10.1111/j.1468-0114.1985.tb00254.x
  • –––, 2006, “Confirmation Theory”, Sarkar and Pfeifer 2006..
  • Eells, Ellery and Branden Fitelson, 2000, “Measuring Confirmation and Evidence”, Journal of Philosophy , 97(12): 663–672. doi:10.2307/2678462
  • Field, Hartry H., 1977, “Logic, Meaning, and Conceptual Role”, Journal of Philosophy , 74(7): 379–409. doi:10.2307/2025580
  • Fisher, R.A., 1922, “On the Mathematical Foundations of Theoretical Statistics”, Philosophical Transactions of the Royal Society, series A , 222(594–604): 309–368. doi:10.1098/rsta.1922.0009
  • Fitelson, Branden, 1999, “The Plurality of Bayesian Measures of Confirmation and the Problem of Measure Sensitivity”, Philosophy of Science , 66: S362–S378. doi:10.1086/392738
  • –––, 2001, “A Bayesian Account of Independent Evidence with Applications”, Philosophy of Science , 68(S3): S123–S140. doi:10.1086/392903
  • –––, 2002, “Putting the Irrelevance Back Into the Problem of Irrelevant Conjunction”, Philosophy of Science , 69(4): 611–622. doi:10.1086/344624
  • –––, 2006, “Inductive Logic”, Sarkar and Pfeifer 2006..
  • –––, 2006, “Logical Foundations of Evidential Support”, Philosophy of Science , 73(5): 500–512. doi:10.1086/518320
  • –––, 2007, “Likelihoodism, Bayesianism, and Relational Confirmation”, Synthese , 156(3): 473–489. doi:10.1007/s11229-006-9134-9
  • Fitelson, Branden and James Hawthorne, 2010, “How Bayesian Confirmation Theory Handles the Paradox of the Ravens”, in Eells and Fetzer (eds.), The Place of Probability in Science , Open Court. [ Fitelson & Hawthorne 2010 preprint available from the author (PDF) ]
  • Forster, Malcolm and Elliott Sober, 2004, “Why Likelihood”, in Mark L. Taper and Subhash R. Lele (eds.), The Nature of Scientific Evidence , Chicago: University of Chicago Press.
  • Friedman, Nir and Joseph Y. Halpern, 1995, “Plausibility Measures: A User’s Guide”, in UAI 95: Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence , 175–184.
  • Gaifman, Haim and Marc Snir, 1982, “Probabilities Over Rich Languages, Testing and Randomness”, Journal of Symbolic Logic , 47(3): 495–548. doi:10.2307/2273587
  • Gillies, Donald, 2000, Philosophical Theories of Probability , London: Routledge.
  • Glymour, Clark N., 1980, Theory and Evidence , Princeton, NJ: Princeton University Press.
  • Goodman, Nelson, 1983, Fact, Fiction, and Forecast , 4 th edition, Cambridge, MA: Harvard University Press.
  • Hacking, Ian, 1965, Logic of Statistical Inference , Cambridge: Cambridge University Press.
  • –––, 1975, The Emergence of Probability: a Philosophical Study of Early Ideas about Probability, Induction and Statistical Inference , Cambridge: Cambridge University Press. doi:10.1017/CBO9780511817557
  • –––, 2001, An Introduction to Probability and Inductive Logic , Cambridge: Cambridge University Press. doi:10.1017/CBO9780511801297
  • Hájek, Alan, 2003a, “What Conditional Probability Could Not Be”, Synthese , 137(3):, 273–323. doi:10.1023/B:SYNT.0000004904.91112.16
  • –––, 2003b, “Interpretations of the Probability Calculus”, in the Stanford Encyclopedia of Philosophy , (Summer 2003 Edition), Edward N. Zalta (ed.), URL = < https://plato.stanford.edu/archives/sum2003/entries/probability-interpret/ >
  • –––, 2005, “Scotching Dutch Books?” Philosophical Perspectives , 19 (Epistemology): 139–151. doi:10.1111/j.1520-8583.2005.00057.x
  • –––, 2007, “The Reference Class Problem is Your Problem Too”, Synthese , 156(3): 563–585. doi:10.1007/s11229-006-9138-5
  • Halpern, Joseph Y., 2003, Reasoning About Uncertainty , Cambridge, MA: MIT Press.
  • Harper, William L., 1976, “Rational Belief Change, Popper Functions and Counterfactuals”, in Harper and Hooker 1976: 73–115. doi:10.1007/978-94-010-1853-1_5
  • Harper, William L. and Clifford Alan Hooker (eds.), 1976, Foundations of Probability Theory, Statistical Inference, and Statistical Theories of Science, volume I Foundations and Philosophy of Epistemic Applications of Probability Theory , (The Western Ontario Series in Philosophy of Science, 6a), Dordrecht: Reidel. doi:10.1007/978-94-010-1853-1
  • Hawthorne, James, 1993, “Bayesian Induction is Eliminative Induction”, Philosophical Topics , 21(1): 99–138. doi:10.5840/philtopics19932117
  • –––, 1994,“On the Nature of Bayesian Convergence”, PSA: Proceedings of the Biennial Meeting of the Philosophy of Science Association 1994 , 1: 241–249. doi:10.1086/psaprocbienmeetp.1994.1.193029
  • –––, 2005, “ Degree-of-Belief and Degree-of-Support : Why Bayesians Need Both Notions”, Mind , 114(454): 277–320. doi:10.1093/mind/fzi277
  • –––, 2009, “The Lockean Thesis and the Logic of Belief”, in Franz Huber and Christoph Schmidt-Petri (eds.), Degrees of Belief , (Synthese Library, 342), Dordrecht: Springer, pp. 49–74. doi:10.1007/978-1-4020-9198-8_3
  • Hawthorne, James and Luc Bovens, 1999, “The Preface, the Lottery, and the Logic of Belief”, Mind , 108(430): 241–264. doi:10.1093/mind/108.430.241
  • Hawthorne, James and Branden Fitelson, 2004, “Discussion: Re-solving Irrelevant Conjunction With Probabilistic Independence”, Philosophy of Science , 71(4): 505–514. doi:10.1086/423626
  • Hellman, Geoffrey, 1997, “Bayes and Beyond”, Philosophy of Science , 64(2): 191–221. doi:10.1086/392548
  • Hempel, Carl G., 1945, “Studies in the Logic of Confirmation”, Mind , 54(213): 1–26, 54(214):97–121. doi:10.1093/mind/LIV.213.1 doi:10.1093/mind/LIV.214.97
  • Horwich, Paul, 1982, Probability and Evidence , Cambridge: Cambridge University Press. doi:10.1017/CBO9781316494219
  • Howson, Colin, 1997, “A Logic of Induction”, Philosophy of Science , 64(2): 268–290. doi:10.1086/392551
  • –––, 2000, Hume’s Problem: Induction and the Justification of Belief , Oxford: Oxford University Press. doi:10.1093/0198250371.001.0001
  • –––, 2002, “Bayesianism in Statistics“, in Swinburne 2002: 39–71. doi:10.5871/bacad/9780197263419.003.0003
  • –––, 2007, “Logic With Numbers”, Synthese , 156(3): 491–512. doi:10.1007/s11229-006-9135-8
  • Howson, Colin and Peter Urbach, 1993, Scientific Reasoning: The Bayesian Approach , La Salle, IL: Open Court. [3rd edition, 2005.]
  • Huber, Franz, 2005a, “Subjective Probabilities as Basis for Scientific Reasoning?” British Journal for the Philosophy of Science , 56(1): 101–116. doi:10.1093/phisci/axi105
  • –––, 2005b, “What Is the Point of Confirmation?” Philosophy of Science , 72(5): 1146–1159. doi:10.1086/508961
  • Jaynes, Edwin T., 1968, “Prior Probabilities”, IEEE Transactions on Systems Science and Cybernetics , SSC–4(3): 227–241. doi:10.1109/TSSC.1968.300117
  • Jeffrey, Richard C., 1983, The Logic of Decision , 2nd edition, Chicago: University of Chicago Press.
  • –––, 1987, “Alias Smith and Jones: The Testimony of the Senses”, Erkenntnis , 26(3): 391–399. doi:10.1007/BF00167725
  • –––, 1992, Probability and the Art of Judgment , New York: Cambridge University Press. doi:10.1017/CBO9781139172394
  • –––, 2004, Subjective Probability: The Real Thing , Cambridge: Cambridge University Press. doi:10.1017/CBO9780511816161
  • Jeffreys, Harold, 1939, Theory of Probability , Oxford: Oxford University Press.
  • Joyce, James M., 1998, “A Nonpragmatic Vindication of Probabilism”, Philosophy of Science , 65(4): 575–603. doi:10.1086/392661
  • –––, 1999, The Foundations of Causal Decision Theory , New York: Cambridge University Press. doi:10.1017/CBO9780511498497
  • –––, 2003, “Bayes’ Theorem”, in the Stanford Encyclopedia of Philosophy , (Summer 2003 Edition), Edward N. Zalta (ed.), URL = < https://plato.stanford.edu/archives/win2003/entries/bayes-theorem/ >
  • –––, 2004, “Bayesianism”, in Alfred R. Mele and Piers Rawling (eds.), The Oxford Handbook of Rationality , Oxford: Oxford University Press, pp. 132–153. doi:10.1093/0195145399.003.0008
  • –––, 2005, “How Probabilities Reflect Evidence”, Philosophical Perspectives , 19: 153–179. doi:10.1111/j.1520-8583.2005.00058.x
  • Kaplan, Mark, 1996, Decision Theory as Philosophy , Cambridge: Cambridge University Press.
  • Kelly, Kevin T., Oliver Schulte, and Cory Juhl, 1997, “Learning Theory and the Philosophy of Science”, Philosophy of Science , 64(2): 245–267. doi:10.1086/392550
  • Keynes, John Maynard, 1921, A Treatise on Probability , London: Macmillan and Co.
  • Kolmogorov, A.N., 1956, Foundations of the Theory of Probability ( Grundbegriffe der Wahrscheinlichkeitsrechnung , 2 nd edition, New York: Chelsea Publishing Company.
  • Koopman, B.O., 1940, “The Bases of Probability”, Bulletin of the American Mathematical Society , 46(10): 763–774. Reprinted in H. Kyburg and H. Smokler (eds.), 1980, Studies in Subjective Probability , 2nd edition, Huntington, NY: Krieger Publ. Co. [ Koopman 1940 available online ]
  • Kyburg, Henry E., Jr., 1974, The Logical Foundations of Statistical Inference , Dordrecht: Reidel. doi:10.1007/978-94-010-2175-3
  • –––, 1977, “Randomness and the Right Reference Class”, Journal of Philosophy , 74(9): 501–520. doi:10.2307/2025794
  • –––, 1978, “An Interpolation Theorem for Inductive Relations”, Journal of Philosophy , 75:93–98.
  • –––, 2006, “Belief, Evidence, and Conditioning”, Philosophy of Science , 73(1): 42–65. doi:10.1086/510174
  • Lange, Marc, 1999, “Calibration and the Epistemological Role of Bayesian Conditionalization”, Journal of Philosophy , 96(6): 294–324. doi:10.2307/2564680
  • –––, 2002, “Okasha on Inductive Scepticism”, The Philosophical Quarterly , 52(207): 226–232. doi:10.1111/1467-9213.00264
  • Laudan, Larry, 1997, “How About Bust? Factoring Explanatory Power Back into Theory Evaluation”, Philosophy of Science , 64(2): 206–216. doi:10.1086/392553
  • Lenhard Johannes, 2006, “Models and Statistical Inference: The Controversy Between Fisher and Neyman-Pearson”, British Journal for the Philosophy of Science , 57(1): 69–91. doi:10.1093/bjps/axi152
  • Levi, Isaac, 1967, Gambling with Truth: An Essay on Induction and the Aims of Science , New York: Knopf.
  • –––, 1977, “Direct Inference”, Journal of Philosophy , 74(1): 5–29. doi:10.2307/2025732
  • –––, 1978, “Confirmational Conditionalization”, Journal of Philosophy , 75(12): 730–737. doi:10.2307/2025516
  • –––, 1980, The Enterprise of Knowledge: An Essay on Knowledge, Credal Probability, and Chance , Cambridge, MA: MIT Press.
  • Lewis, David, 1980, “A Subjectivist’s Guide to Objective Chance”, in Richard C. Jeffrey, (ed.), Studies in Inductive Logic and Probability , vol. 2, Berkeley: University of California Press, 263–293.
  • Maher, Patrick, 1993, Betting on Theories , Cambridge: Cambridge University Press.
  • –––, 1996, “Subjective and Objective Confirmation”, Philosophy of Science , 63(2): 149–174. doi:10.1086/289906
  • –––, 1997, “Depragmatized Dutch Book Arguments”, Philosophy of Science , 64(2): 291–305. doi:10.1086/392552
  • –––, 1999, “Inductive Logic and the Ravens Paradox”, Philosophy of Science , 66(1): 50–70. doi:10.1086/392676
  • –––, 2004, “Probability Captures the Logic of Scientific Confirmation”, in Christopher Hitchcock (ed.), Contemporary Debates in Philosophy of Science , Oxford: Blackwell, 69–93.
  • –––, 2005, “Confirmation Theory”, The Encyclopedia of Philosophy , 2nd edition, Donald M. Borchert (ed.), Detroit: Macmillan.
  • –––, 2006a, “The Concept of Inductive Probability”, Erkenntnis , 65(2): 185–206. doi:10.1007/s10670-005-5087-5
  • –––, 2006b, “A Conception of Inductive Logic”, Philosophy of Science , 73(5): 513–523. doi:10.1086/518321
  • –––, 2010, “Bayesian Probability”, Synthese , 172(1): 119–127. doi:10.1007/s11229-009-9471-6
  • Mayo, Deborah G., 1996, Error and the Growth of Experimental Knowledge , Chicago: University of Chicago Press.
  • –––, 1997, “Duhem’s Problem, the Bayesian Way, and Error Statistics, or ‘What’s Belief Got to do with It?’”, Philosophy of Science , 64(2): 222–244. doi:10.1086/392549
  • Mayo Deborah and Aris Spanos, 2006, “Severe Testing as a Basic Concept in a Neyman-Pearson Philosophy of Induction“, British Journal for the Philosophy of Science , 57(2): 323–357. doi:10.1093/bjps/axl003
  • McGee, Vann, 1994, “Learning the Impossible”, in E. Eells and B. Skyrms (eds.), Probability and Conditionals: Belief Revision and Rational Decision , New York: Cambridge University Press, 179–200.
  • McGrew, Timothy J., 2003, “Confirmation, Heuristics, and Explanatory Reasoning”, British Journal for the Philosophy of Science , 54: 553–567.
  • McGrew, Lydia and Timothy McGrew, 2008, “Foundationalism, Probability, and Mutual Support”, Erkenntnis , 68(1): 55–77. doi:10.1007/s10670-007-9062-1
  • Neyman, Jerzy and E.S. Pearson, 1967, Joint Statistical Papers , Cambridge: Cambridge University Press.
  • Norton, John D., 2003, “A Material Theory of Induction”, Philosophy of Science , 70(4): 647–670. doi:10.1086/378858
  • –––, 2007, “Probability Disassembled”, British Journal for the Philosophy of Science , 58(2): 141–171. doi:10.1093/bjps/axm009
  • Okasha, Samir, 2001, “What Did Hume Really Show About Induction?”, The Philosophical Quarterly , 51(204): 307–327. doi:10.1111/1467-9213.00231
  • Popper, Karl, 1968, The Logic of Scientific Discovery , 3 rd edition, London: Hutchinson.
  • Quine, W.V., 1953, “Two Dogmas of Empiricism”, in From a Logical Point of View , New York: Harper Torchbooks. Routledge Encyclopedia of Philosophy, Version 1.0, London: Routledge
  • Ramsey, F.P., 1926, “Truth and Probability”, in Foundations of Mathematics and other Essays , R.B. Braithwaite (ed.), Routledge & P. Kegan,1931, 156–198. Reprinted in Studies in Subjective Probability , H. Kyburg and H. Smokler (eds.), 2 nd ed., R.E. Krieger Publishing Company, 1980, 23–52. Reprinted in Philosophical Papers , D.H. Mellor (ed.), Cambridge: University Press, Cambridge, 1990,
  • Reichenbach, Hans, 1938, Experience and Prediction: An Analysis of the Foundations and the Structure of Knowledge , Chicago: University of Chicago Press.
  • Rényi, Alfred, 1970, Foundations of Probability , San Francisco, CA: Holden-Day.
  • Rosenkrantz, R.D., 1981, Foundations and Applications of Inductive Probability , Atascadero, CA: Ridgeview Publishing.
  • Roush, Sherrilyn , 2004, “Discussion Note: Positive Relevance Defended”, Philosophy of Science , 71(1): 110–116. doi:10.1086/381416
  • –––, 2006, “Induction, Problem of”, Sarkar and Pfeifer 2006..
  • –––, 2006, Tracking Truth: Knowledge, Evidence, and Science , Oxford: Oxford University Press.
  • Royall, Richard M., 1997, Statistical Evidence: A Likelihood Paradigm , New York: Chapman & Hall/CRC.
  • Salmon, Wesley C., 1966, The Foundations of Scientific Inference , Pittsburgh, PA: University of Pittsburgh Press.
  • –––, 1975, “Confirmation and Relevance”, in H. Feigl and G. Maxwell (eds.), Induction, Probability, and Confirmation , (Minnesota Studies in the Philosophy of Science, 6), Minneapolis: University of Minnesota Press, 3–36.
  • Sarkar, Sahotra and Jessica Pfeifer (eds.), 2006, The Philosophy of Science: An Encyclopedia , 2 volumes, New York: Routledge.
  • Savage, Leonard J., 1954, The Foundations of Statistics , John Wiley (2nd ed., New York: Dover 1972).
  • Savage, Leonard J., et al., 1962, The Foundations of Statistical Inference , London: Methuen.
  • Schlesinger, George N., 1991, The Sweep of Probability , Notre Dame, IN: Notre Dame University Press.
  • Seidenfeld, Teddy, 1978, “Direct Inference and Inverse Inference”, Journal of Philosophy , 75(12): 709–730. doi:10.2307/2025515
  • –––, 1992, “R.A. Fisher’s Fiducial Argument and Bayes’ Theorem”, Statistical Science , 7(3): 358–368. doi:10.1214/ss/1177011232
  • Shafer, Glenn, 1976, A Mathematical Theory of Evidence , Princeton, NJ: Princeton University Press.
  • –––, 1990, “Perspectives on the Theory and Practice of Belief Functions”, International Journal of Approximate Reasoning , 4(5–6): 323–362. doi:10.1016/0888-613X(90)90012-Q
  • Skyrms, Brian, 1984, Pragmatics and Empiricism , New Haven, CT: Yale University Press.
  • –––, 1990, The Dynamics of Rational Deliberation , Cambridge, MA: Harvard University Press.
  • –––, 2000, Choice and Chance: An Introduction to Inductive Logic , 4 th edition, Belmont, CA: Wadsworth, Inc.
  • Sober, Elliott, 2002, “Bayesianism—Its Scope and Limits”, in Swinburne 2002: 21–38. doi:10.5871/bacad/9780197263419.003.0002
  • Spohn, Wolfgang, 1988, “Ordinal Conditional Functions: A Dynamic Theory of Epistemic States”, in William L. Harper and Brian Skyrms (eds.), Causation in Decision, Belief Change, and Statistics , vol. 2, Dordrecht: Reidel, 105–134. doi:10.1007/978-94-009-2865-7_6
  • Strevens, Michael, 2004, “Bayesian Confirmation Theory: Inductive Logic, or Mere Inductive Framework?” Synthese , 141(3): 365–379. doi:10.1023/B:SYNT.0000044991.73791.f7
  • Suppes, Patrick, 2007, “Where do Bayesian Priors Come From?” Synthese , 156(3): 441–471. doi:10.1007/s11229-006-9133-x
  • Swinburne, Richard, 2002, Bayes’ Theorem , Oxford: Oxford University Press. doi:10.5871/bacad/9780197263419.001.0001
  • Talbot, W., 2001, “Bayesian Epistemology”, in the Stanford Encyclopedia of Philosophy , (Fall 2001 Edition), Edward N. Zalta (ed.), URL = < https://plato.stanford.edu/archives/fall2001/entries/epistemology-bayesian/ >
  • Teller, Paul, 1976, “Conditionalization, Observation, and Change of Preference”, in Harper and Hooker 1976: 205–259. doi:10.1007/978-94-010-1853-1_9
  • Van Fraassen, Bas C., 1983, “Calibration: A Frequency Justification for Personal Probability ”, in R.S. Cohen and L. Laudan (eds.), Physics, Philosophy, and Psychoanalysis: Essays in Honor of Adolf Grunbaum , Dordrecht: Reidel. doi:10.1007/978-94-009-7055-7_15
  • Venn, John, 1876, The Logic of Chance , 2 nd ed., Macmillan and co; reprinted, New York, 1962.
  • Vineberg, Susan, 2006, “Dutch Book Argument”, Sarkar and Pfeifer 2006..
  • Vranas, Peter B.M., 2004, “Hempel’s Raven Paradox: A Lacuna in the Standard Bayesian Solution”, British Journal for the Philosophy of Science , 55(3): 545–560. doi:10.1093/bjps/55.3.545
  • Weatherson, Brian, 1999, “Begging the Question and Bayesianism”, Studies in History and Philosophy of Science [Part A] , 30(4): 687–697. doi:10.1016/S0039-3681(99)00020-5
  • Williamson, Jon, 2007, “Inductive Influence”, British Journal for Philosophy of Science , 58(4): 689–708. doi:10.1093/bjps/axm032
  • Zadeh, Lotfi A., 1965, “Fuzzy Sets”, Information and Control , 8(3): 338–353. doi:10.1016/S0019-9958(65)90241-X
  • –––, 1978, “Fuzzy Sets as a Basis for a Theory of Possibility”, Fuzzy Sets and Systems , vol. 1, 3–28.
How to cite this entry . Preview the PDF version of this entry at the Friends of the SEP Society . Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO). Enhanced bibliography for this entry at PhilPapers , with links to its database.
  • Confirmation and Induction . Really nice overview by Franz Huber in the Internet Encyclopedia of Philosophy .
  • Inductive Logic , (in PDF), by Branden Fitelson, Philosophy of Science: An Encyclopedia , (J. Pfeifer and S. Sarkar, eds.), Routledge. An extensive encyclopedia article on inductive logic.
  • Teaching Theory of Knowledge: Probability and Induction . A very extensive outline of issues in Probability and Induction, each topic accompanied by a list of relevant books and articles (without links), compiled by Brad Armendt and Martin Curd.
  • Probabilistic Confirmation Theory and Bayesian Reasoning . An annotated bibliography of influential works compiled by Timothy McGrew.
  • Bayesian Networks Without Tears , (in PDF), by Eugene Charniak (Computer Science and Cognitive Science, Brown University). An introductory article on Bayesian inference.
  • Miscellany of Works on Probabilistic Thinking . A collection of on-line articles on Subjective Probability and probabilistic reasoning by Richard Jeffrey and by several other philosophers writing on related issues.
  • Fitelson’s course on Confirmation Theory . Main page of Branden Fitelson’s course on Confirmation Theory. The Syllabus provides an extensive list of links to readings. The Notes, Handouts, & Links page has Fitelson’s weekly course notes and some links to useful internet resources on confirmation theory.
  • Fitelson’s course on Probability and Induction . Main page of Branden Fitelson’s course on Probability and Induction. The Syllabus provides an extensive list of links to readings on the subject. The Notes & Handouts page has Fitelson’s powerpoint slides for each of his lectures and some links to handouts for the course. The Links page contains links to some useful internet resources.

Bayes’ Theorem | epistemology: Bayesian | probability, interpretations of

Acknowledgments

Thanks to Alan Hájek, Jim Joyce, and Edward Zalta for many valuable comments and suggestions. The editors and author also thank Greg Stokley and Philippe van Basshuysen for carefully reading an earlier version of the entry and identifying a number of typographical errors.

Copyright © 2018 by James Hawthorne < hawthorne @ ou . edu >

  • Accessibility

Support SEP

Mirror sites.

View this site from another server:

  • Info about mirror sites

The Stanford Encyclopedia of Philosophy is copyright © 2024 by The Metaphysics Research Lab , Department of Philosophy, Stanford University

Library of Congress Catalog Data: ISSN 1095-5054

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Kinds of Arguments

Contemporary Western philosophy treats arguments as coming in two main types, deductive and inductive. The basic distinction and difference will be mentioned here.

Deductive arguments are arguments in which the premises (if true) guarantee the truth of the conclusion. The conclusion of a successful deductive argument cannot possibly be false, assuming its premises are true. This is what it means to label an argument as “valid” in logic. The form or structure of a deductive argument is the essential aspect to consider. Somewhat counter-intuitively, the premises do not need to be true for the conclusion to be true.

Arguments are a linguistic representation of an inference. So, using slightly different terminology, we can define deductive inferences . In a successful deductive inference, the premises and the denial of the conclusion constitute an inconsistent set of statements. An alternative way to describe the same relation: in a successful deductive inference, the truth of the premises makes the falsity of the conclusion logically impossible. A successful deductive inference is valid .

Deductive Example

1) All dogs are mammals.

2) All mammals breathe air.

_______________________________________________

SO: All dogs breathe air.

Inductive arguments are arguments with premises which make it likely that the conclusion is true but don’t absolutely guarantee its truth . Inductive arguments are by far the most common type of argument we see in our daily lives. We can assess inductive arguments along a spectrum of successful (stronger) to unsuccessful (weaker). The more successful (stronger) argument suggests that the premises mean the conclusion is probably true, with a high degree of likelihood. It is important to remember that inductive arguments can never fully guarantee the truth of the conclusion.

Using slightly different terminology, we can consider inductive inferences, referring to the actual thinking process in someone’s mind. In a successful inductive inference, the truth of the premises makes the falsity of the conclusion possible, but unlikely. Inductive inferences can be evaluated as “stronger” or “weaker” depending on the probability.

Inductive Example

1) The Interstate Bridge is regularly inspected by qualified engineers.

2) Vehicles have been driving over it for years.

SO: It will be safe to drive over it tomorrow.

One thing that makes applying the distinction between deductive and inductive arguments a bit tricky is this: we can’t look only at the premises OR only at the conclusion. Instead, we need to focus on the relationship between the premise(s) and the conclusion to tell what kind of argument we have.

A further contributor to trickiness: we can’t be distracted by the question of whether the statements are true or false. To classify an argument as deductive or inductive, we need to grant that the premises are true in a hypothetical way. We have to ask the question, “If those premises were true, would it be IMPOSSIBLE for the conclusion to be false?” If so, it is a deductive argument. Or “If those premises were true, would it be UNLIKELY, but still possible, that the conclusion is false? If so, it is an inductive argument.

As an example, consider this valid deductive argument:

1) All clouds are made out of spun sugar.

2) Anything made out of spun sugar is high in calories.

SO: All clouds are high in calories.

This argument is deductively successful because the truth of the premises would make the falsity of the conclusion impossible. Odd, isn’t it?

Some arguments are presented with premises missing. In those cases, the determination of deductive or inductive will depend on how that premise is filled in.

For example: I had an apple for lunch, so I had something healthy!

Exercise: Deductive or Inductive?

Determine if the following arguments are deductive or inductive. It is a good idea to put the arguments in standard form first, so you are clear about the relation between premises and conclusion.

Critical Thinking in Academic Research Copyright © 2022 by Cindy Gruwell and Robin Ewing is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons

Margin Size

  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Humanities LibreTexts

5.2: Cogency and Strong Arguments

  • Last updated
  • Save as PDF
  • Page ID 29610

  • Golden West College via NGE Far Press

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

Strength and Weakness

Inductive arguments are said to be either strong or weak. There’s no absolute cut-off between strength and weakness, but some arguments will be very strong and others very weak, so the distinction is still useful even if it is not precise. A strong argument is one where, if the premises were true, the conclusion would be very likely to be true. A weak argument is one where the conclusion does not follow from the premises (i.e. even if the premises were true, there would still be a good chance that the conclusion could be false.)

Most arguments in courts of law attempt to be strong arguments; they are generally not attempts at valid arguments.

So, the following example is a strong argument.

John was found with a gun in his hand, running from the apartment where Tom's body was found. Three witnesses heard a gunshot right before they saw John run out. The gun in John's possession matched the ballistics on the bullet pulled from Tom's head. John had written a series of threatening letters to Tom. In prison, John confessed to his cellmate that he had killed Tom. Therefore, John is the murder of Tom.

Given that all the premises were true, it would be very likely that the conclusion would be true.

Importantly, strength has nothing to do with the actual truth of the premises!

This is something people frequently forget, so it’s worth repeating: A STRONG ARGUMENT NEEDN’T HAVE ANY TRUE PREMISES! ALL THE PREMISES OF A STRONG ARGUMENT CAN BE FALSE!

The argument is strong because: if the premises WERE true, the conclusion would be likely to be true.

So the following arguments are strong:

98% of Dominicans have superpowers. Lucy is Dominican. I saw Lucy leap from the top of a tall building last week and walk away unscathed. Lucy has superpowers.

People from the lost continent of Atlantis have been manipulating the world’s governments for years by placing Atlantean wizards in positions of power. Whenever possible, they place an Atlantean wizard in the executive position of the most powerful government on earth. They did this in the Roman empire, the Mongol empire, and the British empire. Currently, the United States is the most powerful country on earth. Barack Obama was born in Hawai’i, where about 45% of the people are actually Atlanteans. While he was a Senator from Illinois, he received over 10 billion dollars in funds from a mysterious holding company called “Atlantis Incorporated.” Several journalists claim that they have seen Barack Obama perform feats of magic. For example, Shep Smith of Fox News said he saw Barack Obama walk on water. Barack Obama is clearly an Atlantean wizard.

Two leading researchers in genetics have found that, in every sample of DNA they looked at, there were traces of kryptonite. They examined 1600 samples, from 1600 separate individuals, including an equal distribution from all continents. The results were then replicated in another, larger study of 2700 samples, also taken from all continents. We conclude, then, that normal DNA contains kryptonite.

Cogency: If an argument is strong and all its premises are true, the argument is said to be cogent.

The following arguments are weak. The premises provide little, if any, evidence for the conclusions:

I saw your boyfriend last night and he was talking to another girl. So he’s cheating on you.

Senator Bonham served 8 years in military, whereas his opponent, Mr. Malham never served one day of military service. So you should vote for Senator Bonham.

More people buy Juff ™ brand peanut butter than any other brand, so you should by Juff ™!

It’s notable, again, that the truth of the premises is irrelevant. A weak argument can have true premises and a true conclusion. What makes it weak is that the premises do not provide good reason to believe the conclusion.

Induction 45

All of the argument forms we have looked at so far have been deductively valid. That meant, we said, that the conclusion follows from necessity if the premises are true. But to what extent can we ever be sure of the truth of those premises? Inductive argumentation is a less certain, more realistic, more familiar way of reasoning that we all do, all the time. Inductive argumentation recognizes, for instance, that a premise like “All horses have four legs” comes from our previous experience of horses. If one day we were to encounter a three-legged horse, deductive logic would tell us that “All horses have four legs” is false, at which point the premise becomes rather useless for a deducer. In fact, deductive logic tells us that if the premise “All horses have four legs” is false, even if we know there are many, many four-legged horses in the world, when we go to the track and see hordes of four-legged horses, all we can really be certain of is that “There is at least one four-legged horse.”

Inductive logic allows for the more realistic premise, “The vast majority of horses have four legs”. And inductive logic can use this premise to infer other useful information, like “If I’m going to get Chestnut booties for Christmas, I should probably get four of them.” The trick is to recognize a certain amount of uncertainty in the truth of the conclusion, something for which deductive logic does not allow. In real life, however, inductive logic is used much more frequently and (hopefully) with some success. Let’s take a look at some of the uses of inductive reasoning.

Predicting the Future

We constantly use inductive reasoning to predict the future. We do this by compiling evidence based on past observations, and by assuming that the future will resemble the past. For instance, I make the observation that every other time I have gone to sleep at night, I have woken up in the morning. There is actually no certainty that this will happen, but I make the inference because of the fact that this is what has happened every other time. In fact, it is not the case that “All people who go to sleep at night wake up in the morning”. But I’m not going to lose any sleep over that. And we do the same thing when our experience has been less consistent. For instance, I might make the assumption that, if there’s someone at the door, the dog will bark. But it’s not outside the realm of possibility that the dog is asleep, has gone out for a walk, or has been persuaded not to bark by a clever intruder with sedative-laced bacon. I make the assumption that if there’s someone at the door, the dog will bark, because that is what usually happens.

Explaining Common Occurrences

We also use inductive reasoning to explain things that commonly happen. For instance, if I’m about to start an exam and notice that Bill is not here, I might explain this to myself with the reason that Bill is stuck in traffic. I might base this on the reasoning that being stuck in traffic is a common excuse for being late, or because I know that Bill never accounts for traffic when he’s estimating how long it will take him to get somewhere. Again, that Bill is actually stuck in traffic is not certain, but I have some good reasons to think it’s probable. We use this kind of reasoning to explain past events as well. For instance, if I read somewhere that 1986 was a particularly good year for tomatoes, I assume that 1986 also had some ideal combination of rainfall, sun, and consistently warm temperatures. Although it’s possible that a scientific madman circled the globe planting tomatoes wherever he could in 1986, inductive reasoning would tell me that the former, environmental explanation is more likely. (But I could be wrong.)

Generalizing

Often we would like to make general claims, but in fact it would be very difficult to prove any general claim with any certainty. The only way to do so would be to observe every single case of something about which we wanted to make an observation. This would be, in fact, the only way to prove such assertions as, “All swans are white”. Without being able to observe every single swan in the universe, I can never make that claim with certainty. Inductive logic, on the other hand, allows us to make the claim, with a certain amount of modesty.

Inductive Generalization

Inductive generalization allows us to make general claims, despite being unable to actually observe every single member of a class in order to make a certainly true general statement. We see this in scientific studies, population surveys, and in our own everyday reasoning. Take for example a drug study. Some doctor or other wants to know how many people will go blind if they take a certain amount of some drug for so many years. If they determine that 5% of people in the study go blind, they then assume that 5% of all people who take the drug for that many years will go blind. Likewise, if I survey a random group of people and ask them what their favourite colour is, and 75% of them say “purple”, then I assume that purple is the favourite colour of 75% of people. But we have to be careful when we make an inductive generalization. When you tell me that 75% of people really like purple, I’m going to want to know whether you took that survey outside a Justin Bieber concert.

Let’s take an example. Let’s say I asked a class of 400 students whether or not they think logic is a valuable course, and 90% of them said yes. I can make an inductive argument like this:

(P1) 90% of 400 students believe that logic is a valuable course.

(C) Therefore 90% of students believe that logic is a valuable course.

There are certain things I need to take into account in judging the quality of this argument. For instance, did I ask this in a logic course? Did the respondents have to raise their hands so that the professor could see them, or was the survey taken anonymously? Are there enough students in the course to justify using them as a representative group for students in general?

If I did, in fact, make a class of 400 logic students raise their hands in response to the question of whether logic is valuable course, then we can identify a couple of problems with this argument. The first is bias. We can assume that anyone enrolled in a logic course is more likely to see it as valuable than any random student. I have therefore skewed the argument in favour of logic courses. I can also question whether the students were answering the question honestly. Perhaps if they are trying to save the professor’s feelings, they are more likely to raise their hands and assure her that the logic course is a valuable one.

Now let’s say I’ve avoided those problems. I have assured that the 400 students I have asked are randomly selected, say, by soliciting email responses from randomly selected students from the university’s entire student population. Then the argument looks stronger.

Another problem we might have with the argument is whether I have asked enough students so that the whole population is well-represented. If the student body as a whole consists of 400 students, my argument is very strong. If the student body numbers in the tens of thousands, I might want to ask a few more before assuming that the opinions of a few mirror those of the many. This would be a problem with my sample size.

Let’s take another example. Now I’m going to run a scientific study, in which I will pay someone $50 to take a drug with unknown effects and see if it makes them blind. In order to control for other variables, I open the study only to white males between the ages of 18 and 25.

A bad inductive argument would say:

(P1) 40% of 1000 people who took the drug went blind.

(C) Therefore 40% of people who take the drug will go blind.

A better inductive argument would make a more modest claim:

(P1) 40% of the 1000 people who took the drug went blind.

(C) Therefore 40% of white males between the ages of 18 and 25 who take the drug will go blind.

The point behind this example is to show how inductive reasoning imposes an important limitation on the possible conclusions a study or a survey can make. In order to make good generalizations, we need to ensure that our sample is representative, non-biased, and sufficiently sized.

Statistical Syllogism

Where in an inductive generalization we saw statement expressing a statistic applied to a more general group, we can also use statistics to go from the general to the particular. For instance, if I know that most computer science majors are male, and that some random individual with the androgynous name “Cameron” is an computer science major, then we can be reasonably certain that Cameron is a male. We tend to represent the uncertainty by qualifying the conclusion with the word “probably”. If, on the other hand, we wanted to say that something is unlikely, like that Cameron were a female, we could use “probably not”. It is also possible to temper our conclusion with other similar qualifying words.

Let’s take an example.

(P1) Of the 133 people found guilty of homicide last year in Canada, 79% were jailed.

(P2) Socrates was found guilty of homicide last year in Canada.

(C) Therefore, Socrates was probably jailed.

In this case we can be reasonably sure that Socrates is currently rotting in prison. Now the certainty of our conclusion seems to be dependent on the statistics we’re dealing with. There are definitely more certain and more uncertain cases.

(P1) In the last election, 50% of voting Americans voted for Obama, while 48% voted for Romney.

(P2) Jim is a voting American.

(C) Therefore, Jim probably voted for Obama.

Clearly, this argument is not as strong as the first. It is only slightly more likely than not that Jim voted for Obama. In this case we might want to revise our conclusion to say:

(C) Therefore, it is slightly more likely than not that Jim voted for Obama.

In other cases, the likelihood that something is or is not the case approaches certainty. For example:

(P1) There is a 0.00000059% chance you will die on any single flight, assuming you use one of the most poorly rated airlines.

(P2) I’m flying to Paris next week.

(C) There’s more than a million to one chance that I will die on my flight.

Note that in all of these examples, nothing is ever stated with absolute certainty. It is possible to improve the chances that our conclusions will be accurate by being more specific, or finding out more information. We would know more about Jim’s voting strategy, for instance, if we knew where he lived, his previous voting habits, or if we simply asked him for whom he voted (in which case, we might also want to know how often Jim lies).

Induction by Shared Properties

Induction by shared properties involves noting the similarity between two things with respect to their properties, and inferring from this that they may share other properties.

A familiar example of this is how a company might recommend products to you based on other customers’ purchases. Amazon.com tells me, for instance, that customers who bought the complete Sex and the City DVD series also bought Lipstick Jungle and Twilight.

Assuming that people buy things because they like them, we can rephrase this as:

(P1) There are a large number of people who, if they like Sex and the City and Twilight, will also like Lipstick Jungle.

I could also make the following observation:

(P2) I like Sex and the City and Twilight.

And then infer from there two premises that:

(C) I would also like Lipstick Jungle.

And I did. In general, induction by shared properties assumes that if something has properties w, x, y, and z, and if something else has properties w, x, and y, then it’s reasonable to assume that that something else also has property z. Note that in the above example all of the properties were actually preferences with regard to entertainment. The kinds of properties involved in the comparison can and will make an argument better or worse. Let’s consider a worse induction.

(P1) Lisa is tall, has blonde hair, has blue eyes, and rocks out to Nirvana on weekends.

(P2) Gina is tall, has blonde hair, and has blue eyes.

(C) Therefore Gina probably rocks out to Nirvana on weekends.

In this case the properties don’t seem to be related in the same way as in the first example. While the first three are physical characteristics, the last property instead indicates to us that Lisa is stuck in a 90’s grunge phase. Gina, though she shares several properties with Lisa, might not share the same undying love for Kurt Cobain. Let’s try a stronger argument.

(P1) Bob and Dick both wear plaid shirts all the time, wear large plastic-rimmed glasses, and listen to bands you’ve never heard of.

(P2) Bob drinks PBR.

(C) Dick probably also drinks PBR.

Here we can identify the qualities that Bob and Dick have in common as symptoms of hipsterism. The fact that Bob drinks PBR is another symptom of this affectation. Given that Dick is exhibiting most of the same symptoms, the idea that Dick would also drink PBR is a reasonable assumption to make.

Practical Uses

A procedure very much like Induction by Shared Properties is performed by nurses and doctors when they diagnose a patient’s condition. Their thinking goes like this:

(P1) Patients who have elephantitus display an increased heart rate, elevated blood pressure, a rash on their skin, and a strong desire to visit the elephant pen at the zoo.

(P2) The patient here in front of me has an increased heart rate, elevated blood pressure, and a strong desire to visit the elephant pen at the zoo.

(C) It is probable, therefore, that the patient here in front of me has elephantitus.

The more that a patient’s symptoms match the ‘textbook definition’ of a given disease, then the more likely it is that the patient has that disease. Caregivers then treat the patient for the

disease that they think the patient probably has. If the disease doesn’t respond to the treatment, or the patient starts to present different symptoms, then they consider other conditions with similar symptoms that the patient is likely to have.

Induction by Shared Relations

Induction by shared relations is much like induction by shared properties, except insofar that what is shared are not properties, but relations. A simple example is the causal relation, from which we might make an inductive argument like this:

(P1) Percocet, Oxycontin and Morphine reduce pain, cause drowsiness, and may be habit forming.

(P2) Heroin also reduces pain and causes drowsiness.

(C) Heroin is probably also habit forming.

In this case the effects of reducing pain, drowsiness, and addiction are all assumed to be caused by the drugs listed. We can use an induction by shared relation to make the probable conclusion that if heroin, like the other drugs, reduces pain and causes drowsiness, it is probably also habit forming.

Another interesting example are the relations we have with other people. For instance, Facebook knows everything about you. But let’s focus on the “friends with” relation. They compare who your friends are with the friends of your friends in order to determine who else you might actually know. The induction goes a little like this:

(P1) Donna is friends with Brandon, Kelly, Steve, and Brenda.

(P2) David is friends with Brandon, Kelly, and Steve.

(C) David probably also knows Brenda.

We could strengthen that argument if we knew that Brandon, Kelly, Steve, and Brenda were all friends with each other as well. We could also make an alternate conclusion based on the same argument above:

(C) David probably also knows Donna.

They do, after all, know at least three of the same people. They’ve probably run into each other at some point.

  • Augsburg.edu
  • Inside Augsburg

Search Strommen Center for Meaningful Work

  • Faculty & Staff
  • Graduate Students
  • First Generation
  • International
  • Students With Disabilities
  • Undocumented
  • Business & Finance
  • Culture and Language
  • Environmental Sustainability
  • Government, Law & Policy
  • Health Professions
  • Human & Social Services
  • Information Technology & Data
  • Marketing, Media & Communications
  • Resumes and Cover Letters
  • Expand Your Network / Mentor
  • Explore Your Interests / Self Assessment
  • Negotiate an Offer
  • Prepare for an Interview
  • Prepare for Graduate School
  • Search for a Job / Internship
  • Job Fair Preparation
  • Start Your Internship
  • Choosing a Major
  • Career Collaborative
  • Travelers EDGE
  • Meet the Team

Critical Thinking: A Simple Guide and Why It’s Important

  • Share This: Share Critical Thinking: A Simple Guide and Why It’s Important on Facebook Share Critical Thinking: A Simple Guide and Why It’s Important on LinkedIn Share Critical Thinking: A Simple Guide and Why It’s Important on X

Critical Thinking: A Simple Guide and Why It’s Important was originally published on Ivy Exec .

Strong critical thinking skills are crucial for career success, regardless of educational background. It embodies the ability to engage in astute and effective decision-making, lending invaluable dimensions to professional growth.

At its essence, critical thinking is the ability to analyze, evaluate, and synthesize information in a logical and reasoned manner. It’s not merely about accumulating knowledge but harnessing it effectively to make informed decisions and solve complex problems. In the dynamic landscape of modern careers, honing this skill is paramount.

The Impact of Critical Thinking on Your Career

☑ problem-solving mastery.

Visualize critical thinking as the Sherlock Holmes of your career journey. It facilitates swift problem resolution akin to a detective unraveling a mystery. By methodically analyzing situations and deconstructing complexities, critical thinkers emerge as adept problem solvers, rendering them invaluable assets in the workplace.

☑ Refined Decision-Making

Navigating dilemmas in your career path resembles traversing uncertain terrain. Critical thinking acts as a dependable GPS, steering you toward informed decisions. It involves weighing options, evaluating potential outcomes, and confidently choosing the most favorable path forward.

☑ Enhanced Teamwork Dynamics

Within collaborative settings, critical thinkers stand out as proactive contributors. They engage in scrutinizing ideas, proposing enhancements, and fostering meaningful contributions. Consequently, the team evolves into a dynamic hub of ideas, with the critical thinker recognized as the architect behind its success.

☑ Communication Prowess

Effective communication is the cornerstone of professional interactions. Critical thinking enriches communication skills, enabling the clear and logical articulation of ideas. Whether in emails, presentations, or casual conversations, individuals adept in critical thinking exude clarity, earning appreciation for their ability to convey thoughts seamlessly.

☑ Adaptability and Resilience

Perceptive individuals adept in critical thinking display resilience in the face of unforeseen challenges. Instead of succumbing to panic, they assess situations, recalibrate their approaches, and persist in moving forward despite adversity.

☑ Fostering Innovation

Innovation is the lifeblood of progressive organizations, and critical thinking serves as its catalyst. Proficient critical thinkers possess the ability to identify overlooked opportunities, propose inventive solutions, and streamline processes, thereby positioning their organizations at the forefront of innovation.

☑ Confidence Amplification

Critical thinkers exude confidence derived from honing their analytical skills. This self-assurance radiates during job interviews, presentations, and daily interactions, catching the attention of superiors and propelling career advancement.

So, how can one cultivate and harness this invaluable skill?

✅ developing curiosity and inquisitiveness:.

Embrace a curious mindset by questioning the status quo and exploring topics beyond your immediate scope. Cultivate an inquisitive approach to everyday situations. Encourage a habit of asking “why” and “how” to deepen understanding. Curiosity fuels the desire to seek information and alternative perspectives.

✅ Practice Reflection and Self-Awareness:

Engage in reflective thinking by assessing your thoughts, actions, and decisions. Regularly introspect to understand your biases, assumptions, and cognitive processes. Cultivate self-awareness to recognize personal prejudices or cognitive biases that might influence your thinking. This allows for a more objective analysis of situations.

✅ Strengthening Analytical Skills:

Practice breaking down complex problems into manageable components. Analyze each part systematically to understand the whole picture. Develop skills in data analysis, statistics, and logical reasoning. This includes understanding correlation versus causation, interpreting graphs, and evaluating statistical significance.

✅ Engaging in Active Listening and Observation:

Actively listen to diverse viewpoints without immediately forming judgments. Allow others to express their ideas fully before responding. Observe situations attentively, noticing details that others might overlook. This habit enhances your ability to analyze problems more comprehensively.

✅ Encouraging Intellectual Humility and Open-Mindedness:

Foster intellectual humility by acknowledging that you don’t know everything. Be open to learning from others, regardless of their position or expertise. Cultivate open-mindedness by actively seeking out perspectives different from your own. Engage in discussions with people holding diverse opinions to broaden your understanding.

✅ Practicing Problem-Solving and Decision-Making:

Engage in regular problem-solving exercises that challenge you to think creatively and analytically. This can include puzzles, riddles, or real-world scenarios. When making decisions, consciously evaluate available information, consider various alternatives, and anticipate potential outcomes before reaching a conclusion.

✅ Continuous Learning and Exposure to Varied Content:

Read extensively across diverse subjects and formats, exposing yourself to different viewpoints, cultures, and ways of thinking. Engage in courses, workshops, or seminars that stimulate critical thinking skills. Seek out opportunities for learning that challenge your existing beliefs.

✅ Engage in Constructive Disagreement and Debate:

Encourage healthy debates and discussions where differing opinions are respectfully debated.

This practice fosters the ability to defend your viewpoints logically while also being open to changing your perspective based on valid arguments. Embrace disagreement as an opportunity to learn rather than a conflict to win. Engaging in constructive debate sharpens your ability to evaluate and counter-arguments effectively.

✅ Utilize Problem-Based Learning and Real-World Applications:

Engage in problem-based learning activities that simulate real-world challenges. Work on projects or scenarios that require critical thinking skills to develop practical problem-solving approaches. Apply critical thinking in real-life situations whenever possible.

This could involve analyzing news articles, evaluating product reviews, or dissecting marketing strategies to understand their underlying rationale.

In conclusion, critical thinking is the linchpin of a successful career journey. It empowers individuals to navigate complexities, make informed decisions, and innovate in their respective domains. Embracing and honing this skill isn’t just an advantage; it’s a necessity in a world where adaptability and sound judgment reign supreme.

So, as you traverse your career path, remember that the ability to think critically is not just an asset but the differentiator that propels you toward excellence.

IMAGES

  1. 15 Inductive Reasoning Examples (2024)

    what is inductive argument in critical thinking

  2. What is inductive reasoning? (with examples)

    what is inductive argument in critical thinking

  3. What Is Inductive Reasoning? Definitions, Types and Examples

    what is inductive argument in critical thinking

  4. Inductive Argument Structure

    what is inductive argument in critical thinking

  5. Critical thinking and deductive reasoning

    what is inductive argument in critical thinking

  6. -The flow diagrams of inductive and deductive reasoning

    what is inductive argument in critical thinking

VIDEO

  1. Critical Thinking

  2. CRITICAL THINKING

  3. Critical Thinking: Inductive Arguments 1

  4. Critical Thinking: Inductive Arguments 2

  5. What is an Inductive Argument?

  6. Critical Thinking: Inductive Arguments 3

COMMENTS

  1. Guide To Inductive & Deductive Reasoning

    Guide To Inductive & Deductive Reasoning. Induction and deduction are pervasive elements in critical thinking. They are also somewhat misunderstood terms. Arguments based on experience or observation are best expressed inductively, while arguments based on laws or rules are best expressed deductively. Most arguments are mainly inductive.

  2. Inductive Reasoning: Definition, Examples, & Methods

    Critical thinking: Inductive reasoning requires you to analyze evidence, identify weaknesses, ... Inductive reasoning plays a central role in qualitative research by allowing researchers to derive general principles and theories from specific observations or instances. Researchers begin with a set of detailed observations and gradually develop ...

  3. Chapter 14 Inductive Arguments

    Chapter 14. Inductive Arguments. The goal of an inductive argument is not to guarantee the truth of the conclusion, but to show that the conclusion is probably true. Three important kinds of inductive arguments are. Inductive generalizations, Arguments from analogy, and. Inferences to the best explanation.

  4. Inductive Reasoning

    Inductive reasoning is commonly linked to qualitative research, but both quantitative and qualitative research use a mix of different types of reasoning. Tip Due to its reliance on making observations and searching for patterns, inductive reasoning is at high risk for research biases , particularly confirmation bias .

  5. Deductive and Inductive Arguments

    Deductive and Inductive Arguments. In philosophy, an argument consists of a set of statements called premises that serve as grounds for affirming another statement called the conclusion. Philosophers typically distinguish arguments in natural languages (such as English) into two fundamentally different types: deductive and inductive.Each type of argument is said to have characteristics that ...

  6. 1.8: Deductive vs. Inductive Arguments

    Introduction to Logic and Critical Thinking 2e (van Cleave) 1: Reconstructing and Analyzing Arguments ... In contrast, an inductive argument is an argument whose conclusion is supposed to follow from its premises with a high level of probability, which means that although it is possible that the conclusion doesn't follow from its premises, it ...

  7. 2.7: Inductive Arguments

    Inductively strong reasons are not always truth preserving. There is an inductive leap from the reasons to the conclusion. Inductive support comes in varying degrees; the stronger the inductive reasons, the less risky the inductive leap. This page titled 2.7: Inductive Arguments is shared under a CC BY-NC 4.0 license and was authored, remixed ...

  8. Chapter 2 Arguments

    In critical thinking, an argument is defined as. Argument. A set of statements, one of which is the conclusion and the others are the premises. There are three important things to remember here: ... Inductive reasoning attempts to show that the conclusion is probably true.

  9. Inductive Arguments

    " An inductive argument can be affected by acquiring new premises (evidence), but a deductive argument cannot be. For example, this is a reasonably strong inductive argument: ... If the arguer believes that the truth of the premises definitely establishes the truth of the conclusion, then the argument is deductive."

  10. Inductive reasoning

    Inductive reasoning is a form of argument that—in contrast to deductive reasoning—allows for the possibility that a conclusion can be false, even if all of the premises are true. This difference between deductive and inductive reasoning is reflected in the terminology used to describe deductive and inductive arguments.

  11. Induction

    Inductive reasoning begins with observations that are specific and limited in scope, and proceeds to a generalized conclusion that is likely, but not certain, in light of accumulated evidence. You could say that inductive reasoning moves from the specific to the general. Much scientific research is carried out by the inductive method: gathering ...

  12. PDF Inductive Reasoning

    Inductive Arguments For each argument below, (a) determine whether the argument is an enumerative induction, a statis-tical syllogism, or an analogical induction; (b) identify the conclusion of the argument; (c) identify the principal components of the argument (for enumerative induction, identify the target population,

  13. Arguments VI: Inductive Arguments

    21 Arguments VI: Inductive Arguments . I. Introduction The last chapter introduced the distinction between deductive and inductive arguments. Deductive arguments are those whose conclusion is supposed to follow with logical necessity from the premises, while inductive arguments are those that aim to establish a conclusion as only being probably true, given the premises.

  14. CHAPTER 7

    Evaluating Inductive Arguments and Analogies. As we saw in chapter 1, an inductive argument is an argument whose conclusion is supposed to follow from its premises with a high level of probability, rather than with certainty. ... For an introduction to inductive arguments, watch the Critical Thinking Academy's video, ...

  15. Inductive Reasoning

    One of the basic theories of modern biology, cell theory, is a product of inductive reasoning. It states that because every organism that has been observed is made up of cells, it is most likely that all living things are made up of cells. There are two forms of inductive arguments. Those that compare one thing, event, or idea to another to see ...

  16. LOGOS: Critical Thinking, Arguments, and Fallacies

    Critical thinking can be contrasted with Authoritarian thinking. This type of thinking seeks to preserve the original conclusion. ... Inductive Arguments: in an inductive argument the conclusion likely (at best) follows the premises. Let's have an example: Example 4: 98.9% of all TCC students like pizza. You are a TCC student. Thus, you like ...

  17. 1.8: Deductive vs. Inductive Arguments

    PHI-104: Critical Thinking 1: Reconstructing and Analyzing Arguments 1.8: Deductive vs. Inductive Arguments ... In contrast, an inductive argument is an argument whose conclusion is supposed to follow from its premises with a high level of probability, which means that although it is possible that the conclusion doesn't follow from its ...

  18. Inductive Logic

    An inductive logic is a logic of evidential support. In a deductive logic, the premises of a valid deductive argument logically entail the conclusion, where logical entailment means that every logically possible state of affairs that makes the premises true must make the conclusion true as well. Thus, the premises of a valid deductive argument provide total support for the conclusion.

  19. Kinds of Arguments

    Kinds of Arguments. Contemporary Western philosophy treats arguments as coming in two main types, deductive and inductive. The basic distinction and difference will be mentioned here. Deductive arguments are arguments in which the premises (if true) guarantee the truth of the conclusion. The conclusion of a successful deductive argument cannot ...

  20. What Is Inductive Reasoning? Learn the Definition of Inductive

    There is one logic exercise we do nearly every day, though we're scarcely aware of it. We take tiny things we've seen or read and draw general principles from them—an act known as inductive reasoning. This form of reasoning plays an important role in writing, too. But there's a big gap between a strong inductive argument and a weak one.

  21. Inductive reasoning

    Purchase single chapter. Single Chapter PDF Download $42.00. Details. Unlimited viewing of the article/chapter PDF and any associated supplements and figures. Article/chapter can be printed. Article/chapter can be downloaded. Article/chapter can not be redistributed. Check out.

  22. 5.2: Cogency and Strong Arguments

    Critical Reasoning and Writing (Levin et al.) ... Inductive arguments are said to be either strong or weak. There's no absolute cut-off between strength and weakness, but some arguments will be very strong and others very weak, so the distinction is still useful even if it is not precise. ... Their thinking goes like this: (P1) Patients who ...

  23. What is an inductive argument?

    An inductive argument is an assertion that uses specific premises or observations to make a broader generalization. Inductive arguments, by their nature, possess some degree of uncertainty. They are used to show the likelihood that a conclusion drawn from known premises is true. Logic plays a big role in inductive arguments.

  24. Critical Thinking: A Simple Guide and Why It's Important

    Apply critical thinking in real-life situations whenever possible. This could involve analyzing news articles, evaluating product reviews, or dissecting marketing strategies to understand their underlying rationale. In conclusion, critical thinking is the linchpin of a successful career journey.

  25. Impact of Inductive Reasoning on BI Critical Thinking

    1 Inductive Basics. Inductive reasoning is a method of thinking that involves creating generalizations based on specific instances. In the context of BI, you use this type of reasoning to predict ...