Perceptual Set In Psychology: Definition & Examples

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

Perceptual set in psychology refers to a mental predisposition or readiness to perceive stimuli in a particular way based on previous experiences, expectations, beliefs, and context. It influences how we interpret and make sense of sensory information, shaping our perception and understanding of the world.

Perceptual set theory stresses the idea of perception as an active process involving selection, inference, and interpretation (known as top-down processing ).

The concept of perceptual set is important to the active process of perception.  Allport (1955) defined perceptual set as:

“A perceptual bias or predisposition or readiness to perceive particular features of a stimulus.”

Perceptual set is a tendency to perceive or notice some aspects of the available sensory data and ignore others.  According to Vernon, 1955 perceptual set works in two ways:

  • The perceiver has certain expectations and focuses attention on particular aspects of the sensory data: This he calls a Selector”.
  • The perceiver knows how to classify, understand and name selected data and what inferences to draw from it. This she calls an “Interpreter”.

It has been found that a number of variables, or factors, influence perceptual set, and set in turn influences perception. The factors include:

• Expectations • Emotion • Motivation • Culture

Expectation and Perceptual Set

(a) Bruner & Minturn (1955) illustrated how expectation could influence set by showing participants an ambiguous figure “13” set in the context of letters or numbers e.g.

percpetual set Bruner Minturn

The physical stimulus “13” is the same in each case but is perceived differently because of the influence of the context in which it appears. We EXPECT to see a letter in the context of other letters of the alphabet, whereas we EXPECT to see numbers in the context of other numbers.

(b) We may fail to notice printing/writing errors for the same reason. For example:

1. “The Cat Sat on the Map and Licked its Whiskers”.

percpetual set

(a) and (b) are examples of interaction between expectation and past experience.

(c) A study by Bugelski and Alampay (1961) using the “rat-man” ambiguous figure also demonstrated the importance of expectation in inducing set. Participants were shown either a series of animal pictures or neutral pictures prior to exposure to the ambiguous picture. They found participants were significantly more likely to perceive the ambiguous picture as a rat if they had had prior exposure to animal pictures.

percpetual set expectation

Motivation / Emotion and Perceptual Set

Allport (1955) has distinguished 6 types of motivational-emotional influence on perception:

(i) bodily needs (e.g. physiological needs) (ii) reward and punishment (iii) emotional connotation (iv) individual values (v) personality (vi) the value of objects.

(a) Sandford (1936) deprived participants of food for varying lengths of time, up to 4 hours, and then showed them ambiguous pictures. Participants were more likely to interpret the pictures as something to do with food if they had been deprived of food for a longer period of time.

Similarly Gilchrist & Nesberg (1952), found participants who had gone without food for the longest periods were more likely to rate pictures of food as brighter. This effect did not occur with non-food pictures.

(b) A more recent study into the effect of emotion on perception was carried out by Kunst- Wilson & Zajonc (1980). Participants were repeatedly presented with geometric figures, but at levels of exposure too brief to permit recognition.

Then, on each of a series of test trials, participants were presented a pair of geometric forms, one of which had previously been presented and one of which was brand new.  For each pair, participants had to answer two questions: (a) Which of the 2 had previously been presented? ( A recognition test); and (b) Which of the two was most attractive? (A feeling test).

The hypothesis for this study was based on a well-known finding that the more we are exposed to a stimulus, the more familiar we become with it and the more we like it.  Results showed no discrimination on the recognition test – they were completely unable to tell old forms from new ones, but participants could discriminate on the feeling test, as they consistently favored old forms over new ones. Thus information that is unavailable for conscious recognition seems to be available to an unconscious system that is linked to affect and emotion.

Culture and Perceptual Set

percpetual set culture

Elephant drawing split-view and top-view perspective. The split elephant drawing was generally preferred by African children and adults .

(a) Deregowski (1972) investigated whether pictures are seen and understood in the same way in different cultures. His findings suggest that perceiving perspective in drawings is in fact a specific cultural skill, which is learned rather than automatic. He found people from several cultures prefer drawings which don”t show perspective, but instead are split so as to show both sides of an object at the same time.

In one study he found a fairly consistent preference among African children and adults for split-type drawings over perspective-drawings. Split type drawings show all the important features of an object which could not normally be seen at once from that perspective. Perspective drawings give just one view of an object. Deregowski argued that this split-style representation is universal and is found in European children before they are taught differently.

(b) Hudson (1960) noted difficulties among South African Bantu workers in interpreting depth cues in pictures. Such cues are important because they convey information about the spatial relationships among the objects in pictures. A person using depth cues will extract a different meaning from a picture than a person not using such cues.

Hudson tested pictorial depth perception by showing participants a picture like the one below. A correct interpretation is that the hunter is trying to spear the antelope, which is nearer to him than the elephant. An incorrect interpretation is that the elephant is nearer and about to be speared. The picture contains two depth cues: overlapping objects and known size of objects. Questions were asked in the participants native language such as:

What do you see? Which is nearer, the antelope or the elephant? What is the man doing?

The results indicted that both children and adults found it difficult to perceive depth in the pictures.

percpetual set culture

The cross-cultural studies seem to indicate that history and culture play an important part in how we perceive our environment. Perceptual set is concerned with the active nature of perceptual processes and clearly there may be a difference cross-culturally in the kinds of factors that affect perceptual set and the nature of the effect.

Allport, F. H. (1955). Theories of perception and the concept of structure . New York: Wiley.

Bruner, J. S. and Minturn, A.L. (1955). Perceptual identification and perceptual organisation, Journal of General Psychology 53: 21-8.

Bugelski, B. R., & Alampay, D. A., (1961). The role of frequency in developing perceptual sets. Canadian Journal of Psychology , 15, 205-211.

Deregowski, J. B., Muldrow, E. S. & Muldrow, W. F. (1972). Pictorial recognition in a remote Ethiopian population. Perception , 1, 417-425.

Gilchrist, J. C.; Nesberg, Lloyd S. (1952). Need and perceptual change in need-related objects. Journal of Experimental Psychology , Vol 44(6).

Hudson, W. (1960). Pictorial depth perception in sub-cultural groups in Africa. Journal of Social Psychology , 52, 183-208.

Kunst- Wilson, W. R., & Zajonc, R. B. (1980). Affective discrimination of stimuli that cannot be recognised. Science , Vol 207, 557-558.

Necker, L. (1832). LXI. Observations on some remarkable optical phenomena seen in Switzerland; and on an optical phenomenon which occurs on viewing a figure of a crystal or geometrical solid . The London and Edinburgh Philosophical Magazine and Journal of Science, 1 (5), 329-337.

Sanford, R. N. (1936). The effect of abstinence from food upon imaginal processes: a preliminary experiment. Journal of Psychology: Interdisciplinary and Applied , 2, 129-136.

Vernon, M. D. (1955). The functions of schemata in perceiving. Psychological Review , Vol 62(3).

Why people should be skeptical when evaluating the accuracy of their perceptual set?

People should be skeptical when evaluating the accuracy of their perceptual set because it can lead to biased and subjective interpretations of reality. It can limit our ability to consider alternative perspectives or recognize new information that challenges our beliefs. Awareness of our perceptual sets and actively questioning them allows for more open-mindedness, critical thinking, and a more accurate understanding of the world.

Print Friendly, PDF & Email

5.6 The Gestalt Principles of Perception

Learning objectives.

By the end of this section, you will be able to:

  • Explain the figure-ground relationship
  • Define Gestalt principles of grouping
  • Describe how perceptual set is influenced by an individual’s characteristics and mental state

   In the early part of the 20th century, Max Wertheimer published a paper demonstrating that individuals perceived motion in rapidly flickering static images—an insight that came to him as he used a child’s toy tachistoscope. Wertheimer, and his assistants Wolfgang Köhler and Kurt Koffka, who later became his partners, believed that perception involved more than simply combining sensory stimuli. This belief led to a new movement within the field of psychology known as Gestalt psychology. The word gestalt literally means form or pattern, but its use reflects the idea that the whole is different from the sum of its parts. In other words, the brain creates a perception that is more than simply the sum of available sensory inputs, and it does so in predictable ways. Gestalt psychologists translated these predictable ways into principles by which we organize sensory information. As a result, Gestalt psychology has been extremely influential in the area of sensation and perception (Rock & Palmer, 1990).

Gestalt perspectives in psychology represent investigations into ambiguous stimuli to determine where and how these ambiguities are being resolved by the brain. They are also aimed at understanding sensory and perception as processing information as groups or wholes instead of constructed wholes from many small parts. This perspective has been supported by modern cognitive science through fMRI research demonstrating that some parts of the brain, specifically the lateral occipital lobe, and the fusiform gyrus, are involved in the processing of whole objects, as opposed to the primary occipital areas that process individual elements of stimuli (Kubilius, Wagemans & Op de Beeck, 2011).

One Gestalt principle is the figure-ground relationship. According to this principle, we tend to segment our visual world into figure and ground. Figure is the object or person that is the focus of the visual field, while the ground is the background. As the figure below shows, our perception can vary tremendously, depending on what is perceived as figure and what is perceived as ground. Presumably, our ability to interpret sensory information depends on what we label as figure and what we label as ground in any particular case, although this assumption has been called into question (Peterson & Gibson, 1994; Vecera & O’Reilly, 1998).

An illustration shows two identical black face-like shapes that face towards one another, and one white vase-like shape that occupies all of the space in between them. Depending on which part of the illustration is focused on, either the black shapes or the white shape may appear to be the object of the illustration, leaving the other(s) perceived as negative space.

The concept of figure-ground relationship explains why this image can be perceived either as a vase or as a pair of faces.

   Another Gestalt principle for organizing sensory stimuli into meaningful perception is proximity . This principle asserts that things that are close to one another tend to be grouped together, as the figure below illustrates.

The Gestalt principle of proximity suggests that you see (a) one block of dots on the left side and (b) three columns on the right side.

   How we read something provides another illustration of the proximity concept. For example, we read this sentence like this, notl iket hiso rt hat. We group the letters of a given word together because there are no spaces between the letters, and we perceive words because there are spaces between each word. Here are some more examples: Cany oum akes enseo ft hiss entence? What doth es e wor dsmea n?

We might also use the principle of similarity to group things in our visual fields. According to this principle, things that are alike tend to be grouped together (figure below). For example, when watching a football game, we tend to group individuals based on the colors of their uniforms. When watching an offensive drive, we can get a sense of the two teams simply by grouping along this dimension.

When looking at this array of dots, we likely perceive alternating rows of colors. We are grouping these dots according to the principle of similarity.

   Two additional Gestalt principles are the law of continuity (or good continuation) and closure. The law of continuity suggests that we are more likely to perceive continuous, smooth flowing lines rather than jagged, broken lines (figure below). The principle of closure states that we organize our perceptions into complete objects rather than as a series of parts (figure below).

Good continuation would suggest that we are more likely to perceive this as two overlapping lines, rather than four lines meeting in the center.

Closure suggests that we will perceive a complete circle and rectangle rather than a series of segments..

   According to Gestalt theorists, pattern perception, or our ability to discriminate among different figures and shapes, occurs by following the principles described above. You probably feel fairly certain that your perception accurately matches the real world, but this is not always the case. Our perceptions are based on perceptual hypotheses: educated guesses that we make while interpreting sensory information. These hypotheses are informed by a number of factors, including our personalities, experiences, and expectations. We use these hypotheses to generate our perceptual set. For instance, research has demonstrated that those who are given verbal priming produce a biased interpretation of complex ambiguous figures (Goolkasian & Woodbury, 2010).

Template Approach

Ulrich Neisser (1967), author of one of the first cognitive psychology textbook suggested pattern recognition would be simplified, although abilities would still exist, if all the patterns we experienced were identical. According to this theory, it would be easier for us to recognize something if it matched exactly with what we had perceived before. Obviously the real environment is infinitely dynamic producing countless combinations of orientation, size. So how is it that we can still read a letter g whether it is capitalized, non-capitalized or in someone else hand writing? Neisser suggested that categorization of information is performed by way of the brain creating mental  templates , stored models of all possible categorizable patterns (Radvansky & Ashcraft, 2014). When a computer reads your debt card information it is comparing the information you enter to a template of what the number should look like (has a specific amount of numbers, no letters or symbols…). The template view perception is able to easily explain how we recognize pieces of our environment, but it is not able to explain why we are still able to recognize things when it is not viewed from the same angle, distance, or in the same context.

In order to address the shortfalls of the template model of perception, the  feature detection approach to visual perception suggests we recognize specific features of what we are looking at, for example the straight lines in an H versus the curved line of a letter C. Rather than matching an entire template-like pattern for the capital letter H, we identify the elemental features that are present in the H. Several people have suggested theories of feature-based pattern recognition, one of which was described by Selfridge (1959) and is known as the  pandemonium model suggesting that information being perceived is processed through various stages by what Selfridge described as mental demons, who shout out loud as they attempt to identify patterns in the stimuli. These pattern demons are at the lowest level of perception so after they are able to identify patterns, computational demons further analyze features to match to templates such as straight or curved lines. Finally at the highest level of discrimination, cognitive demons which allow stimuli to be categorized in terms of context and other higher order classifications, and the decisions demon decides among all the demons shouting about what the stimuli is which while be selected for interpretation.

perceptual hypothesis psychology definition

Selfridge’s pandemonium model showing the various levels of demons which make estimations and pass the information on to the next level before the decision demon makes the best estimation to what the stimuli is. Adapted from Lindsay and Norman (1972).

Although Selfridges ideas regarding layers of shouting demons that make up our ability to discriminate features of our environment, the model actually incorporates several ideas that are important for pattern recognition. First, at its foundation, this model is a feature detection model that incorporates higher levels of processing as the information is processed in time. Second, the Selfridge model of many different shouting demons incorporates ideas of parallel processing suggesting many different forms of stimuli can be analyzed and processed to some extent at the same time. Third and finally, the model suggests that perception in a very real sense is a series of problem solving procedures where we are able to take bits of information and piece it all together to create something we are able to recognize and classify as something meaningful.

In addition to sounding initially improbable by being based on a series of shouting fictional demons,  one of the main critiques of Selfridge’s demon model of feature detection is that it is primarily a  bottom-up , or  data-driven processing system. This means the feature detection and processing for discrimination all comes from what we get out of the environment. Modern progress in cognitive science has argued against strictly bottom-up processing models suggesting that context plays an extremely important role in determining what you are perceiving and discriminating between stimuli. To build off previous models, cognitive scientist suggested an additional  top-down , or  conceptually-driven account in which context and higher level knowledge such as context something tends to occur in or a persons expectations influence lower-level processes.

Finally the most modern theories that attempt to describe how information is processed for our perception and discrimination are known as  connectionist   models. Connectionist models incorporate an enormous amount of mathematical computations which work in parallel and across series of interrelated web like structures using top-down and bottom-up processes to narrow down what the most probably solution for the discrimination would be. Each unit in a connectionist layer is massively connected in a giant web with many or al the units in the next layer of discrimination. Within these models, even if there is not many features present in the stimulus, the number of computations in a single run for discrimination become incredibly large because of all the connections that exist between each unit and layer.

The Depths of Perception: Bias, Prejudice, and Cultural Factors

   In this chapter, you have learned that perception is a complex process. Built from sensations, but influenced by our own experiences, biases, prejudices, and cultures , perceptions can be very different from person to person. Research suggests that implicit racial prejudice and stereotypes affect perception. For instance, several studies have demonstrated that non-Black participants identify weapons faster and are more likely to identify non-weapons as weapons when the image of the weapon is paired with the image of a Black person (Payne, 2001; Payne, Shimizu, & Jacoby, 2005). Furthermore, White individuals’ decisions to shoot an armed target in a video game is made more quickly when the target is Black (Correll, Park, Judd, & Wittenbrink, 2002; Correll, Urland, & Ito, 2006). This research is important, considering the number of very high-profile cases in the last few decades in which young Blacks were killed by people who claimed to believe that the unarmed individuals were armed and/or represented some threat to their personal safety.

Gestalt theorists have been incredibly influential in the areas of sensation and perception. Gestalt principles such as figure-ground relationship, grouping by proximity or similarity, the law of good continuation, and closure are all used to help explain how we organize sensory information. Our perceptions are not infallible, and they can be influenced by bias, prejudice, and other factors.

References:

Openstax Psychology text by Kathryn Dumper, William Jenkins, Arlene Lacombe, Marilyn Lovett and Marion Perlmutter licensed under CC BY v4.0. https://openstax.org/details/books/psychology

Review Questions:

1. According to the principle of ________, objects that occur close to one another tend to be grouped together.

a. similarity

b. good continuation

c. proximity

2. Our tendency to perceive things as complete objects rather than as a series of parts is known as the principle of ________.

d. similarity

3. According to the law of ________, we are more likely to perceive smoothly flowing lines rather than choppy or jagged lines.

4. The main point of focus in a visual display is known as the ________.

b. perceptual set

Critical Thinking Question:

1. The central tenet of Gestalt psychology is that the whole is different from the sum of its parts. What does this mean in the context of perception?

2. Take a look at the following figure. How might you influence whether people see a duck or a rabbit?

A drawing appears to be a duck when viewed horizontally and a rabbit when viewed vertically.

Personal Application Question:

1. Have you ever listened to a song on the radio and sung along only to find out later that you have been singing the wrong lyrics? Once you found the correct lyrics, did your perception of the song change?

figure-ground relationship

Gestalt psychology

  • good continuation

pattern perception

perceptual hypothesis

principle of closure

Key Takeaways

1. This means that perception cannot be understood completely simply by combining the parts. Rather, the relationship that exists among those parts (which would be established according to the principles described in this chapter) is important in organizing and interpreting sensory information into a perceptual set.

2. Playing on their expectations could be used to influence what they were most likely to see. For instance, telling a story about Peter Rabbit and then presenting this image would bias perception along rabbit lines.

closure:  organizing our perceptions into complete objects rather than as a series of parts

figure-ground relationship:  segmenting our visual world into figure and ground

Gestalt psychology:  field of psychology based on the idea that the whole is different from the sum of its parts

good continuation:  (also, continuity) we are more likely to perceive continuous, smooth flowing lines rather than jagged, broken lines

pattern perception:  ability to discriminate among different figures and shapes

perceptual hypothesis:  educated guess used to interpret sensory information

principle of closure:  organize perceptions into complete objects rather than as a series of parts

proximity:  things that are close to one another tend to be grouped together

similarity:  things that are alike tend to be grouped together

Review Questions

According to the principle of ________, objects that occur close to one another tend to be grouped together.

Our tendency to perceive things as complete objects rather than as a series of parts is known as the principle of ________.

According to the law of ________, we are more likely to perceive smoothly flowing lines rather than choppy or jagged lines.

The main point of focus in a visual display is known as the ________.

  • perceptual set

Critical Thinking Question

The central tenet of Gestalt psychology is that the whole is different from the sum of its parts. What does this mean in the context of perception?

Take a look at the following figure. How might you influence whether people see a duck or a rabbit?

Answer: Playing on their expectations could be used to influence what they were most likely to see. For instance, telling a story about Peter Rabbit and then presenting this image would bias perception along rabbit lines.

Personal Application Question

Have you ever listened to a song on the radio and sung along only to find out later that you have been singing the wrong lyrics? Once you found the correct lyrics, did your perception of the song change?

Creative Commons License

Share This Book

  • Increase Font Size

Key Theories On The Psychology Of Perception

Perception is defined as “ the process or result of becoming aware of objects, relationships, and events by means of the senses, which includes such activities as recognizing, observing, and discriminating.” It allows us to notice and then interpret stimuli around us so we can understand and respond accordingly. While perception may seem simple, it’s actually a complex and highly individualized process with many psychological components and implications. Below, we’ll cover the basics of perception psychology along with a few of the leading theories on this topic.

How we perceive the world around us

Let’s start with a brief overview of the basic mechanisms of perception—that is, the ways in which we’re able to perceive the world around us. Scientists now recognize seven senses that humans can use to gather information about our surroundings:

  • Visual perception: sight perceived through the eyes
  • Auditory perception: sounds perceived through the ears
  • Gustatory perception: awareness of flavor and taste on the tongue 
  • Olfactory perception: smelling via the nose
  • Tactile perception: awareness of sensation on the skin
  • Vestibular sense: perception of balance and motion
  • Proprioception: perception of the body’s position in space

A brief introduction to perception psychology

Perception psychology is a division of cognitive psychology that studies how humans receive and understand the information delivered through the senses. As mentioned above, perception is a network of bodily systems and sense organs that receive information and then process it. As we interact with the physical world, our brains interpret this information to make sense of what we experience. 

Our brains also automatically attempt to group perceptions to help us understand and interpret our world. There are six main principles the human mind uses to organize what it perceives:

grouping things that look like each other. Items with the same shape, size, and/or color make up parts of perceived patterns that appear to belong together.

grouping things according to how physically close they are to each other. The closer together they are, the more likely the brain will identify them as a group—even if they don’t have any connection to each other.

the tendency to perceive individual elements as a whole rather than a series of parts

Inclusiveness

perceiving all elements of an image before recognizing the parts of it. For example, you may sense a car before recognizing the color, make, or who is inside. 

seeing a partial image and filling in the gaps of what is believed should be there. This ability allows one to overlook a partial understanding and perceive the situation in its entirety despite missing information. 

a tendency to simplify complex stimuli into a simple pattern. An example is looking at a complex building and being aware of the front door while not registering the structure’s many other features.

Main perception psychology theories today

Psychologists and researchers continue to explore the nuances of this complex field. As of today, here’s a brief overview of some of the key perception psychology theories out there. Note that none of these completely explains the process in every instance; this field of study is ongoing.

Perception psychology according to Bruner

Jerome S. Bruner was an American psychologist who theorized that people go through various processes before they form opinions about what they have observed. According to Bruner, people use different informational cues to ultimately define their perceptions. This information-seeking continues until the individual comes across a familiar part and the mind categorizes it. If signals are distorted or do not fit a person’s initial perceptions, the images are forgotten or ignored while a picture forms on the most familiar perceptions. 

Perception psychology according to Gibson

James J. Gibson is another American psychologist who studied perception psychology. Gibson is known for his philosophy of the direct theory of visual perception in particular, also called the “bottom-up” theory. He believed we can explain visual perception solely in terms of the environment, beginning with a sensory stimulus. In each stage of the perceptual process, the eyes send signals to the brain to continue analyzing until it can conclude what the person is seeing.

Gibson theorized that the starting point of visual perception begins with the pattern of light that reaches our eyes. These signals then form the basis of our understanding of perceptions because they convey unambiguous information about the spatial layout we perceive. He further defined perception according to what he called affordances. He identified six affordances of perception, including:

  • Optical array: the patterns of light that travel from the environment to the eyes
  • Relative brightness: the perception that brighter, more evident objects are closer than darker, out-of-focus objects
  • Texture gradient: The grain of texture becomes less defined as an object recedes, indicating that the object may be further in the distance.
  • Relative size: Objects that are farther away will appear smaller.
  • Superimposition: When one image partially blocks another, the viewer sees the first image as being closer to them. Superimposition is similar to inattentional blindness , in which the eye cannot see an object because another object fully engages it.
  • Height in the visual field: Objects that are further away from the viewer typically appear higher in the visual field.

Perception psychology according to Gregory

Richard Langton Gregory was a British psychologist and Emeritus Professor of Neuropsychology at the University of Bristol. Gregory was also the author of the constructivist theory of perception, or the "top-down" theory—which takes the opposite approach of Gibson’s “bottom-up” theory. It assumes that our cognitive processes—including memory and perception—result from our continuously generating hypotheses about the world from the top down. In other words, we recognize patterns by understanding the context in which we perceive them. 

Consider handwriting as an example. The handwriting of many individuals can be difficult for others to read; however, if we can pick out a few words here or there, it helps us understand the text’s context, and that helps us figure out the words we could not read. In other words, Gregory's theory assumes we have previous knowledge of what we are perceiving in addition to the stimulus itself. Because stimuli can often be ambiguous, correctly perceiving it requires a higher level of cognition because we must draw from stored knowledge or past experiences to help us understand our perceptions. He believed perception is based on our accumulated knowledge, and that we actively construct perceptions whether they’re correct or not—though an incorrect hypothesis can lead to errors in perception.

Exploring perception with a therapist

The way we perceive objects, individuals, events, and our environment can have a significant impact on our mood, emotions, and behaviors. In some cases, our perceptions can be distorted, which can lead to distressing feelings or even symptoms of a mental health condition like depression or anxiety. Talk therapy— cognitive behavioral therapy (CBT) in particular—is one way to learn how to recognize any cognitive distortions you may be experiencing and shift your thoughts in a more realistic, balanced, and healthy direction. 

Regularly attending in-person therapy sessions is not possible for everyone. Some may not have adequate provider options in their area, while others may have trouble commuting to and from in-office sessions. In cases like these, online therapy can represent a viable alternative. A platform like BetterHelp can match you with a licensed therapist who you can meet with via video, phone, and/or in-app messaging, all from the comfort of home. Research suggests that virtual therapy is “no less efficacious” than the in-person variety in many cases, so you can generally feel confident in selecting whichever format may work best for you.

What are examples of perception in psychology?

Some examples of types of perception include taste perception, such as being able to identify various flavors in what you’re eating, or visual perception, such as being able to identify and distinguish between a rock, a tree, and a flower. 

What is the simple definition of perception?

The simple, specific meaning of perception is how we use our five senses—plus our senses of balance and our perception of our own body position—to experience the world around us. Perception involves actions like seeing, touching, tasting, and smelling in order to take in our surroundings and then using automatic neural processing to make sense of them.

What are the 4 stages of perception?

The perception process involves four basic stages. First, the individual is exposed to a stimulus through their environment and becomes aware of it through one or more of their perception skills, or senses. Second, their brain registers the stimulus based on the information gathered through the sense(s). Next, the information is organized based on a person’s existing knowledge and beliefs. Finally, the person interprets the stimulus based on their own knowledge and beliefs, such as a good or bad smell, a dangerous or non-dangerous animal, a pleasant or grating sound, etc. 

What is perceptual psychology simple?

Perceptual psychology is made up of various theories from studies over the years about why and how we take in information from the environments around us and perceive things in a certain way. There are many elements that go into why a person may perceive something the way they do, such as existing knowledge, beliefs, culture, and even mental health. Perceptual psychologists study these unconscious processes that contribute to a person’s perception. 

What are 4 examples of perception?

Perception refers to how we see and make sense of the world around us. Four examples include seeing a sunset, smelling a fragrant flower, hearing music playing, and touching a soft blanket.

What is an example of perception in human behavior?

Perception is how our sensory organs detect or perceive stimuli in our surroundings. An example of perception as it relates to human behavior is two people seeing a dog and reacting differently based on their past experiences, knowledge, and beliefs. One might react in fear because they were chased by a dog as a child or in disgust because they think all dogs smell bad. The other might react in excitement and go to interact with the dog because they have a beloved dog of their own at home.

What is an example of perception effect?

There are several different perception effects the human mind uses to categorize, organize, and make sense of the world, and of which we are typically not consciously aware. For example, we’re likely to unconsciously group things that resemble each other—such as objects of the same shape, size, or color—because our brain tells us that they belong together.

What are the 3 factors that influence perception?

There are many different factors that can affect perception, so much so that the field of perception psychology, a type of social psychology, is devoted to examining and understanding them. A few examples of factors that could influence the way an individual perceives something include past experiences, prior knowledge, and cultural values.

What is an example of perception and personality?

The way we perceive words and sounds, sights, smells, tastes, and other forms of stimuli is influenced by our personality. For example, the words one person perceives through auditory signals and then interprets to find offensive may be welcomed by another due to a natural tendency toward humor, optimism/pessimism, etc. 

What is perception and why is it important?

Overall, perception is our ability to identify stimuli in the world around us and interpret it according to our own values, personality, culture, and other factors. It’s important because it’s the means through which we sense and then interpret the world around us. 

  • Exploring Extinction Psychology Medically reviewed by Arianna Williams , LPC, CCTP
  • Is Transpersonal Psychology Right For Me? Medically reviewed by April Justice , LICSW
  • Psychologists
  • Relationships and Relations
  • Bipolar Disorder
  • Therapy Center
  • When To See a Therapist
  • Types of Therapy
  • Best Online Therapy
  • Best Couples Therapy
  • Best Family Therapy
  • Managing Stress
  • Sleep and Dreaming
  • Understanding Emotions
  • Self-Improvement
  • Healthy Relationships
  • Student Resources
  • Personality Types
  • Guided Meditations
  • Verywell Mind Insights
  • 2023 Verywell Mind 25
  • Mental Health in the Classroom
  • Editorial Process
  • Meet Our Review Board
  • Crisis Support

Perceptual Sets in Psychology

Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

perceptual hypothesis psychology definition

Sean is a fact-checker and researcher with experience in sociology, field research, and data analytics.

perceptual hypothesis psychology definition

Naufal MQ / Getty Images

How It Works

Top-down processing.

  • Forces of Influence
  • Real Life Impact

A perceptual set refers to a predisposition to perceive things in a certain way. In other words, we often tend to notice only certain aspects of an object or situation while ignoring other details.

What Is a Perceptual Set?

When it comes to our perceptions of the world around us, you might assume that what you see is what you get. However, in truth, research shows that the way you perceive the world through all of your senses is heavily influenced (and biased) by your own past experiences, expectations, motivations , beliefs, emotions , and even your culture.

For example, think about the last time you started a new class. Did you have any expectations at the outset that might have influenced your experience in the class? If you expect a class to be boring, are you more likely to be uninterested in class?

In psychology , this is what is known as a perceptual set.

A perceptual set is basically a tendency to view things only in a certain way.

What exactly is a perceptual set, why does it happen, and how does it influence how we perceive the world around us?

How do psychologists define perceptual sets?

"Perception can also be influenced by an individual's expectations, motives, and interests. The term perceptual set refers to the tendency to perceive objects or situations from a particular frame of reference," explains authors Susan Nolan and Sandra Hockenbury of the textbook  Discovering Psychology .

Sometimes, perceptual sets can be helpful. They often lead us to make fairly accurate conclusions about what exists in the world around us. In cases where we find ourselves wrong, we often develop new perceptual sets that are more accurate.

Sometimes, our perceptual sets can lead us astray.  

If you have a strong interest in military aircraft, for example, an odd cloud formation in the distance might be interpreted as a fleet of fighter jets; whereas, someone else may see it as a group of migrating ducks in flight.

In one experiment that illustrates this tendency, participants were presented with different non-words, such as sael . Those who were told that they would be reading boating-related words read the word as "sail," while those who were told to expect animal-related words read it as "seal."

A perceptual set is a good example of what is known as top-down processing . In top-down processing, perceptions begin with the most general and move toward the more specific. Such perceptions are heavily influenced by context, expectations, and prior knowledge.

If we expect something to appear in a certain way, we are more likely to perceive it according to our expectations.

Existing schemas , mental frameworks, and concepts often guide perceptual sets. For example, people have a strong schema for faces, making it easier to recognize familiar human faces in the world around us. It also means that when we look at an ambiguous image, we are more likely to see it as a face than some other type of object.

Researchers have also found that when multiple items appear in a single visual scene, perceptual sets will often lead people to miss additional items after locating the first one. For example, airport security officers might be likely to spot a water bottle in a bag but then miss that the bag also contains a firearm.  

Forces of Influence 

Below are examples of various forces of influence:

  • Motivation can play an important role in perceptual sets and how we interpret the world around us. If we are rooting for our favorite sports team, we might be motivated to view members of the opposing team as overly aggressive, weak, or incompetent. In one classic experiment, researchers deprived participants of food for several hours. When they were later shown a set of ambiguous images, those who had been food-deprived were far more likely to interpret the images as food-related objects. Because they were hungry, they were more motivated to see the images in a certain way.
  • Expectations also play an important role. If we expect people to behave in certain ways in certain situations, these expectations can influence how we perceive these people and their roles. One of the classic experiments on the impact of expectation on perceptual sets involved showing participants either a series of numbers or letters. Then, the participants were shown an ambiguous image that could either be interpreted as the number 13 or the letter B. Those who had viewed the numbers were more likely to see it as a 13, while those who had viewed the letters were more likely to see it as the letter B.
  • Culture also influences how we perceive people, objects, and situations. Surprisingly, researchers have found that people from different cultures even tend to perceive perspective and depth cues differently.
  • Emotions can have a dramatic impact on how we perceive the world around us. For example, if we are angry, we might be more likely to perceive hostility in others. One experiment demonstrated that when people came to associate a nonsense syllable with mild electrical shocks, they experienced physiological reactions to the syllable even when it was presented subliminally.
  • Attitudes can also have a powerful influence on perception. In one experiment, Gordon Allport demonstrated that prejudice could have an influence on how quickly people categorize people of various races.

Real-Life Examples

Researchers have shown that perceptual sets can have a dramatic impact on day-to-day life.

In one experiment, young children were found to enjoy french fries more when they were served in a McDonald's bag rather than just a plain white bag. In another example, people who were told that an image was of the famed "Loch Ness monster" were more likely to see the mythical creature in the picture, while others who did not have the expectation of seeing a sea creature, saw only a curved tree trunk.

"Once we have formed a wrong idea about reality, we have more difficulty seeing the truth.

As previously mentioned, our perceptual set for faces is so strong that it actually causes us to see faces where there are none. Consider how people often describe seeing a face on the moon or in many of the inanimate objects that we encounter in our everyday lives.

As you can see, perception is not simply a matter of seeing what is in the world around us. A variety of factors can influence how we take in information and how we interpret it, as stimuli are filtered through our personal knowledge, expectations, emotions, and context.

Biggs A, Adamo S, Dowd E, Mitroff S. Examining perceptual and conceptual set biases in multiple-target visual search .  Attention, Perception, & Psychophysics . 2015;77(3):844-855. doi:10.3758/s13414-014-0822-0

Nolan SA, Hockenbury SE. Discovering Psychology . Worth Publishers, 2021.

Hardy M, Heyes S.  Beginning Psychology . Oxford University Press; 1999.

Gaspelin N, Luck SJ. " Top-down" does not mean "voluntary ".  J Cogn . 2018;1(1):25. doi:10.5334/joc.28

Sanford R. The effects of abstinence from food upon imaginal processes: A preliminary experiment .  J Psychol . 1936;2(1):129-136. doi:10.1080/00223980.1936.9917447

Bruner J, Minturn A. Perceptual identification and perceptual organization .  J Gen Psychol . 1955;53(1):21-28. doi:10.1080/00221309.1955.9710133

de Bruïne G, Vredeveldt A, van Koppen PJ. Cross-cultural differences in object recognition: Comparing asylum seekers from Sub-Saharan Africa and a matched Western European control group .  Appl Cogn Psychol . 2018;32(4):463‐473. doi:10.1002/acp.3419

Lazarus RS, McCleary RA. Autonomic discrimination without awareness: A study of subception .  Psychological Review.  1951;58(2):113–122. doi:10.1037/h0054104

Barlow FK, Hornsey MJ, Thai M, Sengupta NK, Sibley CG. The wallpaper effect: The contact hypothesis fails for minority group members who live in areas with a high proportion of majority group members .  PLoS One . 2013;8(12):e82228. doi:10.1371/journal.pone.0082228

Solomon MR, Russell-Bennett R, Previte J.  Consumer Behaviour: Buying, Having, Being . Frenchs Forest, NSW: 2013.

Campbell S. The Loch Ness Monster: The Evidence . Prometheus Books; 1997.

Myers DG.  Psychology . 7th ed. Worth Publishers; 2004.

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Social Sci LibreTexts

5.7: Gestalt Principles of Perception

  • Last updated
  • Save as PDF
  • Page ID 616

Learning Objectives

  • Explain the figure-ground relationship
  • Define Gestalt principles of grouping
  • Describe how perceptual set is influenced by an individual’s characteristics and mental state

In the early part of the \(20^{th}\) century, Max Wertheimer published a paper demonstrating that individuals perceived motion in rapidly flickering static images—an insight that came to him as he used a child’s toy tachistoscope. Wertheimer, and his assistants Wolfgang Köhler and Kurt Koffka, who later became his partners, believed that perception involved more than simply combining sensory stimuli. This belief led to a new movement within the field of psychology known as Gestalt psychology . The word gestalt literally means form or pattern, but its use reflects the idea that the whole is different from the sum of its parts. In other words, the brain creates a perception that is more than simply the sum of available sensory inputs, and it does so in predictable ways. Gestalt psychologists translated these predictable ways into principles by which we organize sensory information. As a result, Gestalt psychology has been extremely influential in the area of sensation and perception (Rock & Palmer, 1990).

One Gestalt principle is the figure-ground relationship . According to this principle, we tend to segment our visual world into figure and ground. Figure is the object or person that is the focus of the visual field, while the ground is the background. As the figure shows, our perception can vary tremendously, depending on what is perceived as figure and what is perceived as ground. Presumably, our ability to interpret sensory information depends on what we label as figure and what we label as ground in any particular case, although this assumption has been called into question (Peterson & Gibson, 1994; Vecera & O’Reilly, 1998).

An illustration shows two identical black face-like shapes that face towards one another, and one white vase-like shape that occupies all of the space in between them. Depending on which part of the illustration is focused on, either the black shapes or the white shape may appear to be the object of the illustration, leaving the other(s) perceived as negative space.

Another Gestalt principle for organizing sensory stimuli into meaningful perception is proximity . This principle asserts that things that are close to one another tend to be grouped together, as the figure below illustrates.

Illustration A shows thirty-six dots in six evenly-spaced rows and columns. Illustration B shows thirty-six dots in six evenly-spaced rows but with the columns separated into three sets of two columns.

How we read something provides another illustration of the proximity concept. For example, we read this sentence like this, notl iket hiso rt hat. We group the letters of a given word together because there are no spaces between the letters, and we perceive words because there are spaces between each word. Here are some more examples: Cany oum akes enseo ft hiss entence? What doth es e wor dsmea n?

We might also use the principle of similarity to group things in our visual fields. According to this principle, things that are alike tend to be grouped together. For example, when watching a football game, we tend to group individuals based on the colors of their uniforms. When watching an offensive drive, we can get a sense of the two teams simply by grouping along this dimension.

An illustration shows six rows of six dots each. The rows of dots alternate between blue and white colored dots.

Two additional Gestalt principles are the law of continuity (or good continuation ) and closure . The law of continuity suggests that we are more likely to perceive continuous, smooth flowing lines rather than jagged, broken lines figure \(\PageIndex{4}\). The principle of closure states that we organize our perceptions into complete objects rather than as a series of parts, figure \(\PageIndex{5}\).

An illustration shows two lines of diagonal dots that cross in the middle in the general shape of an “X.”

According to Gestalt theorists, pattern perception , or our ability to discriminate among different figures and shapes, occurs by following the principles described above. You probably feel fairly certain that your perception accurately matches the real world, but this is not always the case. Our perceptions are based on perceptual hypotheses : educated guesses that we make while interpreting sensory information. These hypotheses are informed by a number of factors, including our personalities, experiences, and expectations. We use these hypotheses to generate our perceptual set. For instance, research has demonstrated that those who are given verbal priming produce a biased interpretation of complex ambiguous figures (Goolkasian & Woodbury, 2010).

DIG DEEPER: The Depths of Perception - Bias, Prejudice, and Cultural Factors

In this chapter, you have learned that perception is a complex process. Built from sensations, but influenced by our own experiences, biases, prejudices, and cultures , perceptions can be very different from person to person. Research suggests that implicit racial prejudice and stereotypes affect perception. For instance, several studies have demonstrated that non-Black participants identify weapons faster and are more likely to identify non-weapons as weapons when the image of the weapon is paired with the image of a Black person (Payne, 2001; Payne, Shimizu, & Jacoby, 2005). Furthermore, White individuals’ decisions to shoot an armed target in a video game is made more quickly when the target is Black (Correll, Park, Judd, & Wittenbrink, 2002; Correll, Urland, & Ito, 2006). This research is important, considering the number of very high-profile cases in the last few decades in which young Blacks were killed by people who claimed to believe that the unarmed individuals were armed and/or represented some threat to their personal safety.

Gestalt theorists have been incredibly influential in the areas of sensation and perception. Gestalt principles such as figure-ground relationship, grouping by proximity or similarity, the law of good continuation, and closure are all used to help explain how we organize sensory information. Our perceptions are not infallible, and they can be influenced by bias, prejudice, and other factors.

Contributors and Attributions

Rose M. Spielman with many significant contributors. The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the creative commons license and may not be reproduced without the prior and express written consent of Rice University. For questions regarding this license, please contact  [email protected] .Textbook content produced by OpenStax College is licensed under a  Creative Commons Attribution License 4.0  license. Download for free at http://cnx.org/contents/[email protected] .

Logo for Open Oregon Educational Resources

6.1 The Process of Perception

Learning objectives.

By the end of this section, you will be able to:

  • Discuss how salience influences the selection of perceptual information.
  • Explain the ways in which we organize perceptual information.

Perception is the process of selecting, organizing, and interpreting sensory information. This cognitive and psychological process begins with receiving stimuli through our primary senses (vision, hearing, touch, taste, and smell). This information is then passed along to corresponding areas of the brain and organized into our existing structures and patterns, and then interpreted based on previous experiences (Figure 6.1). How we perceive the people and objects around us directly affects our communication. We respond differently to an object or person that we perceive favorably than we do to something or someone we find unfavorable. But how do we filter through the mass amounts of incoming information, organize it, and make meaning from what makes it through our perceptual filters and into our social realities?

Circular graphic showing the three aspects of the process of perception; selection, organization, and interpretation

Selecting Information

We take in information through all five of our senses, but our perceptual field (the world around us) includes so many stimuli that it is impossible for our brains to process and make sense of it all. So, as information comes in through our senses, various factors influence what actually continues on through the perception process (Fiske & Taylor, 1991). Selecting is the first part of the perception process, in which we focus our attention on certain incoming sensory information (Figure 6.2). Think about how, out of many other possible stimuli to pay attention to, you may hear a familiar voice in the hallway, see a pair of shoes you want to buy from across the mall, or smell something cooking for dinner when you get home from work. We quickly cut through and push to the background all kinds of sights, smells, sounds, and other stimuli, but how do we decide what to select and what to leave out?

A group of coworkers talking at a crowded conference.

We tend to pay attention to information that is salient. Salience is the degree to which something attracts our attention in a particular context. The thing attracting our attention can be abstract, like a concept, or concrete, like an object. For example, a person’s identity as a Native American may become salient when they are protesting at the Columbus Day parade in Denver, Colorado. Or a bright flashlight shining in your face while camping at night is sure to be salient. The degree of salience depends on three features (Fiske & Tayor, 1991). We tend to find things salient when they are visually or aurally stimulating, they meet our needs or interests, or when they do or don’t meet our expectations.

Visual and Aural Stimulation

It is probably not surprising to learn that visually and/or aurally stimulating things become salient in our perceptual field and get our attention. Creatures ranging from fish to hummingbirds are attracted to things like silver spinners on fishing poles or red and yellow bird feeders. Having our senses stimulated isn’t always a positive thing though. Think about the couple that won’t stop talking during the movie or the upstairs neighbor whose subwoofer shakes your ceiling at night. In short, stimuli can be attention-getting in a productive or distracting way. However, we can use this knowledge to our benefit by minimizing distractions when we have something important to say. It’s probably better to have a serious conversation with a significant other in a quiet place rather than a crowded food court.

Needs and Interests

We tend to pay attention to information that we perceive to meet our needs or interests in some way. This type of selective attention can help us meet instrumental needs and get things done. When you need to speak with a financial aid officer about your scholarships and loans, you sit in the waiting room and listen for your name to be called. Paying close attention to whose name is called means you can be ready to start your meeting and hopefully get your business handled. When we don’t think certain messages meet our needs, stimuli that would normally get our attention may be completely lost. Imagine you are in the grocery store and you hear someone say your name. You turn around, only to hear that person say, “Finally! I said your name three times. I thought you forgot who I was!” A few seconds before, when you were focused on figuring out which kind of orange juice to get, you were attending to the various pulp options to the point that you tuned other stimuli out, even something as familiar as the sound of someone calling your name. We select and attend to information that meets our needs.

We also find information salient that interests us. Of course, many times, stimuli that meet our needs are also interesting, but it’s worth discussing these two items separately because sometimes we find things interesting that don’t necessarily meet our needs (Figure 6.3). I’m sure we’ve all gotten sucked into a television show, video game, or random project and paid attention to that at the expense of something that actually meets our needs like cleaning or spending time with a significant other. Paying attention to things that interest us but don’t meet specific needs seems like the basic formula for procrastination that we are all familiar with.

Teenager holding a controller, playing a video game.

In many cases we know what interests us and we automatically gravitate toward stimuli that match up with that. For example, as you filter through radio stations, you likely already have an idea of what kind of music interests you and will stop on a station playing something in that genre while skipping right past stations playing something you aren’t interested in. Because of this tendency, we often have to end up being forced into or accidentally experiencing something new in order to create or discover new interests. For example, you may not realize you are interested in Asian history until you are required to take such a course and have an engaging professor who sparks that interest in you. Or you may accidentally stumble on a new area of interest when you take a class you wouldn’t otherwise because it fits into your schedule. As communicators, you can take advantage of this perceptual tendency by adapting your topic and content to the interests of your audience.

Expectations

The relationship between salience and expectations is a little more complex. Basically, we can find expected things salient and find things that are unexpected salient. While this may sound confusing, a couple examples should illustrate this point. If you are expecting a package to be delivered, you might pick up on the slightest noise of a truck engine or someone’s footsteps approaching your front door. Since we expect something to happen, we may be extra tuned in to clues that it is coming. In terms of the unexpected, if you have a shy and soft-spoken friend who you overhear raising the volume and pitch of their voice while talking to another friend, you may pick up on that and assume that something out of the ordinary is going on. For something unexpected to become salient, it has to reach a certain threshold of difference. If you walked into your regular class and there were one or two more students there than normal, you may not even notice. If you walked into your class and there was someone dressed up as a wizard, you would probably notice. So, if we expect to experience something out of the routine, like a package delivery, we will find stimuli related to that expectation salient. If we experience something that we weren’t expecting and that is significantly different from our routine experiences, then we will likely find it salient.

There is a middle area where slight deviations from routine experiences may go unnoticed because we aren’t expecting them. To go back to the earlier example, if you aren’t expecting a package, and you regularly hear vehicle engines and sidewalk foot traffic outside your house, those pretty routine sounds wouldn’t be as likely to catch your attention, even if it were slightly more or less traffic than expected. This is because our expectations are often based on previous experience and patterns we have observed and internalized, which allows our brains to go on “autopilot” sometimes and fill in things that are missing or overlook extra things. Look at the following sentence and read it aloud:

Percpetoin is bsaed on pateetrns, maening we otfen raech a cocnlsuion witouht cosnidreing ecah indviidaul elmenet.

This example illustrates a test of our expectation and an annoyance to every college student. We have all had the experience of getting a paper back with typos and spelling errors circled. This can be frustrating, especially if we actually took the time to proofread. When we first learned to read and write, we learned letter by letter. A teacher or parent would show us a card with A-P-P-L-E written on it, and we would sound it out. Over time, we learned the patterns of letters and sounds and could see combinations of letters and pronounce the word quickly. Since we know what to expect when we see a certain pattern of letters, and know what comes next in a sentence since we wrote the paper, we don’t take the time to look at each letter as we proofread. This can lead us to overlook common typos and spelling errors, even if we proofread something multiple times. Now that we know how we select stimuli, let’s turn our attention to how we organize the information we receive.

Organizing Information

Organizing is the second part of the perception process, in which we sort and categorize information that we perceive based on innate and learned cognitive patterns. Three ways we sort things into patterns are by using proximity, similarity, and difference (Coren, 1980).

In terms of proximity, we tend to think that things that are close together go together (Figure 6.4). For example, have you ever been waiting to be helped in a business and the clerk assumes that you and the person standing near you are together? The moment usually ends when you and the other person in line look at each other, then back at the clerk, and one of you explains that you are not together. Even though you may have never met that other person in your life, the clerk used a basic perceptual organizing cue to group you together because you were standing in proximity to one another.

Chart of coffee beans grouped by different varieties.

We also group things together based on similarity. We tend to think similar-looking or similar-acting things belong together. For example, a group of friends that spend time together are all males, around the same age, of the same race, and have short hair. People might assume that they are brothers. Despite the fact that many of their features are different, the salient features are organized based on similarity and they are assumed to be related (Figure 6.5).

Group of friends taking selfie in a field.

We also organize information that we take in based on difference. In this case, we assume that the item that looks or acts different from the rest doesn’t belong with the group (Figure 6.6). For example, if you ordered ten burgers and nine of them are wrapped in paper and the last is in a cardboard container, you may assume that the burger in the container is different in some way. Perceptual errors involving people and assumptions of difference can be especially awkward, if not offensive. Have you ever attended an event, only to be mistaken as an employee working at the event, rather than a guest at the event?

Jelly beans sorted into different containers based on flavor.

These strategies for organizing information are so common that they are built into how we teach our children basic skills and how we function in our daily lives. I’m sure we all had to look at pictures in grade school and determine which things went together and which thing didn’t belong. If you think of the literal act of organizing something, like your desk at home or work, we follow these same strategies. If you have a bunch of papers and mail on the top of your desk, you will likely sort papers into separate piles for separate classes or put bills in a separate place than personal mail. You may have one drawer for pens, pencils, and other supplies and another drawer for files. In this case you are grouping items based on similarities and differences. You may also group things based on proximity, for example, by putting financial items like your checkbook, a calculator, and your pay stubs in one area so you can update your budget efficiently. In summary, we simplify information and look for patterns to help us more efficiently communicate and get through life.

Simplification and categorizing based on patterns aren’t necessarily a bad thing. In fact, without this capability we would likely not have the ability to speak, read, or engage in other complex cognitive/behavioral functions. Our brain innately categorizes and files information and experiences away for later retrieval, and different parts of the brain are responsible for different sensory experiences. In short, it is natural for things to group together in some ways. There are differences among people, and looking for patterns helps us in many practical ways. However, the judgments we place on various patterns and categories are not natural; they are learned and culturally and contextually relative. Our perceptual patterns do become unproductive and even unethical when the judgments we associate with certain patterns are based on stereotypical or prejudicial thinking.

We also organize interactions and interpersonal experiences based on our firsthand experiences. Misunderstandings and conflict may result when two people experience the same encounter differently. Punctuation refers to the structuring of information into a timeline to determine the cause (stimulus) and effect (response) of our communication interactions (Sillars, 1980). Applying this concept to interpersonal conflict can help us see how the process of perception extends beyond the individual to the interpersonal level. This concept also helps illustrate how organization and interpretation can happen together and how interpretation can influence how we organize information and vice versa.

Where does a conflict begin and end? The answer to this question depends on how the people involved in the conflict punctuate, or structure, their conflict experience. Punctuation differences can often escalate conflict, which can lead to a variety of relationship problems (Watzlawick, Bavelas, & Jackson, 1967). For example, Linda and Joe are on a project team at work and have a deadline approaching. Linda has been working on the project over the weekend in anticipation of her meeting with Joe first thing Monday morning. She has had some questions along the way and has e-mailed Joe for clarification and input, but he hasn’t responded. On Monday morning, Linda walks into the meeting room, sees Joe, and says, “I’ve been working on this project all weekend and needed your help. I e-mailed you three times! What were you doing?” Joe responds, “I had no idea you e-mailed me. I was gone all weekend on a camping trip.” In this instance, the conflict started for Linda two days ago and has just started for Joe. So, for the two of them to most effectively manage this conflict, they need to communicate so that their punctuation, or where the conflict started for each one, is clear and matches up. In this example, Linda made an impression about Joe’s level of commitment to the project based on an interpretation she made after selecting and organizing incoming information. Being aware of punctuation is an important part of perception checking, which we will discuss later. Let’s now take a closer look at how interpretation plays into the perception process.

Interpreting Information

Although selecting and organizing incoming stimuli happens very quickly, and sometimes without much conscious thought, interpretation can be a much more deliberate and conscious step in the perception process. Interpretation is the third part of the perception process, in which we assign meaning to an experience using a mental structure known as schema. A  schema  is a cognitive tool for organizing related concepts or information. Schemata are like databases of stored, related information that we use to interpret new experiences. Overtime we incorporate more and more small units of information together to develop more complex understandings of new information.

We have an overall schema about education and how to interpret experiences with teachers and classmates (Figure 6.7). This schema started developing before we even went to preschool based on things that parents, peers, and the media told us about school. For example, you learned that certain symbols and objects like an apple, a ruler, a calculator, and a notebook are associated with being a student or teacher. You learned new concepts like grades and recess, and you engaged in new practices like doing homework, studying, and taking tests. You also formed new relationships with classmates, teachers, and administrators. As you progressed through your education, your schema adapted to the changing environment. How smooth or troubling schema reevaluation and revision is varies from situation to situation and person to person. For example, some students adapt their schema relatively easily as they move from elementary, to middle, to high school, and on to college and are faced with new expectations for behavior and academic engagement. Other students don’t adapt as easily, and holding onto their old schema creates problems as they try to interpret new information through old, incompatible schema.

An empty college classroom with individual desks.

It’s also important to be aware of schemata because our interpretations affect our behavior. For example, if you are doing a group project for class and you perceive a group member to be shy based on your schema of how shy people communicate, you may avoid giving them presentation responsibilities in your group project because you do not think shy people make good public speakers.

As we have seen, schemata are used to interpret others’ behavior and form impressions about who they are as a person. To help this process along, we often solicit information from people to help us place them into a preexisting schema. In the United States and many other Western cultures, people’s identities are often closely tied to what they do for a living. When we introduce others, or ourselves, occupation is usually one of the first things we mention. Think about how your communication with someone might differ if he or she were introduced to you as an artist versus a doctor. We make similar interpretations based on where people are from, their age, their race, and other social and cultural factors.

In summary, we have schemata about individuals, groups, places, and things, and these schemata filter our perceptions before, during, and after interactions. As schemata are retrieved from memory, they are executed, like computer programs or apps on your smartphone, to help us interpret the world around us. Just like computer programs and apps must be regularly updated to improve their functioning, we update and adapt our schemata as we have new experiences.

  • Perception is the process of selecting, organizing, and interpreting information. This process affects our communication because we respond to stimuli differently, whether they are objects or persons, based on how we perceive them.
  • Given the massive amounts of stimuli taken in by our senses, we only select a portion of the incoming information to organize and interpret. We select information based on salience. We tend to find salient things that are visually or aurally stimulating and things that meet our needs and interests. Expectations also influence what information we select.
  • We organize information that we select into patterns based on proximity, similarity, and difference.
  • We interpret information using schemata, which allow us to assign meaning to information based on accumulated knowledge and previous experience.

Discussion Questions

  • Take a moment to look around wherever you are right now. Take in the perceptual field around you. What is salient for you in this moment and why? Explain the degree of salience using the three reasons for salience discussed in this section.
  • As we organize information (sensory information, objects, and people) we simplify and categorize information into patterns. Identify some cases in which this aspect of the perception process is beneficial. Identify some cases in which it could be harmful or negative.
  • Think about some of the schemata you have that help you make sense of the world around you. For each of the following contexts—academic, professional, personal, and civic—identify a schema that you commonly rely on or think you will rely on. For each schema you identified note a few ways that it has already been challenged or may be challenged in the future.

Remix/Revisions featured in this section

  • Small editing revisions to tailor the content to the Psychology of Human Relations course.
  • Added and changed some images as well as changed formatting for photos to provide links to locations of images and CC licenses.
  • Added doi links to references to comply with APA 7 th edition formatting reference manual.

Attributions

CC Licensed Content, Original Modification, adaptation, and original content.  Provided by : Stevy Scarbrough. License : CC-BY-NC-SA

CC Licensed Content Shared Previously Communication in the Real World. Authored by: University of Minnesota. Located at:   https://open.lib.umn.edu/communication/chapter/2-1-perception-process/ License: CC-BY-NC-SA 4.0

Coren, S. (1980). Principles of perceptual organization and spatial distortion: The Gestalt illusions.  Journal of Experimental Psychology: Human Perception and Performance,  6(3) 404–12. https://doi.org/10.1037/0096-1523.6.3.404

Fiske, S. T., & Taylor, S. E. (1991).  Social Cognition,  2nd ed. New York, NY: McGraw Hill.

Payne, B. K. (2001). Prejudice and perception: The role of automatic and controlled processes in misperceiving a weapon. Journal of Personality and Social Psychology,  81(2) 181–92. https://doi.org/10.1037/0022-3514.81.2.181

Rozelle, R. M. & Baxter, J. C. (1975). Impression formation and danger recognition in experienced police officers. Journal of Social Psychology, 96 (1), 53-63. https://doi.org/10.1080/00224545.1975.9923262

Sillars, A. L. (1980). Attributions and communication in roommate conflicts. roommate Conflicts.  Communication Monographs,  47(3), 180–200. https://doi.org/10.1080/03637758009376031

Watzlawick, P., Bavelas, J. B., & Jackson, D. D. (1967).  Pragmatics of human communication: A study of interactional patterns, pathologies, and paradoxes. New York, NY: W. W. Norton.

Psychology of Human Relations Copyright © by Stevy Scarbrough is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

  • Search Menu
  • Browse content in Arts and Humanities
  • Browse content in Archaeology
  • Anglo-Saxon and Medieval Archaeology
  • Archaeological Methodology and Techniques
  • Archaeology by Region
  • Archaeology of Religion
  • Archaeology of Trade and Exchange
  • Biblical Archaeology
  • Contemporary and Public Archaeology
  • Environmental Archaeology
  • Historical Archaeology
  • History and Theory of Archaeology
  • Industrial Archaeology
  • Landscape Archaeology
  • Mortuary Archaeology
  • Prehistoric Archaeology
  • Underwater Archaeology
  • Urban Archaeology
  • Zooarchaeology
  • Browse content in Architecture
  • Architectural Structure and Design
  • History of Architecture
  • Residential and Domestic Buildings
  • Theory of Architecture
  • Browse content in Art
  • Art Subjects and Themes
  • History of Art
  • Industrial and Commercial Art
  • Theory of Art
  • Biographical Studies
  • Byzantine Studies
  • Browse content in Classical Studies
  • Classical History
  • Classical Philosophy
  • Classical Mythology
  • Classical Literature
  • Classical Reception
  • Classical Art and Architecture
  • Classical Oratory and Rhetoric
  • Greek and Roman Papyrology
  • Greek and Roman Epigraphy
  • Greek and Roman Law
  • Greek and Roman Archaeology
  • Late Antiquity
  • Religion in the Ancient World
  • Digital Humanities
  • Browse content in History
  • Colonialism and Imperialism
  • Diplomatic History
  • Environmental History
  • Genealogy, Heraldry, Names, and Honours
  • Genocide and Ethnic Cleansing
  • Historical Geography
  • History by Period
  • History of Emotions
  • History of Agriculture
  • History of Education
  • History of Gender and Sexuality
  • Industrial History
  • Intellectual History
  • International History
  • Labour History
  • Legal and Constitutional History
  • Local and Family History
  • Maritime History
  • Military History
  • National Liberation and Post-Colonialism
  • Oral History
  • Political History
  • Public History
  • Regional and National History
  • Revolutions and Rebellions
  • Slavery and Abolition of Slavery
  • Social and Cultural History
  • Theory, Methods, and Historiography
  • Urban History
  • World History
  • Browse content in Language Teaching and Learning
  • Language Learning (Specific Skills)
  • Language Teaching Theory and Methods
  • Browse content in Linguistics
  • Applied Linguistics
  • Cognitive Linguistics
  • Computational Linguistics
  • Forensic Linguistics
  • Grammar, Syntax and Morphology
  • Historical and Diachronic Linguistics
  • History of English
  • Language Evolution
  • Language Reference
  • Language Acquisition
  • Language Variation
  • Language Families
  • Lexicography
  • Linguistic Anthropology
  • Linguistic Theories
  • Linguistic Typology
  • Phonetics and Phonology
  • Psycholinguistics
  • Sociolinguistics
  • Translation and Interpretation
  • Writing Systems
  • Browse content in Literature
  • Bibliography
  • Children's Literature Studies
  • Literary Studies (Romanticism)
  • Literary Studies (American)
  • Literary Studies (Asian)
  • Literary Studies (European)
  • Literary Studies (Eco-criticism)
  • Literary Studies (Modernism)
  • Literary Studies - World
  • Literary Studies (1500 to 1800)
  • Literary Studies (19th Century)
  • Literary Studies (20th Century onwards)
  • Literary Studies (African American Literature)
  • Literary Studies (British and Irish)
  • Literary Studies (Early and Medieval)
  • Literary Studies (Fiction, Novelists, and Prose Writers)
  • Literary Studies (Gender Studies)
  • Literary Studies (Graphic Novels)
  • Literary Studies (History of the Book)
  • Literary Studies (Plays and Playwrights)
  • Literary Studies (Poetry and Poets)
  • Literary Studies (Postcolonial Literature)
  • Literary Studies (Queer Studies)
  • Literary Studies (Science Fiction)
  • Literary Studies (Travel Literature)
  • Literary Studies (War Literature)
  • Literary Studies (Women's Writing)
  • Literary Theory and Cultural Studies
  • Mythology and Folklore
  • Shakespeare Studies and Criticism
  • Browse content in Media Studies
  • Browse content in Music
  • Applied Music
  • Dance and Music
  • Ethics in Music
  • Ethnomusicology
  • Gender and Sexuality in Music
  • Medicine and Music
  • Music Cultures
  • Music and Media
  • Music and Religion
  • Music and Culture
  • Music Education and Pedagogy
  • Music Theory and Analysis
  • Musical Scores, Lyrics, and Libretti
  • Musical Structures, Styles, and Techniques
  • Musicology and Music History
  • Performance Practice and Studies
  • Race and Ethnicity in Music
  • Sound Studies
  • Browse content in Performing Arts
  • Browse content in Philosophy
  • Aesthetics and Philosophy of Art
  • Epistemology
  • Feminist Philosophy
  • History of Western Philosophy
  • Metaphysics
  • Moral Philosophy
  • Non-Western Philosophy
  • Philosophy of Language
  • Philosophy of Mind
  • Philosophy of Perception
  • Philosophy of Science
  • Philosophy of Action
  • Philosophy of Law
  • Philosophy of Religion
  • Philosophy of Mathematics and Logic
  • Practical Ethics
  • Social and Political Philosophy
  • Browse content in Religion
  • Biblical Studies
  • Christianity
  • East Asian Religions
  • History of Religion
  • Judaism and Jewish Studies
  • Qumran Studies
  • Religion and Education
  • Religion and Health
  • Religion and Politics
  • Religion and Science
  • Religion and Law
  • Religion and Art, Literature, and Music
  • Religious Studies
  • Browse content in Society and Culture
  • Cookery, Food, and Drink
  • Cultural Studies
  • Customs and Traditions
  • Ethical Issues and Debates
  • Hobbies, Games, Arts and Crafts
  • Lifestyle, Home, and Garden
  • Natural world, Country Life, and Pets
  • Popular Beliefs and Controversial Knowledge
  • Sports and Outdoor Recreation
  • Technology and Society
  • Travel and Holiday
  • Visual Culture
  • Browse content in Law
  • Arbitration
  • Browse content in Company and Commercial Law
  • Commercial Law
  • Company Law
  • Browse content in Comparative Law
  • Systems of Law
  • Competition Law
  • Browse content in Constitutional and Administrative Law
  • Government Powers
  • Judicial Review
  • Local Government Law
  • Military and Defence Law
  • Parliamentary and Legislative Practice
  • Construction Law
  • Contract Law
  • Browse content in Criminal Law
  • Criminal Procedure
  • Criminal Evidence Law
  • Sentencing and Punishment
  • Employment and Labour Law
  • Environment and Energy Law
  • Browse content in Financial Law
  • Banking Law
  • Insolvency Law
  • History of Law
  • Human Rights and Immigration
  • Intellectual Property Law
  • Browse content in International Law
  • Private International Law and Conflict of Laws
  • Public International Law
  • IT and Communications Law
  • Jurisprudence and Philosophy of Law
  • Law and Politics
  • Law and Society
  • Browse content in Legal System and Practice
  • Courts and Procedure
  • Legal Skills and Practice
  • Primary Sources of Law
  • Regulation of Legal Profession
  • Medical and Healthcare Law
  • Browse content in Policing
  • Criminal Investigation and Detection
  • Police and Security Services
  • Police Procedure and Law
  • Police Regional Planning
  • Browse content in Property Law
  • Personal Property Law
  • Study and Revision
  • Terrorism and National Security Law
  • Browse content in Trusts Law
  • Wills and Probate or Succession
  • Browse content in Medicine and Health
  • Browse content in Allied Health Professions
  • Arts Therapies
  • Clinical Science
  • Dietetics and Nutrition
  • Occupational Therapy
  • Operating Department Practice
  • Physiotherapy
  • Radiography
  • Speech and Language Therapy
  • Browse content in Anaesthetics
  • General Anaesthesia
  • Neuroanaesthesia
  • Clinical Neuroscience
  • Browse content in Clinical Medicine
  • Acute Medicine
  • Cardiovascular Medicine
  • Clinical Genetics
  • Clinical Pharmacology and Therapeutics
  • Dermatology
  • Endocrinology and Diabetes
  • Gastroenterology
  • Genito-urinary Medicine
  • Geriatric Medicine
  • Infectious Diseases
  • Medical Toxicology
  • Medical Oncology
  • Pain Medicine
  • Palliative Medicine
  • Rehabilitation Medicine
  • Respiratory Medicine and Pulmonology
  • Rheumatology
  • Sleep Medicine
  • Sports and Exercise Medicine
  • Community Medical Services
  • Critical Care
  • Emergency Medicine
  • Forensic Medicine
  • Haematology
  • History of Medicine
  • Browse content in Medical Skills
  • Clinical Skills
  • Communication Skills
  • Nursing Skills
  • Surgical Skills
  • Browse content in Medical Dentistry
  • Oral and Maxillofacial Surgery
  • Paediatric Dentistry
  • Restorative Dentistry and Orthodontics
  • Surgical Dentistry
  • Medical Ethics
  • Medical Statistics and Methodology
  • Browse content in Neurology
  • Clinical Neurophysiology
  • Neuropathology
  • Nursing Studies
  • Browse content in Obstetrics and Gynaecology
  • Gynaecology
  • Occupational Medicine
  • Ophthalmology
  • Otolaryngology (ENT)
  • Browse content in Paediatrics
  • Neonatology
  • Browse content in Pathology
  • Chemical Pathology
  • Clinical Cytogenetics and Molecular Genetics
  • Histopathology
  • Medical Microbiology and Virology
  • Patient Education and Information
  • Browse content in Pharmacology
  • Psychopharmacology
  • Browse content in Popular Health
  • Caring for Others
  • Complementary and Alternative Medicine
  • Self-help and Personal Development
  • Browse content in Preclinical Medicine
  • Cell Biology
  • Molecular Biology and Genetics
  • Reproduction, Growth and Development
  • Primary Care
  • Professional Development in Medicine
  • Browse content in Psychiatry
  • Addiction Medicine
  • Child and Adolescent Psychiatry
  • Forensic Psychiatry
  • Learning Disabilities
  • Old Age Psychiatry
  • Psychotherapy
  • Browse content in Public Health and Epidemiology
  • Epidemiology
  • Public Health
  • Browse content in Radiology
  • Clinical Radiology
  • Interventional Radiology
  • Nuclear Medicine
  • Radiation Oncology
  • Reproductive Medicine
  • Browse content in Surgery
  • Cardiothoracic Surgery
  • Gastro-intestinal and Colorectal Surgery
  • General Surgery
  • Neurosurgery
  • Paediatric Surgery
  • Peri-operative Care
  • Plastic and Reconstructive Surgery
  • Surgical Oncology
  • Transplant Surgery
  • Trauma and Orthopaedic Surgery
  • Vascular Surgery
  • Browse content in Science and Mathematics
  • Browse content in Biological Sciences
  • Aquatic Biology
  • Biochemistry
  • Bioinformatics and Computational Biology
  • Developmental Biology
  • Ecology and Conservation
  • Evolutionary Biology
  • Genetics and Genomics
  • Microbiology
  • Molecular and Cell Biology
  • Natural History
  • Plant Sciences and Forestry
  • Research Methods in Life Sciences
  • Structural Biology
  • Systems Biology
  • Zoology and Animal Sciences
  • Browse content in Chemistry
  • Analytical Chemistry
  • Computational Chemistry
  • Crystallography
  • Environmental Chemistry
  • Industrial Chemistry
  • Inorganic Chemistry
  • Materials Chemistry
  • Medicinal Chemistry
  • Mineralogy and Gems
  • Organic Chemistry
  • Physical Chemistry
  • Polymer Chemistry
  • Study and Communication Skills in Chemistry
  • Theoretical Chemistry
  • Browse content in Computer Science
  • Artificial Intelligence
  • Computer Architecture and Logic Design
  • Game Studies
  • Human-Computer Interaction
  • Mathematical Theory of Computation
  • Programming Languages
  • Software Engineering
  • Systems Analysis and Design
  • Virtual Reality
  • Browse content in Computing
  • Business Applications
  • Computer Security
  • Computer Games
  • Computer Networking and Communications
  • Digital Lifestyle
  • Graphical and Digital Media Applications
  • Operating Systems
  • Browse content in Earth Sciences and Geography
  • Atmospheric Sciences
  • Environmental Geography
  • Geology and the Lithosphere
  • Maps and Map-making
  • Meteorology and Climatology
  • Oceanography and Hydrology
  • Palaeontology
  • Physical Geography and Topography
  • Regional Geography
  • Soil Science
  • Urban Geography
  • Browse content in Engineering and Technology
  • Agriculture and Farming
  • Biological Engineering
  • Civil Engineering, Surveying, and Building
  • Electronics and Communications Engineering
  • Energy Technology
  • Engineering (General)
  • Environmental Science, Engineering, and Technology
  • History of Engineering and Technology
  • Mechanical Engineering and Materials
  • Technology of Industrial Chemistry
  • Transport Technology and Trades
  • Browse content in Environmental Science
  • Applied Ecology (Environmental Science)
  • Conservation of the Environment (Environmental Science)
  • Environmental Sustainability
  • Environmentalist Thought and Ideology (Environmental Science)
  • Management of Land and Natural Resources (Environmental Science)
  • Natural Disasters (Environmental Science)
  • Nuclear Issues (Environmental Science)
  • Pollution and Threats to the Environment (Environmental Science)
  • Social Impact of Environmental Issues (Environmental Science)
  • History of Science and Technology
  • Browse content in Materials Science
  • Ceramics and Glasses
  • Composite Materials
  • Metals, Alloying, and Corrosion
  • Nanotechnology
  • Browse content in Mathematics
  • Applied Mathematics
  • Biomathematics and Statistics
  • History of Mathematics
  • Mathematical Education
  • Mathematical Finance
  • Mathematical Analysis
  • Numerical and Computational Mathematics
  • Probability and Statistics
  • Pure Mathematics
  • Browse content in Neuroscience
  • Cognition and Behavioural Neuroscience
  • Development of the Nervous System
  • Disorders of the Nervous System
  • History of Neuroscience
  • Invertebrate Neurobiology
  • Molecular and Cellular Systems
  • Neuroendocrinology and Autonomic Nervous System
  • Neuroscientific Techniques
  • Sensory and Motor Systems
  • Browse content in Physics
  • Astronomy and Astrophysics
  • Atomic, Molecular, and Optical Physics
  • Biological and Medical Physics
  • Classical Mechanics
  • Computational Physics
  • Condensed Matter Physics
  • Electromagnetism, Optics, and Acoustics
  • History of Physics
  • Mathematical and Statistical Physics
  • Measurement Science
  • Nuclear Physics
  • Particles and Fields
  • Plasma Physics
  • Quantum Physics
  • Relativity and Gravitation
  • Semiconductor and Mesoscopic Physics
  • Browse content in Psychology
  • Affective Sciences
  • Clinical Psychology
  • Cognitive Psychology
  • Cognitive Neuroscience
  • Criminal and Forensic Psychology
  • Developmental Psychology
  • Educational Psychology
  • Evolutionary Psychology
  • Health Psychology
  • History and Systems in Psychology
  • Music Psychology
  • Neuropsychology
  • Organizational Psychology
  • Psychological Assessment and Testing
  • Psychology of Human-Technology Interaction
  • Psychology Professional Development and Training
  • Research Methods in Psychology
  • Social Psychology
  • Browse content in Social Sciences
  • Browse content in Anthropology
  • Anthropology of Religion
  • Human Evolution
  • Medical Anthropology
  • Physical Anthropology
  • Regional Anthropology
  • Social and Cultural Anthropology
  • Theory and Practice of Anthropology
  • Browse content in Business and Management
  • Business Ethics
  • Business Strategy
  • Business History
  • Business and Technology
  • Business and Government
  • Business and the Environment
  • Comparative Management
  • Corporate Governance
  • Corporate Social Responsibility
  • Entrepreneurship
  • Health Management
  • Human Resource Management
  • Industrial and Employment Relations
  • Industry Studies
  • Information and Communication Technologies
  • International Business
  • Knowledge Management
  • Management and Management Techniques
  • Operations Management
  • Organizational Theory and Behaviour
  • Pensions and Pension Management
  • Public and Nonprofit Management
  • Strategic Management
  • Supply Chain Management
  • Browse content in Criminology and Criminal Justice
  • Criminal Justice
  • Criminology
  • Forms of Crime
  • International and Comparative Criminology
  • Youth Violence and Juvenile Justice
  • Development Studies
  • Browse content in Economics
  • Agricultural, Environmental, and Natural Resource Economics
  • Asian Economics
  • Behavioural Finance
  • Behavioural Economics and Neuroeconomics
  • Econometrics and Mathematical Economics
  • Economic History
  • Economic Systems
  • Economic Methodology
  • Economic Development and Growth
  • Financial Markets
  • Financial Institutions and Services
  • General Economics and Teaching
  • Health, Education, and Welfare
  • History of Economic Thought
  • International Economics
  • Labour and Demographic Economics
  • Law and Economics
  • Macroeconomics and Monetary Economics
  • Microeconomics
  • Public Economics
  • Urban, Rural, and Regional Economics
  • Welfare Economics
  • Browse content in Education
  • Adult Education and Continuous Learning
  • Care and Counselling of Students
  • Early Childhood and Elementary Education
  • Educational Equipment and Technology
  • Educational Strategies and Policy
  • Higher and Further Education
  • Organization and Management of Education
  • Philosophy and Theory of Education
  • Schools Studies
  • Secondary Education
  • Teaching of a Specific Subject
  • Teaching of Specific Groups and Special Educational Needs
  • Teaching Skills and Techniques
  • Browse content in Environment
  • Applied Ecology (Social Science)
  • Climate Change
  • Conservation of the Environment (Social Science)
  • Environmentalist Thought and Ideology (Social Science)
  • Natural Disasters (Environment)
  • Social Impact of Environmental Issues (Social Science)
  • Browse content in Human Geography
  • Cultural Geography
  • Economic Geography
  • Political Geography
  • Browse content in Interdisciplinary Studies
  • Communication Studies
  • Museums, Libraries, and Information Sciences
  • Browse content in Politics
  • African Politics
  • Asian Politics
  • Chinese Politics
  • Comparative Politics
  • Conflict Politics
  • Elections and Electoral Studies
  • Environmental Politics
  • European Union
  • Foreign Policy
  • Gender and Politics
  • Human Rights and Politics
  • Indian Politics
  • International Relations
  • International Organization (Politics)
  • International Political Economy
  • Irish Politics
  • Latin American Politics
  • Middle Eastern Politics
  • Political Behaviour
  • Political Economy
  • Political Institutions
  • Political Methodology
  • Political Communication
  • Political Philosophy
  • Political Sociology
  • Political Theory
  • Politics and Law
  • Public Policy
  • Public Administration
  • Quantitative Political Methodology
  • Regional Political Studies
  • Russian Politics
  • Security Studies
  • State and Local Government
  • UK Politics
  • US Politics
  • Browse content in Regional and Area Studies
  • African Studies
  • Asian Studies
  • East Asian Studies
  • Japanese Studies
  • Latin American Studies
  • Middle Eastern Studies
  • Native American Studies
  • Scottish Studies
  • Browse content in Research and Information
  • Research Methods
  • Browse content in Social Work
  • Addictions and Substance Misuse
  • Adoption and Fostering
  • Care of the Elderly
  • Child and Adolescent Social Work
  • Couple and Family Social Work
  • Developmental and Physical Disabilities Social Work
  • Direct Practice and Clinical Social Work
  • Emergency Services
  • Human Behaviour and the Social Environment
  • International and Global Issues in Social Work
  • Mental and Behavioural Health
  • Social Justice and Human Rights
  • Social Policy and Advocacy
  • Social Work and Crime and Justice
  • Social Work Macro Practice
  • Social Work Practice Settings
  • Social Work Research and Evidence-based Practice
  • Welfare and Benefit Systems
  • Browse content in Sociology
  • Childhood Studies
  • Community Development
  • Comparative and Historical Sociology
  • Economic Sociology
  • Gender and Sexuality
  • Gerontology and Ageing
  • Health, Illness, and Medicine
  • Marriage and the Family
  • Migration Studies
  • Occupations, Professions, and Work
  • Organizations
  • Population and Demography
  • Race and Ethnicity
  • Social Theory
  • Social Movements and Social Change
  • Social Research and Statistics
  • Social Stratification, Inequality, and Mobility
  • Sociology of Religion
  • Sociology of Education
  • Sport and Leisure
  • Urban and Rural Studies
  • Browse content in Warfare and Defence
  • Defence Strategy, Planning, and Research
  • Land Forces and Warfare
  • Military Administration
  • Military Life and Institutions
  • Naval Forces and Warfare
  • Other Warfare and Defence Issues
  • Peace Studies and Conflict Resolution
  • Weapons and Equipment

The Oxford Handbook of Philosophy of Perception

  • < Previous chapter
  • Next chapter >

The Oxford Handbook of Philosophy of Perception

37 Bayesian Perceptual Psychology

Michael Rescorla, University of California, Santa Barbara

  • Published: 05 September 2013
  • Cite Icon Cite
  • Permissions Icon Permissions

Contemporary perceptual psychology uses Bayesian decision theory to develop Helmholtz’s view that perception involves ‘unconscious inference’. The science provides mathematically rigorous, empirically well-confirmed explanations for diverse perceptual constancies and illusions. The explanations assign a central role to mental representation. This article highlights the explanatory centrality of representation within current Bayesian perceptual models. The article also discusses how Bayesian perceptual psychology bears upon several prominent philosophical topics, including: eliminativism about representation (defended by Churchland, Field, Quine, and Stich); relationalism about perception (endorsed by Brewer, Campbell, Martin, and Travis); phenomenal content (postulated by Chalmers, Horgan and Tienson, and Thompson); and the computational theory of mind (espoused by Fodor and many other philosophers).

Bayesian decision theory is a mathematical framework that models reasoning and decision-making under uncertainty. Around 1990, perceptual psychologists began constructing detailed Bayesian models of perception. 1 This research program has proved enormously fruitful. As two leading perceptual psychologists put it, ‘Bayesian concepts are transforming perception research by providing a rigorous mathematical framework for representing the physical and statistical properties of the environment, describing the tasks that perceptual systems are trying to perform, and deriving appropriate computational theories of how to perform those tasks’ ( Geisler and Kersten, 2002 , 508). To understand perception, one must acquire detailed knowledge of Bayesian perceptual psychology. Or so I hope to convince you.

1 Perception as Unconscious Inference

Perception solves an underdetermination problem . The perceptual system estimates environmental conditions, such as the shapes, sizes, colours, and locations of distal objects. It does so based upon proximal stimulations of sensory organs. The proximal stimulations underdetermine their environmental causes. For instance, a convex object under normal lighting generates retinal stimulations ambiguous between at least two possibilities: that the object is convex and that light comes from overhead; or that the object is concave and that light comes from below. Similarly, light reflected from a surface generates retinal stimulations consistent with various colours (e.g. the surface may be red and bathed in daylight, or the surface may be white and bathed in red light). In general, then, retinal input underdetermines possible states of the distal environment. We cannot yet programme a computer that solves this underdetermination problem. The perceptual system solves it quickly, effortlessly, automatically, and reliably. How?

Helmholtz (1867) proposed that the perceptual system executes an ‘unconscious inference’ from sensory stimulations to hypotheses about the environment. The inference reflects ‘implicit assumptions’ concerning the environment or the interaction between environment and perceiver. For instance, the visual system deploys an ‘implicit assumption’ that light comes from overhead. Helmholtz’s approach, now called constructivism, helps explain two notable phenomena: perceptual constancies and illusions .

Perceptual constancies are capacities to represent properties or entities as the same despite large variation in proximal stimulation. To varying degrees, human vision displays constancies for numerous properties, including size, shape, location, colour, depth, and motion. How does the perceptual system achieve constancies? By using ‘implicit assumptions’ to discount variations in proximal stimulation. Colour constancy provides a good illustration. This is the capacity to perceive surface colour as constant despite large variation in viewing conditions, including background illumination. To estimate surface colour, the perceptual system first deploys various ‘implicit assumptions’ (such as that the light source is fairly uniform, or that certain surface colours are likelier than others) to estimate background illumination based upon overall retinal stimulation. The perceptual system then deploys this background illumination estimate so as to estimate a surface’s colour based upon retinal stimulation caused by that surface. As Helmholtz famously put it, the perceptual system ‘discounts the illuminant’.

Perceptual constancies are reliable but fallible, as demonstrated by illusions . Consider again the assumption that light comes from overhead. The assumption is correct in normal cases, so it usually supports an inference to an accurate percept. When the assumption fails, the resulting percept is inaccurate. For instance, lighting a concave object from below generates an illusory percept as of a convex object. Constructivists explain the mistaken shape-estimate by isolating its source: the implicit assumption that light comes from overhead. Similarly, a red spotlight directed upon a single object violates the implicit assumption of a fairly uniform illuminant, thereby inducing an illusory colour percept. These examples illustrate constructivism’s template for explaining illusions: isolate an implicit assumption deployed during perceptual inference; show how failure of the assumption can induce an inaccurate percept.

Perceptual processes are subpersonal and inaccessible to the thinker. There is no good sense in which the thinker herself, as opposed to her perceptual system, executes perceptual inferences. For instance, a normal perceiver simply sees a surface as having a certain colour. Even if she notices the light spectrum reaching her eye, as a painter might, she cannot access the perceptual system’s inference from retinal stimulations to surface colour. 2

The twentieth century produced various rivals to constructivism, including Gibson’s direct perception framework. Gibson (1979) denied that perception involves complex psychological activity, inferential or otherwise. He held that the perceptual system directly ‘picks up’ certain distal properties by ‘resonating’ to them. Gibson’s work yielded many invaluable insights, such as the importance of optic flow, which can be incorporated into constructivism. Viewed as an alternative to constructivism, Gibson’s direct perception framework has difficulty explaining the vast bulk of constancies and illusions ( Fodor and Pylyshyn, 1981 ). That is why the direct perception framework remains marginal within perceptual psychology.

A satisfactory development of constructivism must answer three questions:

In what sense does the perceptual system execute ‘inferences’?

In what sense do the inferences ‘reflect’ various ‘implicit assumptions’?

In what sense does perceptual inference yield the ‘best’ hypothesis?

Different versions of constructivism answer these questions differently. For instance, some constructivists regard ‘implicit assumptions’ as stored premises fit to participate in unconscious deductive, inductive, or abductive inferences ( Rock, 1983 , 272–282). Bayesian perceptual psychology develops constructivism in a different direction, as I will now explain.

2 Perception as Unconscious Statistical Inference

The perceptual system operates under conditions of uncertainty, stemming from at least three sources:

Ambiguity of sensory input, as described above.

Noisiness of perceptual organs and neural mechanisms: that is, their vulnerability to corruption by random errors.

Potential conflict between sensory modalities (e.g. visual versus auditory cues to an object’s location) or between cues within a modality (e.g. binocular disparity cues to depth versus monocular linear perspective cues to depth).

It therefore seems natural to formalize constructivism through Bayesian decision theory, which models decision-making under uncertainty.

The core notion underlying Bayesian decision theory is subjective probability . Subjective probabilities reflect psychological facets of the individual or her subsystems, rather than ‘objective’ features of reality. To formalize probabilities, we introduce a hypothesis space H containing various hypotheses h . Each hypothesis h reflects a possible state of the world (e.g. a possible shape of some distal object; or a possible colour of some distal surface; or a possible assignment of distal objects to spatial locations). A probability function p maps each hypothesis h to a real number p ( h ), reflecting the agent’s subjective probabilities. 3

Bayesian decision theory dictates how to update subjective probabilities based on new evidence. Bayes’s Theorem states that:

meaning that the left-hand side is proportional to the right-hand side. p ( h | e ) and p ( e | h ) are conditional probabilities . For instance, p ( e | h ) is the probability of e, conditional on h . Bayes’s Rule states that, when one receives evidence e, one should update p ( h ) by replacing it with p ( h | e ). To execute Bayes’s Rule, one multiplies the prior probability p ( h ) by the prior likelihood p ( e | h ). One then normalizes so that all probabilities sum to 1. Finally, one adopts the resulting posterior probability p ( h | e ) as a revised probability assignment for h . Thus, the new probability of h is proportional to its original probability, multiplied by the likelihood of evidence e given h . 4

Bayesian perceptual psychologists use this framework to model perceptual inference ( Knill and Richards, 1996 ). On a Bayesian approach, the perceptual system entertains hypotheses drawn from a hypothesis space H . The perceptual system assigns prior probabilities to hypotheses h and prior likelihoods to ( e, h ) pairs, where each e corresponds to some possible sensory input. After receiving input e, the perceptual system reallocates probabilities across the hypothesis space, in rough accord with Bayes’s Rule.

To illustrate, consider the extraction of shape from shading ( Mamassian, Landy, and Maloney, 2002 ). Let s reflect possible shapes, θ reflect possible lighting directions, and e reflect possible patterns of retinal illumination. The visual system encodes:

A prior probability p ( s ), which assigns higher probability to certain distal shapes than others (e.g. it may assign higher probability to convex shapes). A prior probability p ( θ ), which assigns higher probability to an overhead lighting direction than to alternative lighting directions. A prior likelihood p ( e | s, θ ), which assigns a higher probability to an ( e, s, θ ) triplet if distal shape s and lighting direction θ are likely to cause retinal illumination e .

Upon receiving retinal illumination e, the perceptual system redistributes probabilities over shape-estimates, yielding a posterior p ( s | e ). Depending on the case, the posterior might assign a much higher probability to convexity than concavity. For details, see Stone (2011) .

Perception normally yields a determinate percept. For instance, one sees an object as having a determinate shape, not a spectrum of more or less probable shapes. Accordingly, Bayesian models explain how the perceptual system selects a single hypothesis h based on the posterior p ( h | e ). Typical models invoke expected utility maximization. The ‘action’ is selection of h . The utility function, which is task-dependent, reflects the penalty for an incorrect answer. If the utility function has a suitable shape, then expected utility maximization reduces to a much simpler decision rule, such as selecting the mean or the mode of the posterior probability.

As another example, Bayesian models of surface colour perception proceed roughly as follows. A surface has reflectance R (λ), specifying the fraction of incident light that the surface reflects at each wavelength λ. 5 The illuminant has spectral power distribution I (λ): the light power at each wavelength. The retina receives light spectrum C (λ) = I (λ) R (λ) from the surface. The visual system seeks to estimate surface reflectance R (λ). This estimation problem is underdetermined, since C (λ) is consistent with numerous I (λ)- R (λ) pairs. Typical Bayesian models posit that two surfaces have the same colour appearance for a perceiver when her perceptual system estimates the same reflectance for each surface. To estimate R (λ), the visual system estimates I (λ). It does so through a Bayesian inference, based upon overall retinal stimulation, that deploys a prior probability over possible illuminants and possible surface reflectances. To a first approximation, the illuminant prior assigns higher probability to illuminants that resemble natural daylight, while the surface prior assigns higher probability to surface reflectances that occur more commonly in the natural environment. This framework can explain both the success and the failure of human colour constancy under various conditions. For details, see Brainard (2009) . 6

We can schematize a typical Bayesian model through the template depicted in Figure 37.1 . Note that this template does not require perception to represent Bayesian norms. There is no evidence that the perceptual system explicitly represents Bayes’s Rule or expected utility maximization. The perceptual system simply proceeds in rough accord with Bayesian norms.

A typical Bayesian model dictates a unique outcome given four factors: prior probabilities, prior likelihoods, sensory input, and the utility function. 7 In that sense, the model is deterministic. Of course, the model’s generalizations are ceteris paribus . Perceptual malfunction, external interference, or corruption by internal noise can induce exceptions.

Most Bayesian models conform roughly to the foregoing template. But some models vary the template. For instance, some models augment the template by incorporating motor efference copy . 8 Other models replace expected utility maximization with probability matching, a non-deterministic process whereby the probability that the perceptual system selects some hypothesis matches the posterior probability assigned to that hypothesis ( Mamassian, Landy, and Maloney, 2002 ). One phenomenon sometimes analyzed through non-deterministic Bayesian modelling is multistable perception (such as the Necker cube). During multistable perception, experience fluctuates between distinct percepts, rather than yielding a unique percept.

One can construe Bayesian models of perception in two different ways ( Kersten and Mamassian, 2009 ). On the first construal, a Bayesian model describes how an ‘ideal observer’ would estimate environmental conditions based upon sensory input. We construct the model only so as compare human performance with an ideal benchmark. On the second construal, a Bayesian model approximately describes actual mental processes. The model seeks to describe, perhaps in an idealized way, how the perceptual system actually transits from sensory input to perceptual estimates. Both construals figure in the scientific literature. I emphasize the second construal. I am discussing Bayesian models as empirical theories of actual human perception.

A template for Bayesian models of perception.

Many Bayesian models are fairly unrealistic. For example, the hypothesis space is often uncountable. In general, Bayesian inference over an uncountable hypothesis space is computationally intractable. So I think that we should regard most Bayesian perceptual models as idealizations, akin to models from physics that postulate frictionless surfaces or infinite wires. Of course, we eventually want less idealized descriptions. However, I see no principled problem here. Artificial Intelligence (AI) offers numerous tools for constructing computationally tractable approximations to idealized Bayesian computation. No doubt we will eventually supplement or replace current perceptual models with computationally tractable approximations.

Bayesian perceptual psychology provides detailed answers to the three questions (a), (b), and (c) posed at the end of the previous section:

Transitions among perceptual states approximately conform to norms of Bayesian inference. In that sense, the transitions are statistical inferences.

Bayesian models replace talk about ‘implicit assumptions’ with talk about prior probabilities and likelihoods. The models thereby depart substantially from many earlier versions of constructivism. On Rock’s approach, for example, an ‘implicit assumption’ that light comes from overhead corresponds to a single stored premise whose content is that light comes from overhead. Bayesians instead posit a prior assignment of probabilities to possible lighting directions. This prior figures not as a premise but rather as input to Bayesian reallocation of subjective probabilities over shape-estimates.

The perceptual system produces an estimate that is ‘best’ or ‘optimal’ insofar as it conforms to rational norms of Bayesian decision theory. In this manner, Bayesian models depict numerous perceptual illusions as natural by-products of a near-optimal process that infers environmental conditions from ambiguous sensory stimulations.

Hence, the Bayesian framework converts talk about ‘implicit assumptions’ and ‘unconscious inferences’ into mathematically rigorous, quantitatively precise psychological models.

Where do the prior probabilities and prior likelihoods come from? The human visual system evolved over millennia in a fairly stable environment. Accordingly, one might expect certain lawlike or statistical environmental regularities to be ‘encoded in the genes’. Nevertheless, Bayesian perceptual priors do not simply reflect innate programming. For instance, even the ‘light-from-overhead’ prior reflects a complex interplay between nature and nurture. It gathers considerable strength during early childhood ( Stone, 2011 ), and it changes rapidly upon adult exposure to deviant environments ( Adams, Graf, and Ernst, 2004 ). At present, we do not know how genetic endowment and individual experience jointly determine Bayesian priors. Current research mainly tries to identify the priors, not to explain the aetiology of the priors. 9

Ultimately, we want detailed theories explaining how Bayesian priors originate and develop. Even lacking such theories, we can cite the priors to explain constancies and illusions. In this connection, I stress that the priors postulated by Bayesian perceptual psychology are not ad hoc . Admittedly, a precise quantitative match usually requires some ‘curve fitting’. Qualitatively, though, the priors typically reflect antecedently motivated claims about lawlike or statistical properties of our environment. It is plausible that the perceptual system acquires these priors through some combination of nature and nurture, even if we do not yet know how. 10

How can we legitimately postulate Bayesian priors, lacking a developed theory of their aetiology? Because Bayesian priors generate the unifying predictive power characteristic of good explanation. To illustrate, consider motion perception . 11 The visual system can directly measure local retinal image velocities, which underdetermine the distal motions that cause them. The visual system must estimate distal motion based upon local retinal image velocities. It does so fairly well but not perfectly, as illustrated by the fact that low-contrast stimuli appear to move more slowly than high-contrast stimuli. (This may explain why drivers accelerate in the fog—they underestimate relative velocities.) Weiss, Simoncelli, and Adelson (2002) offer a Bayesian motion perception model with two features:

The prior probability favours slow distal motions.

The visual system treats low-contrast retinal images as less reliable. 12

This model explains why vision underestimates velocity under low-contrast conditions: namely, because the slow-motion prior exerts more influence over the velocity-estimate. The model also explains other motion illusions, including the following: a fat rhombus moving horizontally appears to move horizontally, but a thin rhombus seems to move diagonally at low contrasts and horizontally at high contrasts. (Readers can experience this effect at < www.cs.huji.ac.il/~yweiss/Rhombus/rhombus.html >.) Thus, a single Bayesian model explains diverse illusions that otherwise resist unified treatment. Subsequent models have elaborated the Bayesian approach to motion perception in increasingly sophisticated ways ( Ernst, 2010 ).

Bayesian perceptual psychology offers illuminating, rigorous explanations for numerous constancies and illusions. It is our best current science of perception. We should carefully consider how it bears upon contemporary philosophy of mind—a task to which I now turn.

3 Estimation and Representation

A natural view holds that perceptual states are evaluable as accurate or inaccurate . For instance, suppose I perceive a concave object that appears convex due to misleading lighting. It seems natural to say that my percept is inaccurate. To say this, we must ascribe truth, accuracy, or veridicality conditions to the percept. Some philosophers distinguish among ‘truth’, ‘accuracy’, and ‘veridicality’ ( Burge, 2010 ), but I remain neutral on this issue. Call the view that perceptual states have veridicality-conditions representationalism . Burge (2005 , 2010 , 2011 ) argues that current perceptual psychology supports representationalism. I will now defend the same conclusion by examining Bayesian models of perception. 13

On the Bayesian approach, perceptual inference reallocates probabilities over a hypothesis space and then selects a favoured hypothesis. This favoured hypothesis is incorporated into the final percept, whose accuracy depends upon whether the hypothesis is accurate. To illustrate, consider Bayesian models of shape perception. The perceptual system assigns prior probabilities to estimates of specific distal shapes . After receiving sensory input, perceptual inference revises the probability assignment and selects a favoured estimate of a specific distal shape . The resulting percept incorporates this favoured shape-estimate. The percept may also incorporate various size-estimates, motion-estimates, and so on. Accuracy of the percept depends upon accuracy of the individual estimates. By describing perceptual inference in this way, we type-identify perceptual states representationally. We individuate perceptual states partly through environmental conditions that must obtain for the states to be accurate.

What exactly are the accuracy-conditions of percepts? According to Davies (1992) , a percept involves something like existential quantification . The percept is accurate when there exist objects with properties represented by the percept. An opposing view, espoused by Burge (2005) , holds that perceptual accuracy-conditions are object-dependent . A percept represents environmental particulars, such as physical bodies or events. The percept attributes properties to those particulars. It is accurate only if those particulars have the represented properties. I remain neutral between these two views. I emphasize a shared presupposition underlying both views: that perceptual states have accuracy-conditions. This presupposition is integral to perceptual psychology. The science seeks to explain how the perceptual system generates a percept that estimates specific environmental conditions. Estimates can be either accurate or inaccurate.

Following standard philosophical usage, I say that a mental state has representational content when it has a veridicality-condition. On this usage, perceptual states have representational content. I do not assume a specific theory of representational content. One might gloss perceptual contents as sets of possible worlds, or Russellian propositions, or Fregean senses. There are many other options. 14 The key point for us is that the science routinely individuates perceptual states through their representational import.

Bayesian models individuate both explananda and explanantia in representational terms. The science explains perceptual states under representational descriptions, and it does so by citing other mental states under representational descriptions . For instance, Bayesian models of shape from shading assume prior probabilities over hypotheses about specific distal shapes and about specific lighting directions . The models articulate generalizations describing how retinal input, combined with these priors, causes a revised probability assignment to hypotheses about specific distal shapes, subsequently inducing a unique estimate of a specific distal shape . The generalizations type-identify perceptual states as estimates of specific distal shapes. Similarly, Bayesian models of surface colour perception type-identify perceptual states as estimates of specific surface reflectances. Thus, the science assigns representation a central role within its explanatory generalizations. The generalizations describe how mental states that bear certain representational relations to the environment combine with sensory input to cause mental states that bear certain representational relations to the environment .

In what follows, I develop my analysis by examining various philosophical theories that either reject representationalism or else downplay the importance of representational content.

4 The Relational View of Perception

Brewer (2007) , Campbell (2010) , Martin (2004) , and Travis (2004) espouse a relational view of perception. Relationalists eschew all talk about perceptual representation. They treat perceptual states as relations not to representational contents but rather to objects or properties in the perceived environment. For instance, Campbell (2010 , 202) holds that ‘the content of visual experience is constituted by the objects and properties in the scene perceived’, rather than by anything resembling an accuracy-condition. He cautions that we should not ‘think of experience itself as already a representational state’ (ibid.). The relational approach is sometimes allied with Gibsonian direct perception, sometimes not.

To illustrate, consider two counterfactual situations A and B in which I perceive the same object O, yielding qualitatively indistinguishable percepts P A and P B :

In situation A, O is convex and looks convex.

In situation B, O is concave but looks convex through misleading lighting.

Representational taxonomization type-identifies P   A and P B by correlating them with the same accuracy-condition. In particular, both percepts estimate the same distal shape: convexity. In situation A , the estimate is correct. In situation B , the estimate is incorrect. By contrast, Campbell’s relational taxonomization treats P A and P B as type-distinct. Campbell type-identifies the first percept through its relation to a distal property (convexity) to which the second percept is not appropriately related.

Bayesian perceptual psychology supports representationalism over relationalism.

A core postulate underlying the science is that perception produces an estimate of environmental conditions, where the estimate may be either accurate or inaccurate. Consider Figure 37.1 . If we neglect noise, malfunction, and external interference, then a unique percept-type is determined by four factors: the prior probability, the prior likelihood, proximal sensory input, and the utility function. We may stipulate that all four factors are the same in situations A and B . It follows that percepts P A and P B are type-identical from the perspective of the Bayesian model . In both cases, the final percept incorporates a convexity-estimate. The perceptual system produces a convexity-estimate whether or not the perceived object is convex . (Cf. Burge, 2005 , 22–25; 2010, 362–364.) An appropriately modified diagnosis applies to non-deterministic Bayesian models, such as models that replace expected utility maximization with probability matching. For such models, the probability that situation A yields a convexity-estimate equals the probability that situation B yields a convexity-estimate. Thus, explanatory generalizations of Bayesian perceptual psychology enshrine a representational, non-relational taxonomic scheme. The generalizations type-identify percepts by specifying environmental conditions that must obtain for a given percept to be accurate.

Campbell (2010) suggests that we can interpret perceptual science in relational terms. This suggestion seems unpromising, because the Bayesian explanation of illusion relies essentially upon non-relational taxonomization. The central idea is that the perceptual system estimates some environmental state, which may or may not obtain . Bayesian modelling seeks to explain the environmental state estimate, regardless of whether the estimate is veridical. Contrary to Campbell’s relationalist strictures, the science routinely type-identifies veridical and non-veridical percepts. Of course, there is a difference between the veridical and the non-veridical percept. Perceptual psychologists acknowledge this difference. Yet they also emphasize fundamental representational commonalities between the two percepts. Those commonalities play a key individuative role within Bayesian explanatory generalizations. So a relational, non-representational taxonomic scheme flouts explanatory practice within perceptual psychology.

Brewer (2007 , 173) seeks to accommodate illusions inside a relational framework. He concedes that there can be a ‘visually relevant similarity’ between a veridical and a non-veridical percept. He compares: (i) a red surface in daylight; and (ii) a white surface surreptitiously bathed in red light. He acknowledges that the surface in scenario (ii) looks red. He says that ‘this consists in the fact that [the surface] has visually relevant similarities with paradigm red objects: the light reflected from it is like that reflected from such paradigms in normal viewing conditions’ (ibid.).

Naturally, I agree that (i) and (ii) emit similar light spectra. However, merely noting this commonality does not capture the fact that both surfaces look red. A surface that emits the same light spectrum under different viewing conditions may not look red. A surface that emits a radically different light spectrum under different viewing conditions can still look red. Thus, we must reject Brewer’s proposed analysis of looks red . In contrast, representationalists can say that a surface looks red when one’s percept represents the surface as red.

Brewer’s account omits crucial scientifically relevant commonalities between the two percepts. A key scientifically relevant commonality is that both percepts result from perceptual estimation of a single surface reflectance R (λ). The estimate is correct in (i), incorrect in (ii). We do not capture this key commonality between the percepts simply by noting that (i) and (ii) emit similar light. The perceptual system can estimate reflectance R (λ) despite large variation in the light spectrum C (λ) emitted by a surface. Moreover, depending on the perceptual system’s estimate of illumination I (λ), it may not estimate R (λ) even when the surface emits the same light spectrum C (λ). Capturing the scientifically relevant commonalities between (i) and (ii) requires us to cite perceptual estimation (and hence perceptual representation) of surface reflectance. Yet relationalists eschew all talk about perceptual representation.

There are delicate issues here surrounding the relation between colours and surface reflectances. According to current science, a percept that represents a surface as red is caused by perceptual activity that represents reflectance. But does the final percept itself represent reflectance? There are at least three salient options:

The percept represents colour but not reflectance.

The percept represents reflectance and separately represents colour.

The percept represents reflectance and thereby represents colour.

The choice between (a), (b), and (c) depends upon other matters, including the metaphysics of colour (cf. note 6). We need not choose among (a)–(c) here. The crucial point is that relationalists must reject all three options. Relationalists do not countenance perceptual representation of colour, reflectance, or any other distal property.

In summary, relationalism cannot accommodate a core postulate underlying contemporary perceptual psychology: that perception produces an estimate of environmental conditions, where the estimate may be either accurate or inaccurate.

5 Eliminativism, Instrumentalism, and Realism

Beginning with Quine (1960) , various philosophers have argued that intentionality (or representationality ) deserves no place in serious scientific discourse. They have argued that we should replace intentional psychology with some alternative framework, such as Skinnerian behaviourism ( Quine, 1960 ) or neuroscience ( Churchland, 1981 ). This eliminativist position concedes that representational locutions are instrumentally useful in everyday life. It denies that they offer literally true descriptions. Dennett (1987) advocates a broadly instrumentalist position intermediate between intentional realism and eliminativism. He acknowledges that the ‘intentional stance’ is instrumentally useful for scientific psychology, but he questions whether mental states really have representational content.

I assume a broadly scientific realist perspective: explanatory success is a prima facie guide to truth. From a scientific realist perspective, the explanatory success of Bayesian perceptual psychology provides prima facie reason to attribute representational content to perceptual states. The science is empirically successful and mathematically rigorous. It routinely individuates perceptual states through representational relations to the environment. We have no clue how to preserve the resulting explanatory benefits without employing representational locutions. Thus, current perceptual psychology strongly supports intentional realism over eliminativism and instrumentalism. We should no more adopt an eliminativist or instrumentalist posture towards intentionality than we should adopt an eliminativist or instrumentalist posture towards electrons. The famous Quinean criticisms of intentional psychology are notably less rigorous and compelling than the science they purport to undermine. Philosophers who reject intentionality as spooky, obscure, or otherwise unscientific are in fact opposing our current best science of perception.

One might greet my argument by proposing an instrumentalist interpretation of perceptual psychology. In this vein, McDowell insists that appeals to representational content within perceptual psychology are ‘metaphorical’ (2010, 250). On his analysis, perceptual psychologists do not literally claim that perception represents. They claim only that perception proceeds as if it represents. Representational talk is mere heuristic.

McDowell’s proposal misinterprets perceptual psychology. (Cf. Burge, 2011 , 67–68.) A fundamental idea underlying how the science treats illusion is that a perceptual estimate can be inaccurate . An estimate is accurate only if the environmental conditions that it estimates actually obtain. Thus, intentional attribution is embedded within the foundations of the science. Representational locutions do not play a metaphorical role within Bayesian perceptual psychology. They are not heuristic chitchat. They reflect the central, explicit goal of the science: to describe how the perceptual system estimates environmental conditions. Instrumentalism is no more justified toward Bayesian perceptual psychology than toward any other science.

Even readers who reject full-blown instrumentalism may contemplate a moderate instrumentalist agenda: construe representational description literally when applied to explananda but metaphorically when applied to explanantia . Consider again Figure 37.1 . Moderate instrumentalism adopts a realist stance towards sensory input e and the output hypothesis h but an instrumentalist stance towards the priors, posterior, and utility function. On this approach, the priors, posterior, and utility function are simply useful tools for predicting how certain sensory inputs cause certain perceptual states. The perceptual system transits from retinal input to perceptual estimates as if it encodes Bayesian priors. Moderate instrumentalism concedes that the perceptual system implements a mapping from sensory inputs to perceptual estimates, but it remains neutral regarding how the perceptual system implements that mapping. For defence of moderate instrumentalism regarding Bayesian perceptual psychology, see Colombo and Seriès (2012) .

Moderate instrumentalism does not flout the science as blatantly as full-blown instrumentalism. Nevertheless, it strikes me as unsatisfactory. A key point here is that experience can alter the mapping from proximal input to perceptual estimates. For example, Adams, Graf, and Ernst (2004) manipulated the light-from-overhead prior by exposing subjects to deviant haptic feedback regarding shape. The new prior caused altered shape-estimates. Moreover, the new prior transferred to a different task that required subjects to estimate which side of an oriented bar was lighter than the other. Realists can offer a principled,unified explanation for the altered shape-estimates and lightness-estimates: namely, that there is a change in the prior over lighting directions. Moderate instrumentalists seem unable to offer a comparably satisfying explanation. Moderate instrumentalists must simply say that the mapping from retinal input to shape-estimates changes and that the mapping from retinal input to lightness-estimates changes, without offering any underlying explanation for why the mappings change as they do. In this case, at least, realism seems more explanatorily fruitful than moderate instrumentalism. 15

We must exercise care in stating the realist position. As already noted, current Bayesian models are highly idealized. When the hypothesis space is large enough, the perceptual system may only approximately encode the priors and the posterior. What does it mean to ‘approximately encode’ a probability assignment? What is the difference between saying that the mind approximately implements Bayesian inference and saying that the mind merely behaves as if it implements Bayesian inference? 16 These questions—which lie at the intersection of philosophy, AI, and empirical psychology—merit extensive further study.

6 Phenomenal Content

Relatively few philosophers reject representationalism. However, many popular philosophical theories downplay perceptual representation of the distal environment. Most of these theories are consistent with but unsupported by contemporary science. I will now illustrate by considering phenomenal content , as postulated by Chalmers (2006) , Horgan and Tienson (2002) , Thompson (2010) , and various other philosophers.

A distinguishing feature of phenomenal content is that it supervenes upon phenomenal aspects of experience. For example, suppose that a normal perceiver Nonvert observes a red object and experiences a perceptual state with a certain phenomenological character. Suppose that a spectrally inverted perceiver Invert observes a green object and experiences a phenomenally indistinguishable perceptual state. Chalmers and Thompson hold that, in both cases, the resulting percept is veridical. Nonvert’s percept correctly attributes redness, while Invert’s percept correctly attributes greenness. Chalmers and Thompson also hold that the two percepts share a uniform phenomenal content. The content represents red as used by Nonvert and green as used by Invert . Similarly, Chalmers and Thompson hold that a single phenomenal content might represent circularity as used by one perceiver and non-circular ellipticality as used by a phenomenological twin suitably embedded in a sufficiently different environment .

There may be many good reasons for positing phenomenal contents. However, Bayesian perceptual psychology makes no use of such contents. The science delineates explanatory generalizations dictating how mental states that represent certain environmental properties induce other mental states that represent certain environmental properties . Bayesian models describe how the perceiver, exercising standing capacities to represent specific environmental properties , executes perceptual inferences yielding estimates of specific environmental properties . To illustrate, let us follow Thompson (2010) by considering phenomenological twins embedded in such different environments that one twin’s percept P represents circularity while the other twin’s qualitatively indistinguishable percept P* represents non-circular ellipticality. There may be many worthy explanatory projects that type-identify P and P* . But Bayesian perceptual psychology does not type-identify the two percepts. The science studies perceptual estimation of environmental conditions. P and P* estimate radically different environmental conditions: P estimates circularity, while P* estimates non-circular ellipticality. The science features no explanatory generalizations that assimilate these two percepts, because the relevant generalizations are tailored to specific shapes. Phenomenological overlap per se is irrelevant to the current science. What matters is representational overlap.

Similarly, suppose that Nonvert observes a red object, while spectrally inverted Invert observes a green object. Chalmers and Thompson associate the resulting qualitatively indistinguishable percepts with a shared phenomenal content. In contrast, Bayesian perceptual psychology does not type-identify the percepts. Bayesian models treat surface colour perception as involving estimation of reflectance. Explanatory generalizations cite representational relations to specific reflectances. Current Bayesian models of Nonvert describe how retinal illumination C (λ) induces an estimate of illuminant I (λ), subsequently inducing an estimate of reflectance R (λ). Current Bayesian models of Invert describe how different retinal illumination C* (λ) induces an estimate of a different reflectance R* (λ). Reflectance-estimate R (λ) as used by Nonvert and reflectance-estimate R* (λ) as used by Invert may be associated with the same phenomenology. But this phenomenological overlap is irrelevant to the science. No explanatory generalizations type-identify the relevant perceptual processes. At no level of description does current science assimilate Nonvert’s colour perception and Invert’s colour perception. 17

Current perceptual psychology individuates perceptual states by citing representational relations to specific environmental properties. 18 Taxonomization through phenomenal content ignores these representational relations. I conclude that phenomenal content is an armchair construct with no grounding inside contemporary science. Readers must judge for themselves whether philosophical energy is better expended studying this armchair construct or analyzing our current best science of perception.

7 The Computational Theory of Mind

I now want to consider the relation between Bayesian perceptual psychology and the popular philosophical view that mental activity involves computation over formal syntactic types in a language of thought ( Field, 2001 ), ( Fodor, 2008 ), ( Stich, 1983 ). The paradigm here is a Turing machine manipulating formal syntactic items, such as stroke marks, inscribed in memory locations. A formal syntactic type may have a meaning. But it could have had a different meaning, just as the English word ‘cat’ could have denoted dogs. Depending on the perceiver’s causal or evolutionary history, a formal syntactic type that represents some distal property could just as easily have represented some other distal property. Formal syntactic manipulation is not sensitive to such changes in meaning. Transition rules governing mental computation allude solely to ‘local’ syntactic properties of mental states, without citing representational relations to the external environment.

Field (2001) and Stich (1983) combine the formal syntactic picture with eliminativism . They urge scientific psychology to eschew any talk about representational content. Fodor (2008) combines the formal syntactic picture with intentional realism . In particular, he urges scientific psychology to delineate causal laws that cite representational content. He holds that intentional laws are implemented by syntactic mechanisms. So Fodor assigns a central role to representational content in addition to formal syntactic manipulation.

Egan (1992) argues that perceptual psychology postulates formal syntactic manipulation. She defends her conclusion by analyzing the writings of Marr (1982 ). I set aside whether Egan correctly describes Marr’s work, which was historically important but is now outdated. 19 I claim that the formal syntactic picture finds no support within current perceptual psychology, as epitomized by Bayesian modelling. Current perceptual psychology individuates mental computations in representational rather than formal syntactic terms ( Burge, 2010 , 95–101). For instance, Bayesian models of shape perception describe a computation whereby the visual system reallocates probabilities over hypotheses about distal shape . Each hypothesis is individuated partly by its representational relation to a specific distal shape. Transition rules governing the computation derive from Bayesian norms. Of course, the transition rules characterize initial sensory inputs (such as retinal inputs) physiologically rather than representationally. Crucially, though, the rules use representational vocabulary to characterize the perceptual states caused by initial sensory inputs. The rules do not cite formal syntax when characterizing sensory inputs (which are described physiologically) or ensuing perceptual states (which are described representationally). Bayesian models do not cite formal syntactic items divested of representational import. 20

A complete science of perception must illuminate the neural mechanisms that implement Bayesian computation. 21 Thus, a complete theory should include non-representational neural descriptions. But should it include non-representational syntactic descriptions? Syntax is supposed to be multiply realizable , in the sense that systems with wildly different intrinsic physical constitutions can satisfy the same syntactic description ( Fodor, 2008 , 91). Systems may be homogeneous under syntactic description but heterogeneous under neural description. Should a good theory posit formal syntactic types that are multiply realizable and that underdetermine representational content? There may be many good reasons for positing formal syntactic types with these features. Yet no such types figure in current perceptual psychology. The science does not employ computational descriptions that prescind from both representational and neural details. Eliminativist versions of the formal syntactic picture conflict with current perceptual psychology. Intentional realist versions of the formal syntactic picture are consistent with but unsupported by current perceptual psychology.

A common rejoinder is that we can reinterpret intentional explanations in formal syntactic terms, without explanatory loss. In this vein, Field (2001 , 72–82, 153–156) proposes a version of Bayesian modelling on which subjective probabilities attach to formal syntactic items individuated without regard to meaning or content. He claims that this framework can preserve any alleged explanatory benefits offered by intentional explanation.

Field’s proposal is revisionary regarding contemporary psychology. Current science individuates perceptual states representationally. Field proposes an alternative scientific framework that individuates perceptual states in formal syntactic terms. Whether an alternative hypothesis subserves equally good explanations is not a question to be settled a priori . Proponents must first develop the alternative hypothesis in rigorous mathematical and empirical detail. Field must reconstruct current science, expunging any apparent reference to representation. Yet he does not indicate how to execute the needed reconstruction for a single real case study. He does not demonstrate through a single real example that his approach can replicate the explanatory benefits offered by intentional explanation within Bayesian psychology. Thus, Field’s proposal amounts to an unsupported conjecture that we can gut perceptual psychology of a central theoretical construct without explanatory loss. We have no reason to believe this conjecture, absent detailed confirmation. 22 Generally speaking, we cannot radically alter how a science individuates its subject matter while preserving the science’s explanatory shape. We should not expect that we can transfigure the taxonomic scheme employed by current Bayesian models while retaining the explanatory benefits provided by those models.

In her later writings, Egan (2010) avoids talk about formal syntactic manipulation. Instead, she claims that computational models of perception offer “abstract mathematical descriptions” that ignore representational properties of perceptual states. This new account shares a crucial feature with the formal syntactic picture. Both accounts prioritize non-intentional, non-neural computational descriptions. As I have argued, no such descriptions figure in Bayesian perceptual psychology.

Philosophers motivate non-intentional computational modelling through various arguments. One popular argument emphasizes explanatory generality ( Egan, 2010 ; Stich, 1983 , 160–170). Following Egan (2010) , consider a creature Visua whose perceptual states represent some environmental property (such as depth). Imagine a neurophysiological duplicate Twin Visua embedded in such a radically different environment that its corresponding perceptual states do not represent the same property. 23 A non-intentional computational description can type-identify the doppelgangers. We cannot type-identify the doppelgangers if we classify perceptual states through representational relations to the environment. Shouldn’t we prefer the more general theory?

Assessing the merits of this argument is a large task that lies beyond our main focus. The key point for present purposes is that Bayesian perceptual psychology does not type-identify Egan’s putative neurophysiological twins. The science explains how perceptual systems of terrestrial animals transit from sensory input to hypotheses that represent specific environmental properties. It studies terrestrial animals endowed with standing capacities to represent specific environmental properties. Its scope is not intergalactic. It does not seek to accommodate chimerical creatures imagined by philosophers. Whatever the putative explanatory benefits of non-intentional computational modelling, our actual best science of perception individuates perceptual states partly through representational relations to specific environmental properties.

8 An Abstract Mathematical Description?

To bolster my assessment, I will now examine more carefully the role that probability theory plays within Bayesian modelling. Interested readers can consult any standard probability-theory textbook for the technical background to my discussion.

Probability theory, as axiomatized by Kolmogorov, posits a sample space Ω whose elements are possible ‘outcomes’. Kolmogorov’s axioms place no restrictions on elements of Ω. If Ω is discrete, then we can assign probabilities directly to its elements. If Ω is continuous, then we instead assign probabilities to privileged subsets of Ω. We introduce a σ-algebra over Ω (i.e. a set of subsets of Ω that contains Ω and is closed under countable union and complementation in Ω). A probability measure assigns a probability (a real number) to each element of the σ-algebra.

The probability density function for a Normal distribution.

A random variable is a measurable function from Ω to the real numbers ℝ. 24 A probability measure and a random variable jointly induce a probability distribution : an assignment of probabilities to privileged subsets of ℝ. Intuitively, the random variable lets us transform a probability assignment involving Ω into a probability assignment involving ℝ. 25 The probability distribution exists entirely within the realm of abstract mathematical entities. By citing the random variable and the probability distribution, we vastly increase the elegance and utility of our mathematical formalism. In particular, we can now apply real analysis to probabilistic modelling.

When Ω is continuous, we can often introduce a probability density function (pdf), which carries each element of ℝ to a probability density (also drawn from ℝ). A famous example is the Normal (or Gaussian ) distribution , whose associated probability density function is depicted in Figure 37.2 . The probability that a random variable attains a value within some region is found by integrating the pdf over that region. In other words, the probability assigned by the probability distribution to a region equals the integral of the pdf over that region. 26 A pdf is a purely mathematical entity, just like a probability distribution.

To apply probability theory to psychological modelling, we must specify the nature of the underlying sample space Ω. When we seek to model perception, we should construe Ω’s elements as perceptual estimates or hypotheses. For instance, if we are modelling depth perception, then we should construe each element of Ω as a perceptual estimate of some particular depth. One might gloss ‘perceptual estimates’ as mental representations, or Russellian propositions, or Fregean senses, or sets of possible worlds, and so on. The key point is that we individuate perceptual estimates at least partly through the environmental properties that the estimates represent. As I have argued, this is how the science typically individuates perceptual estimates. Once we have introduced an underlying sample space, we can also introduce appropriate random variables. To illustrate, suppose that Ω contains depth-estimates. Then we can introduce a random variable D that maps each depth-estimate h to a real number D ( h ). Depending on our choice of D , the real number D ( h ) might be the depth estimated by h as measured in metres, or as measured in feet, and so on.

In practice, Bayesian perceptual psychologists rarely highlight the underlying sample space Ω. Typical models, including all the models described in this chapter, instead emphasize probability distributions or pdfs. For instance, Jacobs (1999) posits a pdf for a random variable corresponding to depth. A pdf is a purely mathematical entity. By specifying it, we do not specify a unique sample space Ω. The pdf is consistent with numerous sample spaces.

At first blush, the scientific emphasis on probability distributions and pdfs may seem to undermine my representationalist interpretation of Bayesian perceptual psychology. Consider once again Visua, whose perceptual states represent depth, and doppelganger Twin Visua, whose corresponding states do not represent depth. According to Egan, explanatory generalizations of perceptual psychology should and do apply uniformly to Visua and Twin Visua. We can supplement the generalizations by specifying the environmental properties represented by Visua or Twin Visua. But the generalizations themselves ignore environmental representata . The generalizations constitute an ‘abstract mathematical description’ equally consistent with diverse distal interpretations ( Egan, 2010 , 256). Initially, Bayesian models may seem to offer precisely what Egan demands: ‘abstract mathematical descriptions’ that prescind from environmental representata . After all, Bayesian models emphasize pdfs, and a pdf is a purely mathematical entity: a function from real numbers to real numbers. Shouldn’t we conclude that Bayesian models of depth perception describe Twin Visua just as well as Visua?

Any such conclusion would be mistaken. I concede that a Bayesian perceptual model has an abstract mathematical form. I concede that, in principle, this abstract form encompasses diverse chimerical creatures. Nevertheless, the model describes statistical inferences over perceptual hypotheses, which it individuates partly through representational relations to specific environmental properties. Bayesian perceptual psychology does not pursue explanatory generalizations framed at an abstract mathematical level. Just as physics uses abstract mathematics to articulate generalizations over physical state-types, perceptual psychology uses abstract mathematics to articulate generalizations over representational mental state-types.

The central issue here is the notion of random variable . A random variable is a function from a sample space Ω to the real numbers ℝ. Thus, a random variable is defined only given a sample space. Ultimately, any Bayesian perceptual model featuring a random variable presupposes an appropriate sample space Ω. Perceptual models cite random variables only so as to illuminate probability assignments to environmental state estimates. The goal is to describe a statistical inference over estimates about the perceiver’s environment . The random variable is a valuable device for describing this statistical inference. But it is simply a tool for formulating rigorous, elegant explanatory generalizations concerning perceptual estimates .

As evidence for my position, I cite alternative measurement units . Our mapping from depth-estimates to real numbers depends upon our choice of units. The metric system yields one random variable. The British imperial system yields another. Our choice of random variable reflects our measurement units. Thus, the specific mathematical parameters enshrined by a random variable are mere artefacts of our measurement system. The parameters lack any explanatory significance for scientific psychology. We may use metric units to measure depth, but the perceptual system almost certainly does not. Psychological significance resides in the state estimate, not the mathematical entities through which we parameterize state estimates. Our ultimate concern is the probability measure over environmental state estimates , not the probability distribution over mathematical parameters . To privilege the latter over the former is to read our own idiosyncratic measurement system into the psychological phenomena. We must not conflate our measurement units with the environmental states that we use the units to measure.

I conclude that Bayesian perceptual psychology offers intentional generalizations governing probability assignments to environmental state estimates. We articulate the generalizations by citing probability distributions and pdfs over mathematical entities. But these purely mathematical functions are artefacts of our measurement units. They reflect our idiosyncratic measurement conventions, not the underlying psychological reality. They do not yield any explanatorily significant level of non-representational psychological description. They are tools for describing how the perceptual system allocates probabilities over a hypothesis space whose elements are individuated representationally. A Bayesian perceptual model has an abstract mathematical form, but this form does not secure explanatorily significant non-representational descriptions of perceptual states.

What if we identify the privileged measurement units used by the perceptual system? Can’t we assign explanatory priority to a pdf defined over those units? And won’t the resulting theory be non-representational?

One problem with this suggestion is that the perceptual system may not employ measurement units. In Peacocke’s (1992) terminology, perceptual representation may be ‘unit-free’. As far as we know, for example, the visual system may form a depth-estimate without denominating that estimate in feet, metres, or any other measurement units (although we use units to describe the estimate’s accuracy-condition). Admittedly, we may eventually discover that the perceptual system employs measurement units. It is difficult to anticipate how such a discovery might impact perceptual psychology. At present, the matter is speculative. All we can say for sure is that current Bayesian models do not attribute measurement units to the perceptual system. Current science posits probabilistic updating over perceptual hypotheses. It individuates the hypotheses partly through the specific environmental properties they represent.

9 Open Questions

Bayesian perceptual psychology raises numerous further questions, many on the border between philosophy and science. A few examples:

What neural mechanisms implement, or approximately implement, the computations posited by Bayesian models?

Does the Bayesian paradigm generalize from perception to cognition?

Can Bayesian models illuminate the relation between normativity and intentionality?

Can Bayesian models illuminate what it is to represent the external world?

Philosophers who pursue these questions will discover an imposing scientific literature that rewards intensive foundational analysis. 27

Adams, W. , Graf, E. , and Ernst, M. ( 2004 ). ‘Experience Can Change the “Light-From-Above” Prior’. Nature Neuroscience , 7, 1,057–1,058.

Beierholm, U. , Quartz, S. , and Shams, L. ( 2009 ). ‘ Bayesian Priors Are Encoded Independently from Likelihoods in Human Multisensory Perception ’. Journal of Vision, 9, 1–9.

Google Scholar

Bradley, P. ( 2008 ). ‘ Constancy, Categories, and Bayes ’. Philosophical Psychology, 21, 601–627.

Brainard, D. ( 2009 ). ‘Bayesian Approaches to Color Vision’. In M. Gazzaniga (ed.), The Visual Neurosciences , 4th edn (pp. 395–408). Cambridge, MA: MIT Press.

Google Preview

Brewer, B. ( 2007 ). ‘How to Account for Illusion’. In F. Macpherson and A. Haddock (eds), Disjunctivism: Perception, Action, Knowledge (pp. 168–180). Oxford: Oxford University Press.

Burge, T. ( 2011 ). ‘ Disjunctivism Again ’. Philosophical Explorations, 14, 43–80.

Burge, T. ( 2005 ). ‘ Disjunctivism and Perceptual Psychology ’. Philosophical Topics, 33, 1–78.

Burge, T. ( 2010 ). Origins of Objectivity . Oxford: Oxford University Press.

Campbell, J. ( 2010 ). ‘Demonstrative Reference, the Relational View of Experience, and the Proximality Principle’. In R. Jeshion (ed.), New Essays on Singular Thought (pp. 193–212). Oxford: Oxford University Press.

Chalmers, D. ( 2006 ). ‘Perception and the Fall from Eden’. In T. Gendler and J. Hawthorne (eds), Perceptual Experience (pp. 49–125). Oxford: Oxford University Press.

Churchland, P. ( 1981 ). ‘ Eliminative Materialism and the Propositional Attitudes ’. Journal of Philosophy, 78, 67–90.

Clark, A. ( 2013 ). ‘ Whatever Next? Predictive Brains, Situated Agents, and the Future of Cognitive Science ’. B ehavioral and Brain Sciences, 36, 181–204.

Colombo, M. and Seriès, P. ( 2012 ). ‘ Bayes in the Brain—On Bayesian Modeling in Neuroscience ’. British Journal for the Philosophy of Science, 63, 697–723

Davies, M. ( 1992 ). ‘ Perceptual Content and Local Supervenience ’. Proceedings of the Aristotelian Society, 92, 21–45.

Dennett, D. ( 1987 ). The Intentional Stance . Cambridge, MA: MIT Press.

Egan, F. ( 2010 ). ‘ Computational Models: A Modest Role for Content ’. Studies in History and Philosophy of Science, 41, 253–259.

Egan, F. ( 1992 ). ‘ Individualism, Computation, and Perceptual Content ’. Mind, 101, 443–459.

Egan, F. ( 2009 ). ‘Vision’. In E. Craig (ed.), Routledge Encyclopedia of Philosophy Online . < www.rep.routledge.com >.

Ernst, M. ( 2010 ). ‘ Eye Movements: Illusions in Slow Motion ’. Current Biology, 20, R357–R359.

Field, H. ( 2001 ). Truth and the Absence of Fact . Oxford: Oxford University Press.

Fodor, J. ( 2008 ). LOT 2 . Oxford: Oxford University Press.

Fodor, J. and Pylyshyn, Z. ( 1981 ). ‘How Direct is Visual Perception? Some Reflections on Gibson’s “Ecological Approach” ’. Cognition, 9, 139–196.

Geisler, W. ( 2008 ). ‘ Visual Perception and the Statistical Properties of Natural Scenes ’. Annual Review of Psychology, 59, 167–192.

Geisler, W. and Kersten, D. ( 2002 ). ‘ Illusions, Perception, and Bayes ’. Nature Neuroscience, 5, 508–510.

Gibson, J. J. ( 1979 ). The Ecological Approach to Visual Perception . Boston, MA: Houghton Mifflin.

Helmholtz, H. von. ( 1867 ). Handbuch der Physiologischen Optik . Leipzig: Voss.

Horgan, T. and J. Tienson ( 2002 ). ‘The Intentionality of Phenomenology and the Phenomenology of Intentionality’. In D. Chalmers (ed.), Philosophy of Mind: Classical and Contemporary Readings (pp. 520–533). Oxford: Oxford University Press.

Jacobs, R. ( 1999 ). ‘ Optimal Integration of Texture and Motion Cues to Depth ’. Vision Research 39, 3,621–3,629.

Kersten, D. and Mamassian, P. ( 2009 ). ‘Ideal Observer Theory’. iIn L. Squire (ed.), Encyclopedia of Neuroscience , vol. 5 (pp. 89–95). Oxford: Oxford University Press.

Kersten, D. and Schrater, P. ( 2002 ). ‘Pattern Inference Theory: A Probabilistic Approach to Vision’. In D. Heyer and R. Mausfeld (eds), Perception and the Physical World (pp. 191–228). New York: Wiley and Sons.

Knill, D. ( 2007 ). ‘ Learning Bayesian Priors for Depth Perception ’. Journal of Vision, 7, 1–20.

Knill, D. and Pouget, A. ( 2004 ). ‘ The Bayesian Brain: The Role of Uncertainty in Neural Coding and Computation ’. Trends in Neuroscience, 27, 712–719.

Knill, D. and Richards, W. (eds) ( 1996 ). Perception as Bayesian Inference . Cambridge: Cambridge University Press.

Maloney, L. and Mamassian, P. ( 2009 ). ‘ Bayesian Decision Theory as a Model of Human Visual Perception: Testing Bayesian Transfer ’. Visual Neuroscience , 26 , 147–155.

Mamassian, P. , Landy, M. , and Maloney, L. ( 2002 ). ‘Bayesian Modeling of Visual Perception’. In R. Rao , B. Olshausen , and M. Lewicki (eds), Probabilistic Models of the Brain (pp. 13–36). Cambridge, MA: MIT Press.

Marr, D. ( 1982 ). Vision . New York: Freeman.

Martin, M. ( 2004 ). ‘ The Limits of Self-Awareness ’. Philosophical Studies, 120, 37–89.

Matthen, M. ( 2005 ). Seeing, Doing, and Knowing . Oxford: Clarendon Press.

McDowell, J. ( 2010 ). ‘ Tyler Burge on Disjunctivism ’. Philosophical Explorations, 13, 243–255.

Peacocke, C. ( 1992 ). ‘Scenarios, Concepts and Perception’. In T. Crane (ed.), The Contents of Experience (pp. 105–135). Cambridge: Cambridge University Press.

Quine, W. V. ( 1960 ). Word and Object . Cambridge, MA: MIT Press.

Rescorla, M. ( 2012 ). ‘ How to Integrate Representation into Computational Modeling, and Why We Should ’. Journal of Cognitive Science, 13, 1–38.

Rock, I. ( 1983 ). The Logic of Perception . Cambridge, MA: MIT Press.

Segal, G. ( 1991 ). ‘ In Defense of a Reasonable Individualism ’. Mind, 100, 485–494.

Seydell, A. , Knill, D. , and Trommershäuser, J. ( 2011 ). ‘Prior and Learning in Cue Integration’. In J. Trommershäuser , K. Körding , and M. Landy (eds), Sensory Cue Integration (pp. 155–172). Oxford: Oxford University Press.

Siegel, S. ( 2011 ). ‘The Contents of Perception’. In E. Zalta (ed.), The Stanford Encyclopedia of Philosophy . < plato.stanford.edu/archives/spr2011/entries/perception-contents >.

Silverberg, A. ( 2006 ). ‘ Chomsky and Egan on Computational Theories of Vision ’. Minds and Machines, 16, 495–524.

Stich, S. ( 1983 ). From Folk Psychology to Cognitive Science . Cambridge, MA: MIT Press.

Stone, J. ( 2011 ). ‘ Footprints Sticking Out of the Sand, Part 2: Children’s Bayesian Priors for Shape and Lighting Direction ’. Perception, 40, 175–190.

Thompson, B. ( 2010 ). ‘ The Spatial Content of Experience ’. Philosophy and Phenomenological Research, 81, 146–184.

Travis, C. ( 2004 ). ‘ The Silence of the Senses ’. Mind, 113, 57–94.

Trommershäuser, J. , Körding, K. , and Landy, M. (eds) ( 2011 ). Sensory Cue Integration . Oxford: Oxford University Press.

Weiss, Y. , Simoncelli, E. , and Adelson, E. ( 2002 ). ‘ Motion Illusions as Optimal Percepts ’. Nature Neuroscience, 5, 598–604.

Wolpert, D. ( 2007 ). ‘ Probabilistic Models in Human Sensorimotor Control ’. Human Movement Science, 26, 511–524.

Bayesian perceptual psychology generalizes signal detection theory , which was developed in the 1950s. For comparison of the two frameworks, see Kersten and Schrater (2002 , 193–199).

On the distinction between the perceiver and her perceptual system, see Burge (2010 , 23–24; 2011, 68–69).

When the hypothesis space is continuous, p ( h ) is a probability density function . See below for details. For ease of exposition, I often blur the distinction between probability and probability density.

There is an unfortunate tendency among scientists and even some philosophers to conflate Bayes’s Theorem and Bayes’s Rule . The former is an easily provable mathematical theorem. The latter is a prescriptive norm that dictates how to reallocate probabilities in light of new evidence.

The models described in this paragraph assume diffusely illuminated flat matte surfaces . To handle other viewing conditions, we must replace R (λ) with a more complicated surface reflectance property, such as a bidirectional reflectance distribution .

Current models describe perception of surface colour. As Matthen (2005 , 176) emphasizes, colour perception also responds to transmitted colour (e.g. stained-glass windows) and coloured light sources. Thus, we should not identify colours with surface reflectance properties. Should we identify colours with other, possibly disjunctive, physical properties? Maybe. But the Bayesian models I am describing do not presuppose a physicalist reduction of colour. One might combine those models with various metaphysical views of colour, such as that colours are dispositions to cause sensations in normal human perceivers, or such as Matthen’s (2005)   pluralistic realism . Current Bayesian models assume no particular metaphysics of colour. They simply assume that human surface colour perception involves estimation of surface reflectance, as informed by an estimate of background illumination.

Cf. Burge’s ‘Proximality Principle’ (2005).

Motor efference copy figures most prominently in Bayesian models of sensorimotor control ( Wolpert, 2007 ).

There are exceptions, such as Knill (2007) .

In some cases, the priors reflect non-obvious statistical regularities about the environment ( Geisler, 2008 ). In other cases, a satisfying explanation awaits discovery. An example: somewhat mysteriously, the perceptual system assumes that the light source is located overhead and slightly to the left ( Mamassian, Landy, and Maloney, 2002 ). One question in this area concerns informational encapsulation : to what extent can cognition influence the priors?

Cue combination provides another good illustration. The perceptual system typically receives multiple cues, often through different sensory modalities, regarding a single environmental variable. Bayesian perceptual psychology offers a unified framework for explaining diverse cases of intermodal and intramodal sensory fusion: visual and auditory cues to location; visual and proprioceptive cues to limb position; conflicting visual cues to depth; and so on. See Trommershäuser, Körding, and Landy (2011) for an overview.

More technically: the prior likelihood p ( e | h ), considered as a function of h for fixed e , has higher variance when the retinal image e has lower contrast.

Burge discusses several Bayesian perceptual models, but he does not discuss their specifically Bayesian features. Bradley (2008) defends representationalism by citing Bayesian models of colour perception.

For a survey of philosophical approaches to perceptual content, see Siegel (2011) .

There are additional phenomena in a similar vein that favour realism towards prior probabilities and likelihoods ( Seydell, Knill, and Trommershäuser, 2011 ; Beierholm, Quartz, and Shams, 2009 ). Realism towards the utility function seems well-supported for Bayesian models of bodily motion ( Maloney and Mamassian, 2009 ). I am less sure about the utility functions that figure in Bayesian models of perception. Moderate instrumentalism may be more promising for that case.

Clark (2013) raises the same worry.

As noted above, one might hold that the final percept represents colour but not reflectance. However, this suggestion provides no support for phenomenal content. If one perceives a surface as a specific colour, then one’s percept is veridical only if the surface has that colour. Since Invert’s percept is veridical, and since the perceived surface is green, Invert does not perceive the surface as red. So Nonvert perceives a surface as red, while Invert does not perceive a surface as red. There is no basis here for type-identifying the relevant percepts.

One can individuate perceptual states through the environmental properties they represent without individuating them through the environmental particulars they represent. Burge (2010) introduces an individuative scheme for perceptual content along these lines. To illustrate, suppose that a percept attributes convexity to object O . According to Burge, any percept expressing the same content must also represent convexity. But a percept might express that same content while attributing convexity to a distinct object O* . Or a percept expressing that same content may involve a referential illusion, in which case it does not successfully attribute convexity to any object.

Silverberg (2006) argues that Egan misinterprets Marr. Egan (2009) discusses Bayesian models of perception but does not discuss how they bear upon her views regarding non-intentional computational modelling.

Rescorla (2012) relates these points to the computational models employed within CS and AI.

For discussion of possible neural mechanisms, see Clark (2013) and Knill and Pouget (2004) .

The details of Field’s discussion raise further doubts about the conjecture. He claims that there is no viable interpersonal notion of type-identity for mental representation tokens (2001, 75, fn. 3). In other words, Field’s favoured taxonomic scheme cannot type-identify the mental states of distinct creatures. This result is incompatible with current perceptual psychology, which routinely type-identifies the perceptual states of distinct creatures. How could any serious science of perception do otherwise?

Not everyone accepts that there exist creatures Visua and Twin-Visua satisfying these assumptions. In particular, Segal (1991) denies that perceptual states of neurophysiological twins can represent different environmental properties. For the sake of argument, I grant Egan’s description of the thought experiment.

A function X : Ω→ℝ is measurable just in case, for every Borel set B⊆ℝ, X –1 (B) belongs to the σ-algebra. One can generalize the definition of random variable to include functions from Ω to mathematical structures besides the real numbers. For ease of exposition, I focus on real-valued random variables. Consideration of generalized random variables would not alter my main conclusions.

Let P be a probability measure, let X : Ω→ℝ be a random variable, and let B⊆ℝ be a Borel set. Then we define a probability distribution P X by P X (B) = P ( X –1 (B)).

If P is a probability distribution, and if ρ ( x ) is an associated pdf, then P ( [ a , b ] )   =   ∫ a b ρ ( x ) d x .

I am indebted to Mohan Matthen and Susanna Siegel for comments that vastly improved this entry. I have also benefited from discussion of these issues with Tyler Burge, John Campbell, Kevin Falvey, Ian Nance, Christopher Peacocke, and Tamar Weber.

  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Front Psychol

Attention and Conscious Perception in the Hypothesis Testing Brain

Jakob hohwy.

1 Department of Philosophy, Monash University, Melbourne, VIC, Australia

Conscious perception and attention are difficult to study, partly because their relation to each other is not fully understood. Rather than conceiving and studying them in isolation from each other it may be useful to locate them in an independently motivated, general framework, from which a principled account of how they relate can then emerge. Accordingly, these mental phenomena are here reviewed through the prism of the increasingly influential predictive coding framework. On this framework, conscious perception can be seen as the upshot of prediction error minimization and attention as the optimization of precision expectations during such perceptual inference. This approach maps on well to a range of standard characteristics of conscious perception and attention, and can be used to interpret a range of empirical findings on their relation to each other.

Introduction

The nature of attention is still unresolved, the nature of conscious perception is still a mystery – and their relation to each other is not clearly understood. Here, the relation between attention and conscious perception is reviewed through the prism of predictive coding. This is the idea that the brain is essentially a sophisticated hypothesis tester (Helmholtz, 1860 ; Gregory, 1980 ), which continually and at multiple spatiotemporal scales seeks to minimize the error between its predictions of sensory input and the actual incoming input (see Mumford, 1992 ; Friston, 2010 ). On this framework, attention and perception are two distinct, yet related aspects of the same fundamental prediction error minimization mechanism. The upshot of the review here is that together they determine which contents are selected for conscious presentation and which are not. This unifies a number of experimental findings and philosophical issues on attention and conscious perception, and puts them in a different light. The prediction error minimization framework transpires as an attractive, if yet still speculative, approach to attention and consciousness, and their relation to each other.

Attention is difficult to study because it is multifaceted and intertwined with conscious perception. Thus, attention can be endogenous (more indirect, top-down, or motivationally driven) or exogenous (bottom-up, attention grabbing); it can be focal or global; it can be directed at objects, properties, or spatial or temporal regions, and so on (Watzl, 2011a , b ). Attentional change often seems accompanied by a change in conscious perception such that what grabs attention is a new stimulus, and such that whatever is attended to also populates consciousness. It can therefore be difficult to ascertain whether an experimental manipulation intervenes cleanly on attention or whether it intervenes on consciousness too (Van Boxtel et al., 2010 ).

Consciousness is difficult to study, partly because of the intertwinement with attention and partly because it is multifaceted too. Consciousness can apply to an overall state (e.g., awake vs. dreamless sleep) or a particular representation (e.g., conscious vs. unconscious processing of a face) all somehow tied together in the unity of the conscious stream (Bayne, 2010 ); it can pertain to the notion of a self (self-awareness) or just to being conscious (experience), and so on (Hohwy and Fox, 2012 ) 1 . There are widely accepted tools for identifying the neural correlates of conscious experience, though there is also some controversy about how cleanly they manipulate conscious states rather than a wide range of other cognitive processes (Hohwy, 2009 ). In the background is the perennial, metaphysical mind–body problem (Chalmers, 1996 ), which casts doubt on the possibility of ever achieving a fundamentally naturalist understanding of consciousness; (we will not discuss any metaphysics in this paper, however).

Functionally, attention is sometimes said to be an “analyzer,” dissecting and selecting among the many possible and often competing percepts one has at any given time. Consciousness in contrast seems to be a “synthesizer,” bringing together and organizing our multitudinous sensory input at any given time (Van Boxtel et al., 2010 ). On the other hand, attention may bring unity too, via binding (Treisman and Gelade, 1980 ), and consciousness also has a selective role when ambiguities in the sensory input are resolved in favor of one rather than the other interpretation, as seems to happen in binocular rivalry.

Attention and consciousness, then, are both difficult to define, to operationalize in functional terms, and to manipulate experimentally. Part of the trouble here has to do with the phenomena themselves, and possibly even their metaphysical underpinnings. But a large part of the trouble seems due to their intertwined relations. It is difficult to resolve these issues by appeal to commonsense or empirically informed conceptual analyses of each phenomenon in isolation of the other. For this reason it may be fruitful to appeal to a very general theoretical framework for overall brain function, such as the increasingly influential prediction error minimization approach, and review whether it implies coherently related phenomena with a reasonable fit to attention and conscious perception.

Section “ Aspects of Prediction Error Minimization ” describes heuristically the prediction error minimization approach. Section “ Prediction Error and Precision ” focuses on two aspects of this approach, here labeled accuracy and precision, and maps these onto perceptual inference and attention. Section “ Conscious Perception and Attention as Determined by Precise Prediction Error Minimization ” outlines why this mapping might be useful for understanding conscious perception and its relation to attention. In Section “ Interpreting Empirical Findings in the Light of Attention as Precision Optimization ,” the statistical dimensions of precision and accuracy are used to offer interpretations of empirical studies of the relation between attention and consciousness. The final section briefly offers some broader perspectives.

Aspects of Prediction Error Minimization

Two things motivate the idea of the hypothesis testing brain: casting a core task for the brain in terms of causal inference, and then appealing to the problem of induction.

The brain needs to represent the world so we can act meaningfully on it, that is, it has to figure out what in the world causes its sensory input. Representation is thereby a matter of causal inference. Causal inference however is problematic since a many–many relation holds between cause and effect: one cause can have many different effects, and one effect can have many different causes. This is the kernel of Hume’s problem of induction (Hume, 1739–1740 , Book I, Part III, Section vi): cause and effect are distinct existences and there are no necessary connections between distinct existences. Only with the precarious help of experience can the contingent links between them be revealed.

For the special case of the brain’s attempt to represent the world, the problem of induction concerns how causal inference can be made “backwards” from the effects given in the sensory input to the causes in the world. This is the inverse problem, and it has a deep philosophical sting in the case of the brain. The brain never has independent access to both cause and effect because to have that it would already have had to solve the problem of representation. So it cannot learn from experience by just correlating occurrences of the two. It only has the effects to go by so must somehow begin the representational task de novo .

The prediction error minimization approach resolves this problem in time. The basic idea, described heuristically here, is simple whereas the computational details are complex (Friston, 2010 ). Sensory input is not just noise but has repeatable patterns. These patterns can give rise to expectations about subsequent input. The expectations can be compared to that subsequent input and the difference between them be measured. If there is a tight fit, then the pattern generating the expectation has captured a pattern in the real world reasonably well (i.e., the difference was close to expected levels of irreducible noise). If the fit is less good, that is, if there is a sizeable prediction error, then the states and parameters of the hypothesis or model of the world generating the expectation should be revised so that subsequent expectations will, over time, get closer to the actual input.

This idea can be summed up in the simple dictum that to resolve the inverse problem all that is needed is prediction error minimization. Expected statistical patterns are furnished by generative models of the world and instead of attempting the intractable task of inverting these models to extract causes from generated effects, prediction error minimization ensures that the model recapitulates the causal structure of the world and is implicitly inverted; providing a sufficient explanation for sensory input.

This is consistent with a Bayesian scheme for belief revision in the light of new evidence, and indeed both Bayes as well as Laplace (before he founded classical frequentist statistics) developed their theories in response to the Humean-inspired inverse problem (McGrayne, 2011 ). The idea is to weight credence in an existing model of the world by how tightly it fits the evidence (i.e., the likelihood or how well it predicts the input) as well as how likely the model is in the first place (i.e., the prior probability or what the credence for the model was before the evidence came in).

The inverse problem is then resolved because, even though there is a many–many relation between causes in the world and sensory effects, some of the relations are weighted more than others in an optimally Bayesian way. The problem is solved de novo , without presupposing prior representational capability, because the system is supervised not by another agent, nor by itself, but by the very statistical regularities in the world it is trying to represent.

This key idea is then embellished in a number of different ways, all of which have bearing on attention and conscious perception.

The prediction error minimization mechanism sketched above is a general type of statistical building block that is repeated throughout levels of the cortical hierarchy such that there is recurrent message passing between levels (Mumford, 1992 ). The input to the system from the senses is conceived as prediction error and what cannot be predicted at one level is passed on to the next. In general, low levels of the hierarchy predict basic sensory attributes and causal regularities at very fast, millisecond, time scales, and more complex regularities, at increasingly slower time scales, are dealt with at higher levels (Friston, 2008 ; Kiebel et al., 2008 , 2010 ; Harrison et al., 2011 ). Prediction error is concurrently minimized across all levels of the hierarchy, and this unearths the states and parameters that represent the causal structure and depth of the world.

Contextual probabilities

Predictions at any level are subject to contextual modulation. This can be via lateral connectivity, that is, by predictions or hypotheses at the same hierarchical level, or it can be through higher level control parameters shaping low level predictions by taking slower time scale regularities into consideration. For example, the low level dynamics of birdsong is controlled by parameters from higher up pertaining to slower regularities about the size and strength of the bird doing the singing (Kiebel et al., 2010 ). Similarly, it may be that the role of gist perception is to provide contextual clues for fast classification of objects in a scene (Kveraga et al., 2007 ). The entire cortical hierarchy thus recapitulates the causal structure of the world, and the bigger the hierarchy the deeper the represented causal structure.

Empirical Bayes

For any appeal to Bayes, the question arises where do the priors come from (Kersten et al., 2004 )? One scheme for answering this, and evading charges of excessive subjectivity, is empirical Bayes where priors are extracted from hierarchical statistical learning (see, e.g., Casella, 1992 ). In the predictive coding scheme this does not mean going beyond Bayes to frequentism. (Empirical) Priors are sourced from higher levels in the hierarchy, assuming they are learned in an optimally Bayesian fashion (Friston, 2005 ). The notion of hierarchical inference is crucial here, and enables the brain to optimize its prior beliefs on a moment to moment basis. Many of these priors would be formed through long-term exposure to sensory contingencies through a creature’s existence but it is also likely that some priors are more hard-wired and instantiated over an evolutionary time-scale; different priors should therefore be malleable to different extents by the creature’s sensation.

Free energy

In its most general formulation, prediction error minimization is a special case of free energy minimization, where free energy (the sum of squared prediction error) is a bound on information theoretical surprise (Friston and Stephan, 2007 ). The free energy formulation is important because it enables expansion of the ideas discussed above to a number of different areas (Friston, 2010 ). Here, it is mainly the relation to prediction error minimization that will be of concern. Minimizing free energy minimizes prediction error and implicitly surprise. The idea here is that the organism cannot directly minimize surprise. This is because there is an infinite number of ways in which the organism could seek to minimize surprise and it would be impossibly expensive to try them out. Instead, the organism can test predictions against the input from the world and adjust its predictions until errors are suppressed. Even if the organism does not know what will surprise it, it can minimize the divergence between its expectations and the actual inputs encountered. A frequent objection to the framework is that prediction error and free energy more generally can be minimized by committing suicide since nothing surprises a dead organism. The response is that the moment an organism dies it experiences a massive increase in free energy, as it decomposes and is unable to predict anything (there is more to say on this issue, see Friston et al., in press ; there is also a substantial issue surrounding how these types of ideas can be reconciled with evolutionary ideas of survival and reproduction, for discussion see, Badcock, 2012 ).

Active inference

A system without agency cannot minimize surprise but only optimize its models of the world by revising those models to create a tight free energy bound on surprise. To minimize the surprise it needs to predict how the system’s own intervention in the world (e.g., movement) could change the actual input such as to minimize free energy. Agency, in this framework, is a matter of selectively sampling the world to ensure prediction error minimization across all levels of the cortical prediction hierarchy (Friston et al., 2009 , 2011 ). To take a toy example: an agent sees a new object such as a bicycle, the bound on this new sensory surprise is minimized, and the ensuing favored model of the world lets the agent predict how the prediction error landscape will change given his or her intervention (e.g., when walking around the bike). This prediction gives rise to a prediction error that is not minimized until the agent finds him or herself walking around the bike, hence the label “active inference.” If the initial model was wrong, then active inference fails to be this kind of self-fulfilling prophecy (e.g., it was a cardboard poster of a bike). Depending on the depth of the represented causal hierarchy this can give rise to very structured behavior (e.g., not eating all your food now even though you are hungry and instead keeping some for winter, based on the prediction this will better minimize free energy).

There is an intuitive seesaw dynamic here between minimizing the bound and actively sampling the world. It would be difficult to predict efficiently what kind of sampling would minimize surprise if the starting point was a very poor, inaccurate, bound on surprise. Similarly, insofar as selective sampling never perfectly minimizes surprise, new aspects of the world are revealed, which should lead to revisiting the bound on surprise. It thus pays for the system to maintain both perceptual and active inference.

Top-down and bottom-up

This framework comes with a re-conceptualization of the functional roles of the bottom-up driving signal from the senses, and the top-down or backward modulatory signal from higher levels. The bottom-up signal is not sensory information per se but instead just prediction error. The backward signal embodies the causal model of the world and the bottom-up prediction error is then essentially the supervisory feedback on the model (Friston, 2005 ). It is in this way the sensory input ensures the system is supervised, not by someone else nor by itself, but by the statistical regularities of the world.

The upshot is an elegant framework, which is primarily motivated by principled, philosophical and computational concerns about representation and causal inference. It is embellished in a number of ways that capture many aspects of sensory processing such as context-dependence, the role of prior expectations, the way perceptual states comprise sensory attributes at different spatiotemporal resolutions, and even agency. We shall appeal to all these elements as predictive coding is applied to attention and conscious perception.

Prediction Error and Precision

As discussed above, there are two related ways that prediction error can be minimized: either by changing the internal, generative model’s states, and parameters in the light of prediction error, or keeping the model constant and selectively sampling the world and thereby changing the input. Both ways enable the model to have what we shall here call accuracy : the more prediction error is minimized, the more the causal structure of the world is represented 2 .

So far, this story leaves out a crucial aspect of perceptual inference concerning variability of the prediction error. Prediction error minimization of the two types just mentioned assumes noise to be constant, and the variability of all prediction errors therefore the same. This assumption does not actually hold as noise or uncertainty is state dependent. Prediction error that is unreliable due to varying levels of noise in the states of the world is not a learning signal that will facilitate confident veridical revision of generative models or make it likely that selective sampling of the world is efficient. Prediction error minimization must therefore take variability in prediction error messaging into consideration – it needs to assess the precision of the prediction error.

Predictions are tested in sensory sampling: given the generative model a certain input is predicted where this input can be conceived as a distribution of sensory samples. If the actual distribution is different from the expected distribution, then a prediction error is generated. One way to assess a difference in distributions is to assess central tendency such as the mean. However, as is standard in statistical hypothesis testing, even if the means seem different (or not) the variability may preclude a confident conclusion that the two distributions are different (or not). Hence, any judgment of difference must be weighed by the magnitude of the variability – this is a requirement for trusting prediction error minimization.

The inverse of variability is the precision (inverse dispersion or variance) of the distribution. In terms of the framework used here, when the system “decides” whether to revise internal models in the light of prediction errors and to sample the world accordingly, those errors are weighted by their precisions. For example, a very imprecise (i.e., noisy, variable) prediction error should not lead to revision, since it is more likely to be a random upshot of noise for a given sensory attribute.

However, the rule cannot be simply that the more the precision the stronger the weight of the prediction error. Our expectations of precision are context dependent. For example, precisions in different sensory modalities differ (for an example, see Bays and Wolpert, 2007 ), and differ within the same modality in different contexts and for different sensory attributes. Sometimes it may be that one relatively broad, imprecise distribution should be weighed more than another narrower, precise distribution. Similarly, an unusually precise prediction error may be highly inaccurate as a result of under-sampling, for example, and should not lead to revision. In general, the precision weighting should depend on prior learning of regularities in the actual levels of noise in the states of the world and the system itself (e.g., learning leading to internal representations of the regularity that sensory precision tends to decline at dusk).

There is then a (second order) perceptual inference problem because the magnitude of precision cannot be measured absolutely. It must be assessed in the light of precision expectations . The consequence is that generative models must somehow embody expectations for the precision of prediction error, in a context dependent fashion. Crucially, the precision afforded a prediction has to be represented; in other words, one has to represent the known unknowns.

If precision expectations are optimized then prediction error is weighted accurately and replicates the precisions in the world. In terms of perceptual inference, the learning signal from the world will have more weight from units expecting precision, whereas top-down expectations will have more influence on perception when processing concerns units expecting a lot of imprecision; one’s preconceptions play a bigger role in making sense of the world when the signal is deemed imprecise (Hesselmann et al., 2010 ). This precision processing is thought to occur in synaptic error processing such that units that expect precision will have more weight (synaptic gain) than units expecting imprecision (Friston, 2009 ).

Given a noisy world and non-linear interactions in sensory input, first order statistics (prediction errors) and second order statistics (the precision of prediction errors) are then necessary and jointly sufficient for resolving the inverse problem. In what follows, the optimization of representations is considered in terms of both precision and accuracy , precision refers to the inverse amplitude of random fluctuations around, or uncertainty about, predictions; while accuracy (with a slight abuse of terminology) will refer to the inverse amplitude of prediction errors per se . Minimizing free energy or surprise implies the minimization of precise prediction errors; in other words, the minimization of the sum of squared prediction error and an optimal estimate of precision.

Using the terminology of accuracy and precision is useful because it suggests how the phenomena can come apart in a way that will help in the interpretation of the relation between consciousness and attention. It is a trivial point that precision and accuracy can come apart: a measurement can be accurate but imprecise, as in feeling the child’s fever with a hand on the forehead or it can be very precise but inaccurate, as when using an ill calibrated thermometer. This yields two broad dimensions for perceptual inference in terms of predictive coding: accuracy (via expectation of sensory input) and precision (via expectation of variability of sensory input). These can also come apart. Some of the states and parameters of an internal model can be inaccurate and yet precise (being confident that the sound comes from in front of you when it really comes from behind, Jack and Thurlow, 1973 ). Or they can be accurate and yet, imprecise (correctly detecting a faint sound but being uncertain about what to conclude given a noisy background).

With this in mind, assume now that conscious perception is determined by the prediction or hypothesis with the highest overall posterior probability – which is overall best at minimizing prediction error (this assumption is given support in the next section). That is, conscious perception is determined by the strongest “attractor” in the free energy landscape; where, generally speaking, greater precision leads to higher conditional confidence about the estimate and a deeper, more pronounced minimum in the free energy landscape.

On this assumption, precision expectations play a key role for conscious perception. We next note the proposal, which will occupy us in much of the following, that optimization of precision expectations maps on to attention (Friston, 2009 ). It is this mapping that will give substance to our understanding of the relation between attention and consciousness. It is a promising approach because precision processing, in virtue of its relation to accuracy, has the kind of complex relation to prediction error minimization that seems appropriate for capturing both the commonsense notion that conscious perception and attention are intertwined and also the notion that they are separate mechanisms (Koch and Tsuchiya, 2007 ; Van Boxtel et al., 2010 ).

We can usefully think of this in terms of a system such that, depending on context (including experimental paradigms in the lab), sensory estimates may be relatively accurate and precise, inaccurate and imprecise, accurate and imprecise, or inaccurate and precise. With various simplifications and assumptions, this framework can then be sketched as in Figure ​ Figure1 1 .

An external file that holds a picture, illustration, etc.
Object name is fpsyg-03-00096-g001.jpg

Schematic of statistical dimensions of conscious perception . The accuracy afforded by first order statistics refers to the inverse amplitude of prediction errors per se , while the precision afforded by second order statistics refers to the inverse amplitude of random fluctuations around, or uncertainty about, predictions. This allows for a variety of different types of states such that in general, and depending on context, inattentive but conscious states would cluster towards the lower right corner and attentive but unconscious states would cluster towards the upper left; see main text for further discussion.

By and large, conscious perception will be found for states that are both accurate and precise but may also be found for states that are relatively accurate and yet imprecise, and vice versa . Two or more competing internal models or hypotheses about the world can have different constellations of precision and accuracy: a relatively inaccurate but precise model might determine conscious perception over a competing accurate but imprecise model, and vice versa . Similarly, a state can evolve in different ways: it can for example begin by being very inaccurate and imprecise, and thus not determining conscious perception but attention can raise its conditional confidence and ensure it does get to determine conscious content.

On this framework, it should then also be possible to speak to some of the empirical findings of dissociations between attention and consciousness. A case of attention without consciousness would be where precision expectations are high for a state but prediction error for it is not well minimized (expecting a precise signal, or, expecting inference to be relatively bottom-up driven). A case of consciousness without attention would be where prediction error is well minimized but where precision is relatively low (expecting signals to be variable, or, expecting inference to be relatively top-down driven). It is difficult to say precisely what such states would be like. For example, a conscious, inattentive state might have a noisy, fuzzy profile, such as gist perception may have (Bar, 2007 ). It is also possible that increased reliance on top-down, prior beliefs could in fact paradoxically sharpen the representational profile (Ross and Burr, 2008 ) 3 . In general, in both types of cases, the outcome would be highly sensitive to the context of the overall free energy landscape, that is, to competing hypotheses and their precision expectations.

Section “ Interpreting Empirical Findings in the Light of Attention as Precision Optimization ” will begin the task of interpreting some studies in the field according to these accuracy and precision dimensions. The next section, however, will provide some prima facie motivation for this overall framework.

Conscious Perception and Attention as Determined by Precise Prediction Error Minimization

In this section, conscious perception and attention are dealt with through the prism of predictive coding. Though the evidence in favor of this approach is growing (see the excellent discussion in Summerfield and Egner, 2009 ) much of this is still speculative 4 . The core idea is that conscious perception correlates with activity, spanning multiple levels of the cortical hierarchy, which best suppresses precise prediction error: what gets selected for conscious perception is the hypothesis or model that, given the widest context, is currently most closely guided by the current (precise) prediction errors 5 .

Conscious perception can then thought to be at the service of representing the world, and the currently best internal, generative model is the one that most probably represents the causal structure of the world. Predictions by other models may also be able to suppress prediction error, but less well, so they are not selected. Conversely, often some other, possible models could be even better at suppressing prediction error but if the system has not learnt them yet, or cannot learn them, it must make do with the best model it has.

It follows that the predictions of the currently best model can actually be rather inaccurate. However, if it has no better competitor then it will win and get selected for consciousness. Conscious perception can then be far from veridical, in spite of its representational nature. This makes room for an account of illusory and hallucinatory perceptual content, which is an important desideratum on accounts of conscious perception. These would be cases where, for different reasons, poor models are best at precisely explaining away incoming data only because their competitors are even poorer.

The job of the predictive coding system is to attenuate sensory input by treating it as information theoretical surprise and predicting it as perfectly as possible. As the surprise is attenuated, models should stop being revised and predictive activity progressively cease throughout the hierarchy. This seems consistent with repetition suppression (Grill-Spector et al., 2006 ) where neural activity ceases in response to expected input in a manner consistent with prediction error minimization (Summerfield et al., 2008 ; Todorovic et al., 2011 ). At the limit it should have consequences for conscious perception too. When all the surprise is dealt with, prediction and model revision should cease. If it is also impossible to do further selective sampling then conscious perception of the object in question should cease. This follows from the idea that what we are aware of is the “fantasy” generated by the way current predictions attenuate prediction error; if there is no prediction error to explain away, then there is nothing to be aware of. Presumably there is almost always some input to some consumer systems in the brain (including during dreaming) but conceivably something close to this happens when stabilized retinal images fade from consciousness (Ditchburn and Ginsborg, 1952 ). Because such stimuli move with eye and head movement predictive exploration of them is quickly exhausted.

Conscious perception is often rich in sensory attributes, which are neatly bound together even though they are processed in a distributed manner throughout the brain. The predictive coding account offers a novel approach to this “binding” aspect of conscious perception. Distributed sensory attributes are bound together by the causal inference embodied in the parameters of the generative model. The model assumes, for example, that there is a red ball out there so will predict that the redness and the bouncing object co-occur spatiotemporally. The binding problem (Treisman, 1996 ) is then dealt with by default: the system does not have to operate in a bottom-up fashion and first process individual attributes and then bind them. Instead, it assumes bound attributes and then predicts them down through the cortical hierarchy. If they are actually bound in the states of the world, then this will minimize prediction error, and they will be experienced as such.

It is a nice question here what it means for the model with the highest posterior probability to be “selected for consciousness.” We can only speculate about an answer but it appears that on the predictive coding framework there does not have to be a specific selection mechanism (no “threshold” module, cf. Dennett, 1991 ). When a specific model is the one determining the consciously perceived content it is just because it best minimizes prediction error across most levels of the cortical hierarchy – it best represents the world given all the evidence and the widest possible context. This is the model that should be used to selectively sample the world to minimize surprise in active inference. Competing but less probable models cannot simultaneously determine the target of active inference: the models would be at cross-purposes such that the system would predict more surprise than if it relies on one model alone (for more on the relation between attention and action, see Wu, 2011 ).

Though there remain aspects of consciousness that seem difficult to explain, such as the conscious content of imagery and dreaming, this overall approach to conscious perception does then promise to account for a number of key aspects of consciousness. The case being built here is mainly theoretical. There is not yet much empirical evidence for this link to conscious perception, though a recent dynamical causal modeling study from research in disorders of consciousness (vegetative states and minimally conscious states) suggests that what is required for an individual to be in an overall conscious state is for them to have intact connectivity consistent with predictive coding (Boly et al., 2011 ).

As we saw earlier, in the normal course of events, the system is helped in this prediction error minimization task by precision processing, which (following Feldman and Friston, 2010 ) was claimed to map on to attention such that attention is precision optimization in hierarchical perceptual inference. A prediction error signal will have a certain absolute dispersion but whether the system treats this as precise or not depends on its precision expectations, which may differ depending on context and beliefs about prior precision. Precise prediction errors are reliable signals and therefore, as described earlier, enable a more efficient revision of the model in question (i.e., a tighter bound and better active inference). If that model then, partly resulting from precision optimization, achieves the highest posterior probability, it will determine the content of conscious perception. This begins to capture the functional role often ascribed to attention of being a gating or gain mechanism that somehow optimizes sensory processing (Hillyard and Mangun, 1987 ; Martinez-Trujillo and Treue, 2004 ). As shall be argued now, it can reasonably account for a wider range of characteristics of attention.

Exogenous attention

Stimuli with large spatial contrast and/or temporal contrast (abrupt onset) tend to “grab” attention bottom-up, or exogenously. These are situations where there is a relatively high level of sensory input, that is, a stronger signal. Given an expectation that stronger signals have better signal to noise ratio (better precision), than weaker signals (Feldman and Friston, 2010 , p. 9; Appendix), error units exposed to such signals should thus expect high precision and be given larger gain. As a result, more precise prediction error can be suppressed by the model predicting this new input, which is then more likely to be the overall winner populating conscious experience. Notice that this account does not mention expectations about what the signal stems from, only about the signal’s reliability. Also notice that this account does not guarantee that what has the highest signal to noise ratio will end up populating consciousness, it may well be that other models have higher overall confidence or posterior probability.

Endogenous attention

Endogenous attention is driven more indirectly by probabilistic context. Beginning with endogenous cueing, a central cue pointing left is itself represented with high precision prediction error (it grabs attention) and in the parameters of the generative model this cue representation is related to the representation of a stimulus to the left, via a learned causal link. This reduces uncertainty about what to predict there (increases prior probability for a left target) and it induces an expectation of high precision for that region. When the stimulus arrives, the resulting gain on the error units together with the higher prior help drive a higher conditional confidence for it, making it likely it is quickly selected for conscious perception.

The idea behind endogenous attention is then that it works as an increase in baseline activity of neuronal units encoding beliefs about precision. There is evidence that such increase in activity prior to stimulus onset is specific to precision expectations. The narrow distributions associated with precise processing tell us that in detection tasks the precision-weighted system should tend to respond when and only when the target appears. And indeed such baseline increases do bias performance in favor of hits and correct rejections (Hesselmann et al., 2010 ). In contrast, if increased baseline activity had instead been a matter of mere accumulation of evidence for a specific stimulus (if it had been about accuracy and not precision), then the baseline increase should instead have biased toward hits and false alarms.

A recent paper directly supports the role of endogenous attention as precision weighting (Kok et al., 2011 ). As we have seen, without attention, the better a stimulus is predicted the more attenuated its associated signal should be. Attention should reverse this attenuation because it strengthens the prediction error. However, attention depends on the predictability of the stimulus: there should be no strong expectation that an unpredicted stimulus is going to be precise. So there should be less attention-induced enhancement of the prediction error for unpredicted stimuli than for better predicted stimuli. Using fMRI, Kok et al. very elegantly provides evidence for this interaction in early visual cortex (V1).

In more traditional cases of endogenous attention (e.g., the individual deciding herself to attend left) the cue can be conceived as a desired state, for example, that something valuable will be spotted to the left. This would then generate an expectation of precision for that region such that stimuli located there are more likely to be detected. Endogenous attention of this sort has a volitional aspect: the individual decides to attend and acts on this decision. Such agency can range from sensorimotor interaction and experimentation to a simple decision to fixate on something. This agential aspect suggests that part of attention should belong with active inference (selective sampling to minimize surprise). The idea here would be that the sampling is itself subject to precision weighting. This makes sense since the system will not know if its sampling satisfies expectations unless it can assess the variability in the sampling. Without such an assessment, the system will not know whether to keep sampling on the basis of a given model or whether the bound on the model itself needs to be re-assessed. In support of this, there is emerging evidence that precision expectations are also involved in motor behavior (Brown et al., 2011 ).

Biased competition

An elegant approach to attention begins with the observation that neurons respond optimally to one object or property in their receptive field so that if more than one object is present, activity decreases unless competition between them is resolved. The thought is that attention can do this job, by biasing one interpretation over another (Desimone and Duncan, 1995 ). Attention is thus required to resolve ambiguities of causal inference incurred by the spatial architecture of the system. Accordingly, electrophysiological studies show decreased activity when two different objects are present in a neuron’s receptive field, and return to normal levels of activity when attention is directed toward one of them (Desimone, 1998 ).

The predictive coding framework augmented with precision expectations should be able to encompass biased competition. This is because, as mentioned, precision can modulate perceptual inference when there are two or more competing, and perhaps equally accurate, models. Indeed, computational simulation shows precision-weighted predictive coding can play such a biasing role in a competitive version of the Posner paradigm where attention is directed to a cued peripheral stimulus rather than a competing non-cued stimulus. A central cue thus provides a context for the model containing the cued stimulus as a hidden cause. This drives a high precision expectation for that location, which ensures relatively large gain, and quicker response times, when those error units are stimulated. This computational model nicely replicates psychophysics and electrophysiological findings (Feldman and Friston, 2010 , pp. 14–15).

Attentional competition is then not a matter somehow of intrinsically limited processing resources or of explicit competition. It is a matter of optimal Bayesian inference where only one model of the causal regularities in the world can best explain away the incoming signal, given prior learning, and expectations of state-dependent levels of noise.

Binding of sensory attributes by a cognitive system was mooted above as a natural element of predictive coding. Attention is also thought to play a role for binding (Treisman and Gelade, 1980 ; Treisman, 1998 ) perhaps via gamma activity (Treisman, 1999 ) such that synchronized neurons are given greater gain. Again, this can be cast in terms of precision expectations: sensory attributes bound to the same object are mutually predictive and so if the precision-weighted gain for one is increased it should increase for the other too. Though this is speculative, the predictive coding framework could here elucidate the functional role of increased gamma activity and help us understand how playing this role connects to attention and conscious perception.

Perhaps we should pause briefly and ask why we should adopt this framework for attention in particular – what does it add to our understanding of attention to cast it in terms of precision expectations? A worry could be that it is more or less a trivial reformulation of notions of gain, gating, and bias, which has long been used to explicate attention in a more or less aprioristic manner. The immediate answer is that this account of attention goes beyond mere reformulations of known theories, not just because its basic element is precision, but also because it turns on learning precision regularities in the world so different contexts will elicit different precision expectations. This is crucial because optimization of precision is context dependent and thus requires appeal to just the kind of predictive framework used here.

There is also a more philosophical motivation for adopting this approach. Normally, an account of attention would begin with some kind of operational, conceptual analysis of the phenomenon: attention has to do with salience, with some kind of selection of sensory channels, resource limitations, and so on. Then the evidence is consulted and theories formulated about neural mechanisms that could underpin salience and selection etc. This is a standard and fruitful approach in science. But sometimes taking a much broader approach gives a better understanding of the nature of the phenomenon of interest and its relation to other phenomena ( cf . explanation by unification, Kitcher, 1989 ). In our case, a very general conception of the fundamental computational task for the brain defines two functional roles that must be played: estimation of states and parameters, and estimation of precisions. Without beginning from a conceptual analysis of attention, we then discover that the element of precision processing maps on well to the functional role we associate with attention. This discovery tells us something new about the nature of attention: the reason why salience and selection of sensory channels matter, and the reason why there appears to be resource limitations on attention, is that the system as such must assess precisions of sensory estimates and weight them against each other.

Viewing attention from the independent vantage point of the requirements of predictive coding also allows us to revise the concept of attention somewhat, which can often be fruitful. For example, there is no special reason why attention should always have to do with conscious perception, given the ways precision and accuracy can come apart; that is, there may well be precision processing – attention – outside consciousness. The approach suggests a new way for us to understand how attention and perception can rely on separate but related mechanisms. This is the kind of issue to which we now turn.

Interpreting Empirical Findings in the Light of Attention as Precision Optimization

The framework for conscious perception sketched in Section “ Prediction Error and Precision ” (see Figure ​ Figure1) 1 ) implied that studies of the relation between consciousness and attention can be located according to the dimensions of accuracy and precision. We now explore if this implication can reasonably be said to hold for a set of key findings concerning: inattentional blindness, change blindness, the effects of short term and sustained covert attention on conscious perception, and attention to unconscious stimuli.

The tools for interpreting the relevant studies must be guided by the properties of predictive coding framework we have set out above, so here we briefly recapitulate: (1) even though accuracy and precision are both necessary for conscious perception, it does not follow that the single most precise or the most accurate estimate in a competing field of estimates will populate consciousness: that is determined by the overall free energy landscape. For example, it is possible for the highest overall posterior probability to be determined by an estimate having high accuracy and relatively low precision even if there is another model available that has relatively low accuracy yet high precision, and so on. (2) Attention in the shape of precision expectation modulates prediction error minimization subject to precisions predicted by the context, including cues and competing stimuli; it can do this for prediction errors of different accuracies. (3) Precision weighting only makes sense if weights sum to one so that as one goes up the others must go down. Similarly, as the probability of one model goes up the probability of other models should go down – the other models are explained away if one model is able to account for and suppress the sensory input. This gives rise to model competition. (4) Conscious experience of unchanging, very stable stimuli will tend to be suppressed over time, as prediction error is explained away and no new error arises. (5) Agency is active inference: a model of the agent’s interaction with the world is used to selectively sample the world such as to minimize surprise. This also holds for volitional aspects of attention, such as the agency involved in endogenous attention to a spatial location.

The aim now is to use these properties of predictive coding to provide a coherent interpretation of the set of very different findings on attention and consciousness.

Types of inattentional blindness

The context for a stimulus can be a cue or an instruction or other sensory information, or perhaps a decision to attend. Various elements of this context can give a specific generative model two advantages: it can increase priors for its states and parameters (for this part of the view, see also Rao, 2005 ) and it can bias selection of that model via precision weighting. When the target stimulus comes, attention has thus already given the model for that stimulus a probabilistic advantage. If in contrast the context is invalid (non-predictive) and a different target stimulus occurs, the starting point for the model predicting it can be much lower both in terms of prior probability and in terms of precision expectation. If this lower starting point is sufficiently low, and if the invalidly contextualized stimulus is not itself strongly attention grabbing (is not abrupt in some feature space such as having sharp contrast or temporal onset), then “the invalid target may never actually be perceived” (Feldman and Friston, 2010 , pp. 9–10).

This is then what could describe forms of inattentional blindness where an otherwise visible stimulus is made invisible by attending to something at a different location: an attentional task helps bias one generative model over models for unexpected background or peripheral stimuli. A very demanding attentional task would have very strong bias from precision weighting, and correspondingly the weight given to other models must be weakened. This could drive overall posterior probability below selection for consciousness, such that not even the gist of, for example, briefly presented natural scenes is perceived.

It is natural to conclude in such experiments that attention is a necessary condition for conscious perception since unattended stimuli are not seen, and as soon as they are seen performance on the central task decreases (Cohen et al., 2011 ). This is right in the sense that any weighting of precision to the peripheral or background stimulus must go with decreased weight to the central task. However, the more fundamental truth here is that in a noisy world precision weighting is necessary for conscious perception so that at the limit, where noise expectations are uniform, there could be conscious perception even though attention plays very little actual role.

When inattentional blindness is less complete, the gist of briefly presented natural scenes can be perceived (see, Van Boxtel et al., 2010 ). This is consistent with relatively low precision expectation since gist is by definition imprecise. So in this case some, but relatively little prediction error is allowed through for the natural scene, leaving only little prediction error to explain away. It seems likely that this could give rise to gist rather than full perception. However, the distinction between gist and full perception is not well understood and there are more specific views on gist perception, also within the broad predictive coding framework (Bar, 2003 ).

In some cases of inattentional blindness, large and otherwise very salient stimuli can go unnoticed. Famously, when counting basketball passes a gorilla can be unseen, and when chasing someone a nearby fistfight can be unseen (Simons and Chabris, 1999 ; Chabris et al., 2011 ). This is somewhat difficult to explain because endogenous attention as described so far should raise the baseline for precision expectation for a specific location such that any stimulus there, whether it is a basketball pass or a gorilla, should be more likely to be perceived. A smaller proportion of participants experience this effect, so it does in fact seem harder to induce blindness in this kind of paradigm than paradigms using central–peripheral or foreground-background tasks. For those who do have inattentional blindness under these conditions, the explanation could be high precision expectations for the basketball passes specifically, given the context of the passes that have occurred before the gorilla enters. This combines with the way this precision error has driven up the conditional confidence of the basketball model, explaining away the gorilla model, even if the latter is fed some prediction error. This more speculative account predicts that inattentional blindness should diminish if the gorilla, for example, occurs at the beginning of the counting task.

This is then a way to begin conceptualizing feature- and object-based based attention instead of purely spatial attention. Van Boxtel et al. ( 2010 ) suggest that in gorilla type cases the context provided by the overall scene delivers a strong gist that overrides changes that fit poorly with it: “subjects do perceive the gist of the image correctly, interfering with detection of a less meaningful change in the scene as if it was filled in by the gist.” The predictive coding approach can offer an explanation of this kind of interference in probabilistic terms.

A further aspect can be added to this account of inattentional blindness. Attending, especially endogenous attending, is an activity. As such, performing an attention demanding task is a matter of active inference where a model of the world is used to selectively sample sensory input to minimize surprise. This means that high precision input are expected and sampled on the basis of one, initial (e.g., “basketball”) model, leaving unexpected input such as the occurrence of a gorilla with low weighting. Since the active inference required to comply with an attentional task must favor one model in a sustained way, blindness to unexpected stimuli follows.

The benefit of sustained attention viewed as active inference is then that surprise can be minimized with great precision, given an initial model’s states and parameters. On the other hand, the cost of sustained attention is that the prediction error landscape may change during the task; increasing the free energy and making things evade consciousness.

It can thus be disadvantageous for a system to be stuck in active inference and neglecting to revisit the bound on surprise by updating the model (e.g., if the gorilla is real and angry). Perhaps the reason attention can be hard to maintain is that to avoid such disadvantage the system continually seeks, perhaps via spontaneous fluctuations, to alternate between perceptual and active inference. Minor lapses of attention (e.g., missing a pass) could thus lead to some model revision and conscious perception; if the model revision has relatively low precision it may just give rise to gist perception (e.g., “some black creature was there”).

It is interesting here to speculate further that the functional role of exogenous attention can be to not only facilitate processing of salient stimuli but in particular to make the system snap out of active inference, which is often associated with endogenous attention, and back into revision of its generative model. Exogenous and endogenous attention seem to have opposing functional roles in precision optimization.

There remains the rather important and difficult question whether or not the unseen stimulus is in fact consciously perceived but not accessible for introspective report, or whether it is not consciously perceived at all; this question relates to the influential distinction between access consciousness and phenomenal consciousness (Block, 1995 , 2008 ). To some, this question borders on the incomprehensible or at least untestable (Cohen and Dennett, 2011 ), and there is agreement it cannot be answered directly (e.g., by asking participants to report). Instead some indirect, abductive answer must be sought. We cannot answer this question here but we can speculate that the common intuition that there is both access and phenomenal consciousness is fueled by the moments of predictive coding such that (i) access consciousness goes with active inference (i.e., minimizing surprise though agency, which requires making model parameters and states available to control systems), and (ii) phenomenal consciousness goes with perceptual inference (i.e., minimizing the bound on surprise by more passively updating model parameters and states).

If this is right, then a prediction is that in passive viewing, where attention and active inference is kept as minimal as possible, there should be more possibility of having incompatible conscious percepts at the same time, since without active inference there is less imperative to favor just one initial model. There is some evidence for this in binocular rivalry where the absence of attention seems to favor fusion (Zhang et al., 2011 ).

Overall, some inroads on inattentional blindness can be made by an appeal to precision expectations giving the attended stimulus a probabilistic advantage. A more full, and speculative, explanation conceives attention in agential terms and appeals to the way active inference can lead to very precise but eventually overall inaccurate perceptual states.

Change blindness

These are cases where abrupt and scene-incongruent changes like sudden mudsplashes attract attention and make invisible other abrupt but scene-congruent changes like a rock turning into a log or an aircraft engine going missing (Rensink et al., 1997 ). Only with attention directed at (or on repeated exposures grabbed by) the scene-congruent change will it be detected. This makes sense if the distractor (e.g., mudsplashes) has higher signal strength than the masked stimuli because, as we saw, there is a higher precision expectation for stronger signals. This weights prediction error for a mudsplash model rather than for a natural scenery model with logs or aircrafts. Even if both models are updated in the light of their respective prediction errors from the mudsplashes and the rock changing to the log, the mudsplash model will have higher conditional confidence because it can explain away precisely a larger part of the bottom-up error signal.

More subtly, change blindness through attention grabbing seems to require that the abrupt stimuli activate a competing model of the causes in the world. This means that the prediction error can be relevant to the states and parameters of one of these models. Thus, the mudsplashes mostly appear to be superimposed on the original image, which activates a model with parameters for causal interaction between mudsplashes and something like a static photo. In other words, the best explanation for the visual input is the transient occlusion or change to a photo, where, crucially, we have strong prior beliefs that photographs do not change over short periods of time. This contrasts with the situation prior to the mudsplashes occurring where the model would be tuned more to the causal relations inherent in the scene itself (that is, the entire scene is not treated as a unitary object that can be mudsplashed). With two models, one can begin to be probabilistically explained away by the other: as the posterior probability of the model that treats the scene as a unitary object increases, the probability of the model that treats it as composite scene will go down. Once change blindness is abolished, such that both mudsplashes and scene changes are seen, a third (“Photoshop”) model will have evolved on which individual components can change but not necessarily in a scene-congruent manner. All this predicts that there should be less change blindness for mudsplashes on dynamic stimuli such as movies because the causal model for such stimuli has higher accuracy; it also predicts less blindness if the mudsplashes are meaningful in the original scene such that competition between models is not engendered.

For some scene changes it is harder to induce change blindness. Mudsplashes can blind us when a rock in the way of a kayak changes into a log, but blinds us less when the rock changes into another kayak (Sampanes et al., 2008 ). This type of situation is often dealt with in terms of gist changes but it is also consistent with the interpretation given above. The difference between a log and another kayak in the way of the kayak is in the change in parameters of the model explaining away the prediction error. The change from an unmoving object (rock) to another unmoving object (log) incurs much less model revision than the change to a moving, intentional object (other kayak): the scope for causal interaction between two kayaks is much bigger than for one kayak and a log. The prediction error is thus much bigger for the latter, and updating the model to reflect this will increase its probability more, and make blindness less likely.

A different type of change blindness occurs when there is no distractor but the change is very slow and incremental (e.g., Simons et al., 2000 ), such as a painting where one part changes color over a relatively long period of time. Without attention directed at the changing property, the change is not noted. In this case it seems likely that each incremental change is within the expected variability for the model of the entire scene. When attention is directed at the slowly changing component of the scene, the precision expectation and thus the weighting goes up, and it is more likely that the incremental change will generate a prediction error. This is then an example of change blindness due to imprecise prediction error minimization. If this is right, a prediction is that change of precision expectation through learning, or individual differences in such expectations, should affect this kind of change blindness.

Short term covert attention enhances conscious perception

If a peripheral cue attracts covert attention to a grating away from fixation, then conscious experience of its contrast is enhanced (Carrasco et al., 2004 ). Similar effects are found for spatial frequency and gap size (Gobell and Carrasco, 2005 ). In terms of precision, the peripheral cue induces a high precision expectation for the cued region, which increases the weighting for prediction error from the low contrast grating placed there. Specifically, the expectation will be for a stimulus with an improved signal to noise ratio, that is, a stronger signal. This then seems to be a kind of self-fulfilling prophecy: an expectation for a strong bottom-up signal causing a stronger error signal. The result is that the world is being represented as having a stronger, more precise signal than it really has, and this is then reflected in conscious perception.

From this perspective, the attentional effect is parasitic on a causal regularity in the world. Normally, when attention is attracted to a region there will indeed be a high signal to noise event in that region. This is part of the prediction error minimization role for attention described above. If this regularity did not hold, then exogenous attention would be costly in free energy. In this way the effect from Carrasco’s lab is a kind of attentional visual illusion. A further study provides evidence for just this notion of an invariant relation between cue strength and expectation for subsequent signal strength: the effect is weakened as the cue contrast decreases (Fuller et al., 2009 ). The cue sets up an expectation for high signal strength (i.e., high precision) in the region and so it makes sense that the cue strength and the expectation are tied together. It is thus an illusion because a causal regularity about precision is applied to a case where it does not in fact hold. If it is correct that this effect relies on learned causal regularities, then it can be predicted that the effect should be reversible through learning, such that strong cues come to be associated with expectations for imprecise target stimuli and vice versa 6 .

At the limit, this paradigm provides an example of attention directed at subthreshold stimuli, and thereby enabling their selection into conscious perception (e.g., 3.5% contrast subthreshold grating is perceived as a 6% contrast threshold grating (Carrasco et al., 2004 ). This shows nicely the modulation by precision weighting of the overall free energy landscape: prediction error, which initially is so imprecise that it is indistinguishable from expected noise can be up-weighted through precision expectations such that the internal model is eventually revised to represent it. Paradoxically, however, here what we have deemed an attentional illusion of stimulus precision facilitates veridical perception of stimulus occurrence.

It is an interesting question if the self-fulfilling prophecy suggested to be in play here is always present under attention, such that attention perpetually enhances phenomenology. In predictive coding terms, the answer is probably “no.” The paradigm is unusual in the sense that it is a case of covert attention, which stifles normal active inference in the form of fixation shifts. If central fixation is abolished and the low contrast grating is fixated, the bound on free energy is again minimized, and this time the error between the model and the actual input from the grating is likely to override the expectation for a strong signal.

This attentional illusion works for exogenous cueing but also for endogenous cueing (Liu et al., 2009 ), where covert endogenous attention is first directed at a peripheral letter cue, is sustained there, and then enhances the contrast of the subsequent target grating at that location. There does not seem to be any studies of the effect of endogenous attention that is entirely volitional and not accompanied by high contrast cues in the target region (even Ling and Carrasco, 2006 has high contrast static indicators at the target locations).

From the point of view of predictive coding, the prediction is then that there will be less enhancing effect of such pure endogenous attention since the high precision expectation (increased baseline) in this case is not induced via a learned causal regularity linking strong signal cues to strong signal targets.

A more general prediction follows from the idea that attention is driven by the (hyper-) prior that cues with high signal strength have high signal to noise ratio. It may be possible to revert this prior through learning such that attention eventually is attracted by low strength cues and stronger cues are ignored. In support of this prediction, there is evidence that some hyperpriors can be altered, such as the light from above prior (Morgenstern et al., 2011 ).

This attentional effect is then explained by precision optimization leading to an illusory perceptual inference. It is a case of misrepresented high precision combined with relatively low accuracy.

Sustained covert attention diminishes conscious perception and enhances filling-in

In Troxler fading (Troxler, 1804 ) peripheral targets fade out of conscious perception during sustained central fixation. If attention but not fixation is endogenously directed at one type of sensory attribute, such as the color of some of the peripheral stimuli, then those stimuli fade faster than the unattended stimuli (Lou, 1999 ).

It is interesting that here attention seems to diminish conscious perception whereas in the cases discussed in the previous section it enhances it. A key factor here is the duration of trials: fading occurs after several seconds and enhancement is seen in trials lasting only 1–2 s. This temporal signature is consistent with predictive coding insofar as when the prediction error from a stimulus is comprehensively suppressed and no further exploration is happening (since active inference is subdued due to central fixation during covert attention) probability should begin to drop. This follows from the idea that what drives conscious perception is the actual process of suppressing prediction error. It translates to the notion that the system expects that the world cannot be unchanging for very long periods of time (Hohwy et al., 2008 ).

In Troxler fading there is an element of filling-in as the fading peripheral stimuli are substituted by the usually gray background. This filling-in aspect is seen more dramatically if the background is dynamic (De Weerd et al., 2006 ): as sustained attention diminishes perception of the peripheral target stimuli, it also amplifies conscious perception by illusory filling-in. A similar effect is seen in motion induced blindness (MIB). Here peripheral targets fade when there is also a stimulus of coherently moving dots, and the fading of the peripheral dots happens faster when they are covertly attended (Geng et al., 2007 ; Schölvinck and Rees, 2009 ).

The question is then why attention conceived as precision weighting should facilitate the fading of target stimuli together with enhancing filling-in in these cases. In Troxler fading with filling-in of dynamic background as well as in MIB there is an element of model competition. In MIB, there is competition between a model representing the coherently moving dots as a solid rotating disk, which if real would occlude the stationary target dots, and a model representing isolated moving dots, which would not occlude the target dots. The first model wins due to the coherence of the motion. An alternative explanation is that there is competition between a model on which there is an error (a “perceptual scotoma”) in the visual system, and a model where there is not; in the former case, it would make sense for the system to fill-in (New and Scholl, 2008 ). In the Troxler case with a dynamic background, there is competition between models representing the world as having vs. not having gaps at the periphery, with the latter tending to win. Sustained attention increases the precision weighting for all prediction error from the attended region, that is, for both the target stimuli and the context in which they are shown (i.e., the dynamic background or, as in MIB, the coherently moving foreground). This context is processed not only at that region but also globally in the stimulus array and this would boost the confidence that it fills the locations of the target stimuli. This means that as the prediction error for the peripheral target stimuli is explained away, the probabilistic balance might tip in favor of the model that represents the array as having an unbroken background, or a solid moving foreground (or a perceptual scotoma).

It is thus possible to accommodate these quite complex effects of covert attention within the notion of attention as precision expectation. On the one hand, exogenous cues can engender high precision expectations that can facilitate target perception, and, on the other hand these expectations can facilitate filling-in of the target location. At the same time, covert attention stifles active inference and engenders a degree of inaccuracy.

Exogenous attention to invisible stimuli

During continuous flash suppression, perceptually suppressed images of nudes can attract attention in the sense that they function as exogenous cues in a version of the Posner paradigm (Jiang et al., 2006 ). This shows that a key attentional mechanism works in the absence of conscious perception. When there are competing models, conscious perception is determined by the model with the highest posterior probability. It is conceivable that though the nude image is a state in a losing model it may still induce precision-related gain for a particular region. In general, in the processing of emotional stimuli, there is clear empirical evidence to suggest that fast salient processing (that could mediate optimization of precision expectations) can be separated from slower perceptual classification (Vuilleumier et al., 2003 ). Evidence for this separation rests on the differences in visual pathways, in terms of processing speeds and spatial frequencies that may enable the salience of stimuli to be processed before their content. Even though a high precision expectation could thus be present for the region of the suppressed stimulus, it is possible for the overall prediction error landscape to not favor the generative model for that stimulus over the model for the abruptly flashing Mondrian pattern in the other eye. The result is that the nude image is not selected for conscious perception but that there nevertheless is an expectation of high precision for its region of the visual field, explaining the effect.

Concluding Remarks

The relation between conscious perception and attention is poorly understood. It has proven difficult to connect the two bodies of empirical findings, based as they are on separate conceptual analyses of each of these core phenomena, and fit them into one unified picture of our mental lives. In this kind of situation, it can be useful to instead begin with a unified theoretical perspective, apply it to the phenomena at hand and then explore if it is possible to reasonably interpret the bodies of evidence in the light of the theory.

This is the strategy pursued here. The idea that the brain is a precision-weighted hypothesis tester provides an attractive vision of the relationship. Because the states of the world have varying levels of noise or uncertainty, perceptual inference must be modulated by expectations about the precisions of the sensory signal (i.e., of the prediction error). Optimization of precision expectations, it turns out (Feldman and Friston, 2010 ), fits remarkably well the functional role often associated with attention. And the perceptual inference which, thus modulated by attention, achieves the highest posterior probability fits nicely with being what determines the contents of conscious perception.

In this perspective, attention and conscious perception are distinct but naturally connected in a way that allows for what appears to be reasonable and fruitful interpretations of some key empirical studies of them and their relationship. Crudely, perception and attention stand to each other as accuracy and precision, statistically speaking, stand to each other. We have seen that this gives rise to reasonably coherent interpretations of specific types of experimental paradigms. Further mathematical modeling and empirical evidence is needed to fully bring out this conjecture, and a number of the interpretations were shown to lead to testable predictions.

To end, I briefly suggest this unifying approach also sits reasonably well with some very general approaches to attention and perception.

From a commonsense perspective, endogenous and exogenous attention have different functional roles. Endogenous attention can only be directed at contents that are already conscious (how can I direct attention to something I am not conscious of?) and when states of affairs grab exogenous attention they thereby become conscious (if I fail to become aware of something then how could my attention have been grabbed?). This is an oversimplification, as can be seen from the studies reviewed above. The mapping of conscious perception and attention onto the elements of predictive coding can explain the commonsense understanding of their relationship but also why it breaks down. Normally endogenous attention is directed at things we already perceive so that no change is missed, i.e., more precision is expected and the gain is turned up. But precision gain itself is neutral on the actual state of affairs, it just makes the system more sensitive to prediction error, so if we direct attention at a location that seems empty but that has a subthreshold stimulus we are still more likely to spot it in the end. Conversely, even if precision expectations are driven up by an increase in signal strength somewhere, and attention in this sense is grabbed, it does not follow that this signal must drive conscious perception. A competing model may as a matter of fact have higher probability.

It is sometimes said that a good way to conceive of conscious perception and attention is in terms of the former as a synthesizer that allows us to make sense of our otherwise chaotic sensory input, and the latter as an analyzer that allows us to descend from the overall synthesized picture and focus on a few more salient things (Van Boxtel et al., 2010 ). The predictive coding account allows this sentiment: prediction error minimization is indeed a way of solving the inverse problem of figuring out what in the world caused the sensory input, and attention does allow us to weight the least uncertain parts of this signal. The key insight from this perspective is however that though these are distinct neural processes they are both needed to allow the brain to solve its inverse problem. But when there are competing models, they can work against each other, and conscious perception can shift between models as precisions and bounds are optimized and the world selectively sampled.

Perhaps the most famous thing said about attention is from James:

Everyone knows what attention is. It is the taking possession by the mind, in clear and vivid form, of one out of what seem several simultaneously possible objects or trains of thought. Focalization, concentration, of consciousness are of its essence. It implies withdrawal from some things in order to deal effectively with others, and is a condition which has a real opposite in the confused, dazed, scatter-brained state which in French is called distraction , and Zerstreutheit in German (James, 1890 , Vol. I, pp. 403–404).

The current proposal is that “attention is simply the process of optimizing precision during hierarchical inference” (Friston, 2009 , p. 7). This does not mean the predictive coding account of attention stands in direct opposition to the Jamesian description. It is a more accurate, reductive and unifying account of the mechanism underlying parts of the phenomenon James is trying to capture: James’ description captures many of the aspects of endogenous attention and model competition that are discussed in terms of precision in this paper.

The sentiment that attention is intimately connected with perception in a hypothesis testing framework was captured very early on by Helmholtz. He argued, for example, that binocular rivalry is an attentional effect but he explicated attention in terms of activity, novelty, and surprise, which is highly reminiscent of the contemporary predictive coding framework:

The natural unforced state of our attention is to wander around to ever new things, so that when the interest of an object is exhausted, when we cannot perceive anything new, then attention against our will goes to something else. […] If we want attention to stick to an object we have to keep finding something new in it, especially if other strong sensations seek to decouple it (Helmholtz, 1860 , p. 770; translated by JH).

Helmholtz does not here mention precision expectations but they find a natural place in his description of attention’s role in determining conscious content: precision expectations enable attention to stick, where sticking helps, and to wander more fruitfully too.

Conflict of Interest Statement

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

Thanks to Karl Friston, Tim Bayne, and the reviewers for very helpful comments and suggestions. This research is supported by the Australian Research Council.

1 In addition to perceptual forms of consciousness there is also a live debate, set aside here, about non-perceptual forms of consciousness, such as conceptual thought (Bayne and Montague, 2011 ).

2 There is a simplification here: surprise has both accuracy and complexity components, such that minimizing surprise or free energy increases accuracy while minimizing complexity. This ensures the explanations for sensory input are parsimonious and will generalize to new situations; c.f., Occam’s razor.

3 There is also a very good question here about how this kind of confidence assessment fits with the psychological confidence of the organism, which appears a defining feature of consciousness, and which is often assessed in confidence ratings. (Thanks to a reviewer for raising this issue).

4 A further disclaimer: the speculation that conscious perception is a product of accuracy and precision in predictive coding is a limited speculation about an information processing mechanism. It is not a speculation about why experience is conscious rather than not conscious – predictive coding can after all be implemented in unconscious machines. The mystery of consciousness will remain untouched.

5 This claim depends on optimal Bayesian inference actually being able to recapitulate the causal structure of the world. Here we bracket for philosophical debate the fact that this assumption breaks down for perfect skeptical scenarios, such as Cartesian deceiving demons or evil scientists manipulating brains in vats, where minimizing free energy does not reveal the true nature of the world. We also bracket deeper versions of the problem of induction, such as the new riddle of induction (Goodman, 1955 ). though we note that when two hypotheses are equally good at predicting new input the free energy principle prefers the one with the smallest complexity cost.

6 It is a tricky question whether or not this attentional effect is then explained without appealing to “mental paint” (Block, 2010 ), and whether it is therefore a challenge to representationalism about conscious perception. Precision optimization is an integral part of perceptual inference, which is all about representing the causal structure of the world. As such the explanation is representational. But it concerns precision, which is an often neglected aspect of representation: the representationalism assumed here allows that a relatively accurate representation can fail to optimize precision. What attention itself affords is improved precision, not accuracy (see Prinzmetal et al., 1997 ).

  • Badcock P. B. (2012). Evolutionary systems theory: a unifying meta-theory of psychological science . Rev. Gen. Psychol. 16 , 10–23 10.1037/a0026381 [ CrossRef ] [ Google Scholar ]
  • Bar M. (2003). A cortical mechanism for triggering top-down facilitation in visual object recognition . J. Cogn. Neurosci. 15 , 600–609 10.1162/089892903321662976 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Bar M. (2007). The proactive brain: using analogies and associations to generate predictions . Trends Cogn. Sci. (Regul. Ed.) 11 , 280–289 10.1016/j.tics.2007.05.005 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Bayne T. (2010). The Unity of Consciousness . Oxford: Oxford University Press [ Google Scholar ]
  • Bayne T., Montague M. (2011). Cognitive Phenomenology . Oxford: Oxford University Press [ Google Scholar ]
  • Bays P. M., Wolpert D. M. (2007). Computational principles of sensorimotor control that minimize uncertainty and variability . J. Physiol. (Lond.) 578 , 387–396 10.1113/jphysiol.2006.120121 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Block N. (1995). On a confusion about a function of consciousness . Behav. Brain Sci. 18 , 227–287 10.1017/S0140525X00038188 [ CrossRef ] [ Google Scholar ]
  • Block N. (2008). Consciousness, accessibility, and the mesh between psychology and neuroscience . Behav. Brain Sci. 30 , 481–499 [ PubMed ] [ Google Scholar ]
  • Block N. (2010). Attention and mental paint . Philos. Issues 20 , 23–63 10.1111/j.1533-6077.2010.00177.x [ CrossRef ] [ Google Scholar ]
  • Boly M., Garrido M. I., Gosseries O., Bruno M.-A., Boveroux P., Schnakers C., Massimini M., Litvak V., Laureys S., Friston K. (2011). Preserved feedforward but impaired top-down processes in the vegetative state . Science 332 , 858–862 10.1126/science.1202043 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Brown H., Friston K. J., Bestmann S. (2011). Active inference, attention and motor preparation . Front. Psychol. 2 :218. 10.3389/fpsyg.2011.00218 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Carrasco M., Ling S., Read S. (2004). Attention alters appearance . Nat. Neurosci. 7 , 308–313 10.1038/nn1194 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Casella G. (1992). Illustrating empirical Bayes methods . Chemometr. Intell. Lab. Syst. 16 , 107–125 10.1016/0169-7439(92)80050-E [ CrossRef ] [ Google Scholar ]
  • Chabris C. F., Weinberger A., Fontaine M., Simons D. J. (2011). You do not talk about fight club if you do not notice fight club: inattentional blindness for a simulated real-world assault . i-Perception 2 , 150–153 10.1068/i0436 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Chalmers D. (1996). The Conscious Mind . Harvard: Oxford University Press [ Google Scholar ]
  • Cohen M. A., Alvarez G. A., Nakayama K. (2011). Natural-scene perception requires attention . Psychol. Sci. 22 , 1165–1172 10.1177/0956797611419168 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Cohen M. A., Dennett D. C. (2011). Consciousness cannot be separated from function . Trends Cogn. Sci. (Regul. Ed.) 15 , 358–364 10.1016/j.tics.2011.10.004 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • De Weerd P., Smith E., Greenberg P. (2006). Effects of selective attention on perceptual filling-in . J. Cogn. Neurosci. 18 , 335–347 10.1162/jocn.2006.18.3.335 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Dennett D. C. (1991). Consciousness Explained . Boston: Little, Brown & Co [ Google Scholar ]
  • Desimone R. (1998). Visual attention mediated by biased competition in extrastriate visual cortex . Philos. Trans. R. Soc. Lond. B Biol. Sci. 353 , 1245. 10.1098/rstb.1998.0280 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Desimone R., Duncan J. (1995). Neural mechanisms of selective visual attention . Annu. Rev. Neurosci. 18 , 193. 10.1146/annurev.ne.18.030195.001205 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Ditchburn R. W., Ginsborg B. L. (1952). Vision with a stabilized retinal image . Nature 170 , 36–37 10.1038/170036a0 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Feldman H., Friston K. (2010). Attention, uncertainty and free-energy . Front. Hum. Neurosci. 4 :215. 10.3389/fnhum.2010.00215 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Friston K. (2005). A theory of cortical responses . Philos. Trans. R. Soc. Lond. B Biol. Sci. 360 , 815–836 10.1098/rstb.2005.1622 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Friston K. (2008). Hierarchical models in the brain . PLoS Comput. Biol. 4 , e1000211. 10.1371/journal.pcbi.1000211 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Friston K. (2009). The free-energy principle: a rough guide to the brain? Trends Cogn. Sci. (Regul. Ed.) 13 , 293–301 10.1016/j.tics.2009.04.005 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Friston K. (2010). The free-energy principle: a unified brain theory? Nat. Rev. Neurosci. 11 , 127–138 10.1038/nrn2787 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Friston K., Mattout J., Kilner J. (2011). Action understanding and active inference . Biol. Cybern. 104 , 137–160 10.1007/s00422-011-0424-z [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Friston K., Stephan K. (2007). Free energy and the brain . Synthese 159 , 417–458 10.1007/s11229-007-9237-y [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Friston K. J., Daunizeau J., Kiebel S. J. (2009). Reinforcement learning or active inference? PLoS ONE 4 , e6421. 10.1371/journal.pone.0006421 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Friston K. J., Thornton C., Clark A. (in press). Free-energy minimization and the dark room problem . Front. Psychol. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Fuller S., Park Y., Carrasco M. (2009). Cue contrast modulates the effects of exogenous attention on appearance . Vision Res. 49 , 1825–1837 10.1016/j.visres.2009.04.019 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Geng H., Song Q., Li Y., Xu S., Zhu Y. (2007). Attentional modulation of motion-induced blindness . Chin. Sci. Bull. 52 , 1063–1070 10.1007/s11434-007-0309-7 [ CrossRef ] [ Google Scholar ]
  • Gobell J., Carrasco M. (2005). Attention alters the appearance of spatial frequency and gap size . Psychol. Sci. 16 , 644–651 10.1111/j.1467-9280.2005.01588.x [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Goodman N. (1955). Fact, Fiction and Forecast . Cambridge, MA: Harvard University Press [ Google Scholar ]
  • Gregory R. L. (1980). Perceptions as hypotheses . Philos. Trans. R. Soc. Lond. B Biol. Sci. 290 , 181–197 10.1098/rstb.1980.0090 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Grill-Spector K., Henson R., Martin A. (2006). Repetition and the brain: neural models of stimulus-specific effects . Trends Cogn. Sci. (Regul. Ed.) 10 , 14–23 10.1016/j.tics.2005.11.006 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Harrison L., Bestmann S., Rosa M. J., Penny W., Green G. G. R. (2011). Time scales of representation in the human brain: weighing past information to predict future events . Front. Hum. Neurosci. 5 :37. 10.3389/fnhum.2011.00037 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Helmholtz H. V. (1860). Treatise on Physiological Optics . New York: Dover [ Google Scholar ]
  • Hesselmann G., Sadaghiani S., Friston K. J., Kleinschmidt A. (2010). Predictive coding or evidence accumulation? False inference and neuronal fluctuations . PLoS ONE 5 , e9926. 10.1371/journal.pone.0009926 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Hillyard S. A., Mangun G. R. (1987). Sensory gating as a physiological mechanism for visual selective attention . Electroencephalogr. Clin. Neurophysiol. Suppl. 40 , 61–67 [ PubMed ] [ Google Scholar ]
  • Hohwy J. (2009). The neural correlates of consciousness: new experimental approaches needed? Conscious. Cogn. 18 , 428–438 10.1016/j.concog.2009.02.006 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Hohwy J., Fox E. (2012). Preserved aspects of consciousness in disorders of consciousness: a review and conceptual analysis . J. Conscious. Stud. 19 , 87–120 [ Google Scholar ]
  • Hohwy J., Roepstorff A., Friston K. (2008). Predictive coding explains binocular rivalry: an epistemological review . Cognition 108 , 687–701 10.1016/j.cognition.2008.05.010 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Hume D. (1739. –1740). A Treatise of Human Nature . Oxford: Oxford: Clarendon Press [ Google Scholar ]
  • Jack C. E., Thurlow W. R. (1973). Effects of degree of visual association and angle of displacement on the “ventriloquism” effect . Percept. Mot. Skills 37 , 967–979 10.2466/pms.1973.37.3.967 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • James W. (1890). The Principles of Psychology . New York: Holt [ Google Scholar ]
  • Jiang Y., Costello P., Fang F., Huang M., He S. (2006). A gender- and sexual orientation-dependent spatial attentional effect of invisible images . Proc. Natl. Acad. Sci. U.S.A. 103 , 17048–17052 10.1073/pnas.0605678103 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kersten D., Mamassian P., Yuille A. (2004). Object perception as Bayesian inference . Annu. Rev. Psychol. 55 , 271–304 10.1146/annurev.psych.55.090902.142005 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kiebel S. J., Daunizeau J., Friston K. J. (2008). A hierarchy of time-scales and the brain . PLoS Comput. Biol. 4 , e1000209. 10.1371/journal.pcbi.1000209 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kiebel S. J., Daunizeau J., Friston K. J. (2010). Perception and hierarchical dynamics . Front. Neuroinformatics 4 , 12 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kitcher P. (1989). “Explanatory unification and the causal structure of the world,” in Scientific Explanation , eds Kitcher P., Salmon W. (Minneapolis: University of Minnesota Press; ), 410–505 [ Google Scholar ]
  • Koch C., Tsuchiya N. (2007). Attention and consciousness: two distinct brain processes . Trends Cogn. Sci. (Regul. Ed.) 11 , 16–22 10.1016/j.tics.2006.10.012 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kok P., Rahnev D., Jehee J. F. M., Lau H. C., De Lange F. P. (2011). Attention reverses the effect of prediction in silencing sensory signals . Cereb. Cortex . 10.1093/cercor/bhr310 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kveraga K., Ghuman A. S., Bar M. (2007). Top-down predictions in the cognitive brain . Brain Cogn. 65 , 145–168 10.1016/j.bandc.2007.06.007 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Ling S., Carrasco M. (2006). When sustained attention impairs perception . Nat. Neurosci. 9 , 1243–1245 10.1038/nn1761 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Liu T., Abrams J., Carrasco M. (2009). Voluntary attention enhances contrast appearance . Psychol. Sci. 20 , 354–362 10.1111/j.1467-9280.2009.02300.x [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Lou L. (1999). Selective peripheral fading: evidence for inhibitory sensory effect of attention . Perception 28 , 519–526 10.1068/p2816 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Martinez-Trujillo J. C., Treue S. (2004). Feature-based attention increases the selectivity of population responses in primate visual cortex . Curr. Biol. 14 , 744–751 10.1016/j.cub.2004.04.028 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • McGrayne S. B. (2011). The Theory That Would Not Die: How Bayes’ Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant from Two Centuries of Controversy . New Haven: Yale University Press [ Google Scholar ]
  • Morgenstern Y., Murray R. F., Harris L. R. (2011). The human visual system’s assumption that light comes from above is weak . Proc. Natl. Acad. Sci. U.S.A. 108 , 12551–12553 10.1073/pnas.1100794108 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Mumford D. (1992). On the computational architecture of the neocortex. II. The role of cortico-cortical loops . Biol. Cybern. 66 , 241–251 10.1007/BF00198477 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • New J. J., Scholl B. J. (2008). Perceptual scotomas . Psychol. Sci. 19 , 653–659 10.1111/j.1467-9280.2008.02228.x [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Prinzmetal W., Nwachuku I., Bodanski L., Blumenfeld L., Shimizu N. (1997). The Phenomenology of Attention . Conscious. Cogn. 6 , 372–412 10.1006/ccog.1997.0313 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Rao R. P. N. (2005). Bayesian inference and attentional modulation in the visual cortex . Neuroreport 16 , 1843–1848 10.1097/01.wnr.0000183900.92901.fc [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Rensink R., O’Regan J., Clark J. (1997). To see or not to see: the need for attention to perceive changes in scenes . Psychol. Sci. 8 , 368. 10.1111/j.1467-9280.1997.tb00427.x [ CrossRef ] [ Google Scholar ]
  • Ross J., Burr D. (2008). The knowing visual self . Trends Cogn. Sci. (Regul. Ed.) 12 , 363–364 10.1016/j.tics.2008.06.007 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Sampanes A. C., Tseng P., Bridgeman B. (2008). The role of gist in scene recognition . Vision Res. 48 , 2275–2283 10.1016/j.visres.2008.07.011 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Schölvinck M. L., Rees G. (2009). Attentional influences on the dynamics of motion-induced blindness . J. Vis. 9 (Article 38):1–8 10.1167/9.6.1 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Simons D. J., Chabris C. F. (1999). Gorillas in our midst: sustained inattentional blindness for dynamic events . Perception 28 , 1059–1074 10.1068/p2952 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Simons D. J., Franconeri S. L., Reimer R. L. (2000). Change blindness in the absence of a visual disruption . Perception 29 , 1143–1154 10.1068/p3104 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Summerfield C., Egner T. (2009). Expectation (and attention) in visual cognition . Trends Cogn. Sci. (Regul. Ed.) 13 , 403–409 10.1016/j.tics.2009.06.003 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Summerfield C., Trittschuh E. H., Monti J. M., Mesulam M. M., Egner T. (2008). Neural repetition suppression reflects fulfilled perceptual expectations . Nat. Neurosci. 11 , 1004–1006 10.1038/nn.2163 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Todorovic A., Van Ede F., Maris E., De Lange F. P. (2011). Prior expectation mediates neural adaptation to repeated sounds in the auditory cortex: an MEG study . J. Neurosci. 31 , 9118–9123 10.1523/JNEUROSCI.1425-11.2011 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Treisman A. (1996). The binding problem . Curr. Opin. Neurobiol. 6 , 171–178 10.1016/S0959-4388(96)80070-5 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Treisman A. (1998). Feature binding, attention and object perception . Philos. Trans. R. Soc. Lond. B Biol. Sci. 353 , 1295–1306 10.1098/rstb.1998.0284 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Treisman A. (1999). Solutions to the binding problem: review progress through controversy summary and convergence . Neuron 24 , 105–110 10.1016/S0896-6273(00)80826-0 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Treisman A. M., Gelade G. (1980). A feature-integration theory of attention . Cogn. Psychol. 12 , 97–136 10.1016/0010-0285(80)90005-5 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Troxler D. (1804). “Über das Verschwindern gegebener Gegenstande innerhalb unsers Gesichtskreises,” in Ophthalmologisches Bibliothek , eds Himly K., Schmidt J. A. (Jena: Fromman; ), 1–119 [ Google Scholar ]
  • Van Boxtel J. J. A., Tsuchiya N., Koch C. (2010). Consciousness and attention: on sufficiency and necessity . Front. Psychol. 1 :217. 10.3389/fpsyg.2010.00217 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Vuilleumier P., Armony J. L., Driver J., Dolan R. J. (2003). Distinct spatial frequency sensitivities for processing faces and emotional expressions . Nat. Neurosci. 6 , 624–631 10.1038/nn1057 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Watzl S. (2011a). The nature of attention . Philos. Compass 6 , 842–853 10.1111/j.1747-9991.2011.00433.x [ CrossRef ] [ Google Scholar ]
  • Watzl S. (2011b). The philosophical significance of attention . Philos. Compass 6 , 722–733 10.1111/j.1747-9991.2011.00432.x [ CrossRef ] [ Google Scholar ]
  • Wu W. (2011). Confronting many-many problems: attention and agentive control . Noûs 45 , 50–76 10.1111/j.1468-0068.2010.00804.x [ CrossRef ] [ Google Scholar ]
  • Zhang P., Jamison K., Engel S., He B., He S. (2011). Binocular rivalry requires visual attention . Neuron 71 , 362–369 10.1016/j.neuron.2011.05.035 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]

Psychology Dictionary

PERCEPTUAL CYCLE HYPOTHESIS

the theory that cognition impacts perceptual exploration but is thereby changed by real-world encounters, cultivating a cycle of attention , cognition, comprehension, and the authentic world wherein each impacts the others.

Avatar photo

Leave a Reply

Your email address will not be published. Required fields are marked *

Latest Posts

perceptual hypothesis psychology definition

Meeting the Milestones: A Guide to Piaget's Child Developmental Stages

perceptual hypothesis psychology definition

Counseling, Therapy, and Psychology: What Is The Difference?

perceptual hypothesis psychology definition

The Psychology of Metaphysical Belief Systems

perceptual hypothesis psychology definition

4 Key Considerations When Supporting a Loved One Through a Legal Battle for Justice 

How Exercise Can Boost Your Mental Health as You Age

Finding Balance: The Psychological Benefits of Staying Active

perceptual hypothesis psychology definition

The Psychology of Winning: Case Studies and Analysis from the World of Sports

perceptual hypothesis psychology definition

Transitioning to Digital Therapy: Navigating the Pros and Cons

perceptual hypothesis psychology definition

From Loss to Liberation: The Psychological Journey Of Seniors Receiving All-On-4 Dental Implants

perceptual hypothesis psychology definition

How to Create Family History Interview Questions?

perceptual hypothesis psychology definition

The Most Common Addiction in the United States

Road to recovery: tools and resources for mental health treatment success.

perceptual hypothesis psychology definition

Do Cat Allergy Shots for Humans Work?

Popular psychology terms, medical model, hypermnesia, affirmation, backup reinforcer, brainwashing, affiliative behavior, message-learning approach, spontaneous neural activity, sensory adaptation, excitation-transfer theory, kinesthetic feedback.

ORIGINAL RESEARCH article

Investigating conversational dynamics in triads: effects of noise, hearing impairment, and hearing aids.

Eline Borch Petersen

  • WS Audiology, Lynge, Denmark

Communication is an important part of everyday life and requires a rapid and coordinated interplay between interlocutors to ensure a successful conversation. Here, we investigate whether increased communication difficulty caused by additional background noise, hearing impairment, and not providing adequate hearing-aid (HA) processing affected the dynamics of a group conversation between one hearing-impaired (HI) and two normal-hearing (NH) interlocutors. Free conversations were recorded from 25 triads communicating at low (50 dBC SPL) or high (75 dBC SPL) levels of canteen noise. In conversations at low noise levels, the HI interlocutor was either unaided or aided. In conversations at high noise levels, the HI interlocutor either experienced omnidirectional or directional sound processing. Results showed that HI interlocutors generally spoke more and initiated their turn faster, but with more variability, than the NH interlocutors. Increasing the noise level resulted in generally higher speech levels, but more so for the NH than for the HI interlocutors. Higher background noise also affected the HI interlocutors’ ability to speak in longer turns. When the HI interlocutors were unaided at low noise levels, both HI and NH interlocutors spoke louder, while receiving directional sound processing at high levels of noise only reduced the speech level of the HI interlocutor. In conclusion, noise, hearing impairment, and hearing-aid processing mainly affected speech levels, while the remaining measures of conversational dynamics (FTO median, FTO IQR, turn duration, and speaking time) were unaffected. Hence, although experiencing large changes in communication difficulty, the conversational dynamics of the free triadic conversations remain relatively stable.

1 Introduction

Living with hearing loss affects not only the ability to hear but also the way a person interacts with others in communication situations. Communication is a core activity in our everyday lives and relies on the ability to switch rapidly and continuously between listening and talking. Conversations are complex interactions consisting of linguistic, auditory, and visual components that can be adapted to overcome communication challenges. For example, it has been known for more than 100 years that humans adapt their speech when communicating in noise (Lombard speech) by increasing the intensity, pitch, and duration of words ( Lombard, 1911 ; Junqua, 1996 ). Similarly, it has been observed that when communicating with an elder hearing impaired (HI) interlocutor, younger normal-hearing (NH) interlocutors speak louder in quiet and noisy situations, reduce their articulation rate, and alter the spectral content of their speech ( Hazan and Tuomaine, 2019 ; Sørensen et al., 2019 ; Beechey et al., 2020b ; Petersen et al., 2022 ). These changes suggest that the NH interlocutors adapt their speech to alleviate the communication difficulty experienced by their HI communication partner. Furthermore, it has been observed that when providing the HI interlocutors with hearing aids (HAs), the HI interlocutors reduce the duration of their utterances (inter-pausal units), speak faster (higher articulation rate), and decrease their speech level ( Beechey et al., 2020a ; Petersen et al., 2022 ). Additionally, when the HI interlocutor is aided, the NH interlocutor also decreases their speech level despite not directly experiencing any alteration in the communication difficulty ( Beechey et al., 2020a ; Petersen et al., 2022 ).

Another aspect of a conversation is the interactive turn-taking between interlocutors and the timing of the turn-starts, denoted as floor-transfer offsets (FTOs). Despite taking at least 600 ms to physically produce a verbal response ( Indefrey and Levelt, 2004 ; Magyari et al., 2014 ), turns are generally initiated after a short pause of around 200 ms ( Stivers et al., 2009 ), indicating that turn-ends must be predicted to initiate a fast response ( Bögels et al., 2015 ; Gisladottir et al., 2015 ; Levinson and Torreira, 2015 ; Barthel et al., 2016 ; Corps et al., 2018 ). Impaired hearing causes the talker to initiate their turns in a less well-timed manner, evident from a larger variability in their FTOs compared to NH interlocutors ( Sørensen, 2021 ; Petersen et al., 2022 ). When receiving HA amplification, the FTOs of HI interlocutors become less variable, indicating that some of their communication difficulty is relieved. This allows them to provide more well-timed verbal responses ( Petersen et al., 2022 ).

From the studies referred to above, results showed that the added communication difficulty experienced by HI interlocutors affected not only the dynamics of their own speech but also of their NH conversational partner ( Hazan and Tuomaine, 2019 ; Beechey et al., 2020b ; Sørensen, 2021 ; Petersen et al., 2022 ). However, in the above studies, conversations were initiated using communication tasks (Diapix or puzzle), which required active participation and interactive exchange of information between two interlocutors ( Baker and Hazan, 2011 ; Beechey et al., 2018 ). In the current study, we investigated whether the conversational dynamics are affected in a similar manner if the conversation is less task-bound and occurs between three interlocutors. Conversations between two NH and one HI interlocutors were conducted at two different noise levels. At the low level of noise, the HI interlocutor was either unaided or aided with an HA, while at the high level of noise, they received omnidirectional or directional sound processing. This unbalanced study design was chosen because previous studies suggested that HA amplification affected the conversational dynamics, specifically the speech levels, when communicating in quiet ( Petersen et al., 2022 ). At high levels of background noise, HA amplification ensures audibility, but not intelligibility, for the HI interlocutor. Hence, the effect of reducing background noise through directional sound processing is investigated at high levels of background noise.

In the current study, increased communication difficulty, caused by hearing impairment, higher noise levels, or suboptimal HA signal processing, is expected to result in 1) longer and 2) more variable FTO values (median and interquartile range), 3) longer turn durations, 4) higher speech levels, and 5) increased speaking time for the HI interlocutor specifically. By asking the interlocutors to subjectively evaluate their active participation in the conversation and their perceived use of listening/talking strategies, it was investigated whether any alterations in the conversational dynamics were perceived or deliberately used by the interlocutors.

Focusing on the conversational dynamics of a group, rather than a two-person conversation, posed some methodological considerations on how to determine the communication states of the conversation and how to account for pauses made within a talker’s own turn. Some of these considerations and post-processing steps applied before extracting the five features of the conversational dynamics listed above are described in the section Quantifying Turn-taking in Group Conversations.

2.1 Participants

Conversations were recorded from 25 groups of three interlocutors fluent in Danish: One older hearing-impaired (HI) interlocutor, one older normal-hearing (ONH) interlocutor, and one younger normal-hearing (YNH) interlocutor. The HI participants were recruited from an internal database of HI test subjects, while all the NH interlocutors were recruited internally among employees at WS Audiology, Lynge, Denmark. All normal-hearing interlocutors passed a hearing screening at 20 dB HL at 500, 1000, 2000, and 4,000 Hz, except two ONH with a 30-dB HL on one ear at 4000 Hz. All YNH were below 35 years of age (mean = 27.2, sd = 5.2, 14 female participants). All older participants (ONH and HI) were required to be older than 50 years of age, but the HI participants (mean = 75.8, sd = 6.5, 9 female participants) were significantly older than the ONH (mean = 54.8, sd = 3.7, 15 female participants, t (48) = −14.1, p  < 0.001). The YNH participants were significantly younger than the ONH and HI participants ( p  < 0.001).

The HI participants had mild-to-moderate symmetrical hearing loss ( Figure 1A , pure-tone average across 500, 1,000, 2000, and 4,000 of 48.9 dB HL, sd = 6.1 dB HL) and were experienced hearing-aid users (>1 year of hearing-aid usage).

www.frontiersin.org

Figure 1 . Participants’ audiogram and experimental setup. (A) Individual pure-tone hearing thresholds for all HI participants averaged across ears (thin gray lines), participants (bold purple line), and the standard deviation (shaded purple area). (B) An experimental setup with the three participants seated equally spaced around a table with a diameter of 1.2 m. A loudspeaker is placed 2.2 m directly in front of each participant. (C) Attenuation of white noise when applying the directional sound processing experienced by the HI interlocutor at a high level of background noise (75 dBC, dir condition). The attenuation, in dB, indicated by concentric circles for different frequencies (line types and shading of gray), is shown for different azimuth angles. Note that the attenuations depicted for the negative azimuth angles were recorded from the left HA, while the attenuations at positive angles were recorded from the right HA.

The triads were grouped at random, ensuring that the YNH and ONH did not work closely together at WS Audiology. Across the 25 triads, 19 had interlocutors of mixed genders, while 2 had only male and 4 only female participants.

All participants gave their written informed consent, and the study was approved by the regional ethics committee (Board of Copenhagen, Denmark, reference H-20068621).

2.2 Experimental setup

The experiment was conducted in a meeting room at WS Audiology, with the participants seated at a round table ( Figure 1B ). The positions of the YNH, ONH, and HI at the table were balanced across triads. Three loudspeakers were placed 2.2 m directly in front of each participant ( Figure 1B ). The background noise presented by the loudspeakers was spatially recorded noise from the canteen of WS Audiology, which was presented at either 50 or 75 dBC SPL.

The conversations were individually recorded by each interlocutor using a directional headset microphone (DPA 4088, Allerød, Denmark). All sounds were presented and recorded via customized Matlab scripts (2018a) at a sampling frequency of 44.1 kHz. As the headsets were not easily calibrated, individual 5-s speech signals were recorded from the headset, as well as from a calibrated omnidirectional reference microphone (B-5, Behringer, Willich, Germany) placed at the center of the table. Combining the attenuation of the speech signal recorded from the headset to the reference microphone with a calibrated reference signal recorded from the reference microphone, it was possible to compute the conversational speech levels recorded from the headset in dB SPL.

2.3 Conversational task

To ensure a natural and free conversation between the three previously unacquainted participants in each triad, two conversational types were used: Consensus questions (e.g., Can you come up with a three-course dinner consisting only of dishes none of you likes) and picture cards with three keywords (e.g., a picture of a crowd at a festival with the keywords festivals, music, and summer). These two ways of initiating a conversation have previously been tested and found to spark natural and balanced conversations between interlocutors ( Petersen et al., 2022 ). The test leader showed the picture or read the consensus question aloud before each 5-min conversation. The pictures and questions could be used to guide the upcoming conversation, but the triads were instructed that deviation from the topic/question was allowed. The participants were not instructed to behave or speak in a particular manner but to act as naturally as possible.

2.4 Hearing-aid fitting

The HI participants were equipped with Signia Pure 312 7X receiver-in-the-canal HAs with M-receivers and closed-sleeve instant domes fitted with the NAL-NL2 prescription rule ( Keidser et al., 2011 ). No further fine-tuning, feedback tests, or real-ear measurements were performed. The frequency-based noise reduction system was disabled in the fitting software (Connexx version 9.6.6.488, WS Audiology), and two programs were made: One with omnidirectional and another with directional sound processing.

During conversations with a low level of background noise (50 dBC), the HI interlocutor was either not wearing HAs (denoted the unaided condition) or wearing HAs with omnidirectional sound processing (denoted the aided condition). Before the unaided conversations, the HAs were removed by the test leader in a discrete fashion to avoid notifying the NH conversational partners. Furthermore, the HI participants had been instructed not to notify the NH conversational partners that they were unaided.

During conversations with a high level of background noise (75 dBC), the HAs worn by the HI interlocutors were either providing omnidirectional sound processing (denoted omni, settings identical to the aided condition) or directional sound processing (denoted dir) designed to suppress noise sources based on their spatial position. The directional attenuation pattern was fixed using the Signia App (provided by Sivantos Pte. Ltd), controlled by the test leader, in which the pattern was set to the narrowest beam possible, providing 10–15 dB attenuation of white noise presented from directions beyond +/−45 degrees azimuth ( Figure 1C ).

2.5 Experimental procedure

Before the actual experiment, the triads did two 5-min training conversations. The training served to introduce the two conversational types, to acquaint the participants with each other, and to introduce the background noise used during the experiment. During the first training round, participants were given a consensus question to discuss in quiet; during the second training round, they discussed a picture card in canteen noise presented at 60 dBC, a noise level between the low and high levels of noise used during the actual experiment.

A total of 12 experimental conversations were recorded from each triad in the four different experimental conditions (unaided and aided in low noise and omnidirectional and directional processing in high noise), each repeated three times. The order of the four conditions was balanced within three blocks, while the conversational types were balanced across conditions within each triad. The participants had a mandatory break after six experimental conversations.

After each conversation, all participants provided individual subjective ratings of their active participation and the perceived usage of listening/talking strategies. All participants answered the question, ‘ If the conversation would have taken place in quiet, I would have participated: Put a cross on the scale’ , with the scale ranging from 0 (a lot less active) to 10 (a lot more active), with 5 indicating the same perceived activity level as if the conversation was being held in quiet. The formulation of the second question differed depending on hearing status: Both questions started with ‘ In comparison to a conversation in quiet, to which degree do you feel the noise made you … ’, with the ONH and YNH being asked ‘ change the way you communicated , e.g. , by changing the way you expressed yourself, used your voice, or body language?’ , while the formulation to the HI was ‘use listening tactics, such as asking for repeats, asking to speak up or turning your better ear to the speaker?’ For both questions, the scale ranged from 0 (no change) to 10 (a lot of change).

2.6 Statistical analysis

The effects of the experimental contrasts on the measures of conversational dynamics were investigated through Linear-Mixed Effects Models (LMERs) using the lm4 package for R ( Bates et al., 2015 ). The experimental design of the current study cannot be treated as a 2×2 design because the HA conditions differ between the noise conditions (low noise levels: unaided and aided/omni; high noise levels: omni/aided and dir). For this reason, it was chosen to test each of the three experimental contrasts (background noise level, providing HA amplification at low noise levels, and providing directional processing at high noise levels) in three separate LMER models.

All models included the fixed effects hearing status (HI, YNH, and ONH), experimental contrasts (two conditions for each contrast, see details below), and their interaction effect, with a random intercept of triad and person varying within the triad, i.e., x ~ hearing + conditions + hearing:conditions + (1 | triad/person) . When testing the effect of the experimental contrast background noise (low vs. high levels of noise), the two conditions included were aided and omni. When testing the effect of providing HA amplification during low levels of noise, the conditions were unaided and aided, and finally, the effect of the experimental contrast directional sound processing was investigated by comparing the conditions omni and dir during high levels of background noise. The predicted variable x in the statistical model will be the five measures of the conversational dynamics and two subjective ratings of the conversation. The extraction of the five measures of conversational dynamics is described in detail in the following section.

3 Quantifying turn-taking in group conversations

The focus of the following section is on the methodological considerations of how to perform voice activity detection, determine the communication states when three interlocutors, instead of two, are interacting, and how to deal with pauses made within one talker’s own turn. The final part of this chapter will provide a detailed description of the features of the conversational dynamics used in the current study.

3.1 Voice activity detection of individual interlocutors

Quantifying the conversational dynamics requires knowing when each interlocutor is speaking, e.g., by performing individual voice activity detection (VAD). VAD can be done automatically, either using simple methods based on short-term energy changes and thresholding or using more advanced neural network implementation ( Sharma et al., 2022 ). Accurate VADs are important when computing the features characterizing conversational dynamics to reliably identify the beginning and end of all utterances.

One major issue in the application of automatic VADs is crosstalk, i.e., speech from the conversational partners is audible in the recording of the targeted interlocutor. Due to the distance between talkers and the directionality of the headsets worn by the interlocutors, the amplitude level of the crosstalk is generally lower than speech from the targeted talker. However, natural speech has a large dynamic range. At low noise levels (50 dBC), the speech volume of single utterances ranged from 25.9 to 84.1 dB SPL (across all talkers); however, an average of 12.4% of all intervals without speech (background noise, crosstalk, and artifacts) exceeded the minimum speech level. At the high level of background noise (75 dBC), the speech volume ranged from 33.5 to 88.0 dB SPL, but a significantly lower percentage of the background noise exceeding the minimum speech level (7.8%, F (1,877) = 29.3, p  < 0.001). When testing the performance of various energy-based VAD approaches on data from the current study, this ~10% overlap between targeted interlocutor speech and non-speech caused unreliable VAD detections, including false positives and false negative detections.

For the current study, no automatic algorithm was identified that could provide a reliable VAD detection without erroneously labeling crosstalk as speech or vice versa. Hence, the VAD was performed manually based on the following rules: 1) All utterances should be labeled, including laughing, but excluding breaths and sighs; 2) Pauses between utterances shorter than 180 ms should be marked as speech to avoid cutting off stop closures ( Heldner and Edlund, 2010 ); 3) Utterances shorter than 90 ms should not be marked as it is not assumed to be speech ( Heldner and Edlund, 2010 ).

3.2 Determining the conversational states

From the binary output of the individual VADs (1 = interlocutor speech, 0 = not interlocutor speech), the conversational states, i.e., the organization of turns between interlocutors, must be determined before extracting the features of the conversational dynamics.

Before determining the conversational states, all instances of laughter were removed from the output-VAD because laughing does not constitute a wish from the interlocutor to “take the floor” ( Heldner and Edlund, 2010 ). Across all interlocutors, between 0 and 16 instances of laughter were removed per conversation (on average 0.60 laughs/min). Note that since laughing often manifests as short bursts separated by unvoiced silence, consecutive bursts of laughter were grouped into one instance of laughing.

Following the procedure proposed by Heldner and Edlund for two-talker conversations ( Heldner and Edlund, 2010 ), it is possible to categorize conversations into the following states (see Figure 2A ): A break in a talker’s utterance without a change of turn is called a pause , while a turn-taking between talkers (a floor-transfer) can either happen after a gap or in an overlap between (overlapB) speech. Finally, an utterance can happen simultaneously with an ongoing turn creating an overlap within (overlapW) other interlocutors’ turns. This general procedure can also be applied to triadic conversations when only two of the three interlocutors are active. However, if all three interlocutors are active at the same time, the resulting multiple overlaps will cause one utterance to be assigned to multiple conversational states (see text below and Figure 2 for more detail). In the current study, we wish to determine the conversational states of the entire conversation, meaning that each utterance should only have a single conversational state. As detailed below, this requires adding a few exceptions to the procedure proposed by Heldner and Edlund.

www.frontiersin.org

Figure 2 . Illustration of conversational states of a three-talker conversation (T1–T3). (A) Most states occur between two of the three talkers and are identical to the states observed in two-talker conversations, i.e., the turn-taking happens in an overlap between (overlapB) T1 and T2, in a gap between T2 and T3, or speech (T3) can completely overlap within (overlapW) the turn of another talker (T1). Dotted vertical lines indicate which talker the floor is transferred to and the floor-transfer offset (turn-taking) times of gaps or overlapBs used in the analysis. (B) When three talkers consecutively take turns in overlap (overlapBs), it can create instances where one talker (T3) has an overlapB between both remaining talkers (T1, dotted gray area, and T2, gray area). An utterance (T1) can also overlapW speech of remaining talkers (T2 and T3). (C) An utterance (T3) can overlapB one talker (T1) but overlapW another (T2) at the same time. In the three examples of (B,C) , the final conversational state of an utterance is determined by which talker initiated their utterance first (see details in the text).

Figure 2 illustrates the three examples where overlapping utterances fall into two conversational states. In Figure 2B , Talker 2 and Talker 3 (T2 and T3) start an utterance that overlaps T1 (overlapB). As T3 initiates the utterance later than T2, an additional overlapB between T2 and T3 occurs (indicated with light gray in Figure 2B ). In this case, the turn should be transferred from T1 to T2 and then from T2 to T3. The resulting duration of the overlapB included in the analysis of the floor-transfer offsets is indicated with dotted vertical lines in Figure 2 . Figure 2B also shows the example of an utterance made by T1 in overlapW with the speech of both T2 and T3. This utterance is classified as one overlapW and is always said to overlap within the speech of the talker who first initiated their turn, in this case, T3 ( Figure 2B ). It is also possible for an utterance to be classified as both an overlapB and overlapW, as illustrated for T3 in Figure 2C . As T2 initiates a turn first (in an overlapB T1), the utterance by T3 ends up overlapping within (overlapW) the turn of T2 and between (overlapB) the turn of T1. In this case, the utterance of T3 is classified as an overlapW of the speech of T2, as T2 initiated the speech before T3.

Across all the conversations, an average of 1.28 utterances/conversation was corrected for having two overlapBs (illustrated in Figure 2B , range 0–6/conversation). An average of 0.04 utterance/conversation was corrected for multiple overlapWs (range 0–1/conversation, Figure 2B ), while an average of 1.24 utterance/conversation was corrected for being overlapB and overlapW (range 0–7/conversation, Figure 2C ).

3.3 Correcting pauses within turns

Upon inspecting the conversational states and turn-taking resulting from the procedure described in the previous paragraph, it was evident that further processing was needed to capture the dynamics of the conversations. Figure 3 illustrates a typical exchange observed in the triadic conversation: T1 is speaking but receives verbal feedback (denoted backchannels) from both conversational partners (T2 and T3) within natural pauses occurring within the turn of T1. When following the rules for determining the conversational states (see previous section), the example provided in Figure 3 results in six turn-takings (solid orange line). However, considering that the definition of a backchannel is that it does not signal a wish from the talker to take the turn ( Yngve, 1970 ), the timing of backchannels does not have to follow the same social rules as the timing of a turn. Indeed, it has been observed that for utterances made in overlap (overlapW and overlapB), 73% of them are backchannels ( Levinson and Torreira, 2015 ).

www.frontiersin.org

Figure 3 . Example of post-processing of turn-taking from a conversation between three talkers (T1–T3). Individual VADs from an excerpt of a conversation (transcription on top) are indicated with fully colored blocks. Based on these, there are six resulting turn-takings (full orange line) between the interlocutors. After post-processing the VADs by bridging pauses within a talker’s own speech shorter than 1 s (dotted blue area), the number of turn-takings is reduced to one, as indicated by the dotted orange line.

To get a better estimate of the true number of turns and their timing in the triadic conversations, post-processing of the output of the VADs was performed to connect utterances constituting a single turn. To this avail, any pauses within a talker’s speech shorter than 1 s were bridged such that the pauses were considered speech. This was done under the assumption that if a talker pauses for less than 1 s, the intention was not to end but to continue the turn. In the example provided in Figure 3 , the bridging of pauses reduces the number of turn-takings from six to one.

An average of 7.4 pauses/min were bridged per interlocutor. The conversational states of the post-processed output of the VADs were determined. As expected, bridging the pauses increased the number of utterances overlapping the ongoing turn (overlapW) by an average of 0.60 more overlapW per minute conversation relative to the output of the original VAD.

3.4 Features of the conversational dynamics

The dynamics of a conversation can be described by different measures extracted from the individual utterances and conversational states. In the current study, a total of five measures were extracted:

From the individual utterances, the 1) speech levels, defined as the RMS of all utterances, were extracted and scaled using a calibration recording to get the level in dB SPL ( Petersen et al., 2022 ). To avoid including periods of pauses within turns, the speech level was extracted by concatenating utterances of the original VADs, i.e., prior to performing the post-processing described above. From the post-processed individual VADs, the 2) median turn duration was extracted, while 3) the percentage speaking time was extracted as the percentage of the 5-min recording where the interlocutor was talking. As such, the percentage speaking time across the three interlocutors of the conversation can exceed 100% due to overlapB and overlapW.

From the conversational states, the FTOs were extracted by combining gaps and overlapBs to generate the FTO distribution. From the FTO distribution, the 4) median and 5) variability, quantified by the interquartile range (IQR), were extracted as measures of the turn-taking timing.

Furthermore, two subjective evaluations were made for each interlocutor after each conversation regarding 6) the level of activity (participation) and 7) the application of listening (for HI) or talking (YNH and ONH) strategies.

The fixed effect of hearing status (HI, ONH, and YNH) and experimental contrasts (noise level, HA amplification, and HA directionality) were investigated for the five measures of conversational dynamics and the two subjective ratings made by each interlocutor after each conversation. All statistical results are presented in Table 1 . In the visualizations of results, the main effects of conditions and interactions between hearing status and conditions are shown using lines and asterisks, respectively, indicating the level of significance (*** p  < 0.001, ** p  < 0.01, * p  < 0.05).

www.frontiersin.org

Table 1 . Statistically significant effects are highlighted in bold writing. The relevant post-hoc results are presented in italics below the significant fixed effect, indicating the contrasts, estimated difference, and p -values.

4.1 Floor-transfer offsets

Across all interlocutors and conditions, the FTO distribution peaked at 208 ms ( Figure 4A ), i.e., interlocutors tended to start their turn after a short gap. For each interlocutor and conversation, the FTO distribution was formed, and the median and the interquartile range (IQR) were extracted. For all experimental contrasts, a significant effect of hearing status was observed on the median FTO ( Table 1 ; Figure 4B ). The HI interlocutors initiated their turns on average 79 ms faster than the YNH and ONH interlocutors at high noise levels. At low levels of noise, the HI interlocutors initiated their turns faster than the YNH (124 ms), but the 79 ms difference between HI and ONH was not significant ( p  = 0.08).

www.frontiersin.org

Figure 4 . Floor-transfer offset (FTO) distribution and measures. (A) FTO distributions for HI (red), ONH (green), and YNH (blue) for all four conditions. Positive FTO values indicate turns initiated after a gap, while a negative value indicates an overlap between turns. The dotted vertical line indicates an FTO of 0 ms, i.e., neither gap nor overlap. (B) Median of the FTO distribution extracted for each interlocutor and conditions (averaged across repetitions). (C) Variability of the FTO distribution extracted as the interquartile range (IQR) for each interlocutor and conditions (averaged across repetitions). Here and in the following, the boxes indicate the 25th to 75th percentile, and the horizontal lines are the median. Whiskers extend the range of the data, and the dots highlight the outliers.

The FTO variability also showed an effect of hearing status ( Table 1 ; Figure 4C ), indicating that the spread of the HI interlocutors’ FTO distribution was ~130 ms larger than that of the YNH and ONH interlocutors at both high and low high noise levels, although the difference between HI and ONH in the latter only approached significance ( p  = 0.054).

4.2 Turn duration and speaking time

The median overall turn duration was 3.2 s. The main effect of hearing status in the model testing effect of increasing the noise level ( p  = 0.02, Table 1 ; Figure 5A ) suggests that HI interlocutors spoke in longer turns in general; however, this effect is driven by the significant interaction between noise and hearing ( p  = 0.03), revealing that the HI interlocutors only differ from the YNH and ONH at high levels of noise. This is confirmed by the main effect of hearing status in the conditions with high levels of noise where the HI interlocutors spoke 1.3 s longer than the ONH ( p  < 0.01) and 0.9 s longer than the YNH interlocutors ( p  = 0.02).

www.frontiersin.org

Figure 5 . Median turn duration, speaking time, and speech levels. (A) Median turn duration resulting from the post-processed VADs across conditions and interlocutor hearing status. (B) Percentage of the speaking time of the total conversation duration of 5 min. (C) Speech levels in dB SPL. Background noise levels (50 and 75 dBC) of the different conditions are indicated with dotted gray. Asterisks colored according to hearing status indicate the statistically significant results of the post-hoc testing of the interaction effect between hearing status and experimental condition.

No effect of hearing status was found on the turn duration at low background noise, although the result was close to significance ( p  = 0.06). In lower noise, the main effect of HA amplification on turn duration ( p  = 0.03) indicated that all interlocutors’ turns were on average 357 ms longer when the HI interlocutors were aided relative to unaided.

The percentage speaking time was affected by hearing status for all experimental contrasts, indicating that HI interlocutors spoke around 5% more than the ONH and around 8% more than the YNH interlocutors across all conditions ( Table 1 ; Figure 5B ). It should be noted that at a low level of background noise, the difference between HI and ONH only approached significance ( p  = 0.06). No difference in the speaking time between the YNH and ONH interlocutors was seen, although there was a non-significant tendency for the ONH to speak more than the YNH in high levels of noise ( p  = 0.06).

4.3 Speech level

The conversations held in 50 dBC noise were conducted at an SNR of +11.1 dB on average, while in 75 dBC noise, the SNR was reduced to −6.3 dB when averaging across interlocutors, repetitions, and HA settings.

The speech levels were affected by hearing status at high noise levels but not at low levels of noise ( Table 1 ; Figure 5C ). When increasing the noise level (aided vs. omni), the HI interlocutors increased their speech by around 1 dB, which is less than the ONH and YNH interlocutors. Consequently, during the high level of background noise, the ONH spoke 2.4 dB louder than the HI interlocutors ( p  < 0.01), while there was a non-significant trend for the YNH to speak 1.3 dB louder than the HI interlocutor in noise ( p  = 0.07). In terms of SNR, the HI interlocutors talked at −8.0 dB SNR on average at the highest level of noise, while the YNH was speaking at −5.8 dB SNR and the ONH at −5.1 dB SNR.

Significant effects of altering the HA processing were observed. At a low background noise level, all interlocutors spoke 0.8 dB louder when the HI interlocutor was unaided ( p  < 0.001, Table 1 ). Similarly, all interlocutors spoke on average 0.58 dB louder when the HI interlocutors were listening to the unprocessed omnidirectional sound input ( p  < 0.01). However, a significant interaction effect between hearing status and directional sound processing revealed that while the NH interlocutors generally spoke louder than the HI interlocutors at high noise levels, providing directional sound processing caused the HI interlocutors to reduce their speech level further by 1.3 dB ( p  < 0.001), while the speech levels of the NH interlocutors were unaffected (both p ’s < 0.09). As a result, the HI interlocutors reduced the SNR experienced by the NH interlocutors from −7.2 dB when listening to omnidirectional sound processing to −8.5 dB when receiving directional sound processing. The HI interlocutors experienced an SNR of −5.5 dB produced by the NH interlocutors in both conditions with high levels of background noise.

4.4 Subjective evaluations

After each conversation, the interlocutors were asked to subjectively rate their level of participation as well as their application of listening (for HI interlocutors) and talking (YNH and OHN interlocutors) strategies.

The subjective ratings of the level of participation showed no significant effect on hearing status ( Table 1 , data not shown, all p’s > 0.2), while increasing the noise level reduced their participation rating by 0.4 points (p < 0.001).

Similarly, the subjective evaluation of the application of listening/talking strategies increased by 4.2 points when the noise level was increased ( Table 1 ; Figures 6 , p  < 0.001). The HI interlocutors rated increasing their usage of listening strategies compared to the application of talking strategies rated by the NH interlocutors when increasing the noise level ( Table 1 ; Figure 6 , both p ’s < 0.05), but the significant interaction effect between hearing status and background noise indicated that hearing status only affected the ratings at the low level of background noise (both p ’s < 0.001), whereas no differences were observed between HI and NH interlocutors at high noise level (all p ’s < 0.2).

www.frontiersin.org

Figure 6 . Subjective ratings. The subjective rating of how much the HI rated applying listening strategies relative to whether the conversation had been held in quiet. The NH rated how much they applied communication strategy relative to whether the conversation had been held in quiet. Ratings were performed on a continuous 11-point visual analog scale.

In low-level noise, the HI interlocutors rated using 3.0 points more strategy on average compared to NH listeners ( Table 1 ; Figures 6 , p  < 0.001). Although a significant main effect of HA amplification ( p  < 0.001) indicated a general 0.6-point decrease in applying strategies when the HI interlocutor was aided, the significant interaction between hearing status and HA amplification ( p  < 0.001) revealed that the effect is driven by the HI interlocutors rating using 1.6 points less strategy when receiving HA amplification ( p  < 0.001), whereas the YNH and ONH rated no changes in their application of talking strategies (both p ’s > 0.2).

5 Discussion

The current study investigated the effect of hearing status and three different experimental contrasts (background noise level, HA amplification, and HA directionality) on the dynamics of a group conversation between one HI and two NH interlocutors. We observed that being hearing impaired affected all measures of conversational dynamics, HA processing, and noise level, which primarily affected speech levels. The following discussion will focus on why the experimental contrasts did not affect the conversational dynamics as hypothesized.

5.1 Effect of noise and HA processing on the conversational dynamics

Only a few effects were observed when altering the three experimental contrasts: Increasing the background noise or altering the HI interlocutor’s auditory perception by providing either HA amplification or directional processing.

Beyond increases in speech levels ( Figure 5C ), the 25 dB increase in the level of the canteen noise did not have any effect on the conversational dynamics (no main effects of aided vs. omni, Table 1 ). The increased noise level caused interlocutors to speak on average 8.2 dB louder, resulting in a reduction in the communication SNR across interlocutors, from +11.1 dB in 50 dBC background noise to −6.3 dB SNR in 75 dBC background noise. This communication SNR is much in line with a previous study finding that dialogs between an HI and an NH interlocutor happened at −5 dB SNR in 77.3 dBA café noise ( Beechey et al., 2020b ). For comparison, the standardized Danish speech-in-noise tests find sentence intelligibility (without visual cues) to be lower than 50% for NH listens at −5 dB SNR ( Nielsen and Dau, 2009 ; Bo Nielsen et al., 2014 ). It should be noted that in realistic everyday listening situations, communication SNRs below +5 dB SNR are rarely observed ( Smeds et al., 2015 ). Nevertheless, the result of the current study suggests that communication at −6.3 dB SNR was possible for both HI and NH interlocutors. This is evident from the fact that the overall percentage speaking time did not alter when increasing the noise level, and the subjective participation ratings only decreased by 0.4 points on the 11-point scale. The neglectable effect of increasing the noise level on the conversational dynamics could be caused by the access to visual cues, the spatial separation of the noise and interlocutors, and/or predictability of the conversational topic. The interlocutors not being able to increase their vocal intensity more, to improve the SNR beyond −6.3 dB, could be caused by the additional physical strain on the vocal cords associated with speaking at higher levels, causing a reduction in voice quality ( Södersten et al., 2005 ). Hence, the SNR of a conversation is likely a balance between speaking loud enough for communication to be successful while at the same time reducing the vocal effort.

As two of the three experimental contrasts (HA amplification and directional processing) were only experienced by the HI interlocutors, it is noteworthy that providing HA amplification affected turn duration and speech level for both the HI and NH interlocutors ( Table 1 ). All interlocutors shortened their turns by 357 ms on average, when the HI interlocutor was unaided ( Figure 5A ). This observation contradicts the hypothesis that communication difficulty would cause longer turns, as observed with the increased turn duration of the HI interlocutors. The effect of HA amplification on speech level will be discussed in detail in the section Speech Levels are Sensitive to all Experimental Contrasts.

5.2 Effects of hearing impairment on conversational dynamics

HI interlocutors were hypothesized to initiate their turns slower and with more variability because their impairment makes them worse at predicting turn-ends than the NH interlocutors ( Sørensen et al., 2019 ; Petersen et al., 2022 ). Although the HI interlocutors were found to initiate their turns with more variability than the NH interlocutors (higher FTO IQR, Figure 4C ), they were also observed to do that faster, not slower, than the NH interlocutors (lower medina FTO, Figure 4B ). Previous studies have focused on turn-taking in dyadic conversations; however, the presence of an additional interlocutor adds an element of competition to the conversation. Indeed, the many minds problem describes how the complexity and uncertainty of the turn-taking system increase when more than two interlocutors are conversing ( Cooney et al., 2020 ). To ensure getting the turn, interlocutors might be forced to initiate turns earlier in overlapBs. This could explain why the broader FTO distributions in the current study skewed toward negative values ( Figure 4A ) relative to FTO distributions of the dyadic conversations of previous studies ( Figure 2A of Petersen et al., 2022 , Figure 4 left in Sørensen et al., 2019 ). However, it should be noted that although the many minds problem can affect turn-taking, the post-processing of the VADs by bridging pauses also has a substantial effect on the turn-taking timing by occasionally causing utterances classified as overlaps within (overlapW) to be bridged with later utterances, resulting in larger negative FTO values (Section 3.2 Correcting Pauses Within Turns). Despite the influence of the post-processing step, it is nevertheless interesting to note that the peak of the overall FTO distribution, at 208 ms, is comparable to that of previous studies (~230 ms in Petersen et al., 2022 , ~275 ms in Sørensen et al., 2019 ), lending more emphasis on the stability of the average turn being taken with a 200-ms gap ( Levinson and Torreira, 2015 ).

When facing difficult communication situations, it has been reported that HI interlocutors can adopt a face-saving strategy of speaking more to avoid listening ( Stephens and Zhao, 1996 ). The HI interlocutors in the current study generally took up around 5% more speaking time relative to the NH interlocutors. Increasing the noise level did not affect the speaking time of the HI interlocutors. This suggests that although the HI interlocutors took up more speaking time, they did not seem to deliberately use the strategy of dominating the conversation to avoid listening when the background noise level increased.

The HI interlocutors also produced longer turns ( Figure 5A ), although the effect seemed to be largest at higher levels of background noise, as the effect of hearing status was only near-significant at the low noise level ( Tables 1 , p  = 0.06). Overall, HI interlocutors spoke for around 1 s longer per turn, which must be considered a substantial increase relative to the overall average turn duration of 3.2 s. The HI interlocutors could have prolonged their turns by edge speaking slower, adding more pauses, or including more filler words such as “ um ” or “ uh ” in their speech. Non-informative filler words play an important role in coordinating turn-taking by helping the interlocutor take the floor fast, or keep the floor, while planning an upcoming utterance ( Clark and Fox Tree, 2002 ). Indeed, it might be speculated that if the longer turn durations observed for the HI interlocutors are caused by uttering filler words, these might cause the faster turn-taking timing (lower FTO median) observed for the HI interlocutor.

It should be noted that the HI interlocutors were significantly older than the two NH groups, which could lead to speculation on whether the observed effect of hearing status was driven by the difference in age between the groups. However, as the ONH interlocutors were also significantly older than the YNH participants, it would be expected that any potential age effects would have resulted in significant differences between the YNH and OHN groups, which was not observed.

5.3 Speech levels are sensitive to all experimental contrasts

Similar to a previous study ( Petersen et al., 2022 ), speech level was the measure most affected by alterations in communication difficulty ( Table 1 and Figure 5C ). At low background noise, hearing status had no differential effect on the speech level; however, when the HI interlocutor did not receive HA amplification (unaided), all interlocutors spoke louder. The observed decrease in speech level of 0.8 dB upon providing HA amplification is comparable to the 1.1 dB decrease in speech level observed when providing amplification to HI interlocutors in dialogs held in quiet ( Petersen et al., 2022 ).

When increasing the level of background noise, all interlocutors increased their speech level. However, the increase was around 2 dB larger for the NH interlocutors than for the HI interlocutors. Again, a similar effect was observed when adding 70 dB background noise to a dialog, in which NH interlocutors increased their speech level by 3.2 dB more than the HI interlocutors ( Petersen et al., 2022 ). Hearing status was found to affect speech level differentially, suggesting that the NH interlocutors made up for the added communication difficulty experienced by the HI interlocutors when communicating in noise by speaking louder. Interestingly, when providing directional sound processing, thereby reducing the noise level experienced by the HI interlocutors, the HI interlocutors reduced their speech level by 1.3 dB, further reducing the SNR experienced by the NH interlocutors. Hence, directional sound processing increased the communication difficulty experienced by the NH interlocutors.

The subjective evaluation of the use of talking strategies during the conversations, including speaking louder, revealed that although speaking louder, the NH interlocutors did not perceive using additional talking strategies when the HI interlocutors were unaided ( Table 1 ; Figure 6 ). However, the HI interlocutors reported applying more listening strategies when communicating unaided, despite the small increase in speech level made by all interlocutors relative to when the HI interlocutors were aided. At higher levels of background noise, interlocutors reported using more talking/listening strategies. However, it is interesting to note that the additional application of listening strategies in noise rated by the HI interlocutors seemed to match the increase in applied talking strategies made by the NH interlocutors.

The Lombard effect describes the increase in speech level when talking in the presence of noise; however, the effect has rarely been investigated in interactive communication situations. As the findings of the current study highlight, the speech level of interlocutors depends not only on the noise level but also on the communication difficulty experienced by the (HI) conversational partner. Through requests to repeat utterances, statements of not being able to hear, miscommunications, or subtle alterations in facial expressions, gestures, or body posture/movements, such as leaning in or turning the better ear, an interlocutor can influence the conversational partners to increase their speech level. However, the current study also suggests that HI interlocutors alter their speech level according to their own perceived communication difficulty, as evident from the reduced speech level of the HI interlocutors when receiving directional sound processing in a high level of background noise. However, when receiving HA amplification at the lower noise level, the speech levels increased not only for the HI interlocutor but for all interlocutors. During the experiment, the test leader physically removed the HAs as discretely as possible (see section hearing-aid fitting); however, the removed and missing HAs during the unaided condition were visible to the NH interlocutors. It is, therefore, likely that all interlocutors were aware that the HI interlocutor was going to experience communication difficulties in the unaided conditions, potentially causing interlocutors to alter their speech levels going into the conversation. This is contrary to the change in directional sound processing, which was changed through an app, thereby not prompting the interlocutors that the auditory experience of the HI interlocutor was altered.

Altogether, the result of the current study shows that the conversational dynamics of free triadic conversation are relatively stable in response to changes in communication difficulties. This is contrary to previous studies of task-bound dyadic conversations, where researchers found changes in many different measures of conversational dynamics ( Hazan et al., 2018 ; Beechey et al., 2020a , b ; Sørensen, 2021 ; Petersen et al., 2022 ). We can only speculate what caused the observed stability of the conversational dynamics in the current study: Perhaps the interpersonal coordination of a triadic conversation, caused by the many minds problem, influences the dynamics of the triadic conversation more than, e.g., altering the background noise. Perhaps the free conversations allowed the interlocutors to utilize and modify their word usage, linguistics, or body language to help overcome the increased communication difficulty. It is also possible that the conversational dynamics are determined by the fact that two out of three interlocutors were NH, who are potentially less affected by changes in the noise level. Unfortunately, we cannot know which, if any, of the reasons listed above cause the insensitivity of the features of conversational dynamics to the changes in communication difficulties.

6 Conclusion

The current study explored whether the dynamics of a free group conversation were affected by the impaired hearing experienced by one of the three interlocutors and whether noise and hearing-aid signal processing would influence it to the same extent as observed in dyadic conversations. It was hypothesized that any alteration of the communication difficulty (noise level, hearing loss, and HA processing) experienced by one or all interlocutors would affect the five measures of the conversational dynamics (FTO median, FTO IQR, turn duration, speech level, and speaking time). This hypothesis could not be uniformly confirmed: Interlocutors with hearing loss showed the expected larger variability in turn-taking timing (FTO IQR), taking up more speaking time, having longer turn-durations at high noise levels, and resulted in the NH interlocutors speaking louder, especially at low noise levels. However, contrary to the expectations, it was also observed that the HI interlocutors initiated their turns faster (FTO median), not slower, than the NH interlocutors. An overall increase in the noise level of 25 dB SPL caused an increase in the speech levels but did not affect the turn-taking timing, turn duration, or distribution of speaking time. Furthermore, improving listening for the HI interlocutors by providing HA amplification at low noise levels and directional sound processing at high noise levels had no effect on the conversational dynamics beyond the speech level: At low noise levels, providing HA amplification to the HI interlocutors cause all conversation partners to speak at a lower volume. At high noise levels, providing directional sound processing caused the HI interlocutor to speak at a lower volume.

From the current results, the speech levels were observed to be a measure of the conversational dynamics most sensitive to alterations in the communication difficulty experienced by the group (background noise), as well as the HI interlocutor when providing HA amplification and directional sound processing.

Data availability statement

The original contributions presented in the study are included in the article/supplementary materials, further inquiries can be directed to the corresponding author.

Ethics statement

The studies involving humans were approved by Board of Copenhagen, Denmark, reference H-20068621. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.

Author contributions

EP: Writing – original draft, Writing – review & editing.

The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.

Acknowledgments

The author would like to thank Clinical Research Audiologist Els Walravens for her detailed and vigorous work in collecting data for the current study.

Conflict of interest

EP is an employee at the hearing-aid manufacturing company WS Audiology.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Baker, R., and Hazan, V. (2011). DiapixUK: task materials for the elicitation of multiple spontaneous speech dialogs. Behav. Res. Methods 43, 761–770. doi: 10.3758/s13428-011-0075-y

PubMed Abstract | Crossref Full Text | Google Scholar

Barthel, M., Sauppe, S., Levinson, S. C., and Meyer, A. S. (2016). The timing of utterance planning in task-oriented dialog: evidence from a novel list-completion paradigm. Front. Psychol. 7:1858. doi: 10.3389/fpsyg.2016.01858

Bates, D., Mächler, M., Bolker, B., and Walker, S. (2015). Fitting linear mixed-effects models using lme4. J. Stat. Softw. 67, 1–48. doi: 10.18637/jss.v067.i01

Crossref Full Text | Google Scholar

Beechey, T., Buchholz, J. M., and Keidser, G. (2018). Measuring communication difficulty through effortful speech production during conversation. Speech Comm. 100, 18–29. doi: 10.1016/j.specom.2018.04.007

Beechey, T., Buchholz, J. M., and Keidser, G. (2020a). Hearing aid amplification reduces communication effort of people with hearing impairment and their conversation partners. J. Speech Lang. Hear. Res. 63, 1299–1311. doi: 10.1044/2020_JSLHR-19-00350

Beechey, T., Buchholz, J. M., and Keidser, G. (2020b). Hearing impairment increases communication effort during conversations in noise. J. Speech Lang. Hear. Res. 63, 305–320. doi: 10.1044/2019_JSLHR-19-00201

Bo Nielsen, J., Dau, T., and Neher, T. (2014). A Danish open-set speech corpus for competing-speech studies. J. Acoust. Soc. Am. 135, 407–420. doi: 10.1121/1.4835935

Bögels, S., Magyari, L., and Levinson, S. C. (2015). ‘Neural signatures of response planning occur midway through an incoming question in conversation’, scientific reports. Nat. Publ. Group 5, 1–11. doi: 10.1038/srep12881

Clark, H. H., and Fox Tree, J. E. (2002). Using uh and um in spontaneous speaking. Cognition 84, 73–111. doi: 10.1016/S0010-0277(02)00017-3

Cooney, G., Mastroianni, A. M., Abi-Esber, N., and Brooks, A. W. (2020). The many minds problem: disclosure in dyadic versus group conversation. Curr. Opin. Psychol. 31, 22–27. doi: 10.1016/j.copsyc.2019.06.032

Corps, R. E., Gambi, C., and Pickering, M. J. (2018). Coordinating utterances during turn-taking: the role of prediction, response preparation, and articulation. Discourse Process. 55, 230–240. doi: 10.1080/0163853X.2017.1330031

Gisladottir, R., Chwilla, D., and Levinson, S. (2015). Conversation electrified: ERP correlates of speech act recognition in underspecified utterances. PLoS One 10, 1–24. doi: 10.1371/journal.pone.0120068

Hazan, V., Tuomainen, O., Tu, L., Kim, J., Davis, C., Brungart, D., et al. (2018). How do aging and age-related hearing loss affect the ability to communicate effectively in challenging communicative conditions? Hear. Res. 369, 33–41. doi: 10.1016/j.heares.2018.06.009

Hazan, V., and Tuomaine, O. (2019) The effect of visual cues on speech characteristics of older and younger adults in an interactive task. 19th International Congress of the Phonetic Sciences.

Google Scholar

Heldner, M., and Edlund, J. (2010). Pauses, gaps and overlaps in conversations. J. Phon. 38, 555–568. doi: 10.1016/j.wocn.2010.08.002

Indefrey, P., and Levelt, W. (2004). The spatial and temporal signatures of word production components. Cognition 92, 101–144. doi: 10.1016/j.cognition.2002.06.001

Junqua, J. C. (1996). The influence of acoustics on speech production: a noise-induced stress phenomenon known as the Lombard reflex. Speech Comm. 20, 13–22. doi: 10.1016/S0167-6393(96)00041-6

Keidser, G., Dillon, H., Flax, M., Ching, T., and Brewer, S. (2011). The NAL-NL2 prescription procedure. Audiol. Res. 1, 88–90. doi: 10.4081/audiores.2011.e24

Levinson, S. C., and Torreira, F. (2015). Timing in turn-taking and its implications for processing models of language. Front. Psychol. 6, 1–17. doi: 10.3389/fpsyg.2015.00731

Lombard, E. (1911). Le signe de le Televation de la voix. Ann. Malad. 27, 101–119.

Magyari, L., Bastiaansen, M. C. M., de Ruiter, J. P., and Levinson, S. C. (2014). Early anticipation lies behind the speed of response in conversation. J. Cogn. Neurosci. 26, 2530–2539. doi: 10.1162/jocn_a_00673

Nielsen, J. B., and Dau, T. (2009). Development of a Danish speech intelligibility test. Int. J. Audiol. 48, 729–741. doi: 10.1080/14992020903019312

Petersen, E. B., Macdonald, E. N., and Sørensen, A. (2022). The effects of hearing aid amplification and noise on conversational dynamics between Normal-hearing and hearing-impaired talkers. Trends Heari. 26, 233121652211033–233121652211018. doi: 10.1177/23312165221103340

Petersen, E. B., Walravens, E., and Pedersen, A. K. (2022). Real-life listening in the lab: does wearing hearing aids affect the dynamics of a group conversation?. Conference: Proceedings of the 26th workshop on the semantics and pragmatics of dialog.

Sharma, M., Joshi, S., Chatterjee, T., and Hamid, R. (2022). A comprehensive empirical review of modern voice activity detection approaches for movies and TV shows. Neurocomputing 494, 116–131. doi: 10.1016/j.neucom.2022.04.084

Smeds, K., Wolters, F., and Rung, M. (2015). Estimation of signal-to-noise ratios in realistic sound scenarios. J. Am. Acad. Audiol. 26, 183–196. doi: 10.3766/jaaa.26.2.7

Södersten, M., Ternström, S., and Bohman, M. (2005). Loud speech in realistic environmental noise: Phonetogram data, perceptual voice quality, subjective ratings, and gender differences in healthy speakers. J. Voice 19, 29–46. doi: 10.1016/j.jvoice.2004.05.002

Sørensen, A. (2021) The effects of noise and hearing loss on conversational dynamics. Technical University of Denmark. Available at: https://www.hea.healthtech.dtu.dk/-/media/centre/hea_hearing_systems/hea/english/research/phd-thesis-pdf/00_47_-sorensen.pdf?la=da&hash=BA1FD9620387CA7A7C0F284C1089554CDFD10331

Sørensen, A., MacDonald, E., and Lunner, T. (2019). Timing of turn taking between normal-hearing and hearing-impaired interlocutors. Proceedings of the International Symposium on Auditory and Audiological Research (ISAAR).

Stephens, D., and Zhao, F. (1996). Hearing impairment: special needs of the elderly. Folia Phoniatr. Logop. 48, 137–142. doi: 10.1159/000266400

Stivers, T., Enfield, N. J., Brown, P., Englert, C., Hayashi, M., Heinemann, T., et al. (2009). Universals and cultural variation in turn-taking in conversation. Proc. Natl. Acad. Sci. USA 106, 10587–10592. doi: 10.1073/pnas.0903616106

Yngve, V. (1970) On getting a word in edgewise. Sixth regional meeting Chicago linguistic society.

Keywords: hearing loss, communication, hearing aids, noise, conversational dynamics

Citation: Petersen EB (2024) Investigating conversational dynamics in triads: Effects of noise, hearing impairment, and hearing aids. Front. Psychol . 15:1289637. doi: 10.3389/fpsyg.2024.1289637

Received: 13 September 2023; Accepted: 04 March 2024; Published: 12 April 2024.

Reviewed by:

Copyright © 2024 Petersen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Eline Borch Petersen, [email protected]

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

IMAGES

  1. Basic principles of visual and auditory perception

    perceptual hypothesis psychology definition

  2. Perception Psychology and How We Understand Our World

    perceptual hypothesis psychology definition

  3. Perception (Psychology): 10 Examples and Definition (2024)

    perceptual hypothesis psychology definition

  4. Perception: Meaning, and Process of Perception (2022)

    perceptual hypothesis psychology definition

  5. Perception: Definition, Importance, Factors, Perceptual Process, Errors

    perceptual hypothesis psychology definition

  6. 5. perception

    perceptual hypothesis psychology definition

VIDEO

  1. Psychophysics- Physiology of perception

  2. the definition of a hypothesis. the definition of luck. Look it up

  3. Perception

  4. Perceptual Reasoning (Definition + Examples)

  5. Difference between Hypothesis and Theory

  6. Exploring Phenomenology

COMMENTS

  1. Perceptual Set In Psychology: Definition & Examples

    Culture and Perceptual Set. Perceptual set in psychology refers to a mental predisposition or readiness to perceive stimuli in a particular way based on previous experiences, expectations, beliefs, and context. It influences how we interpret and make sense of sensory information, shaping our perception and understanding of the world.

  2. 5.6 The Gestalt Principles of Perception

    pattern perception: ability to discriminate among different figures and shapes. perceptual hypothesis: educated guess used to interpret sensory information. principle of closure: organize perceptions into complete objects rather than as a series of parts. proximity: things that are close to one another tend to be grouped together

  3. Key Theories On The Psychology Of Perception

    Visual perception: sight perceived through the eyes. Auditory perception: sounds perceived through the ears. Gustatory perception: awareness of flavor and taste on the tongue. Olfactory perception: smelling via the nose. Tactile perception: awareness of sensation on the skin. Vestibular sense: perception of balance and motion.

  4. Perceptual Sets in Psychology

    A perceptual set is a good example of what is known as top-down processing. In top-down processing, perceptions begin with the most general and move toward the more specific. Such perceptions are heavily influenced by context, expectations, and prior knowledge. If we expect something to appear in a certain way, we are more likely to perceive it ...

  5. 5.7: Gestalt Principles of Perception

    Two additional Gestalt principles are the law of continuity (or good continuation) and closure. The law of continuity suggests that we are more likely to perceive continuous, smooth flowing lines rather than jagged, broken lines figure 5.7.4 5.7. 4. The principle of closure states that we organize our perceptions into complete objects rather ...

  6. Frontiers

    The direct perception hypothesis: perceiving the intention of another's action hinders its precise imitation ... 3 School of Psychology, ... that social understanding is a direct perceptual achievement in most normal situations is thereby excluded by definition. This limited view of perceptual experience has important implications for how ...

  7. Perceptions as Hypotheses

    Abstract. Philosophers concerned with perception traditionally consider phenomena of perception which may readily be verified by individual observation and a minimum of apparatus. Experimental psychologists and physiologists, on the other hand, tend to use elaborate experimental apparatus and sophisticated techniques, so that individual ...

  8. Embodiment and the Perceptual Hypothesis

    The Perceptual Hypothesis opposes Inferentialism, which is the view that our knowledge of others' mental features is always inferential. The claim that some mental features are embodied is the claim that some mental features are realised by states or processes that extend beyond the brain. The view I discuss here is that the Perceptual ...

  9. Embodiment and the Perceptual Hypothesis

    The Perceptual Hypothesis opposes Inferentialism, which is the view that our knowledge of others' mental features is always inferential. The claim that some mental features are embodied is the claim that some mental features are realised by states or processes that extend beyond the brain. The view I discuss here is that the Perceptual ...

  10. Perceptions as Hypotheses

    Claims: (1) that perceptions are essentially like predictive hypotheses in science; (2) that the procedures of science are a guide for discovering processes of perception; (3) that many perceptual illusions correspond to and may receive explanations from under-. standing systematic errors occurring in science.

  11. Frontiers

    If perception corresponds to hypothesis testing ( Gregory, 1980 ); then visual searches might be construed as experiments that generate sensory data. In this work, we explore the idea that saccadic eye movements are optimal experiments, in which data are gathered to test hypotheses or beliefs about how those data are caused.

  12. The power of predictions: An emerging paradigm for psychological

    The hypothesis that internal representations are hypotheses that play a key role in perception and action formed the basis of the cognitive revolution (e.g., Gregory, 1980; Neisser, 1967) and within social psychology, implicit attitudes, stereotyping and prejudice are predicated on the idea that information inside the head shapes experience of ...

  13. The simplicity principle in perception and cognition

    The simplicity principle, traditionally referred to as Occam's razor, is the idea that simpler explanations of observations should be preferred to more complex ones. In recent decades the principle has been clarified via the incorporation of modern notions of computation and probability, allowing a more precise understanding of how exactly ...

  14. 6.1 The Process of Perception

    Perception is the process of selecting, organizing, and interpreting sensory information. This cognitive and psychological process begins with receiving stimuli through our primary senses (vision, hearing, touch, taste, and smell). This information is then passed along to corresponding areas of the brain and organized into our existing ...

  15. PDF Perception as hypothesis testing

    perception involves testing hypotheses, and that once a hypothesis is established it may prevent or delay the acceptance of an alternative hypothesis. This interpreta­ tion thus attributes the negative effect of prior exper-This research was supported by a grant to the senior author

  16. Perceptual Set

    The perceptual set psychology definition is "a predisposition to perceive or notice some aspects of the available sensory data and ignore others." Perceptual set determines the way people ...

  17. The direct perception hypothesis: perceiving the intention of another's

    The possibility that social understanding is a direct perceptual achievement in most normal situations is thereby excluded by definition. This limited view of perceptual experience has important implications for how researchers in developmental and comparative psychology approach the phenomenon of imitation.

  18. Bayesian Perceptual Psychology

    Contemporary perceptual psychology uses Bayesian decision theory to develop Helmholtz's view that perception involves 'unconscious inference'. The science provides mathematically rigorous, empirically well-confirmed explanations for diverse perceptual constancies and illusions. The explanations assign a central role to mental representation.

  19. Attention and Conscious Perception in the Hypothesis Testing Brain

    Two things motivate the idea of the hypothesis testing brain: casting a core task for the brain in terms of causal inference, and then appealing to the problem of induction. The brain needs to represent the world so we can act meaningfully on it, that is, it has to figure out what in the world causes its sensory input.

  20. PERCEPTUAL CYCLE HYPOTHESIS

    perceptual cycle hypothesis By N., Sam M.S. the theory that cognition impacts perceptual exploration but is thereby changed by real-world encounters, cultivating a cycle of attention , cognition, comprehension, and the authentic world wherein each impacts the others.

  21. Frontiers

    1 Institute of Geography, Augsburg University, Augsburg, Germany; 2 Faculty of Social Sciences/Psychology, Tampere University, Tampere, Finland; We provide an extension of the Savanna perceptual preference hypothesis ("Savanna Hypothesis"), supposing that interaction with landscapes offering survival advantage for human groups during evolution might have gradually evolved to permanent ...

  22. Embodiment and the Perceptual Hypothesis

    The Perceptual Hypothesis opposes Inferentialism, which is the view that our knowledge of others' mental features is always inferential. The claim that some mental features are embodied is the claim that some mental features are realised by states or processes that extend beyond the brain. The view I discuss here is that the Perceptual ...

  23. Frontiers

    Figure 1.Participants' audiogram and experimental setup. (A) Individual pure-tone hearing thresholds for all HI participants averaged across ears (thin gray lines), participants (bold purple line), and the standard deviation (shaded purple area).(B) An experimental setup with the three participants seated equally spaced around a table with a diameter of 1.2 m.