type of speech communication

List of Theories
Privacy Policy
Opt-out preferences

Types of Speech in Communication

Communication is a fundamental aspect of human interaction, and speech is one of its most powerful tools. Speech allows individuals to convey ideas, emotions, intentions, and information effectively. Different types of speech are used depending on the context, audience, and purpose of communication.

Understanding these types helps in selecting the appropriate mode of expression and achieving the desired impact.

1. Informative Speech

Informative speech educates or informs the audience about a particular topic. The primary goal is to provide knowledge, explain concepts, or clarify issues. This type of speech is often used in educational settings, professional presentations, or public lectures.

Example: A professor giving a lecture on the impacts of climate change is delivering an informative speech. The professor provides data, explains scientific concepts, and discusses potential solutions to the problem. The focus is on sharing factual information to enhance the audience’s understanding.

2. Persuasive Speech

Persuasive speech aims to convince the audience to adopt a certain viewpoint or take a specific action. The speaker uses logical arguments, emotional appeals, and credible evidence to influence the audience’s beliefs, attitudes, or behaviors. Persuasive speeches are common in political campaigns, advertising, and debates.

Example: A politician giving a campaign speech will likely use persuasion to garner support. They might highlight their achievements, present their future plans and appeal to the emotions of the audience by addressing pressing societal issues. The objective is to persuade the audience to vote for them.

3. Demonstrative Speech

Demonstrative speech involves showing the audience how to do something. It combines explanation with practical demonstration, making it easier for the audience to understand and replicate the process. This type of speech is useful in workshops, training sessions, and instructional videos.

Example: A chef giving a cooking class is engaging in demonstrative speech. They not only explain the recipe but also demonstrate each step, such as chopping vegetables, mixing ingredients, and cooking the dish. The audience learns by watching and can follow along.

4. Entertaining Speech

Entertaining speech is intended to amuse the audience and provide enjoyment. While it may contain informative or persuasive elements, its primary purpose is to entertain. This type of speech is often light-hearted, humorous, and engaging, making it suitable for social events, ceremonies, or entertainment shows.

Example: A stand-up comedian performing a routine uses an entertaining speech to make the audience laugh. The comedian may share funny anecdotes, joke about everyday situations, or use witty observations to entertain the crowd. The focus is on creating an enjoyable experience.

5. Special Occasion Speech

Special occasion speech is delivered during specific events or ceremonies, such as weddings, graduations, funerals, or award ceremonies. The content is often personalized and tailored to the occasion, focusing on the significance of the event and the emotions associated with it.

Example: During a wedding, the best man might give a special occasion speech to honor the couple. The speech might include heartfelt memories, humorous stories, and well-wishes for the future. The purpose is to celebrate the occasion and express support for the couple.

6. Impromptu Speech

An impromptu speech is delivered without preparation, often in response to an unexpected situation or question. It requires quick thinking and the ability to articulate thoughts clearly on the spot. This type of speech is common in casual conversations, interviews, or meetings.

Example: In a team meeting, an employee might be asked to give an impromptu speech about the progress of a project. Without prior notice, the employee summarizes the project’s status, highlights key achievements, and addresses any challenges. The speech is spontaneous and unscripted.

7. Extemporaneous Speech

Extemporaneous speech is prepared in advance but delivered without a script. The speaker has a general outline or notes but speaks more freely, allowing for natural delivery and adaptability. This type of speech is common in business presentations, academic conferences, and public speaking engagements.

Example: A business executive presenting a quarterly report to stakeholders might use extemporaneous speech. They have prepared key points and data but speak conversationally, adjusting their delivery based on the audience’s reactions and questions. This approach allows for a more engaging and dynamic presentation.

8. Manuscript Speech

Manuscript speech is read word-for-word from a prepared text. This type of speech is often used when precise wording is essential, such as in official statements, legal proceedings, or news broadcasts. The speaker focuses on delivering the content accurately without deviation.

Example: A news anchor reading the evening news is using manuscript speech. The anchor reads from a teleprompter, ensuring that the information is conveyed accurately and clearly. The emphasis is on precision and professionalism.

9. Memorized Speech

Memorized speech involves delivering a speech from memory, without notes or a script. This approach is often used in performances, speeches that require exact wording, or competitive speaking events. Memorization allows for a polished and confident delivery but requires extensive practice.

Example: An actor reciting a monologue in a play is giving a memorized speech. The actor has committed the lines to memory and delivers them with emotion and expression, engaging the audience fully. The speech is fluid and rehearsed, showcasing the actor’s skill.

10. Motivational Speech

Motivational speech is designed to inspire and energize the audience, often encouraging them to pursue their goals or overcome challenges. The speaker uses personal stories, powerful messages, and emotional appeals to uplift the audience and provoke action.

Motivational speeches are common in self-help seminars, leadership conferences, and personal development events.

Example: A life coach speaking to a group of entrepreneurs might give a motivational speech about resilience and perseverance. The coach shares personal experiences of overcoming obstacles and encourages the audience to stay focused on their goals, despite setbacks.

11. Pitch Speech

A pitch speech is a brief, persuasive speech used to present an idea, product, or proposal to an audience, usually with the aim of securing funding, approval, or support. The speaker must be concise, clear, and convincing, often focusing on the benefits and potential impact of the proposal.

Example: An entrepreneur pitching his startup idea to potential investors is giving a pitch speech. The entrepreneur outlines the problem their product solves, the market opportunity, and how the investors will benefit, all within a few minutes.

A eulogy is a speech delivered at a funeral or memorial service, honoring the life and legacy of a deceased person. The speaker reflects on the person’s character, achievements, and the impact they had on others, often blending personal anecdotes with expressions of gratitude and remembrance.

Example: A family member delivering a eulogy at a funeral might share touching stories about the deceased, highlighting their kindness, generosity, and love for their family. The eulogy serves as a tribute, celebrating the life of the person who has passed away.

Tips for Giving a Great Speech

1. Know Your Audience : Understanding your audience’s interests, values, and expectations helps tailor your message effectively.

2. Structure Your Speech: Organize your content with a clear introduction, body, and conclusion. A well-structured speech is easier to follow and more impactful.

3. Practice: Rehearse your speech multiple times to become familiar with the content and improve your delivery. Rehearse your speech alone or in front of your friends (maybe in low numbers) to become familiar with the vocabulary and pronunciation of the precise phrases. so you can control the speed and improve your speech delivery.

4. Use Visual Aids: Visual aids can enhance understanding and retention. Ensure they are relevant and not overly distracting.

5. Engage with the Audience: Make eye contact, use gestures, and involve the audience through questions or interactive elements to keep them engaged.

How to Make Your Speech More Memorable

1. Start with a Strong Opening: Capture attention with a powerful quote, anecdote, or question that relates to your main message.

2. Use Stories: People remember stories better than facts alone. Incorporate personal or relatable stories to illustrate your points.

3. Be Passionate: Express enthusiasm and passion for your topic. A passionate delivery can leave a lasting impression.

4. Repeat Key Points: Repetition helps reinforce important ideas. Summarize key points at the end of your speech to ensure they stick.

5. End with a Call to Action: Encourage your audience to take a specific action or reflect on your message. A clear and compelling conclusion makes your speech memorable.

Non-Verbal Communication
Various Types Of Communication Styles - Examples
Types Of Thinking-Tips And Tricks To Improve Thinking Skill
Conflict Management - Skills, Styles And Models
Types Of Motivation And Its Components - Examples
Most Important Social Skills - Explained With Examples

The 4 types of speeches in public speaking

Informative, demonstrative, persuasive and special occasion.

By: Susan Dugdale

There are four main types of speeches or types of public speaking.

Demonstrative
Special occasion or Entertaining

To harness their power a speaker needs to be proficient in all of them: to understand which speech type to use when, and how to use it for maximum effectiveness.

What's on this page:

An overview of each speech type, how it's used, writing guidelines and speech examples:

informative
demonstrative
special occasion/entertaining
how, and why, speech types overlap

Graphic: 4 types of speeches: informative, demonstrative, persuasive, special occasion

Return to Top

Informative speeches

An informative speech does as its name suggests: informs. It provides information about a topic. The topic could be a place, a person, an animal, a plant, an object, an event, or a process.

The informative speech is primarily explanatory and educational.

Its purpose is not to persuade or influence opinion one way or the other. It is to provide sufficient relevant material, (with references to verifiable facts, accounts, studies and/or statistics), for the audience to have learned something.

What they think, feel, or do about the information after they've learned it, is up to them.

This type of speech is frequently used for giving reports, lectures and, sometimes for training purposes.

Examples of informative speech topics:

the number, price and type of dwellings that have sold in a particular suburb over the last 3 months
the history of the tooth brush
how trees improves air quality in urban areas
a brief biography of Bob Dylan
the main characteristics of Maine Coon cats
the 1945 US bombing of Hiroshima and Nagasaki
the number of, and the work of local philanthropic institutions
the weather over the summer months
the history of companion planting
how to set up a new password
how to work a washing machine

Image: companion planting - cabbage planted alongside orange flowering calendula. Text: The history of companion planting - informative speech topic possibilities

Click this link if you'd like more informative topic suggestions . You'll find hundreds of them.

And this link to find out more about the 4 types of informative speeches : definition, description, demonstration and explanation. (Each with an example outline and topic suggestions.)

Image - label - 4 Informative speech example outlines: definition, description, explanation, demonstration

Demonstration, demonstrative or 'how to' speeches

A demonstration speech is an extension of an informative process speech. It's a 'how to' speech, combining informing with demonstrating.

The topic process, (what the speech is about), could either be demonstrated live or shown using visual aids.

The goal of a demonstrative speech is to teach a complete process step by step.

It's found everywhere, all over the world: in corporate and vocational training rooms, school classrooms, university lecture theatres, homes, cafes... anywhere where people are either refreshing or updating their skills. Or learning new ones.

Knowing to how give a good demonstration or 'how to' speech is a very valuable skill to have, one appreciated by everybody.

Examples of 'how to' speech topics are:

how to braid long hair
how to change a car tire
how to fold table napkins
how to use the Heimlich maneuver
how to apply for a Federal grant
how to fill out a voting form
how to deal with customer complaints
how to close a sale
how to give medicine to your cat without being scratched to bits!

Image: drawing of a very cute cat. Text: 10 minute demonstration speech topics - How to give a cat medicine without being scratched to bits.

Resources for demonstration speeches

1 . How to write a demonstration speech Guidelines and suggestions covering:

choosing the best topic : one aligning with your own interests, the audience's, the setting for the speech and the time available to you
how to plan, prepare and deliver your speech - step by step guidelines for sequencing and organizing your material plus a printable blank demonstration speech outline for you to download and complete
suggestions to help with delivery and rehearsal . Demonstration speeches can so easily lurch sideways into embarrassment. For example: forgetting a step while demonstrating a cake recipe which means it won't turn out as you want it to. Or not checking you've got everything you need to deliver your speech at the venue and finding out too late, the very public and hard way, that the lead on your laptop will not reach the only available wall socket. Result. You cannot show your images.

Image: label saying 'Demonstration speech sample outline. Plus video. How to leave a good voice mail message.

2. Demonstration speech sample outline This is a fully completed outline of a demonstration speech. The topic is 'how to leave an effective voice mail message' and the sample covers the entire step by step sequence needed to do that.

There's a blank printable version of the outline template to download if you wish and a YouTube link to a recording of the speech.

3. Demonstration speech topics 4 pages of 'how to' speech topic suggestions, all of them suitable for middle school and up.

Images x 3: cats, antique buttons, mannequins in a pond. Text: How to choose a pet, How to make jewelry from antique buttons, How to interpret modern art.

Persuasive speeches

The goal of a persuasive speech is to convince an audience to accept, or at the very least listen to and consider, the speaker's point of view.

To be successful the speaker must skillfully blend information about the topic, their opinion, reasons to support it and their desired course of action, with an understanding of how best to reach their audience.

Everyday examples of persuasive speeches

Common usages of persuasive speeches are:

what we say when being interviewed for a job
presenting a sales pitch to a customer
political speeches - politicians lobbying for votes,
values or issue driven speeches e.g., a call to boycott a product on particular grounds, a call to support varying human rights issues: the right to have an abortion, the right to vote, the right to breathe clean air, the right to have access to affordable housing and, so on.

Models of the persuasive process

The most frequently cited model we have for effective persuasion is thousands of years old. Aristotle, the Greek philosopher, 384–322 BC , explained it as being supported by three pillars: ethos, pathos and logos.

Image: Fresco from School of Aristotle by Gustav Spangenberg. Text: 3 pillars of persuasion - ethos, logos, pathos

Briefly, ethos is the reliability and credibility of the speaker. How qualified or experienced are they talk on the topic? Are they trustworthy? Should we believe them? Why?

Pathos is the passion, emotion or feeling you, the speaker, bring to the topic. It's the choice of language you use to trigger an emotional connection linking yourself, your topic and the audience together, in a way that supports your speech purpose.

(We see the echo of Pathos in words like empathy: the ability to understand and share the feels of another, or pathetic: to arouse feelings of pity through being vulnerable and sad.)

Logos is related to logic. Is the information we are being presented logical and rational? Is it verifiable? How is it supported? By studies, by articles, by endorsement from suitably qualified and recognized people?

To successfully persuade all three are needed. For more please see this excellent article: Ethos, Pathos, Logos: 3 Pillars of Public Speaking and Persuasion

Monroe's Motivated Sequence of persuasion

Another much more recent model is Monroe's Motivated Sequence based on the psychology of persuasion.

It consists of five consecutive steps: attention, need, satisfaction, visualization and action and was developed in the 1930s by American Alan H Monroe, a lecturer in communications at Purdue University. The pattern is used extensively in advertising, social welfare and health campaigns.

Resources for persuasive speeches

1. How to write a persuasive speech Step by step guidelines covering:

speech topic selection
setting speech goals
audience analysis
empathy and evidence
balance and obstacles
4 structural patterns to choose from

2. A persuasive speech sample outline using Monroe's Motivated Sequence

3. An example persuasive speech written using Monroe's Motivated Sequence

4. Persuasive speech topics : 1032+ topic suggestions which includes 105 fun persuasive ideas , like the one below.☺

Image: a plate with the remains of a piece of chocolate cake. Text: Having your cake and eating it too is fair.

Special occasion or entertaining speeches

The range of these speeches is vast: from a call 'to say a few words' to delivering a lengthy formal address.

This is the territory where speeches to mark farewells, thanksgiving, awards, birthdays, Christmas, weddings, engagements and anniversaries dwell, along with welcome, introduction and thank you speeches, tributes, eulogies and commencement addresses.

In short, any speech, either impromptu or painstakingly crafted, given to acknowledge a person, an achievement, or an event belongs here.

You'll find preparation guidelines, as well as examples of many special occasion speeches on my site.

Resources for special occasion speeches

How to prepare:

an acceptance speech , with an example acceptance speech
a birthday speech , with ongoing links to example 18th, 40th and 50th birthday speeches
an office party Christmas speech , a template with an example speech
an engagement party toast , with 5 examples
a eulogy or funeral speech , with a printable eulogy planner and access to 70+ eulogy examples
a farewell speech , with an example (a farewell speech to colleagues)
a golden (50th) wedding anniversary speech , with an example speech from a husband to his wife
an impromptu speech , techniques and templates for impromptu speaking, examples of one minute impromptu speeches with a printable outline planner, plus impromptu speech topics for practice
an introduction speech for a guest speaker , with an example
an introduction speech for yourself , with an example
a maid of honor speech for your sister , a template, with an example
a retirement speech , with an example from a teacher leaving to her students and colleagues
a student council speech , a template, with an example student council president, secretary and treasurer speech
a Thanksgiving speech , a template, with an example toast
a thank you speech , a template, with an example speech expressing thanks for an award, also a business thank you speech template
a tribute (commemorative) speech , with a template and an example speech
a welcome speech for an event , a template, an example welcome speech for a conference, plus a printable welcome speech planner
a welcome speech for new comers to a church , a template with an example speech
a welcome speech for a new member to the family , a template with an example

Speech types often overlap

Because speakers and their speeches are unique, (different content, purposes, and audiences...), the four types often overlap. While a speech is generally based on one principal type it might also have a few of the features belonging to any of the others.

For example, a speech may be mainly informative but to add interest, the speaker has used elements like a demonstration of some sort, persuasive language and the brand of familiar humor common in a special occasion speech where everybody knows each other well.

The result is an informative 'plus' type of speech. A hybrid! It's a speech that could easily be given by a long serving in-house company trainer to introduce and explain a new work process to employees.

how to write a good speech . This is a thorough step by step walk through, with examples, of the general speech writing process. It's a great place to start if you're new to writing speeches. You'll get an excellent foundation to build on.
how to plan a speech - an overview of ALL the things that need to be considered before preparing an outline, with examples
how to outline a speech - an overview, with examples, showing how to structure a speech, with a free printable blank speech outline template to download
how to make and use cue cards - note cards for extemporaneous speeches
how to use props (visual aids)

And for those who would like their speeches written for them:

commission me to write for you

Image: woman sitting at a writing desk circa 19th century. Text: Speech writer - a ghost writer who writes someone one's speech for them

speaking out loud

Subscribe for FREE weekly alerts about what's new For more see speaking out loud

Susan Dugdale - write-out-loud.com - Contact

Top 10 popular pages

Welcome speech
Demonstration speech topics
Impromptu speech topic cards
Thank you quotes
Impromptu public speaking topics
Farewell speeches
Phrases for welcome speeches
Student council speeches
Free sample eulogies

From fear to fun in 28 ways

A complete one stop resource to scuttle fear in the best of all possible ways - with laughter.

Public speaking games ebook cover - write-out-loud.com

Useful pages

Search this site
About me & Contact
Free e-course
Privacy policy

Designed and built by Clickstream Designs

Learn The Types

Learn About Different Types of Things and Unleash Your Curiosity

Types of Speeches: A Guide to Different Styles and Formats

Speeches are a powerful way to communicate ideas, inspire people, and create change. There are many different types of speeches, each with its own unique characteristics and formats. In this article, we’ll explore some of the most common types of speeches and how to prepare and deliver them effectively.

1. Informative Speech

An informative speech is designed to educate the audience on a particular topic. The goal is to provide the audience with new information or insights and increase their understanding of the topic. The speech should be well-researched, organized, and delivered in a clear and engaging manner.

2. Persuasive Speech

A persuasive speech is designed to convince the audience to adopt a particular viewpoint or take action. The goal is to persuade the audience to agree with the speaker’s perspective and take action based on that belief. The speech should be well-researched, organized, and delivered in a passionate and compelling manner.

3. Entertaining Speech

An entertaining speech is designed to entertain the audience and create a memorable experience. The goal is to engage the audience and make them laugh, cry, or think deeply about a particular topic. The speech can be humorous, inspirational, or emotional and should be delivered in a lively and engaging manner.

4. Special Occasion Speech

A special occasion speech is designed for a specific event or occasion, such as a wedding, graduation, or retirement party. The goal is to celebrate the occasion and honor the people involved. The speech should be personal, heartfelt, and delivered in a sincere and respectful manner.

5. Impromptu Speech

An impromptu speech is delivered without any preparation or planning. The goal is to respond quickly and effectively to a particular situation or question. The speech should be delivered in a clear and concise manner and address the topic at hand.

In conclusion, speeches are an important way to communicate ideas, inspire people, and create change. By understanding the different types of speeches and their unique characteristics and formats, individuals can prepare and deliver successful speeches that are engaging, informative, and memorable.

You Might Also Like:

Patio perfection: choosing the best types of pavers for your outdoor space, a guide to types of pupusas: delicious treats from central america, exploring modern period music: from classical to jazz and beyond.

Understanding Different Types of Speeches and Their Purposes

Public speaking is an essential skill that can open doors, influence opinions, and convey vital information. Understanding the different types of speeches and how to craft them effectively can significantly enhance one's communication skills. Whether you're presenting an informative lecture, persuading an audience, or celebrating a special occasion, knowing which type of speech to use and how to deliver it is crucial. This guide will walk you through the distinct types of speeches, offering valuable insights and practical tips to enhance your speech delivery.

Informative Speeches: Sharing Knowledge Effectively

Informative speeches aim to educate the audience by providing them with knowledge on a particular subject. The informative speech's purpose is to share one's knowledge clearly and concisely, ensuring that the audience walks away with a better understanding of the topic. These speeches are often used in academic settings, business environments, and local community groups.

Classroom Lectures : Teachers use informative speeches to introduce new concepts to students. These informative speeches vary widely depending on the subject, from history lessons to scientific theories.
Business Presentations : Company employees might give informative speeches to share updates on the latest project or explain new policies.
Informative Presentations at Community Groups : Individuals might speak about topics of interest, such as healthy living or local history, to provide valuable information to group members.

Key Elements:

Clarity : The information presented should be clear and understandable.
Credible Sources : Support your speech with credible evidence and sources to build trust with the audience.
Visual Aids : Use visual aids like slides or charts to help illustrate key points and make the information more engaging.

Tip : When preparing an informative speech, focus on simplifying complex theories. Break down ambiguous ideas into more manageable pieces of information. Use examples and relatable scenarios to make the content more accessible to your audience.

Persuasive Speeches: Influencing Beliefs and Actions

Persuasive speeches are designed to convince the audience to adopt a particular point of view or take a specific action. Unlike informative speeches that merely share information, persuasive speeches actively aim to change the listener's beliefs or behaviors. Persuasive speech writing often involves critical thinking and appeals to emotions, making them powerful tools in public speeches.

Political Speeches : Politicians often use persuasive speeches to influence public opinion and gain support for their policies.
Debate Speech : In a debate setting, speakers use persuasive language and evidence to argue a particular issue, focusing on presenting a convincing argument backed by facts and logic.
Campaign Pitches : Candidates running for office or promoting a cause use pitch speeches to rally support and convince the audience to back their initiatives.

Techniques:

Factual Evidence : Support your arguments with data, statistics, and credible sources to build a strong case.
Emotional Appeal: Connect with the audience on an emotional level to make your message more impactful.
Convincing Tone : Use a confident and assertive tone to convey conviction in your message.

‍ Tip : When delivering a persuasive speech, focus on the audience's beliefs and values. Tailor your message to resonate with their concerns and interests, making it more likely that they'll be persuaded by your argument.

Demonstrative Speeches: Showing How It's Done

Demonstrative speeches are instructional and focus on showing the audience how to do something through a step-by-step process. The primary purpose of a demonstrative speech is to provide a clear understanding of how to perform a specific task, making it a valuable skill in educational and training contexts. A demonstrative speech utilizes visual aids and hands-on examples to enhance learning.

Examples of Demonstrative Speeches:

Cooking Classes : A chef might give a demonstrative speech on how to prepare a specific dish, such as Mediterranean cooking, showing each step of the process.
How-To Workshops : Professionals may offer workshops to demonstrate techniques in fields like carpentry, art, or technology.
Educational Demonstrations : Teachers use demonstrative speeches to explain scientific experiments or procedures.

Key Aspects:

Physical Demonstration : Showing the steps visually helps the audience follow along and understand better.
Clear Instructions : Provide detailed explanations for each step to avoid confusion.
Visual Aids : Use props, tools, or presentation slides to support the demonstration.

‍ Tip : Incorporate visual aids to enhance your demonstrative speech. They can help illustrate the steps more clearly, making it easier for the audience to follow and replicate the task.

Oratorical Speeches: The Art of Powerful Delivery

Oratorical speeches are formal speeches delivered with eloquence and often emphasize powerful rhetoric and grand style. These speeches are common in events that celebrate significant moments, such as inaugurations, memorials, or public celebrations. The main speaker's goal in an oratorical speech is to captivate and inspire the audience through their choice of words and delivery style.

Inauguration Ceremonies : Leaders deliver oratorical speeches to set the tone for their leadership and outline their vision.
Commemorative Events : During memorials or national holidays, speakers use oratorical speeches to honor significant historical figures or events.
Public Celebrations : At large gatherings, speakers deliver oratorical speeches to motivate and unify the community.
Eloquent Language : Use of rich, powerful language to engage the audience.
Rhetorical Devices : Employing techniques like repetition, metaphors, and analogies to emphasize key points.
Strong Deliver y: A commanding presence and vocal variety are crucial to maintaining the audience's attention.

Tip : Practice delivering your speech with emotion and passion. Focus on your body language, voice modulation, and eye contact to make a lasting impression on your audience.

Motivational Speeches: Inspiring the Audience

Motivational speeches are designed to inspire and encourage the audience to take action or improve their lives. These speeches often draw on personal experiences, stories of overcoming obstacles, and positive affirmations to motivate listeners.

Commencement Addresses : Often delivered to college students, motivational speeches during graduation ceremonies encourage them to pursue their dreams and face challenges head-on.
Corporate Events : Motivational speakers might inspire employees to embrace change, enhance productivity, or foster teamwork.
Personal Development Seminars : Individuals attend these seminars to gain insights on self-improvement and personal growth.
Inspiration : Focus on sharing stories or messages that ignite passion and drive.
Connection : Establish a personal connection with the audience to make the message more relatable.
Actionable Steps : Provide practical advice or steps the audience can take to apply the motivation in their lives.

Tip : When crafting a motivational speech, focus on genuine stories and experiences. Authenticity can have a more profound impact than generalized advice.

Entertaining Speeches: Engaging and Amusing the Audience

Entertaining speeches aim to amuse and engage the audience, often through humor, storytelling , or personal anecdotes. These speeches are typically delivered in informal settings where the goal is to entertain rather than inform or persuade.

Wedding Toasts : Friends or family members give lighthearted speeches to celebrate the newlyweds.
Talent Show Introductions : Hosts use entertaining speeches to introduce performers and keep the audience engaged.
After-Dinner Speeches : Delivered at social gatherings, these speeches provide entertainment and often include humorous observations or stories.
Humor : Use jokes or funny anecdotes to keep the audience entertained.
Personal Stories : Sharing personal experiences can make the speech more relatable and engaging.
Audience Interaction : Engage with the audience by asking questions or encouraging participation.

Tip : To deliver a successful entertaining speech, keep the tone light and relatable. Consider the audience's mood and interests, and tailor your content to fit the occasion.

Special Occasion Speeches: Marking Important Events

Special occasion speeches are crafted to honor a particular event, person, or milestone. These speeches are often given during significant moments, such as award ceremonies, weddings, or anniversaries, where the main goal is to celebrate or commemorate. Special occasion speeches can vary widely in tone and style, depending on the nature of the event and the relationship between the speaker and the honoree.

Acceptance Speeches : Recipients of awards or honors express gratitude and acknowledge those who supported them.
Wedding Toasts : Speeches given by the best man or maid of honor to celebrate the couple's journey.
Speeches at Award Ceremonies : Honoring the achievements of individuals or groups and highlighting their contributions.

How to Deliver Special Occasion Speeches:

Emotional Connection : Connect with the audience by expressing genuine emotions and heartfelt sentiments.
Balance : Find the right balance between personal anecdotes and the significance of the event.
Focus on the Main Point : Highlight the core purpose of the event, whether it's celebrating a person's achievements or commemorating a milestone.

Tip : When delivering a special occasion speech, focus on conveying emotions that resonate with the event's atmosphere. Share personal experiences or stories that highlight the significance of the occasion.

Impromptu Speeches: Speaking Off-the-Cuff

Impromptu speeches are delivered without prior preparation, often in response to spontaneous situations. They require quick thinking and the ability to communicate ideas clearly on the spot, making them an essential skill in both professional and personal settings.

Responding to Unexpected Questions : Handling questions during a Q&A session.
Speaking at Volunteer Activities : Giving a short speech to thank volunteers or recognize their efforts.
Community Events : Offering remarks at a local gathering or event.

Tips for Delivering Impromptu Speeches:

Stay Calm : Take a deep breath and collect your thoughts before speaking.
Organize Your Ideas : Focus on a few key points to keep your speech structured.
Be Authentic : Speak from the heart and be genuine in your delivery.

‍ Tip : Practice impromptu speaking regularly to build confidence. Use prompts or scenarios to simulate spontaneous speaking situations, helping you become more comfortable with thinking on your feet.

How to Choose the Right Type of Speech for Your Situation

Selecting the appropriate type of speech depends on the audience, purpose, and context. It's essential to analyze these factors before deciding on the speech format:

Audience Analysis : Consider the audience's interests, knowledge level, and expectations. Tailor your speech to resonate with them.
Purpose : Determine whether the goal is to inform, persuade, entertain, or commemorate. This will guide the structure and content of your speech.
Context : Consider the setting and occasion. For example, a formal event might require a more structured informative or persuasive speech, while a casual gathering might call for an entertaining or impromptu speech.

Examples of Blending Speech Types: ‍

Combining Informative and Persuasive: A speaker at a health seminar might provide informative details about a health issue and persuade the audience to adopt healthier habits.
Mixing Entertaining and Special Occasion : A keynote speaker at a wedding might blend humor with heartfelt sentiments to engage the audience while celebrating the couple.

Using Teleprompters for Different Types of Speeches

When delivering a speech, especially in a formal setting or a high-stakes event, using a teleprompter can be a great tool for maintaining a smooth and engaging presentation. A teleprompter displays the speech text on a transparent screen, positioned in front of the speaker, ensuring that they maintain eye contact with the audience while following their script. This technique is widely used across various types of speeches, from informative speeches in business presentations to motivational speeches at large events.

Advantages of Using Teleprompters:

Confidence and Flow : Teleprompters help speakers stay on track, reducing the risk of losing their place or forgetting key points.
Engagement : By allowing speakers to maintain eye contact with the audience, teleprompters enhance engagement and make the speech more impactful.
Professional Delivery : Teleprompters ensure that the speech is delivered as planned, without unnecessary pauses or deviations, contributing to a more polished presentation.

Mastering the Types of Speeches

Understanding and mastering the different types of speeches can significantly enhance your public speaking skills. From informative speeches that educate to persuasive speeches that inspire action, each type plays a unique role in effective communication. Practice these types regularly, and don't hesitate to experiment with blending different styles to suit your needs. By honing your skills in various speech types, you can become a more versatile and confident speaker, capable of captivating any audience.

Recording videos is hard. Try Teleprompter.com

Recording videos without a teleprompter is like sailing without a compass..

Understand the different types of speeches and how to deliver them effectively. Get tips for informative, persuasive, motivational, and more speech styles.

Creative Vlog Ideas for Beginners

Get inspired with creative vlog ideas for beginners! From personal vlogs to educational content, find fresh & engaging YouTube video ideas.

Since 2018 we’ve helped 1M+ creators smoothly record 17,000,000 + videos

Effortlessly record videos and reduce your anxiety so you can level up the quality of your content creation

The best Teleprompter software on the market for iOS.

Address: Budapest Podmaniczky utca 57. II. em. 14. 1064 🇭🇺 ‍ Contact:

Subject Libguide: Speech Communication: Types of speeches

Writing a Speech
Effective communication
Additional Resources
Public Domain Images & Creative Commons
Evaluating Resources This link opens in a new window
APA Format This link opens in a new window
Multi-search Database and ebooks guide This link opens in a new window
Featured online books This link opens in a new window

A persuasive speech tries to influence or reinforce the attitudes, beliefs, or behavior of an audience. This type of speech often includes the following elements:

appeal to the audience
appeal to the reasoning of the audience
focus on the relevance of your topic
alligns the speech to the audience - ensure they understand the information expressed

An informative speech is one that informs the audience. These types of speeches can be on a variety of topics:

A informative speech will:

define terms to make the information more precise
use descriptions to help the audience towards a larger picture
include an demonstration
explain concept the informative speech is conveying
<< Previous: Home
Next: Writing a Speech >>
Last Updated: Nov 6, 2020 3:03 PM
URL: https://mccollege.libguides.com/speech

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Chapter 1: The Speech Communication Process

The Speech Communication Process

Listener(s)

Interference

As you might imagine, the speaker is the crucial first element within the speech communication process. Without a speaker, there is no process. The speaker is simply the person who is delivering, or presenting, the speech. A speaker might be someone who is training employees in your workplace. Your professor is another example of a public speaker as s/he gives a lecture. Even a stand-up comedian can be considered a public speaker. After all, each of these people is presenting an oral message to an audience in a public setting. Most speakers, however, would agree that the listener is one of the primary reasons that they speak.

The listener is just as important as the speaker; neither one is effective without the other. The listener is the person or persons who have assembled to hear the oral message. Some texts might even call several listeners an “audience. ” The listener generally forms an opinion as to the effectiveness of the speaker and the validity of the speaker’s message based on what they see and hear during the presentation. The listener’s job sometimes includes critiquing, or evaluating, the speaker’s style and message. You might be asked to critique your classmates as they speak or to complete an evaluation of a public speaker in another setting. That makes the job of the listener extremely important. Providing constructive feedback to speakers often helps the speaker improve her/his speech tremendously.

Another crucial element in the speech process is the message. The message is what the speaker is discussing or the ideas that s/he is presenting to you as s/he covers a particular topic. The important chapter concepts presented by your professor become the message during a lecture. The commands and steps you need to use, the new software at work, are the message of the trainer as s/he presents the information to your department. The message might be lengthy, such as the President’s State of the Union address, or fairly brief, as in a five-minute presentation given in class.

The channel is the means by which the message is sent or transmitted. Different channels are used to deliver the message, depending on the communication type or context. For instance, in mass communication, the channel utilized might be a television or radio broadcast. The use of a cell phone is an example of a channel that you might use to send a friend a message in interpersonal communication. However, the channel typically used within public speaking is the speaker’s voice, or more specifically, the sound waves used to carry the voice to those listening. You could watch a prerecorded speech or one accessible on YouTube, and you might now say the channel is the television or your computer. This is partially true. However, the speech would still have no value if the speaker’s voice was not present, so in reality, the channel is now a combination of the two -the speaker’s voice broadcast through an electronic source.

The context is a bit more complicated than the other elements we have discussed so far. The context is more than one specific component. For example, when you give a speech in your classroom, the classroom, or the physical location of your speech, is part of the context . That’s probably the easiest part of context to grasp.

But you should also consider that the people in your audience expect you to behave in a certain manner, depending on the physical location or the occasion of the presentation . If you gave a toast at a wedding, the audience wouldn’t be surprised if you told a funny story about the couple or used informal gestures such as a high-five or a slap on the groom’s back. That would be acceptable within the expectations of your audience, given the occasion. However, what if the reason for your speech was the presentation of a eulogy at a loved one’s funeral? Would the audience still find a high-five or humor as acceptable in that setting? Probably not. So the expectations of your audience must be factored into context as well.

The cultural rules -often unwritten and sometimes never formally communicated to us -are also a part of the context. Depending on your culture, you would probably agree that there are some “rules ” typically adhered to by those attending a funeral. In some cultures, mourners wear dark colors and are somber and quiet. In other cultures, grieving out loud or beating one’s chest to show extreme grief is traditional. Therefore, the rules from our culture -no matter what they are -play a part in the context as well.

Every speaker hopes that her/his speech is clearly understood by the audience. However, there are times when some obstacle gets in the way of the message and interferes with the listener’s ability to hear what’s being said. This is interference , or you might have heard it referred to as “noise. ” Every speaker must prepare and present with the assumption that interference is likely to be present in the speaking environment.

Interference can be mental, physical, or physiological. Mental interference occurs when the listener is not fully focused on what s/he is hearing due to her/his own thoughts. If you’ve ever caught yourself daydreaming in class during a lecture, you’re experiencing mental interference. Your own thoughts are getting in the way of the message.

A second form of interference is physical interference . This is noise in the literal sense -someone coughing behind you during a speech or the sound of a mower outside the classroom window. You may be unable to hear the speaker because of the surrounding environmental noises.

The last form of interference is physiological . This type of interference occurs when your body is responsible for the blocked signals. A deaf person, for example, has the truest form of physiological interference; s/he may have varying degrees of difficulty hearing the message. If you’ve ever been in a room that was too cold or too hot and found yourself not paying attention, you’re experiencing physiological interference. Your bodily discomfort distracts from what is happening around you.

The final component within the speech process is feedback. While some might assume that the speaker is the only one who sends a message during a speech, the reality is that the listeners in the audience are sending a message of their own, called feedback . Often this is how the speaker knows if s/he is sending an effective message. Occasionally the feedback from listeners comes in verbal form – questions from the audience or an angry response from a listener about a key point presented. However, in general, feedback during a presentation is typically non-verbal -a student nodding her/his head in agreement or a confused look from an audience member. An observant speaker will scan the audience for these forms of feedback, but keep in mind that non-verbal feedback is often more difficult to spot and to decipher. For example, is a yawn a sign of boredom, or is it simply a tired audience member?

Generally, all of the above elements are present during a speech. However, you might wonder what the process would look like if we used a diagram to illustrate it. Initially, some students think of public speaking as a linear process -the speaker sending a message to the listener -a simple, straight line. But if you’ll think about the components we’ve just covered, you begin to see that a straight line cannot adequately represent the process, when we add listener feedback into the process. The listener is sending her/his own message back to the speaker, so perhaps the process might better be represented as circular. Add in some interference and place the example in context, and you have a more complete idea of the speech process.

Share This Book

10 Types of Speeches Every Speechwriter Should Know

“Speech is power. Speech is to persuade, to convert, to compel.” — Ralph Waldo Emmerson

Many events in human history can be traced back to that one well-written , well-presented speech. Speeches hold the power to move nations or touch hearts as long as they’re well thought out. This is why mastering the skill of speech-giving and speech writing is something we should all aim to achieve.

But the word “speech” is often too broad and general. So let’s explore the different types of speeches and explain their general concepts.

Basic Types of Speeches

While the core purpose is to deliver a message to an audience, we can still categorize speeches based on 4 main concepts: entertaining, informing, demonstrating and persuading.

The boundaries between these types aren’t always obvious though, so the descriptions are as clear as possible in order to differentiate between them.

1. Entertaining Speech

If you’ve been to a birthday party before, that awkward toast given by friends or family of the lucky birthday person is considered to fall under the definition of an entertaining speech .

The core purpose of an entertaining speech is to amuse the audience, and obviously, entertain them. They’re usually less formal in nature to help communicate emotions rather than to simply talk about a couple of facts.

Let’s face it, we want to be entertained after a long day. Who wouldn’t enjoy watching their favorite actors giving an acceptance speech , right?

You’ll find that entertaining speeches are the most common type of speeches out there. Some examples include speeches given by maids of honor or best men at weddings, acceptance speeches at the Oscars, or even the one given by a school’s principal before or after a talent show.

2. Informative Speech

When you want to educate your audience about a certain topic, you’ll probably opt to create an informative speech . An informative speech’s purpose is to simplify complex theories into simpler, easier-to-digest and less ambiguous ideas; in other words, conveying information accurately.

The informative speech can be thought of as a polar opposite to persuasive speeches since they don’t relate to the audience’s emotions but depend more on facts, studies, and statistics.

Although you might find a bit of overlap between informative and demonstrative speeches, the two are fairly distinct from one another. Informative speeches don’t use the help of visual aids and demonstrations, unlike demonstrative speeches, which will be described next.

Some examples of informative speeches can be speeches given by staff members in meetings, a paleontology lecture, or just about anything from a teacher (except when they’re telling us stories about their pasts).

3. Demonstrative Speech

ِFrom its name we can imagine that a demonstrative speech is the type of speech you want to give to demonstrate how something works or how to do a certain thing. A demonstrative speech utilizes the use of visual aids and/or physical demonstration along with the information provided.

Some might argue that demonstrative speeches are a subclass of informative speeches, but they’re different enough to be considered two distinct types. It’s like differentiating between “what is” and “how to”; informative speeches deal with the theoretical concept while demonstrative speeches look at the topic with a more practical lens.

Tutors explaining how to solve mathematical equations, chefs describing how to prepare a recipe, and the speeches given by developers demonstrating their products are all examples of demonstrative speeches.

4. Persuasive Speech

Persuasive speeches are where all the magic happens. A speech is said to be persuasive if the speaker is trying to prove why his or her point of view is right, and by extension, persuade the audience to embrace that point of view.

Persuasive speeches differ from other basic types of speeches in the sense that they can either fail or succeed to achieve their purpose. You can craft the most carefully written speech and present it in the most graceful manner, yet the audience might not be convinced.

Persuasive speeches can either be logical by using the help of facts or evidence (like a lawyer’s argument in court), or can make use of emotional triggers to spark specific feelings in the audience.

A great example of persuasive speeches is TED / TEDx Talks because a big number of these talks deal with spreading awareness about various important topics. Another good example is a business pitch between a potential client, i.e. “Why we’re the best company to provide such and such.”

Other Types of Speeches

Other types of speeches are mixes or variations of the basic types discussed previously but deal with a smaller, more specific number of situations.

5. Motivational Speech

A motivational speech is a special kind of persuasive speech, where the speaker encourages the audience to pursue their own well-being. By injecting confidence into the audience, the speaker is able to guide them toward achieving the goals they set together.

A motivational speech is more dependent on stirring emotions instead of persuasion with logic. For example, a sports team pep talk is considered to be a motivational speech where the coach motivates his players by creating a sense of unity between one another.

One of the most well-known motivational speeches (and of all speeches at that) is I Have a Dream by Martin Luther King Jr.

6. Impromptu Speech

Suppose you’re at work, doing your job, minding your own business. Then your co-worker calls you to inform you that he’s sick, there is a big meeting coming up, and you have to take his place and give an update about that project you’ve been working on.

What an awkward situation, right?

Well, that’s what an impromptu speech is: A speech given on the spot without any prior planning or preparation. It being impromptu is more of a property than a type on its own since you can spontaneously give speeches of any type (not that it’s a good thing though; always try to be prepared for your speeches in order for them to be successful).

Mark Twain once said, “It usually takes me more than three weeks to prepare a good impromptu speech.”

7. Oratorical Speech

This might sound a bit counterintuitive at first since the word oratorical literally means “relating to the act of speech-giving” but an oratorical speech is actually a very specific type of speech.

Oratorical speeches are usually quite long and formal in nature. Their purpose could be to celebrate a certain event like a graduation, to address serious issues and how to deal with them, or to mourn losses and give comfort like a eulogy at a funeral.

8. Debate Speech

The debate speech has the general structure of a persuasive speech in the sense that you use the same mechanics and figures to support your claim, but it’s distinct from a persuasive speech in that its main purpose is to justify your stance toward something rather than convince the audience to share your views.

Debate speeches are mostly improvised since you can’t anticipate all the arguments the other debaters (or the audience) could throw at you. Debate speeches benefit the speaker since it develops their critical thinking, public speaking, and research among other benefits .

You’ll find debate speeches to be common in public forums, legislative sessions, and court trials.

9. Forensic Speech

According to the American Forensic Association (AFA), the definition of a forensic speech is the study and practice of public speaking and debate. It’s said to be practiced by millions of high school and college students.

It’s called forensic because it’s styled like the competitions held in public forums during the time of the ancient Greeks.

Prior to a forensic speech, students are expected to research and practice a speech about a certain topic to teach it to an audience. Schools, universities, or other organizations hold tournaments for these students to present their speeches.

10. Special Occasion Speech

If your speech doesn’t fall under any of the previous types, then it probably falls under the special occasion speech . These speeches are usually short and to the point, whether the point is to celebrate a birthday party or introduce the guest of honor to an event.

Special occasion speeches can include introductory speeches, ceremonial speeches, and tributary speeches. You may notice that all these can be categorized as entertaining speeches. You’re right, they’re a subtype of entertaining speeches because they neither aim to teach nor to persuade you.

But this type shouldn’t be viewed as the black sheep of the group; in fact, if you aim to mark a significant event, special occasion speeches are your way to go. They are best suited (no pun intended) for a wedding, a bar mitzvah, or even an office party.

If you’ve reached this far, you should now have a general understanding of what a speech is and hopefully know which type of speech is needed for each occasion. I hope you’ve enjoyed and learned something new from this article. Which type will you use for your next occasion?

Photo by Forja2 Mx on Unsplash

Musings and updates from the content management team at Clippings.me.

4 Main Types of Speeches in Public Speaking (With Examples)

We live in a world where communication is king.

With social media and all the digital stuff, we’re bombarded with information constantly, and everyone is fighting for our attention.

Research shows that our attention spans have declined from 12 seconds to just 8.25 seconds in the past 15 years, even shorter than a goldfish’s attention span.

So, the point is being able to get your point across quickly and effectively is a big deal. That’s where the invaluable skill of public speaking comes in handy.

But being a great speaker goes beyond just having confidence. It’s about understanding different kinds of speeches and knowing which one works best for your audience and purpose.

In this blog, we will explore four main types of speeches (or types of public speaking), each with its own purpose and impact. By understanding these types, you can connect with your audience , cater to their needs, and deliver a message that resonates.

So, let’s dive right in:

What is Speech?

Importance of public speaking (7 benefits).

4 Main Types of Public Speeches (With Examples)

Other Types of Speeches

Final thoughts.

A speech is a formal or informal presentation in which a person communicates their thoughts, ideas, or information to an audience. It is a spoken expression of thoughts, often delivered in a structured and organized manner.

Speeches can be delivered to serve various purposes, such as to persuade , educate, motivate, or entertain the audience.

People usually give speeches in public places, like meetings, conferences, classrooms, or special events, aiming to connect with and influence the listeners through their words.

A public speech may involve the use of supporting materials, such as visual aids, slides , or props, to enhance understanding and engagement.

The delivery of a speech encompasses not only the words spoken but also factors like the tone of voice, body language , and timing, which can greatly impact the overall effectiveness and reception of the message.

You may want to check out our short video on how to speak without hesitation.

Public speaking is a superpower that transforms your life in more ways than you can imagine.

Here are 7 reasons why Public speaking is an invaluable skill:

Effective Communication: Being a good public speaker helps you express yourself clearly and confidently. It allows you to share your knowledge, opinions, and ideas in a captivating manner.
Professional Growth: Mastering public speaking gives you a competitive edge in the job market. It allows you to lead meetings , present ideas, negotiate deals, and pitch projects with confidence.

Building Confidence: Overcoming the fear of public speaking and delivering successful presentations significantly boosts your self-confidence . With experience, you become more self-assured in various situations, both inside and outside of public speaking.
Influence and Persuasion: A strong public speaker can inspire, motivate, and influence others. By effectively conveying your message, you can sway opinions, change attitudes, and drive positive change in your personal and professional circles.

Leadership Development: Public speaking is a crucial skill for effective leadership. It enables you to inspire and guide others, lead meetings and presentations, and rally people around a common goal.
Personal Development: Public speaking encourages personal growth and self-improvement. It pushes you out of your comfort zone, enhances your critical thinking and problem-solving skills, and helps you become a more well-rounded individual.
Increased Visibility: The ability to speak confidently in public attracts attention and raises your visibility among peers, colleagues, and potential employers. This can lead to new opportunities, collaborations, and recognition for your expertise.

Public speaking is a vital tool for social change. History has shown us how influential speeches have shaped the world we live in. From Martin Luther King Jr.’s “ I Have a Dream ” speech to Malala Yousafzai’s advocacy for girls’ education, public speaking has been at the forefront of inspiring change. Your words have the power to challenge beliefs, ignite passion, and rally others around a cause. So, if you have a message you want to share or a mission you want to pursue, mastering the art of public speaking is essential.

1. Informative Speech

An informative speech is a type of public speaking that aims to educate or provide information to the audience about a specific topic. The main purpose of this speech is to present facts, concepts, or ideas in a clear and understandable manner.

Delivering an Informative Speech

In an informative speech, the speaker’s objective is to provide knowledge, increase awareness, or explain a subject in detail.

To be informative, you need to structure your content in a way that’s clear and easy to follow. The structure of an informative speech typically includes:

an introduction where you grab the audience’s attention and introduce the topic
the body where you present the main points and supporting evidence
a conclusion where you summarize the key information and emphasize your message.
a Q&A session or a brief discussion to further deepen their understanding.

Informative speech could be formal or informal speech, depending on the context. However, it is helpful to maintain a conversational tone.

Use relatable examples, anecdotes, or even a touch of humor to keep your audience engaged and interested. Think of it as having a friendly chat with a group of curious friends.

Examples of Informative Speeches:

An Example of Informative Speech

Academic Settings : Students may deliver presentations to educate their classmates. Teachers or instructors may explain a specific subject to students in schools, colleges, and universities.
Business and Professional Presentations: In the corporate world, professionals may present information about industry trends, new technologies, market research, or company updates to inform and educate their colleagues or clients.
Public Events and Conferences: Informative speeches are prevalent in public events and conferences where experts and thought leaders share their knowledge and insights with a broader audience.
Ted Talks and Similar Platforms: TED speakers design their speeches to educate, inspire, and spread ideas that have the potential to make a positive impact on society.
Community Gatherings: Informative speeches can be delivered at community gatherings where speakers may inform the community about local issues, government policies, or initiatives aimed at improving the community’s well-being.

The beauty of informative speeches is their versatility; they can be adapted to different settings and tailored to suit the needs and interests of the audience.

2. Demonstrative Speech

In a demonstrative speech, the main goal is to show how to do something or how something works. It is like giving a step-by-step guide or providing practical instructions.

The purpose of a demonstrative speech is to educate or inform the audience about a specific process, task, or concept.

It can be about anything that requires a demonstration, such as cooking a recipe, performing a science experiment, using a software program, or even tying a tie.

The key to a successful demonstrative speech is to be organized and concise.

When preparing for a demonstrative speech, you need to break down the process or technique into clear and easy-to-follow steps.

You need to make sure that your audience can grasp the concepts and replicate the actions themselves. Visual aids like props, slides, or even live demonstrations are incredibly helpful in illustrating your points.

A great demonstrative speech not only teaches but also inspires.

You need to ignite a sense of enthusiasm and curiosity in your audience . Encourage them to try it out themselves and apply what they’ve learned in their own lives.

Examples of Demonstrative Speeches:

An Example of Demonstrative Speech

Educational Settings: Demonstrative speeches are often used in classrooms, workshops, or training sessions to teach students or participants how to perform specific activities. For instance, a teacher might give a demonstrative speech on how to conduct a science experiment, play a musical instrument, or solve a math problem.
Professional Training: In the workplace, a trainer might give a demonstrative speech on how to use a new software application, operate a piece of machinery, or follow safety protocols.
DIY and Home Improvement: Demonstrative speeches are commonly seen in DIY (do-it-yourself) videos, TV shows, or workshops where experts demonstrate how to complete tasks like painting a room, fixing plumbing issues, or building furniture.
Culinary Demonstrations: Demonstrative speeches are prevalent in the culinary world, where chefs or cooking experts showcase recipes and cooking techniques.

Overall, a demonstrative speech is a practical and hands-on type of speech that aims to educate, inform, and empower the audience by teaching them how to perform a particular task or skill.

3. Persuasive Speech

A persuasive speech is when the speaker tries to convince the audience to adopt or support a particular point of view, belief, or action. In a persuasive speech, the speaker aims to influence the audience’s opinions, attitudes, or behaviors.

You may present arguments and evidence to support your viewpoint and try to persuade the listeners to take specific actions or simply agree with you.

You have to use persuasive techniques such as logical reasoning, emotional appeals, and credibility to make your case.

Let me break it down for you.

First, you need a clear and persuasive message. Identify your objective and what you want to achieve with your speech. Once you have a crystal-clear goal, you can shape your arguments and craft your speech accordingly.
Secondly, you need to connect with your audience on an emotional level. You may use stories , anecdotes, and powerful examples to evoke emotions that resonate with your audience.
Thirdly, you need to present compelling evidence, facts, and logical reasoning to support your arguments. Back up your claims with credible sources and statistics.
Additionally, the delivery of your speech plays a crucial role in persuasion. Your body language, tone of voice , and overall presence should exude confidence and conviction.
Lastly, end your persuasive speech with a call to action. Whether it’s signing a petition, donating to a cause, or changing a behavior, make it clear what steps you want your audience to take.

Examples of Persuasive speeches:

An Example of Persuasive Speech

Political speeches: Politicians ****often deliver persuasive speeches to win support for their policies or convince people to vote for them.
Sales and marketing presentations: Advertisements ****use persuasive techniques to persuade consumers to buy their products.
Social issue speeches: Activists, advocates, or community leaders often give persuasive speeches to raise awareness about social issues and mobilize support for a cause.

Effective persuasion helps you win over clients, close deals, and secure promotions.

However, it’s important to note that persuasion should always be used ethically and with integrity. It’s not about manipulating people but rather about creating win-win situations.

4. Entertaining Speech

An entertaining speech is a type of public presentation that aims to captivate and amuse the audience while providing enjoyment and laughter. Unlike other types of speeches, entertaining speeches prioritize humor, storytelling , and engaging content to entertain and delight the listeners.

In an entertaining speech, the speaker uses various techniques such as jokes, anecdotes, funny stories, witty observations, humorous examples, and clever wordplay to engage the audience and elicit laughter.

The primary objective is to entertain and create a positive, lighthearted atmosphere.

An entertaining speech is a powerful tool for building a connection with the audience. It isn’t just about cracking jokes. It’s about using humor strategically to reinforce the main message.

When we’re entertained, our guards come down, and we become more receptive to the speaker’s message. It’s like a spoonful of sugar that helps the medicine go down.

An entertaining speech can be particularly effective when the topic at hand is traditionally considered dull, serious, or sensitive. By infusing humor, you can bring life to the subject matter and help the audience connect with it on a deeper level.

With entertainment, you can make complex concepts more accessible. And also break down barriers that might otherwise discourage people from paying attention.

Delivery and timing are crucial elements in entertaining speeches.

The speaker’s tone, facial expressions, gestures , and voice modulation play a significant role in enhancing the comedic effect.

Effective use of pauses , punchlines, and comedic timing can heighten the audience’s anticipation and result in laughter and amusement.

Examples of Entertaining Speech:

An Example of Entertaining Speech

Social Events: Entertaining speeches are often seen at social gatherings such as weddings, birthday parties, or anniversary celebrations.
Conferences or Conventions: In professional conferences or conventions, an entertaining speech can be a refreshing break from the more serious and technical presentations. A speaker may use humor to liven up the atmosphere.
Stand-up Comedy: Stand-up comedians are prime examples of entertaining speeches. They perform in comedy clubs, theaters, or even on television shows, aiming to make the audience laugh and enjoy their performance.

The content and style of an entertaining speech should be tailored to the audience and the occasion. While humor is subjective, the skilled entertaining speaker knows how to adapt their speech to suit the preferences and sensibilities of the specific audience. By carefully selecting appropriate humor, you can transform a dull or serious setting into an enjoyable experience for the audience.

Beyond the four main types of public speeches we mentioned, there are a few other different types of speeches worth exploring.

Special Occasion Speeches: These speeches are delivered during specific events or occasions, such as weddings, graduation ceremonies, or award ceremonies. They are meant to honor or celebrate individuals, express congratulations, or provide inspiration and encouragement.
Motivational Speeches: Motivational speeches aim to inspire and are commonly delivered by coaches, entrepreneurs, or motivational speakers. They often focus on personal development, goal-setting, overcoming obstacles, and achieving success.
Commemorative Speeches: These speeches are delivered on anniversaries, memorial services, or dedications. These speeches express admiration, highlight achievements, and reflect on the impact of the person or event being commemorated.
Debate Speeches: Debate speeches involve presenting arguments and evidence to support a particular viewpoint on a topic. They require logical reasoning, persuasive language, and the ability to counter opposing arguments effectively.
Impromptu Speeches: Impromptu speeches are delivered without prior preparation or planning. You are given a topic or a question on the spot and must quickly organize your thoughts and deliver a coherent speech. These speeches test the speaker’s ability to think on their feet and communicate effectively in spontaneous situations.
Oratorical Speech: An oratorical speech is a formal and eloquent speech delivered with great emphasis and rhetorical flair. It aims to inspire, persuade, or inform the audience through the skilled use of language and powerful delivery techniques. Oratorical speeches are typically given on significant occasions, such as political rallies, commemorative events, or public ceremonies.

No matter what kind of speech you are giving, pauses play a key role in making it captivating.

Check out our video on how pausing can transform your speeches.

Public speaking is a powerful skill that holds tremendous value in various aspects of our lives. Whether you’re aiming to inform, demonstrate, persuade, or entertain, mastering the art of public speaking can open doors to new opportunities and personal growth.

Growth happens when you push beyond your comfort zones. Public speaking may seem daunting at first, but remember that every great speaker started somewhere. Embrace the challenge and take small steps forward.

Start with speaking in front of friends or family, join a local speaking club, or seek opportunities to present in a supportive environment . Each time you step out of your comfort zone, you grow stronger and more confident.

Seek resources like TED Talks, workshops, books , and podcasts to learn from experienced speakers and improve your skills.

Just like any skill, public speaking requires practice. The more you practice, the more comfortable and confident you will become.

Seek opportunities to speak in public, such as volunteering for presentations or joining public speaking clubs. Embrace every chance to practice and refine your skills.

If you are looking for a supportive environment to practice and hone your public speaking skills, try out BBR English.

Our 1:1 live sessions with a corporate expert are designed to help you improve your communication skills. You’ll gain the confidence and skills you need to communicate effectively in any situation.

Don’t let fear or insecurity hold you back from achieving your goals.

Book your counseling session now and take the first step towards becoming a more confident and effective communicator.

Your future self will thank you!

To get a peek into our results, check out How A Farmer’s Son Faced His Fear Of Public Speaking To Climb Up The Leadership Roles In An MNC.

Happy Speaking!

Team BBR English

who needs to see our story? Share this content

Opens in a new window

Speech Communications

Types of Speeches
Get Started
Find Websites
Find Articles
Speech Anxiety
MLA Citations
Research Help
Find a Tutor This link opens in a new window

Persuasive Speeches

A persuasive speech attempts to influence or reinforce the attitudes, beliefs, or behavior of an audience. This type of speech often includes the following elements:

appeal to the needs of the audience
appeal to the reasoning of the audience
focus on the relevance of your topic to the audience
fit the speech to the audience - ensure they understand the info
make yourself credible by demonstrating your expertise

Watch out for logical fallacies in developing your argument:

ad hominem argument = attacking an opponent rather than their argument
bandwagoning = using popular opinion as evidence
begging the question = using circular reasoning
either-or fallacy = the argument is structured as having either one answer or another
hasty generalization = taking one instance as a general pattern
non sequitur = aka: it does not follow ; your conclusions are not connected to the reasoning
red herring fallacy = using irrelevant info in the argument
slippery slope = arriving at a truth by supposing a series of possible events
Persuasive Speech Topic Ideas

Informative Speeches

An informative speech is one that enlightens an audience. These types of speeches can be on a variety of topics:

A good informative speech will:

define terms to make the information clearer
use descriptions to help the audience form a mental picture
incorporate a demonstration
explain concepts in-depth for greater understanding

Informative speech example from Bill Gates: Mosquitos, malaria and education

Informative Speech Ideas
<< Previous: Get Started
Next: Find Websites >>
Last Updated: Jul 16, 2024 3:17 PM
URL: https://nscc.libguides.com/speech

The 5 Different Types of Speech Styles

Human beings have different ways of communicating . No two people speak the same (and nor should they). In fact, if you’ve paid any attention to people’s speeches around you, you might have already noticed that they vary from speaker to speaker, according to the context. Those variations aren’t merely coincidental.

The 5 Different Types of Speech Styles (Table)


Frozen/Fixed Style	-Formal rigid and static language, reliant on expertise;-Particular vocabulary, previously agreed upon, that rejects slang.	-Formal settings and important ceremonies.-Speaker to an audience without response.	-Presidential speech;-Anthem;-School creed;-The Lord’s prayer.
	-Formal language; -Particular, previously agreed upon vocabulary yet more allowing of slang, contractions, ellipses and qualifying modal adverbials;-Writing and speaking.	-Speaking and writing in formal and professional settings, to medium to large groups of people;-Speaking and writing to strangers, figures of authority, professionals and elders.	-Formal meetings;-Corporate meetings;-Court;-Speeches and presentations; -Interviews;-Classes.
	-Semi-formal vocabulary;-Unplanned and reliant on the listener’s responses;-May include slang, contractions, ellipses and qualifying modal adverbials.	-Two-way communication and dialogue, between two or more people, without intimacy or acquaintanceship.	-Group discussions; -Teacher-student communication; -Expert-apprentice communication; -Work colleagues communication;-Employer-employee communication; -Talking to a stranger.
Casual Style	-Casual, flexible and informal vocabulary;-Unplanned and without a particular order;-May include slang, contractions, ellipses and qualifying modal adverbials.	-Relaxed and casual environments; -Two or more people with familiarity and a relatively close relationship.	-Chats with friends and family;-Casual phone calls or text messages.
Intimate Style	-Casual and relaxed vocabulary. -Incorporates nonverbal and personal language codes (terms of endearment, new expressions with shared meaning). -May include slang, contractions, ellipses and qualifying modal adverbials.	-Intimate settings, relaxed and casual environments; -Two or more people with an intimate bond.	-Chats between best friends, boyfriend and girlfriend, siblings and other family members, whether in messages, phone calls, or personally.

1. Frozen Style (or Fixed speech)

A speech style is characterized by the use of certain grammar and vocabulary particular to a certain field, one in which the speaker is inserted. The language in this speech style is very formal and static, making it one of the highest forms of speech styles. It’s usually done in a format where the speaker talks and the audience listens without actually being given the space to respond.

2. Formal Style

This style, just like the previous one, is also characterized by a formal (agreed upon and even documented) vocabulary and choice of words, yet it’s more universal as it doesn’t necessarily require expertise in any field and it’s not as rigid as the frozen style.

Application: Although it’s often used in writing, it also applies to speaking, especially to medium to large-sized groups. It’s also the type of speech that should be used when communicating with strangers and others such as older people, elders, professionals, and figures of authority.

3. Consultative Style

The third level of communication it’s a style characterized by a semi-formal vocabulary, often unplanned and reliant on the listener ’s responses and overall participation.

Application: any type of two-way communication, dialogue, whether between two people or more, where there’s no intimacy or any acquaintanceship.

4. Casual Style (or Informal Style)

Application: used between people with a sense of familiarity and a relatively close relationship, whether in a group or in a one-on-one scenario.

5. Intimate Style

Examples: chats between best friends, boyfriend and girlfriend, siblings and other family members, whether in messages, phone calls, or personally.

The 4 Methods or Types of Speech Delivery

What makes a great presenter 9 key qualities to look for, an easy guide to all 15 types of speech, 4 factors that influence speech styles, 1. the setting .

The setting is essentially the context in which the speech shall take place. It’s probably the most important factor to be considered when choosing which speech style to use as nothing could be more harmful than applying the wrong speech style to the wrong setting.

Although it’s a factor that’s exhausted and diverse, to make things simple for you, I’ve divided them in three main categories:

In these settings, people are more relaxed and less uptight than in formal settings. Since there’s a degree of familiarity between those speaking, even though people are not necessarily intimate, the speaker can apply either consultative or casual speech styles. Some examples of these settings include weddings, company or team meetings, and school classes.

Misreading the setting can be really embarrassing and have devastating consequences. If, for instance, you make inappropriate jokes in a work meeting or use slang words, you could be perceived as unprofessional and disrespectful, and that could cost you your job.

2. The Participants

Your audience, the people to whom your speech is directed, or the people you interact with are decisive factors when choosing your speech style.

3. The Topic

For example, sometimes, when making a presentation about a serious topic at a conference, you might want to mix formal speech with a more consultative or casual speech by sliding in a joke or two in between your presentation, as this helps lighten up the mood.

4. The Purpose of The Discourse or Conversation

The purpose of your discourse is your main motivation for speaking. Just like with the topic, when it comes to choosing the speech style taking into account the purpose, the choice is mostly intuitive and keeps in mind the other factors.

Speaker Styles

A content-rich speaker is one whose aim is to use the speech to inform. He is factual and very objective and focused on providing all the information the audience or receptor of the message needs.

Stand-up comedians are a great example of this type of speaker.

Most TED talkers or motivational speakers are great examples of this type of speaker.

Usually, the type of speaker is not fixed in each speech style; one person can be many types of speakers depending on the speech style that they are using and keeping in mind the factors that influence the choice of the speech style.

What’s The Importance of Speech Styles In Communication

Knowing the speech styles and the rules that apply to each of them saves you from embarrassment and positions you as someone of principles and respectful, especially in formal and conservative settings.

Besides that, people tend to gravitate more towards and get influenced by good communicators; therefore, learning something new in that area and improving the quality of your speech and presentations will only benefit you.

Login Alert

> Communicative Functions and Linguistic Forms in Speech Interaction
> Speech Communication in Human Interaction

Book contents

Communicative Functions and Linguistic Forms in Speech Interaction
Cambridge Studies in Linguistics
Copyright page
Introduction
1 Speech Communication in Human Interaction
2 Prosody in a Functional Framework:
3 The Representation Function
4 The Appeal Function
5 The Expression Function
6 Linguistic Form of Communicative Functions in Language Comparison

1 - Speech Communication in Human Interaction

Published online by Cambridge University Press: 13 October 2017

1.1 Human Interaction and the Organon Model

Humans interact for a variety of reasons:

for survival and procreation, and for play, which they share with the animal world

for creating habitats and social bonds, which they basically share with many animals

for making tools and using them in their daily activities

for selling and buying, and for business transactions in general

for establishing, enforcing and observing social and legal codes

for social contact, phatic communion and entertainment

for reporting events and issuing warnings

for instructing and learning

for asking, and finding answers to, questions of religious belief, of philosophical understanding, of scientific explanation, of historical facts and developments

for artistic pursuits for eye, ear and mind.

Central to all these human interactions is speech communication , i.e. communication via an articulatory–acoustic–auditory channel (AAA) between a sender and a receiver, supplemented by a gestural–optical–visual channel (GOV) . Speech communication is based on cognitive constructs that order the world and human action in space and time. These constructs are manifested in the AAA channel as words with their paradigmatic and phonotactic sound structures, and as syntagmatic organisations of words in utterances. The words and phrase structures are linked to articulatory processes in speech production by a speaker and to auditory patterns in speech perception and understanding by a listener. Speech communication on the basis of such cognitive constructs, of their formal representation, and articulatory and perceptual substantiation, shared by speakers and listeners, performs three basic functions in sender–receiver interaction:

(a) the transmission of symptoms relating to the sender's feelings and attitudes in the communicative act

(b) the emission of signals by a sender to a receiver to stimulate behaviour

(c) the transmission of symbols mapped to objects and factual relations in space and time, constructing the world in communicative acts.

This is Karl Bühler's Organon Model (from Classical Greek ὄργανον, ‘instrument, tool, organ’, after Aristotle's works on logic: Bühler Reference Bühler and Goodwin 1934 , pp. 24–33; see Figure 1.1 ), which relates the linguistic sign to the speaker, the listener, and the world of objects and factual relations, in the communicative functions of Expression (a), Appea l (b) and Representation (c). The objects in the symbolic mapping of (c) are not just concrete things, e.g. ‘table’, ‘mountain’, but also include abstract entities, e.g. ‘love’, ‘death’, and attributes, e.g. ‘redness’, ‘beauty’. The symbolic mapping to objects and factual relations constitutes language structure [ Sprachgebilde ] based on social convention binding individual speakers in their speech actions [ Sprechhandlungen ]. This is, in Bühler's terms ( Reference Bühler and Goodwin 1934 , pp. 48ff), de Saussure's langue versus parole (de Saussure Reference Saussure, Bally and Sechehaye 1922 ) . For the Representation function, the human mind devised systems to capture linguistic signs graphically on durable material in order to overcome the time and space binding of fleeting signals through AAA and GOV channels . These writing systems are either logographic, with reference to the symbolic values of linguistic signs, or phonographic, with reference to their sound properties, either syllabic or segmental. The latter, alphabetic writing, was only invented once in the Semitic language family. It conquered the world and became the basis of linguistic study, which has, for many centuries, focused on the Representation function in written texts, or on speech reduced to alphabetic writing.

Figure 1.1. The Organon Model according to Bühler ( Reference Bühler and Goodwin 1934 , p. 28), with the original German labels, and their added English translations, of the three relationships, functions and aspects of the linguistic SIGN Z(eichen).

The three aspects of the linguistic sign – sender symptom , receiver-directed signal , symbol - to - world mapping – are semasiological categories, with primary manifestation through the AAA channel, but accompanied in varying degrees by the GOV channel, more particularly for the functions (a) and (b). Bühler made it quite clear that he regarded the three functions of his Organon Model as being operative in any speech action at any given moment, but with varying strengths of each, depending on the communicative situation. In rational discourse, Representation with symbol - to - world mapping dominates; in highly emotional communication, it is the symptoms of the Expression function; in commands on the drill ground, the signals of the Appeal function; a balance of signals and symptoms occurs in words of endearment or abuse. An aggressive act may be totally devoid of symbolic meaning, as in the reported case of a Bonn student silencing the most powerful market crier in the Bonn fruit and vegetable market, eventually having her in tears, by simply reciting the Greek and Hebrew alphabets loudly with pressed phonation : ‘Sie Alpha! Sie Beta! …’ (Bühler Reference Bühler and Goodwin 1934 , p. 32).

The linguistic sign is at the centre of the model and has a direct iconic symptom or signal relationship to the sender or the receiver in Expression and Appeal , respectively, and an indirect symbolic relationship to objects and factual relations in Representation. The direct or indirect relationships are indicated by plain or dotted connection lines in Figure 1.1 . The linguistic sign is encapsulated in a circle encircling all three functions. Superimposed on, and cutting across, this circle is a triangle connecting with the sign's three functions and covering a smaller area of the circle, as well as going beyond the area of the circle with its three edges. The triangle represents Bühler's principle of abstractive relevance applied to the phonetic manifestation of the linguistic sign, which is captured by the circle. The triangle contains only the communicatively relevant features of the total of phonetic properties, and at the same time it adds functional aspects in relation to the three communicative functions that are absent from the phonetic substance.

Abstractive relevance is also the basis for Bühler's concept of phonology versus phonetics, which he expounded in his seminal article of 1931 (Bühler Reference Bühler 1931 ), and which Trubetzkoy took over in his Grundzüge of 1939. Abstractive relevance means that the total phonetic substance of the instantiation of a linguistic sign is reduced to its functionally relevant phonetic features by an abstractive scaling, not by abstract representation. Thus, phonology comes out of phonetics, phonetics does not go into phonology, contrary to Ladd ( Reference Ladd, Goldsmith, Riggle and Yu 2011 ), who interpreted Prague phonology as an abstraction from phonetics that stopped short of its logical conclusion (cf. Kohler Reference Kohler 2013a ). The mistake Prague phonology made was not incomplete phonological abstraction from concrete phonetics but the postulation of two disciplines, phonology and phonetics, of which the former was furthermore linked with the humanities, the latter with the natural sciences. The reason this happened lies in the methodology of early experimental phonetics at the turn of the twentieth century, where linguistic concepts disintegrated and objective truth was sought in speech curves of various, mainly articulatory, origins and in the numbers derived from them (Scripture Reference Scripture 1935 , p. 135) . This imbalance was put right again by the Prague linguists, who reintroduced the functional aspect into phonetics, which had always been present in the several thousand years of descriptive studies of speech sounds in languages since the invention of alphabetic writing. Bühler's concept of abstractive relevance shows how this dichotomy can be overcome: there is only one science of the sound of speech in human language – it determines the functionally relevant features in speech communication in the languages of the world from the broad array of sound in individual speech acts.

Bühler developed the concept of abstractive relevance in connection with the symbolic mapping of sound markers of the linguistic sign to Objects and Factual Relations in the Representation function, especially the sound markers of names (words) assigned to objects. The entire sound impressions of words are not relevant for the differential name-object mappings; only a small number of systematically ordered distinctive sound features are. This is the principal aspect of Prague segmental word phonology incorporated into Bühler's theory of language. Apart from lexical tone and stress, this framework says nothing about prosodic phonology at the level of mapping formal phrasal structures and factual relations in the world. Bühler left a gap in his theory of language, which needs filling in two respects:

He considers ‘musical modulation’ at the utterance level in the Indo-European languages to be irrelevant for Representation , and therefore free to be varied diacritically in the other two functions, for example, adding an urgent Appeal to the German phrase ‘es regnet’ [it is raining] in order to remind a forgetful person to take an umbrella (Bühler Reference Bühler and Goodwin 1934 , p. 46). Global unstructured utterance prosodies are seen as Expression or Appeal overlays on structured phonematic lexical sound markers in Representation . This is incomplete in two respects: prosody can and does map symbolic relations in Representation , and it is highly structured in all three functions. In Bühler's time, prosodic research was still in its infancy, so he was not able to draw on as rich a data analysis as we can today.

The function-form perspective is to map the functions of the Organon Model , as well as subfunctions in each, and their formal systems and structures at all linguistic levels, from phonetics/phonology through the lexicon and morphology to syntax. For example, the investigation into Question versus Statement needs to consider word-order syntax , question particles and prosody . In this way, prosody as the acoustic exponent in symbolic phrase-level mapping is treated on a par with other formal means, lexical and structural, and is thus fully integrated into the theory of language and of language comparison.

Bühler saw the gap in his theory and set a goal for further development:

Let me stress the point once again: these are only phenomena of dominance, in which one of the three fundamental relationships of the language sounds is in the foreground. The decisive scientific verification of our constitutional formula, the Organon Model of language, has been given if it turns out that each of the three relationships, each of the three semantic functions of language signs discloses and identifies a specific realm of linguistic phenomena and facts. That is indeed the case. ‘Expression in language’ and ‘appeal in language’ are partial objects for all of language research, and thus display their own specific structures in comparison with representation in language … This is the thesis of the three functions of language in simplest terms. It will be verified as a whole when all three books that the Organon Model requires have been written.

He himself concentrated on the Representation function, which he indicated in the subtitle of the book. It resulted from an extensive study of the extant literature of Indo-European historical linguistics, with its focus on such topics as the Indo-European case system, deixis and pronouns, anaphora, word and sentence, compound, ellipsis and metaphor, generally dealing with Representation , written texts and historical comparison. He was also thoroughly familiar with the Greek philosophers, with modern logic and with the philosophy of language. He especially discussed Husserl's Logische Untersuchungen and Cartesianische Meditationen in some detail in connection with the concept of Sprechakte [ speech acts ], in which a speaker confers specific discourse-driven meanings to words of a language, and which are distinguished from Sprechhandlungen [ speech actions ], the unique hic et nunc utterances by individuals. He also took de Saussure ( Reference Saussure, Bally and Sechehaye 1922 ) , and especially Gardiner ( Reference Gardiner 1932 ) and Wegener ( Reference Wegener 1885 ), into account, who added the linguistic expert perspective to what Bühler contributed as a psychologist working with language. The study of language is a study of creative actions, not of a static linguistic object, because language users interact through speech actions in communicative speech acts by means of a Sprachgebilde [ language structure ] to create Sprachwerke [ language works ]. This naturally led to the Organon Model and to looking beyond Representation .

1.2 Deictic and Symbolic Fields in Speech Communication

In addition to the Organon Model , Bühler proposed a two-field theory of speech communication : the pointing or deictic field and the naming or symbolic field . The deictic field is one-dimensional, with systems of deictic elements that receive their ordering in contexts of situation. In a pointing field, a speaker sets the sender origo of hic-nunc-ego coordinates, which position the speaker in space and time for the communicative action. Within the set coordinates, the sender transmits gestural and/or acoustic signals to a proximate or distant receiver. These signals point to the sender, or to the receiver, or to the world of objects away, or far away, from (the positions of) the sender and the receiver. Receivers relate the received sender-, receiver- or world-related signals to their own hic-nunc-ego coordinates to interpret them. The understanding of their intended meanings relies on material signal properties that guide the receiver through four different pointing dimensions: here or hic deixis; where-you-are or istic deixis; there or illic deixis; and yonder deixis.

On the other hand, the symbolic field in its most developed synsemantic form is a field where linguistic signs do not occur primarily in situational but in linguistic contexts. It is two-dimensional, comprising systems of signs for objects and factual relations, a lexicon, and structures, a syntax, into which the systemic units are ordered. Another symbolic field is the one-dimensional sympractical field, which contains systems of signs that are situation-related in an action field, rather than being anchored in linguistic context.

1.2.1 Deictic Field Structures

In deictic communication, the sender creates a situational field in space and time by using optical and acoustic signals in relation to the sender's hic-nunc-ego coordinates, and the receiver decodes these signals with reference to the receiver's position in the created communal space-time situation. If the signals are optical, they are gestures, including index finger or head pointing, and eye contact. If they are acoustic, they include linguistic signs, deictic particles, demonstrative and personal pronouns, which function as attention signals. These signals structure the deitic field with reference to the four pointing dimensions. For each dimension, the relation may be at , to or from the reference, as in the Latin deictic signs ‘hic’, at the sender, ‘huc’, to the sender, ‘hinc’, from the sender; ‘istic’, at the receiver, ‘istuc’, to the receiver, ‘istinc’, from the receiver; ‘illic’, at a third-person place, ‘illuc’, to that place, ‘illinc’, from that place. Yet pointing in a situational field is always meant for a receiver, even if there are no specific receiver-deictic signals. The linguistic signs receive their referential meaning through the situational dimensions. ‘here’, ‘I’, ‘yours’, ‘that one over there’ are semantically unspecified outside the hic-nunc-ego coordinates of the deictic field. Languages differ a great deal in the way they structure the deictic field with deictic linguistic signs. Latin provides a particularly systematic place-structure deixis . Linguistic deixis signs are not only accompanied by gestures, but also by acoustic signals pointing to the sender or the receiver.

1.2.1.1 here or hic Deixis

Hic deixis signalling points to the sender in two ways, giving (1) the position and (2) the personal identification of the sender.

(1) Position of the Sender

When speaker B answers ‘Here’ from a removed place after speaker A has called out, ‘Anna, where are you?’, the deictic ‘here’ is defined within sender B's coordinates but remains unspecified for receiver A unless the acoustics of the uttered word contain properties pointing to the sender's position in space, or, in the case of potential visual contact between sender and receiver, are accompanied by a gestural signal of a raised hand or index finger. The acoustic properties of signal energy and signal directionality give A a fair idea of the distance and the direction of B's position in relation to A's coordinates, indicating whether the sender is nearby, e.g. in the same room or somewhere close in the open, but outside A's visual field, or whether B is in an adjoining room, or on another floor or outside the house. From their daily experience with speech interaction, both speaker and listener are familiar with the generation and understanding of these sender-related pointing signals.

A different variety of this hic deixis occurs in response to hearing one's name in a roll call, which depends on visual contact for verification. Raising arm and index finger and/or calling out ‘Here’/‘Yes’ transmits the sender's position and personal (see (2) below) coordinates.

(2) Personal Identification of the Sender

A speaker B, waiting outside the door or gate to be let in, may answer ‘It's me’ in response to a speaker A asking ‘Who is there?’ over the intercom. This hic deixis is only understandable if A has a mental trace of B's individual voice characteristics. Their presence in a pointing signal allows the correct interpretation of an otherwise unspecified ‘me’. Again, speakers and listeners are familiar with these material properties of individualisation in speech interaction.

1.2.1.2 where-you-are or istic Deixis

Istic deixis signalling points to the receiver. Bühler thought that, contrary to the other types of deixis, there are no specific systematic pointing signals for istic deixis , although he lists a few subsidiary devices on an articulated sound basis, such as ‘pst’, ‘hey’, ‘hello’, or the reference to the receiver by ‘you’, or by the personal name, accompanied by index finger pointing and/or head turning to the person to establish eye contact.

However, an examination of the various occurrences of istic deixis , and of the pitch patterns associated with them in English and German, shows up a specific melodic device that is characteristic of utterances pointing to the receiver, and that differs from pitch patterns used in other types of deixis and in the synsemantic symbolic field of speech communication . It is level pitch stepping up or down, or staying level, as against continuous movement (see 2.14 ). Continuous pitch patterns form a system of distinctive differences for coding Representation, Appeal and Expression functions in speech communication in a particular language. They fill speech communication with sender-receiver-world content. On the other hand, stepping patterns function as pointing signals to the receiver to control sender-receiver interaction; they do not primarily fill it with expression of the speaker, attitudes towards the listener and representation of the world. The referential content is predictable from the discourse context, and the sender shares an established and mutually acknowledged routine convention with the receiver. These acoustic patterns of istic Deixis may occur interspersed in speech communication at any moment to initiate, sustain and close speaker–listener interaction, with two different functions, either to control connection with a receiver or to induce specific action in a receiver. In all the varieties of istic Deixis found in English and German, which will be discussed as subcategories of an Appeal function in 4.1 , stepping pitch patterns operate as such receiver-directed control signals. When they are replaced by continuous contours in the same verbal contexts they lose the simple pointing control characteristic and become commands, expressive pronouncements and informative statements in acts of speech communication.

1.2.1.3 Proximate and Distant Pointing: there or illic Deixis and yonder Deixis

In order to point away from sender and receiver to objects in a proximate or a distant pointing field, arm and finger gestures are used as the standard signal. Demonstrative pronouns and position adverbs, such as ‘that’ – ‘yon’, ‘there’ – ‘yonder’/‘over there’ in English, or ‘der’ – ‘jener’, ‘da’ – ‘dort’, ‘dort’ – ‘dort drüben’/‘jenseits’ in German are linguistic signs used for pointing in sympractical usage, accompanied by gesture, but they also operate anaphorically in a synsemantic field. The distinction between proximate and distant positions in a speaker's pointing field coordinates is less clearly defined than the one between the positions of sender and receiver. Languages do not always have a stable, formally marked system of proximate and distant position adverbs and demonstrative pronouns. Even English ‘yon’ and ‘yonder’ are literary and archaic outside dialectal, especially Scottish, usage, and German ‘da’ versus ‘dort’ are unstable in their position references. Speakers use phrase constructions instead to define different field positions, for example ‘over there’ in English and ‘dort drüben’ in German, or they define distant positions in relation to landmarks. As regards signalling proximate and distant positions by gesture, stretching out arm and index finger in the direction of an object may be used for the former, an upward-downward arm–index finger movement for the latter.

Signalling in a pointing field may combine a deixis gesture to objects with a deixis gesture to the receiver. I recently observed an instance of this. I had just got some cash out of an ATM but was still close to the machine when another customer approached to use it. He turned his head towards me and, with his far-away arm and index finger, pointed to the machine, asking ‘Fertig?’ [Finished?] (with high-level pitch). He identified the object he wanted to use after me and, by looking at me, identified me as the receiver of his object pointing and his enquiry, which he spoke with the acoustic stylisation of istic Deixis , a high-level pitch signalling ‘May I use this machine?’ This is a Deixis Appeal , different from the Question Appeal ‘Is it true that you are finished with the machine?’ (see 4.1 , 4.2 ). The response may be ‘Ja, bitte’ or just ‘Bitte’ [(Yes,) go ahead], which is impossible in the Question context.

Now let's visit a Scottish pub to illustrate the whole gamut of communicative interaction that is possible with ordering beer, from a synsemantic description to mere gesture. One may give the order ‘A pint of Caledonian 80/- please’, with continuous pitch. In this situation, it is quite clear that it means ‘I want to buy a pint of heavy draught beer sold under the Caledonian Company's trademark’, but nobody would use this synsemantic description. The order may be shortened to ‘A pint of Caley 80, please’, again with continuous pitch. Or one may just point to the label on a draught pump and say ‘Pint, please’, with a high-level pitch pattern, to induce the receiver to act. This is accompanied by turning one's head towards the barman, to establish gestural contact. Or the customer may point towards the pump with one hand and hold up the other hand with fingers raised according to the number of pints wanted. In a pub near the Tynecastle Hearts football stadium in Edinburgh, called ‘The Diggers’, because it used to be frequented by gravediggers from a nearby cemetery, this gesturing can be further reduced to mere finger raising, because the barmen know that regulars drink one of their fourteen types of heavy, of course by the pint, and they also know who drinks Caley 80.

The speech versions of this pub order are self-sufficient sympractical signs in their own right, not ellipses of a synsemantic structure ‘Sell me a pint of Caledonian 80/- heavy draught beer, please.’ In such a sympractical communication field the transmission of symbolic meaning through speech is reduced to the minimum considered necessary by the communicators; the action field and the situation supply the referential meaning, and the Expression and Appeal functions are of secondary relevance. They may come in when speakers do not get what they want. Interaction still works when gestures take over altogether. In this case, the question of an ellipsis simply does not arise, which also casts doubt on any attempt to derive the linguistically reduced forms from fully elaborated ones.

1.2.2 From Sympractical Deixis in Situations to Synsemantic Symbols in Contexts

A sender communicating with a receiver may establish a hic-nunc-ego origo in a deictic field relating to their actual situation. In its simplest form, communication is just by gesture or by gesture accompanying sympractical speech, or only by sympractical speech pointing to the situation that both sender and receiver are connected with (e.g. ‘mind the gap’/‘mind your step’ announcements on the London Underground/at Schiphol Aiport, see 4.1.3(2) ). The pointing in this sympractical speech may be done by deictic particles and pronouns with or without finger gesture, as in ‘The flowers over there.’ Direct pointing by gesture and/or deictic words may be removed in synsemantic place description in relation to the origo , as in ‘The flowers are on the table at the window in the back room upstairs.’ But there is still some pointing in relation to the position of the sender's origo in such utterances, because they are only intelligible with reference to the situation both sender and receiver are related to, and they presuppose the receiver's awareness of, and familiarity with, the locality.

In developing talk, a speaker may move the hic-nunc-ego origo from the actual sender-receiver situation to a place and time in memories and imagination, and relate symbols to this new origo position, thus creating a virtual deictic field, which Bühler calls Deixis am Phantasma [ Phantasma Deixis ] (Bühler Reference Bühler and Goodwin 1934 , pp. 121ff). In Indo-European languages the same deictic signs are used as for pointing in actual situations (‘this (one)’, ‘that (one)’, ‘here’, ‘there’; German ‘dieser’, ‘jener’, ‘der(jenige)’), supplemented by position and time adverbs and conjunctions. This pointing in displaced virtual situations is found in narrating fairy tales:

Es war einmal ein kleines süßes Mädchen … Eines Tages sprach seine Mutter zu ihm: … bring das der Großmutter hinaus …

Once upon a time there was a dear little girl … One day her mother said to her: … take this to your grandmother …

or in storytelling of past or future events:

After a five-hour climb we arrived at the top of Ben Nevis . Here we first of all had a rest. Then we dug into our food. And when the fog lifted, we were rewarded by the most spectacular view of the Highland scenery around and below us.

or in giving direction:

You take the road north out of our village . When you get to a junction turn right , then the first left . You continue there for about a mile. Then the castle will come into view.

Communicative action changes completely when symbols are anchored in the context of linguistic structure and are freed from situations. Let's assume that on 3 April 2005, the day after Pope John Paul II died, a passenger on a New York subway train says to the person beside him, ‘The Pope's died’, referring to what he has just read in the paper. This statement is removed from the place of the communicative situation between sender and receiver: it might have been made anywhere around the world (in the respective languages), but it is still linked to the time when the speaker makes it. In a proposition like ‘Two times two is four’ the time link is also severed. This is the self-contained synsemantic use of symbols in a symbolic field to refer to objects and factual relations, valid at all times and places, in statements of mathematics, logic and science.

As regards the intonation of such sentences in oral communicative actions, the occurrence of stepping pitch is all the more likely the stronger the sympractical deictic element. Synsemantic sentences have continuous pitch, rising-falling centered on ‘Pope’, and on the second ‘two’ in the above examples. If ‘↓The ↑Pope's died’ is spoken with upward-stepping pitch, it may, for example, come from a newspaper seller in the street attracting the attention of people passing by (‘Buy the paper, and read more about the news of the Pope’), i.e. a receiver-directed signal puts the synsemantic sentence into a pointing situation. Similarly, when saying the times tables by rote, for instance teacher-directed in class, an upward-stepping pattern will be given to the synsemantic sentence: ‘↓One times two is ↑two. ↓Two times two is ↑four. ↓Three times two is ↑six…’, or shortened to ‘↓One two ↑two. ↓Two twos ↑four. ↓Three twos ↑six …’

There is another, quite different way of introducing pointing into the synsemantic symbolic field: anaphora (Bühler Reference Bühler and Goodwin 1934 , pp. 121ff, 385ff). It reinforces reference to the symbolic context because pointing occurs with backward or forward reference to the internal structure of developing talk in the symbolic field, not with reference to the external situation: the symbolic (linguistic) context functions as the pointing field. In Indo-European languages, the exponents are again the same deictics as for pointing in situations, supplemented by position and time adverbs, conjunctions, relative and third-person personal pronouns, fully integrated into the case system and syntax of the language. For examples from German illustrating the distinction between external situation and internal anaphoric pointing, see Abraham ( Reference Abraham and Goodwin 2011 , pp. xxiiff).

1.3 From Function to Form

1.3.1 bühler and functional linguistics of the prague school.

Bühler places language functions at the centre of his theory of language and then looks at their mapping with linguistic form, for example in the discussion of the Indo-European case system as a formal device for representing objects and factual relations of the world with symbols in a symbolic field (Bühler Reference Bühler and Goodwin 1934 , pp. 249ff).

The model is thus eminently suited as a theoretical basis for a function-form approach. The notion of function has played a role in many structural theories of language that ask about the acts language users perform with the formal tools. Functional theories of grammar strive to define these functions and subsequently relate them to the structural carriers. The most elementary function is the differentiation of representational meaning, in its simplest form in functional phonetics. The Prague School linguists were the first to develop functional structuralism , starting with phonology, based on the principle of the distinction of lexical meaning, rather than on the principle of complementary distribution, as in American behaviourist structuralism.

Under the influence of Bühler's Organon Model , Trubetzkoy ( Reference Trubetzkoy 1939 , pp. 17ff) complemented the phonology of the Representation function (‘Darstellungsphonologie’) by phonologies of the Expression and Appeal (conative) functions (‘Ausdrucks- und Kundgabephonologie’), which he did not always find easy to separate, and which, following Prague systematising, he allocated to a new discipline, called ‘sound stylistics’ (‘Lautstilistik’), with two subsections. He subsumed vocalic lengthening, as in ‘It's won derful!’, and initial consonant lengthening, as in ‘You're a b astard!’, under the Appeal function, because he maintained that the speaker signals to the listener to empathise with his/her feelings. Isačenko ( Reference Isačenko and Vachek 1966 ) rightly criticised this solution as unacceptable psychologising and allocated such data to the Expression function, which I do likewise. Mathesius ( Reference Mathesius and Vachek 1966 ) extended the functional perspective to lexical and syntactic form (beside accentuation and intonation) for Intensification and for Information Selection and Weighting . Contrary to general usage, he called the latter emphasis . Since this term is used with a wide array of signification I shall avoid it altogether and refer to the two functions by the above pair of terms. An Intensification scale will be incorporated into the Organon Model as the Expressive Low-to-High Key function (see Chapter 5 ).

Jakobson ( Reference Jakobson and Sebeok 1960 ) took up the three functions of Bühler's model as emotive , conative and referential , oriented towards addresser , addressee and message referent . He derived a magic, incantatory function from the triadic model as a ‘conversion of an absent or inanimate “third person” into an addressee of a conative message’ (p. 355). Prayer comes under this heading. And he added another three functions (pp. 355ff):

phatic serving to establish, prolong or discontinue communication: ‘Can you hear me?’ ‘Not a bad day, is it?’ – ‘It isn't, is it, could be a lot worse’ (an exchange between two hikers meeting in the Scottish hills on a foggy, drizzly day)

poetic focusing on the message for its own sake: rhythmic effects make ‘Joan and Margery’ sound smoother than ‘Margery and Joan’; the poetic device of paronomasia selects ‘horrible’ instead of ‘terrible’ in ‘I h ate h orrible H arry’

metalingual , language turning back on itself: ‘What is a sophomore?’ – ‘A sophomore means a second-year student.’

Jakobson gives the following linguistic criteria for the poetic and metalingual functions:

We must recall the two basic modes of arrangement used in verbal behavior, selection and combination … The poetic function projects the principle of equivalence from the axis of selection into the axis of combination. Equivalence is promoted to the constitutive device of the sequence. In poetry one syllable is equalised with any other syllable of the same sequence; word stress is assumed to equal word stress, as unstress equals unstress … Syllables are converted into units of measure, and so are morae or stresses … in metalanguage the sequence is used to build an equation, whereas in poetry the equation is used to build a sequence.

Jakobson's additional communicative functions are an extension to Bühler's theory of language, but they are not on a par with the three functions of the Organon Model ; rather, they are functions within the domains of the sender, the receiver and the referent. The phatic function is clearly receiver-directed and constitutes one type of signalling. The metalingual function belongs to the domain of objects and factual relations, and constitutes the essence of a symbolic speech act. The poetic function is not a function in the sense of the other two, i.e. of communicative action between a sender and a receiver. It is a device characterising a speech act or a language work. As such, it may have an aesthetic function to give sensuous pleasure, or a Guide function to increase intelligibility, or a rhetorical function to persuade, as in advertising , in all cases in the domain of the receiver. In the example of paronomasia given above, the poetic device has a speaker-focused Expression function, which it may also have in reciting lyrical poetry.

Garvin ( Reference Garvin, Čmejrková and Štícha 1994 , p. 64), in discussing Charles Morris's three branches of semiotics – syntactics, semantics, pragmatics – notes:

In Bühler's field theory … variants [of structural linguistic units] can be interpreted in terms of the field-derived properties of the units in question. In the Morrisian schema, I do not seem to be able to find a real place for this issue … None of this, of course, means that I object to ‘pragmatics’ as a label of convenience for the discussion of certain of the phenomena that, as I have repeatedly asserted, Bühler's field theory handles more adequately, I only object to giving theoretical significance as a separate ‘level’ or ‘component’ … the foundation of Bühler's theory is the … Gestalt-psychological notion of the figure–ground relation. Morris's foundations, on the other hand, are admittedly behaviorist … There is no doubt about my preference for the Gestaltist position … It is interesting to note that many of the linguists who have arrived at a total rejection of the behaviorist bases of descriptivist linguistics nevertheless have come to use the Morrisian schema, at least to the extent of accepting a pragmatics component for explaining certain phenomena.

In full agreement with Garvin's dictum, I also follow Bühler's theory of language. Building the theory, and the empirical analysis, of language on the Organon Model can immediately dispense with all the subdivisions of the field of speech science into separate disciplines, phonology versus phonetics, phonology versus sound stylistics, linguistics versus paralinguistics , pragmatics versus syntax and semantics, and relate units and structures across all linguistic levels of analysis to axiomatically postulated functions in speech communication – functions in the domains of Sender, Receiver and Referent, such as Question, Command, Request, Information Selection and Weighting, Intensification . In moving from these functions to the linguistic signs in their deictic and symbolic fields, speech science can capture all the formal phonetic, phonological and linguistic aspects related to them.

1.3.2 Halliday's Functional Systemic Linguistics

A few words need to be said about another, more recent functional framework that is also rooted in the European linguistic tradition, more particularly J. R. Firth's enquiry into systems of meaning (Firth Reference Firth 1957 ): Michael Halliday's Systemic Functional Linguistics (SFL) (Hasan Reference Hasan and Webster 2009 ). It is conceived as systemic with reference to paradigmatic choices in language, and also as functional with regard to specific functions that these formal systems are to serve in communication. These functions are called metafunctions, comprising the ideational function ( experiential and logical ), the interpersonal function and the textual function. There are correspondences between Halliday's and Bühler's functions but also fundamental differences. Standing in the European tradition, Halliday and Hasan know Bühler's Theory of Language , but they do not always represent it correctly. Hasan ( Reference Hasan and Webster 2009 , p. 19) says:

Bühler thought of functions as operating one at a time; further, his functions were hierarchically ordered, with the referential as the most important. The metafunctions in SFL are not hierarchised; they have equal status, and each is manifested in every act of language use: in fact, an important task for grammatics is to describe how the three metafunctions are woven together into the same linguistic unit.

The concept of ‘function’, when used in SFL with reference to the system of language as a whole, is critically different from the concept of ‘function’, as applied to a speech act such as promising, ordering, etc., or as applied to isolated utterances à la Bühler ( Reference Bühler and Goodwin 1934 ) for the classification of children's utterances as referential, conative or expressive. SFL uses the term ‘metafunction’, to distinguish functions of langue system from the ‘function’ of an utterance.

As regards the first quotation, the discussion in this chapter will have made it clear that Bühler's three functions in the Organon Model do not operate one at a time, and they are not hierarchically ordered. His linguistic sign has the three functions of Expression , Appeal and Representation at any given moment, but depending on the type of communicative action their relative weighting changes. In the Theory of Language , he puts particular emphasis on the representational function, because this is the area linguistics had been dealing with predominantly during the nineteenth century and up to his time, and he felt a few principles that were generally applied needed to be put right.

The second quotation shows the reason for the misunderstanding. The fundamental difference between the two models is not that Halliday takes a global view of the system of language and Bühler refers to speech actions in isolated utterances. The difference is between a descriptive product model of language in SFL (Bühler's Sprachwerk ), and a communicative process model of speech actions, which looks at communicative functions between speakers and listeners in speech interaction (Bühler's Sprechhandlungen ). It is the difference between the linguist's versus the psychologist's view of speech and language. Halliday asks ‘How does language work?’, whereas Bühler asks ‘How do speakers and listeners communicate about the world with linguistic signs in deictic and symbolic fields?’ The Organon Model is system-oriented, not restricted to utterances, although the functions surface in utterance signals. Halliday's interpersonal function is part of all three Organon functions: social aspects of the speaker's expression, of attitudes and appeals to the listener, and of representation of the factual world. For Bühler, social relationships determine the communicative interaction between speakers and listeners about referents, i.e. they shape the three functions of the linguistic sign . For Halliday and Hasan, the interpersonal metalevel is a function at a linguistic level, the level of sociolinguistics. The two models are thus complementary perspectives; for a phonetician the process model is particularly attractive because it allows the modelling of speech communication in human interaction.

1.3.3 Discourse Representation Theory

More recent language theories have sprung up from logical semantics incorporating context dependence into the study of meaning. A prominent representative of this dynamic semantics is Discourse Representation Theory (DRT), developed by Kamp and co-workers (Kamp and Reyle Reference Kamp and Reyle 1993 ) over the past two decades. Utterances are regarded as interpretable only when the interpreter takes account of the contexts in which they are made, and the interaction between context and utterance is considered reciprocal. ‘Each utterance contributes (via the interpretation which it is given) to the context in which it is made. It modifies the context into a new context, in which this contribution is reflected; and it is this new context which then informs the interpretation of whatever utterance comes next’ (p. 4). This has resulted in moving away from the classical conception of formal semantics and replacing its central concept of truth by one of information: ‘the meaning of a sentence is not its truth conditions but its “information change potential” – its capacity for modifying given contexts or information states into new ones’ (Kamp, Genabith and Reyle Reference Kamp, Genabith, Reyle and Gabbay 2011 , p. 4). Anaphoric pronouns referring back to something that was introduced previously in the discourse are the most familiar and certainly the most thoroughly investigated kind of context dependence within this framework.

At first sight, this paradigm looks very similar to Bühler's, and, as its proponents and followers would maintain, is far superior because it is formalised, thus testable, and eminently suited to be applied to the automatic analysis of appropriately tagged corpora. But closer inspection reveals that the two are not compatible. DRT talks about utterances in context but means sentences in textual linguistic environments. However, it is speech actions that occur in everyday communication, and they occur not only in synsemantic contexts but, first and foremost, in contexts of situation in sympractical fields . Moreover, not all actions subserve information transmission, because there is phatic communion (Jakobson Reference Jakobson and Sebeok 1960 ; Malinowski Reference Malinowski, Ogden and Richards 1923 ), and appeal to the receiver as well as expression of the sender, where referential meaning is subordinate to social and emotive interaction. DRT w ould need new categories and a change of perspective, going beyond information structure in texts, to provide explanations for exchanges by speech and gesture, such as the ones experienced on a Kiel bus or in a Scottish pub (cf. Introduction and 1.2.1.3 ). Here is another set of possible speech actions that illustrate the great communicative variety beyond information exchange in synsemantic text fields:

I am about to leave the house to go to work, putting on my coat in the hall. My wife is in the adjoining open-plan sitting-room. She briefly looks out of the window and calls to me ‘It's raining’, with a downstepping level pitch pattern on ‘raining’ (see 4.1 ), to draw my attention to the need to take protection against the weather. I thank her for warning me, grab my umbrella, say ‘See you tonight’ and leave.

After I have gone, she calls her sister in Edinburgh, and, following their exchanges of greetings, she goes on to talk about the weather, inevitable in British conversation, and asks, ‘What's your weather like?’, not because she wants to get meteorological information but as an interactional opening. She gets the answer ‘It's raining’, with a continuous, low falling pitch pattern across the utterance, suggesting ‘What else do you expect?’ This is followed by a reference to the Kiel weather and then by an appraisal that the recent terrible flooding in the North of England was much worse, so there is really no reason to complain. After this ritual, the two sisters exchange information about family and friends for another half hour, the goal of the telephone call.

After coming off the phone, she switches the radio on to get the 11 a.m. regional news. At the end, the weather forecast reports ‘In Kiel regnet es heute’ [It is raining in Kiel today]. This is now factual weather information, located in place and time, intended for an anonymous public, therefore removed from interaction between communicators, and since the individual recipient had looked out of the window, the speech action has no informative impact on her.

Each of these communicative interchanges serves a different, but very useful, communicative goal, with different values attributed to the information conveyed. DRT cannot model this diversity because the differently valued types of information are not simply the result of an incremental development of meaning evolving in linguistic contexts but depend on talk in interaction between communicators in contexts of situation. This fact is addressed by Ginzburg ( Reference Ginzburg 2012 ) in the Interactive Stance Model (ISM).

1.3.4 Ginzburg's Interactive Stance Model

This is a theory of meaning in interaction that, on the one hand, is based on the DRT notion of dynamic semantics and, on the other, incorporates two concepts from Conversation Analysis (Schegloff, Jefferson and Sacks Reference Schegloff, Jefferson and Sacks 1977 ) and from psycholinguistics (Clark Reference Clark 1996 ): repair and grounding of content in the communicators’ common ground through interaction in contexts. Ginzburg defines the goal of his semantic theory as ‘to characterize for any utterance type the contextual update that emerges in the aftermath of successful exchange and the range of possible clarification requests otherwise. This is, arguably, the early twenty-first-century analogue of truth conditions’ (Ginzburg Reference Ginzburg 2012 , p. 8). This means that an adequate semantic theory must model imperfect communication just as much as successful communication. Besides giving meaning to indexicals ‘I’, ‘you’, ‘here’, ‘there’, ‘now’ through linguistic context in dynamic semantics, non-sentential units, such as ‘yes’, ‘what?’, ‘where?’, ‘why?’ etc., and repeated fragments of preceding utterances must receive their meanings through the interactive stance in contexts of situation. These are ideas that have been proposed, in a non-formalised way, by Bühler ( Reference Bühler and Goodwin 1934 ) and Gardiner ( Reference Gardiner 1932 ), as well as by Firth ( Reference Firth 1957 ) and his followers in Britain to this day, e.g. John Local and Richard Ogden. None of this literature is cited, no doubt because it is considered outdated and surpassed by more testable and scientific models. However, careful study of the ideas of both camps reaches the opposite conclusion.

Ginzburg's theoretical proposition is that ‘grammar and interaction are intrinsically bound’ and that ‘the right way to construe grammar is as a system that characterizes types of talk in interaction’ (Ginzburg ( Reference Ginzburg 2012 ), p. 349). The pivotal category in this interaction is gameboards , one for each participant, which make communicators keep track of unresolved issues in questions under discussion and allow for imperfect communication through mismatches. The corollary of the notion of the personal gameboard is that participants may not have equal access to the common ground, and contextual options available to one may be distinct from those available to the other(s). Ginzburg illustrates this with a constructed example of dialogue interaction under what he terms the Turn-Taking Puzzle (p. 23).

a.	A:	Which members of this audience own a parakeet? Why? (= Why own a parakeet?)
b.	A:	Which members of this audience own a parakeet?
	B:	Why? (= Why are you asking which members of this audience own a parakeet?)
c.	A:	Which members of this audience own a parakeet? Why am I asking this question?

He explains the different meanings accorded to ‘why’ in the three contexts by referring them to who keeps, or takes over, the turn. ‘The resolution that can be associated with “Why?” if A keeps the turn is unavailable to B were s/he to have taken over, and vice versa. c. shows that these facts cannot be reduced to coherence or plausibility – the resolution unavailable to A in a. yields a coherent follow-up to A's initial query if it is expressed by means of a non-elliptic form.’

These constructed dialogues are problematic, because they lack a sufficiently specified context of situation and violate rules of behavioural interaction beyond speech, and their interpretation by reference to turn-taking is flawed. The reference to ‘members of this audience ’ in a book on the Interactive Stance indicates that the speaker must be contextualised as addressing, and interacting with, a group attending a talk, not as establishing contact for interaction with one or several individuals. Thus B, who is an individual that has not been addressed individually, will not call out from among the audience with non-sentential ‘Why?’ to ask why the speaker addressed the group with that question. There are three possible reactions from the audience. (1) There is a show of hands by those members who have a parakeet. (2) There is no gestural or vocal response, because nobody in the audience has a parakeet. (3) There may be a call from an obstreperous young attendee, something like ‘What the heck are you asking that for? Get on with your subject.’ Just as A did not establish interaction with individual members of the audience, speaker B in (3), in turn, does not intend to interact with A, but opposes interaction by refusing to answer A's question.

In response to the reactions, or the lack of a reaction, from the audience, A may continue with ‘Why am I asking this question?’ (in (2) after pausing for a couple of seconds). In all these cases, A starts a new turn, after a gestural turn from the audience in (1), after a speech turn from an individual member in (3) and after registering absence of a response in (2). A produces an interrogative form that is no longer a Question because it lacks the Appeal to somebody else to answer A's question. It actualises the content of a potential question that the members of the audience may have asked in (1), and particularly in (2), and did ask in (3). This is a Question Quote (see 4.2.2.7 ). Since it is not a Question Appeal it cannot be reduced to the bare lexical interrogative , which presupposes the Appeal function, and it has falling intonation. In German, the Question Quote would be realised by dependent-clause syntax ‘(Sie mögen sich fragen) Warum ich diese Frage stelle?’, instead of the interrogative syntax ‘Warum stelle ich diese Frage?’ The latter (as well as its English syntactic equivalent) has two communicative meanings: (a) A appeals to receivers to give an answer why they think A asks the question; (b) it is the speaker's exclamatory expression ‘Why on earth am I asking this? (It does not get me anywhere!)’ With meaning (b), the interrogative form does not code a question either, since A does not appeal reflexively to A to give an answer to a proposition A is querying. In traditional terminology it would be called a rhetorical question, but in terms of communicative function it is a speaker-centred Expression rather than a listener-directed Appeal . Neither (a) nor (b) seem to have a behavioural likelihood in the interaction with an audience. Ginzburg's sequencing of Information Question and Question Quote in one turn in c. may therefore be considered an ill-formed representation of behavioural interaction. Before giving a Question Quote to the audience, A must have assessed their reaction to the Information Question A put to them.

There is a third possibility (c) of contextualising the German and English interrogative forms ‘Warum stelle ich diese Frage?’ and ‘Why am I asking this question?’ Here is a possible lecture context (let's assume A is male):

I would like to raise a question at the outset of my talk: ‘How many of this audience keep parakeets at home?’ Why am I asking this question? Well, let me explain. I would like to share experiences of parakeets’ talking behaviour with you in the discussion after my talk. So, could I have a show of hands, please. ‘Which of you have a parakeet?’

This constructed opening of a lecture illustrates the lecturer's ambivalent function of reporter to an audience and communicator with an audience. His main function is to report subject matter. In his role as a reporter, the lecturer may raise questions in connection with the topic of the talk, appealing to virtual recipients to give answers. In this reporting role, the lecturer does not enter into interaction with communicators in a real context of situation. He creates a virtual question–answer field in which he enacts interaction between virtual senders and receivers whom he brings to life through his mouth. He treats the audience as external observers of the reporter's question–answer field. This is question–answer phantasma , in an extension of Bühler's notion of Deixis am Phantasma (see the Introduction and Bühler Reference Bühler and Goodwin 1934 , pp. 121ff) . The lecturer's second function is to enter into an interaction with the audience.

In d., lecturer A is first a reporter, then a communicator. A reports two questions for which virtual receivers are to provide answers in the lecture. The second question is immediately answered by the reporter. These questions differ from the question-in-interaction at the end by being non-interactive. The second question can be a virtual Information Question with falling intonation, where the reporter enacts the sender and, at the same time, the receiver to give the answer. It may also be a virtual Confirmation Question , with high-rising intonation starting on ‘why’ (see 4.2.2.4 ), where the reporter enacts a virtual sender who reflects on his reasons for having asked, and a virtual receiver who is to confirm the reasons in the answer: ‘Why am I asking this question really?’ Ginzburg's interactive stance excludes both these questions from his context c., but he obviously explains c. in the non-interactive way of d. This problem must have been realised by the reviewer of Ginzburg ( Reference Ginzburg 2012 ), Eleni Gregoromichelaki ( Reference Gregoromichelaki 2013 ), because she replaced ‘this audience’ by ‘our team’ in her discussion of Ginzburg's ‘parakeet’ example, which is now a question to individual communicators.

Ginzburg's sequencing of a general ‘who?’ and a more specific follow-up ‘why?’ Information Question in one turn of a. is also a behaviourally ill-formed representation. There must be some response to the first Information Question before the second one is asked in a new turn. Moreover, if the first question is put to an audience, A needs to select an individual B, or several individuals in succession, for an answer to the second question, because it can no longer be gestural but must be vocal. There is the possibility of a double question, ‘Do you own a parakeet and why?’, in the opening turn of an interaction with an individual.

Taking all these points together, there is no compelling reason to associate the attribution of different meanings to non-sentential ‘why?’ with turn-holding or turn-taking. Ginzburg's explication of this change of meaning in an interaction, with reference to different options available to communicators in their respective turns, is not convincing. He does not provide a sufficiently specified interactional setting, does not distinguish between interactions with a group and with an individual, and fails to differentiate Question function and interrogative form . Furthermore, he does not acknowledge the occurrence of gestural beside vocal turns, nor of two successive turns by the same speaker, only separated by a pause for the assessment of the interactive point that has been reached. And, last but not least , he discusses questions as if they are removed from interaction in spite of their contextualisations. His concept of interaction does not model speech action in communicative contexts in human behaviour but is derived post festum from formal relations in written text, or spoken discourse that has been reduced to writing, or in constructed dialogues dissociated from interaction.

Now let us give Ginzburg's interaction scenario a more precise definition and develop the meanings of the two non-sentential ‘why?'s in it.

[General context of situation A famous member of the International Phonetic Association (P) is giving an invited talk to the Royal Zoological Society of Scotland on the subject ‘Talking parakeets’. After the introduction by the host and giving thanks for the invitation, P opens his talk.]
a.	P(1):	I suppose quite a few, if not all, of you have a parakeet at home.
	P(2):	[points to an elderly lady in the front row] What about you, madam? Do you keep one?
	S(1):	I do. [may be accompanied, or replaced, by nodding]
		Why?
	P(3):	Why am I asking you this question. Well, let me explain. I am interested in how owners of parakeets communicate with their pets.
b.	[same precursor as in a., then:]
	S(1):	I do.
	P(3):	Why?
	S(2):	Why? Well, because it keeps me company.

In a., P(2) asks a Polarity Question (see 4.2.2.2 ) whether the elderly lady keeps a parakeet in her home, most probably with a falling intonation because the speaker prejudges the answer ‘yes’. S(1) answers in the affirmative and asks an Information Question (see 4.2.2.3 ), appealing to P to tell her why he asked her. To establish rapport with P, S will use low-rising intonation in both her Statement and her Question . P(3) quotes the content of S's question (see 4.2.2.6 ), putting it in interrogative form to himself, as a theme for his rheme explanation of his original Polarity Question . Since the utterance is a factual report, lacking an appeal, the intonation falls. (In German it would be ‘Warum ich Ihnen diese Frage stelle’, again with falling intonation.) In b., P's Polarity Question is answered in the affirmative by S, as in a. This is followed by P asking a follow-up Information Question about the lady's reasons for keeping a parakeet. The intonation may fall or rise, depending on whether P simply asks a factual question or, additionally, establishes rapport with S. This is, in turn, followed by S's Confirmation Question ‘Are you asking me why?’, with high-rising intonation on the lexical interrogative (see 4.2.2.4 ), in turn followed by her answer.

These examples illustrate communicative steps in a question–answer interaction field, made up of declarative and interrogative syntactic structures with varying intonation patterns as carriers of Statements and different types of Question . Different functionally defined question types are bound to the semantic points reached at each step in the interaction and are not exchangeable without changing the semantic context. The crucial issue is that an interrogative form does not receive different meanings in different contexts of situation in interaction, as Ginzburg maintains. Rather, the transmission of meaning at different points in interaction necessitates functionally different Questions , which may be manifested by identical interrogative structure. This is the function-form approach proposed in this monograph, which also incorporates an important prosodic component to differentiate between lexically and syntactically identical utterances. Ginzburg's semantic modelling takes an infelicitous turn in three steps:

He does not recognise question function beside interrogative form .

He is forced to locate semantic differentiators in interaction contexts when syntactically identical interrogatives (disregarding utterance prosody), such as ‘why’, occur with different meanings, and he then incorporates the contexts into the grammar.

He finally refers the semantic differences of these utterances to their turn-holding or turn-taking positions in dialogue.

The reason Ginzburg tries to resolve the semantic indeterminacy of formal grammar by incorporating context of interaction in it lies in the development of semantics in linguistic theory. The formal component of American Structuralism , as systematised by Zellig Harris ( Reference Harris 1951 , 1960), became the morphosyntactic core of his pupil Noam Chomsky's Generative Grammar ( Reference Chomsky 1957 , Reference Chomsky 1965 ), with a semantic and a phonological interpretive level attached at either end of the generative rule system. Within this generative framework, semantics gradually assumed an independent status, which culminated in DRT. With growing interest in spontaneous speech , the meaning of interaction elements that go beyond linguistic context variables had to be taken into account. This led to the inclusion of situational context in formal grammar, which became Ginzburg's research goal. It continues the preoccupation with form since the days of structuralism, now with an ever-increasing concern for meaning.

1.3.5 Developing a Model of Speech Communication

To really become an advanced semantic theory of the twenty-first century, the relationship between grammar and interaction would need to be reversed, with a form-in-function approach replacing interaction-in-grammar by grammar-in-interaction. Empirical research within a theory of speech communication can offer greater insight into the use of speech and language than systematising linguistic forms in discourse contexts with grammar-based formalisms. It is a task for the social sciences, including linguistics, to develop a comprehensive Theory of Human Interaction, which contains a sub-theory of Speech Communication, which in turn contains a Grammar of Human Language and Grammars of Languages. Herbert Clark has taken a big step towards this goal by advocating that:

We must take … an action approach to language use, which has distinct advantages over the more traditional product approach … Language use arises in joint activities … you take the joint activity to be primary, and the language … used along the way to be secondary, a means to an end. To account for the language used, we need to understand the joint activities [for which a framework of interactional categories is proposed].

Influenced by the Language Philosophers Grice ( Reference Grice 1957 ), Austin ( Reference Austin 1962 ) and Searle ( Reference Searle 1969 ), he expanded their theory of meaning in action, speech acts , to a theory of meaning in joint activities and joint actions, which accords the listener an equally important role, beside the speaker, in establishing communicative meaning: ‘There can be no communication without listeners taking actions too – without them understanding what speakers mean’ (Clark Reference Clark 1996 , p. 138). However, Clark is first and foremost concerned with language u , the ‘language’ of language use, which he contrasts with language s , the traditional conceptualisation of ‘language’ as language structure (p. 392). What we need is the incorporation of language s into the theory of speech communication, including the AAA and GOV channels , and a powerful model of fine-graded prosodic systems and structures to signal communicative functions in language u .

Since speech and language are anchored in the wider field of human interaction, a communicative approach is the basis of a successful interdisciplinary linguistic science. The seminal thoughts that the psychologist Karl Bühler published on this topic eighty years ago are in no way outdated and inferior to more recent attempts at formalising interaction contexts in grammar. On the contrary, the product approaches of SFL, DRT and ISM, in the tradition of structural linguistics, deal with the formal results of interaction and lose sight of the functions controlling interaction processes, a distinction Bühler captured with Sprachwerk [ language work ] versus Sprechhandlung [ speech action ]. Since Bühler's theory is little known in the linguistic world, especially among an Anglophone readership, this chapter has given an overview of its main components, to bring them back into the arena of theoretical discussion in formal linguistics and measurement-driven phonetics. I shall pick up Bühler's threads in the following chapters to weave a tapestry of speech communication, and elaborate Bühler's model to a function network in human speech interaction to which communicative form across AAA and GVO channels will be related. More particularly, I shall provide subcategorisations of the functions of Representation, Appeal and Expression in Chapters 3 , 4 and 5 , and integrate prosody, the prime formal exponent of Appeal and Expression , into the functional framework of the Organon Model . In adding the prosodic level to the analysis of speech interaction, which is largely missing from the formalised context-in-grammar accounts of DRT and ISM, I shall be relying on insights from extensive research on communicative phonetics carried out at Kiel University over the past thirty years.

The communicative model starts from speech functions and integrates with them the production and perception of paradigmatic systems and syntagmatic structures in morpho-syntax, sounds and prosodies. Thus, the functional categories of Statement or Question or Command/Request are separated conceptually and notationally from the syntactic structures of declarative or interrogative or imperative , with distinctive prosodic patterns coding further functional subcategorisations. In German and English, various syntactic structures can be used, with different connotations, of course, to code a Command or a Request :


with falling intonation for a Command or rising intonation for a Request
Mach (bitte) das Fenster zu!	Shut the window (please)!


with falling intonation and reinforced accents for a Command
Machst du endlich das Fenster zu!	Are you going to shut the window!
with rising intonation and default accents for a Request
Würdest du bitte das Fenster zumachen!	Would you like to shut the window!


with falling intonation and reinforced accentuation for a Command
Du hast die Tür offen gelassen!	You have left the door open!
Du hast vergessen, die Tür zuzumachen!	You forgot to shut the door!
Du machst jetzt das Fenster zu!	You are going to shut the window at once!

Or a Question


for a Polarity Question
Ist er nach Rom gefahren?	Has he gone to Rome?

with rising intonation or in high register for a Confirmation Question
Er ist nach Rom gefahren?	He's gone to Rome?

Furthermore, within Statement or Question or Command/Request , functional relations between semantic constituents are manifested by syntactic structures between formal elements. Both are enclosed in < >, the former in small capitals, the latter in italics (for some of the notional terminology, see Lyons Reference Lyons 1968 , pp. 340ff):

In the active versus passive constructions of Indo-European languages, <Agent> is coded by <subject> and <prepositional phrase> , <Goal> by < object > and < subject >, respectively.

<Agent >	<Action >	<Goal >
<Die Nachbarn>	<verprügelten>	<den Einbrecher>.
<The neighbours>	<beat up>	<the burglar>.
<Goal >	<Action >	<Agent >	<Action >
<Der Einbrecher>	<wurde>	<von den Nachbarn>	<verprügelt>.
<The burglar>	<was beaten up>	<by the neighbours>.

The <Action> , coded by the unitary < verb > ‘verprügelten’ or ‘beat up’, may be divided into the semantic dyad <Action> <Goal> coded by the <verb> < direct object > phrase ‘verpassten eine gehörige Tracht Prügel’ or ‘gave a good beating’, making ‘Einbrecher’ or ‘burglar’ the <Recipient indirect object > of <Action> <Goal >. Active can again be turned into passive .

Finally, the passive patient construction may be lexical:

Another type of proposition centres on an <Event> , for instance meteorological events:

<Event>	<Time>/<Place>	<Place>/<TIME>
<Es regnet/schneit>	<heute>	<in Paris>.
<It's raining/snowing>	<in Paris>	<today>.

In these cases, both the event and its occurrence are coded syntactically by the impersonal verb construction. But, more generally, the two semantic components are separated in syntactic structure, for instance as <subject> and < verb> , and German and English may go different ways, for example in:

Zur Zeit ist über Paris ein Unwetter.

< Event Occurrence>

There is a heavy thunderstorm over Paris right now.

<Occurrence >	<Event>
<There are>	<gale-force winds, hail, thunder and lightning>

Before I move on, let me add a word of clarification concerning the difference, and the relationship, between communicative theory and linguistic discovery procedures. It is a well-established, very useful goal in linguistics to work out the systems and structures of distinctive phonetic sound units that are used to distinguish words in a language, including lexical tone, lexical stress and phonation type in tone, lexical stress and lexical voice quality languages. It is mandatory to base this investigation on the word removed from communicative context in interaction. There is an equally established and useful procedure to work out the morpho-syntactic elements and structures, as well as the accent and intonation patterns that carry distinctive sentential meaning. This puts the sentence removed from communicative context in focus. In the initial analysis stages of an unknown, hitherto uninvestigated language, these phonological and syntactic discovery procedures produce context-free word and sentence representations, which will have to be adjusted as the investigation continues and more and more context is introduced in a series of procedural steps. But it will not be possible to base the phonetic or syntactic analysis on talk in interaction for a long time yet. The procedural product approach makes it possible to reduce a language to writing, and to compile grammars, as well as dictionaries, that link graphemic, phonetic and semantic information for speakers and learners of the language to consult for text writing and speaking. The product approach to language forms also provides useful procedural tools for language and dialect comparison, dialect geography, language typology and historical linguistics.

But the situation changes when languages have been investigated for a very long time, such as English, German, French, Spanish, Arabic, Hindi, Japanese and Mandarin Chinese. When sound representations of words and structural representations of words in sentences have been put in systematic descriptive linguistic formats in such languages, linguistic pursuits may proceed in two different ways.

(1) The formal representations may acquire a purpose in themselves and assume the status of the ‘real’ thing they are supposed to map. Then proponents of another linguistic paradigm may recycle the same data in a different format of their own, suggesting that it increases the explanatory power for the ‘real’ thing. So, we experience recycling of the same data from Structural Linguistics to Generative Grammar to Government and Binding, to Head-driven Phrase Structure Grammar to Role and Representation Grammar, and so on. An example from phonology is the treatment of Turkish vowel harmony in the frameworks of structural phonemics , generative phonology and Firthian prosodic analysis (Lees Reference Lees 1961 ; Voegelin and Ellinghausen Reference Voegelin and Ellinghausen 1943 ; Waterson Reference Waterson 1956 ). The contribution of such l'art pour l'art linguistics to the understanding of speech communication in human interaction is limited.

(2) On the other hand, it may be considered timely to renew theoretical reflection on how speakers and listeners interact with each other, using language beside other communicative means in contexts of situation. The forms obtained through a linguistic product approach will now be studied as manifestations of communicative functions in interactive language use. SFL, DRT and ISM are no longer discovery procedures, but theoretical models. They stop short, however, of reaching the dynamic level of speech interaction because they are still product-oriented and incorporate interaction context statically into structural representation.

Future research will benefit from advancing models of speech communication in interaction for at least some of the well-studied languages of the world. This monograph is an attempt in this direction, focusing primarily on German and English, but additionally including other languages in the discussion of selected communicative aspects. The results of this action approach can, in turn, be fed back into the product approach of language description and comparison. For example, in traditional language descriptions interrogative structures are compared between languages with regard to some vague ‘question’ concept. In the action approach, different types of question are postulated as different Appeal functions in human interaction, and the interrogative forms found in different languages are related to these functions. This will have a great effect on making language teaching and language learning, based on linguistic descriptions of languages, more efficient.

Since, in addition to the syntactic structures , prosody is another central formal device in this functional framework, a prosodic model needs to be selected that guarantees observational and explanatory adequacy for the communicative perspective. This goal can best be achieved when the choice follows from a critical comparative overview of the most influential descriptive paradigms that have been proposed in the past. Therefore, the remaining section of this chapter provides such an historical survey to prepare the exposition, in Chapter 2 , of the prosodic model adopted for integration in the Organon Model .

1.4 Descriptive Modelling of Prosody – An Overview of Paradigms

The study of prosody has concentrated on intonation and, with few exceptions, such as Bolinger's work ( Reference Bolinger and Greenberg 1978 , Reference Bolinger 1986 ), has focused on the formal elements and structures of auditory pitch of acoustic F0 patterns. Questions of meaning and the function of these patterns were raised post hoc , above all in relation to syntactic structures, sentence mode, phrasing and focus . Two influential paradigms in the study of prosody, the British and the American approach, are briefly discussed here, as a basis for the exposition of the Kiel Intonation Model (KIM) , the former because KIM is an offspring of it, the latter in order to show and explain the divergence of KIM from present-day mainstream prosody research. Examples will, in each case, be presented in original notations, as well as in KIM/PROLAB symbolisations (cf. the list at the end of the Introduction ), for cross-reference.

1.4.1 The Study of Intonation in the London School of Phonetics

Descriptions of intonation by the London School of Phonetics (Allen Reference Allen 1954 ; Armstrong and Ward Reference Armstrong and Ward 1931 ; Cruttenden Reference Cruttenden 1974 , Reference Cruttenden 1986 (2nd edn 1997); Jones Reference Jones 1956 ; Kingdon Reference Kingdon 1958 ; Lee Reference Lee 1956 ; O'Connor and Arnold Reference O'Connor and Arnold 1961 ; Palmer Reference Palmer 1924 ; Palmer and Blandford Reference Palmer and Blandford 1939 ; Schubiger Reference Schubiger 1958 ; Wells Reference Wells 2006 ) relied on auditory observation and introspection for practical application in teaching English as a foreign language. Armstrong and Ward ( Reference Armstrong and Ward 1931 ) and Jones ( Reference Jones 1956 ) set up two basic tunes for English, imposed on stress patterns and represented by dots and dashes and curves: Tune I falling, associated with statements, commands and wh questions , Tune II rising, associated with requests and word-order questions. Modifications of these generate falling-rising and rising-falling, as well as pitch-expanded and compressed, patterns, signalling emphasis for contrast and intensity.

Palmer, Kingdon, and O'Connor and Arnold elaborated this basic two-tune concept by differentiating tunes according to falling, low-rising, high-rising, falling-rising and rising-falling patterns. Palmer introduced tonetic marks in orthographic text to represent the significant points of a tune, rather than marking every syllable. This was a move towards a phonological assessment of prosodic substance. The tune was also divided into syntagmatic constituents. O'Connor and Arnold's practical introduction became the standard textbook of Standard Southern British English intonation, proposing a division of tunes, now called tone groups , first into nucleus and prenucleus , then into nuclear tune and tail, and into head and prehead , respectively. These structural parts, with their paradigmatic elements, are combined into ten Tone Groups, five with falling, five with rising tunes at the nucleus. The intonation patterns are, in turn, related to four grammatical structures – statements, questions, commands and interjections. These are formal syntactic structures: declarative syntax , lexical interrogative (called special questions), word-order interrogative syntax (called general questions), imperative syntax and interjectional ellipsis . High-rising nuclear tunes in declarative syntax (‘You like him?’) are discussed under the formal heading of statements, though referred to as ‘questions’ in a functional sense. Similarly, low-falling nuclear tunes in word-order question syntax (‘Will you be quiet!’, ‘Stand still, will you!’, with a high head , or ‘Aren't you lucky!’, with a low head) are discussed under the formal category of ‘general questions’, though referred to as ‘commands’ or ‘exclamations’ in a functional sense. This highlights the formal point of departure of intonation analysis. However, the formal description is followed by a discussion of fine shades of meaning carried by the ten tone groups in their four syntactic environments. This discussion is couched in descriptive ordinary-language word labels (e.g. ‘Tone Group 2 is used to give a categorical , considered , weighty , judicial , dispassionate character to statements’), not in terms of a semantic theory of speech functions. The result is a mix of the formal elements and structures of intonation and syntax in English with ad hoc semantic interpretations. The descriptive semantic additions include attitudinal and expressive meaning over and above the meaning of syntax-dependent sentence modes, i.e. they are treated inside linguistics, not relegated to paralinguistics .

The phoneticians of the London School were excellent observers, with well-trained analytic ears. Although they did not have the concept of alignment of pitch accents with stressed syllables, and did not separate edge tones from pitch accents , central premisses in AM Phonology, they described the auditory differences in minute, accurate detail. What AM Phonology later categorised as H+L*, H* or L+H*, L*+H pitch accents, combined with L-L% edge tones, are separate unitary pitch contours in the taxonomic system of the London School: low fall, high fall, rise-fall. AM/ToBI H* and L+H*/L*+H, combined with L-H%, are fall-rise and rise-fall-rise. Ladd ( Reference Ladd 1996 , p. 44f, 122f, 291 n.6, 132ff) accepts this contour approach as observationally adequate but does not consider it descriptively adequate, because it does not separate edge tones from pitch accents and does not associate the latter with stresses in various alignments.

In Ladd's view, a lack of insight into prosodic structures is most obvious in the way the London School phoneticians treat (rise-)fall-rises in British English. He argues that a rise-fall-rise pattern is compressed into a monosyllabic utterance, but is not spread out across syllables following a stressed syllable. In this case, the fall occurs on the nuclear syllable, the rise at the end of the utterance, with syllables on low pitch in between. To illustrate this he gives the example:

i.		A:	I hear Sue's taking a course to become a driving instructor.
	(a)	B:	Sue!? [L*HL-H%].
			Sue
			A [L*+H] driving instructor [L-H%]!?
			A driving instructor

The low tone of the combined pitch accent L*+H is associated with the stressed syllable of ‘driving’, the trailing high tone with this stressed and the following unstressed syllable. The low tone of the phrase accent L- is associated with the second syllable of ‘driving’ and the first two syllables of ‘instructor’, creating a long low stretch, and the high tone of the boundary tone H% is linked to the final syllable. This shows, according to Ladd, that the edge tones L-H% must be separated from the pitch accent in both cases, although they form an observable complex pitch contour on the monosyllable. The analysis with AM categories and ToBI symbols leaves out an important aspect of the actual realisation, which can be derived from this phonological representation in combination with the impressionistic pitch curve that Ladd provides. The final-syllable pitch rise after a stretch of low pitch gives the stressed syllable of ‘instructor’ extra prominence, partially accenting the word. The pitch pattern is thus turned into a rise-fall on main-accent ‘driving’, followed by a rise on partially accented ‘instructor’.

(b)

A dr iving instr uctor

This is no longer the same pattern as the rise-fall-rise on the monosyllabic utterance, and would not convey the same intended meaning. Therefore, Ladd's line of argument is no proof of a need to separate edge tones from pitch accents in intonational phonology.

The structurally adequate systematisation of rise-fall-rise intonations in English becomes a problem in Ladd's analysis, rather than in that of the English phoneticians, because, in the wake of AM Phonology, Ladd does not distinguish between unitary fall-rise and sequential fall+rise intonation patterns, which were separated as meaningful contrasts by the London School, especially by Sharp ( Reference Sharp 1958 ). Prosodically the two patterns differ in the pitch end points of the fall and of the following rise, being lower for both in the sequence F+R than for the unitary FR, and they also differ in rhythmic prominence on the rise of F+R, as against FR, resulting in a partial accent on the word containing the rise. If the partial accent is put on a function word it naturally has a strong form, whereas in FR a weaker form occurs. This is an additional manifestation of greater prominence in the rise of F+R. Sharp provides an extensive list of examples for both patterns, predominantly in statements and requests, a few in information and polarity questions, and some miscellaneous cases. He is less sure about the occurrence of FR in questions, but maintains, against Lee ( Reference Lee 1956 , p. 70) and Palmer ( Reference Palmer 1924 , p. 82), who mention its absence from this sentence mode, that it does occur, but less frequently than in the other modes. It seems to be perfectly clear, however, ‘that in both “yes-no” questions and “special” questions at least one focus for the patterns is quite common: the first word [of the question]. FR, in these circumstances, asks for confirmation or repetition, F+R pleads for an answer (or for action)’ (Sharp Reference Sharp 1958 , p. 143). Sharp does not give any examples ‘for these circumstances’, but from the general functional description he has given for FR and F+T in questions, the following typical instances may be constructed:

ii.	(a)	[FR] What did you say?
		‘I did not catch that, please repeat.’
		What did you say
		The fall before the rise adds insistence to the request for repetition, which is absent in a simple rise starting on ‘what’:
		What did you say
	(b)	[F] What did you [R] say?
		‘Give me the content of what you said (when he asked you).’
		What did you say
	(c)	But a full accent on the rise is more likely:
		‘Tell me what you said (when he asked you).’
		What did you say
	The fall before the rise in (b) and (c) adds insistence to the request for information, which is absent when the rise on ‘say’ is preceded by a high, instead of a falling, prenucleus :
		What did you say
iii.	(a)	[FR] Are you going to tell him?
		‘He needs to be told, please confirm.’
		Are you going to tell him
		The fall before the rise adds insistence to the request for confirmation, which is absent in a simple rise starting on ‘are’:
		Are you going to tell him
	(b)	[F] Are you going to [R] tell him?
		‘Inform me whether you will tell him.’
		Are you going to tell him
	(c)	But a full accent on the rise is more likely:
		Are you going to tell him
	The fall before the rise in (b) and (c) adds insistence to the request for information, which is absent when the rise on ‘tell’ is preceded by a high, instead of a falling, prenucleus :
		Are you going to tell him

In examples (ii.a) and (iii.a), the peak of FR has medial alignment with the accented syllable, AM H*, PROLAB &2^ . In (ii.b) and (iii.b), a partial accent is possible for the rise of F+R on ‘say’ or ‘tell him’, but the full accent in (c) conveys the given meaning more clearly. The increased prominence that signals it is produced by the F0 onset of the rise in the accented syllable being critically below the end point of the preceding fall. This difference between a partially and a fully accented rise in F+R cannot be represented in the London School framework because accent is not a separate category from intonation and rhythmic structure. The examples in (ii.) and (iii.) have been constructed on the basis of Sharp's description. There are one or two examples in his list of the FR and F+R distinction in initial focus position of questions, but they are different from the ones in (ii.) and (iii.); they represent his standard patterns of medial-to-late FR alignment and F+R accentuation.

iv.	(a)	[FR] What's his name?
		‘I have forgotten.’ ‘I am incredulous.’
		What's his name
	(b)	[F] What shall I [R] tell him?
		‘I really cannot think of anything.’
		What shall I tell him
		Accent is possible as well when ‘tell’ is given a second major information point.
v.		[F] Are you [R] coming?
		‘Do tell me whether you are coming.’ ‘Must I wait here for ever?’ (Despair)
		Are you coming

Sharp did not distinguish clearly between two different alignments of FR. Except for the cases illustrated in (ii.) and (iii.), his examples refer to medial-to-late alignment of FR with the accented syllable. His FR data also appear to be all of the non-intensified type of (rise-)fall-rise, and therefore do not correspond to the AM category L*+HL-H% in Ladd's emphatic example, but to (L+)H*L-H% ( PROLAB: &2^…&. , versus &2^- (… &. ,). The general meanings of F+R and FR may be given as ‘associative’ versus ‘dissociative’ reference to alternatives in preceding speech actions. Here are two sets of examples:

vi.	A:	Look, there's Peter.
	B:	I've seen him.
	(a)	[aɪv FR siːn ɪm] ‘I saw him before you even pointed him out.’
		[aɪv & (siːn ɪm ., &PG]
	(b)	[aɪv F siːn R hɪm] ‘I have spotted the person you are pointing to.’
		[aɪv & si˸n hɪm , &PG]
	(c)	[aɪv FR siːn R hɪm] ‘I saw the person you are pointing to without you mentioning it.’
		[aɪv (siːn hɪm , &PG]

In FR of (a) ‘him’ has its weak form, in F+R of (b) its strong form. (c) shows that an FR on ‘seen’ may be followed by a simple rise on ‘him’ [hɪm] (again in its strong form, as in (b)), giving it more prominence, and partially accenting and foregrounding it. This rules out an association of the rise of the (rise-)fall-rise with an edge tone and is therefore outside the scope of AM Phonology.

There are further possibilities:

vi.	(d)	[aɪv F siːn ɪm], ‘reporting the fact that I have seen him’
		[aɪv siːn ɪm ]
	(e)	[aɪv F siːn hɪm]
		with partial accent on ‘him’, like (d) but foregrounding ‘him’.
		[aɪv siːn hɪm ]
	(d) and (e) differ from (a) and (c) by only reporting speaker-oriented facts, whereas the latter involve the dialogue partner.
vii.	A:	You chaired the appointment committee for the chair of phonetics. The committee decided to take the applicant from down-under. Was it a good choice?
	B:	I [F] thought [R] so. ‘That was my opinion and it still is.’
		I thought so
		I [FR] thought so. ‘That was my opinion at the time, but I have changed my mind.’
		I thought so

These data, analysed with observational as well as descriptive adequacy in the London School of Phonetics , cannot be handled in the AM Phonology framework, precisely because it links the rise to edge tones. Intermediate phrase boundaries cannot be introduced to solve the problem because there are no phonetic grounds for them. This had already been pointed out with reference to German data in Kohler ( Reference Kohler, Sudhoff, Lenertová, Meyer, Pappert, Augurzky, Mleinek, Richter and Schließer 2006b , pp. 127ff), cf. 2.7 . In addition to pitch accent L*+H, followed by the edge tones L-H%, Ladd ( Reference Ladd 1996 , p. 122) discusses some examples in British English for which he postulates pitch accent H*:

viii.	(a1)	Could I [H] have the [H] bill please [L-H%]?
		Could I have the bill please
	(b1)	Is your [H*] mother there [L-H%]?
		Is your mother there

They sound ‘condescending or peremptory’ to speakers of North American English , where a high-rising nucleus + edge tones, H*H-H%, would be used instead:

viii.	a2)	Could I [H] have the [H] bill please [H-H%,]?
		Could I have the bill please
	(b2)	Is your [H*] mother there [H-H%]?
		Is your mother there

The reference to Halliday's broken Tone 2 in viii. (p. 291 n.6) makes it clear that Ladd is referring to a fall (not a rise-fall) on the accent of ‘bill’ or ‘mother’, followed by a rise on unaccented ‘please’ or ‘there’ in word-order questions. The pattern is a unitary fall-rise , making an associative reference to preceding actions of the type ‘I've been served, I've eaten, I want to pay now’ in (a), and ‘I would like to speak to your mother. Is she in?’ in (b). In both cases the rise establishes contact with the person spoken to; a simple fall would lack this and sound abrupt.

These examples could, of course, also be spoken with a unitary rise-fall-rise , and would then make dissociative references, (a) ‘Waiter, I've been trying to catch your attention but you are constantly dealing with other customers, I am in a hurry’ (b) ‘Sorry, it's not you I have come to see, but your mother.’ And in (a), ‘please’ may get extra prominence, giving it a secondary accent, in a separate rise after a fall or a fall-rise, creating F+R or FR+R and adding insistence to the request.

viii.	(a3)	Could I have the bill please
	(a4)	Could I have the bill please
	(a5)	Could I have the ^ bill please
	(b3)	Is your mother there

Parallel to the British English example (viii.b1) ‘Is your mother there?’, Ladd ( Reference Ladd 1996 , p. 122) discusses the German equivalent in the AM Phonology framework:

ix.	(a1)	Ist deine [H*] Mutter da [L-H%]?
		probably based on an exponency classifiable as
		Ist deine [FR] Mutter da?
		and as Ist deine Mutter da
		But there are other possible realisations.
	(b1)	Ist deine [F] Mutter [R]da?
		Ist deine Mutter da
		partially foregrounding ‘being ’ as a minor information point beside the main information point ‘your mother’
	(a2)	Ist deine Mutter da
	(b2)	Ist deine Mutter da

The functional interpretations of these patterns are the same as in the English equivalents.

1.4.2 Halliday's Intonational Phonology

Halliday followed the tradition of the London School of Phonetics, but he incorporated the phonetic analysis of intonation in a phonological framework within his categories of a theory of grammar (Halliday Reference Halliday 1961 ). In two complementary papers ( Reference Halliday 1963a , Reference Halliday b ), which were republished in adapted and more widely distributed book form in 1967, he described intonation as a complex of three phonological systemic variables, tonality , tonicity and tone , interrelated with a fourth variable, rhythm . Tonality refers to the division of speech events into melodic units, tone groups . The tone group enters into a hierarchy of four phonological units together with, in descending order, the rhythmic foot , the syllable and the phoneme , each element of a higher-order unit consisting of one or more elements of the unit immediately below, without residue. The rhythmic feet in a tone group form a syntagmatic structure of an obligatory tonic preceded by an optional pretonic , each consisting of one or more feet. This structure is determined by the tonicity variable, which marks one foot in the foot sequence of a tone group as the tonic foot , by selecting one of a system of five tonal contrasts, the tones 1 fall, 2 high rise , 3 low rise , 4 fall-rise, 5 rise-fall. Feet following the tonic foot in the tonic of a tone group generally follow the pitch course set by the tone of the tonic. Besides these single tonics there are the double tonics 13 and 53, uniting tone 1 or 5 with tone 3 in two successive tonic feet of the tonic section of one tone group. They form major and minor information points and correspond to F+R versus FR in tone 4.

Tied to the tone selection at the tonic there are further tone selections at the pretonic. At both elements of tone group structure, a principle of delicacy determines finer specifications, such as different extensions of the fall in tone 1 (1+ high, 1 mid, 1- low), different high-rising patterns for tone 2 (2 simple rise, 2 rise preceded by high fall: broken tone 2), and different extensions of the fall in tone 4 (4 mid fall-rise, 4 low fall-rise). Each rhythmic foot has a syntagmatic structure of obligatory ictus , followed by optional remiss ; the former is filled by a strong syllable, the latter by one or more weak syllables. Halliday follows Abercrombie ( Reference Abercrombie, Abercrombie, Fry, MacCarthy, Scott and Trim 1964 ) in assuming stress-timed isochronicity for English, and that the ictus may be silent (‘silent stress’) ‘if the foot follows a pause or has initial position in the tone group’ (Halliday Reference Halliday 1963a , p. 6).

Halliday integrates his intonational phonology into the grammar of spoken English, where the intonational systems operate side by side with non-intonational ones in morphology and syntax, at many different places in the coding of meaningful grammatical contrasts. In the 1963a paper, he looked from phonological contrasts to distinctive grammatical sets, asking ‘What are the resources of intonation that expound grammatical meaning?’, whereas in the 1963b paper, he looked at the phonological contrasts from the grammatical end, asking ‘What are the grammatical systems that are expounded by intonation?’ With this approach, Halliday took a step towards a functional view of phonological and grammatical form, which he has been concerned with ever since in the development of a coherent framework of Systemic Functional Linguistics (SFL).

Pheby ( Reference Pheby 1975 ) and Kohler ( Reference Kohler 1977 (1st edn)) applied Halliday's framework to German. They were an advance on von Essen ( Reference Essen 1964 ), who delimited three basic pitch patterns with reference to vaguely defined functional terms – terminal, continuative, interrogative intonation – and was then forced to state that yes-no questions have rising intonation, question-word questions and statements terminal intonation, and syntactically unfinished sentences continuation rises . This analysis, quite apart from being superficial and incomplete, mixed up the formal and functional levels of intonation right from the start, which the British colleagues and Kohler ( Reference Kohler 1977 , Reference Kohler 1995 , Reference Kohler, Fant, Fujisaki, Cao and Xu 2004 , Reference Kohler 2013b ) did not; they knew, and said so, that both question forms can have either terminal or rising pitch with finer shades of meaning.

The more recent publication by Halliday and Greaves ( Reference Halliday and Greaves 2008 ) expounds the Hallidayan intonation framework in greater detail and reflects its integration with grammar in the very title. Whereas the earlier publications described the intonation of Standard Southern British English (RP) , the later one includes Australian and Canadian English, thus taking ‘English’ in a more global sense, and it illustrates the descriptions with Praat graphics in the text and with sound files of isolated but grammatically contextualised utterances, as well as of dialogues, on an accompanying CDROM. Meaning as carried by intonation is now related to three of Halliday's four metafunctions: the interpersonal, the textual and the logical. The systems of tonality and tonicity are linked to textual meanings, the systems of tone to interpersonal meanings. The phonological rank scale is paralleled by a grammatical rank scale of sentence , clause , group/phrase , word , morpheme , linking to experiential, interpersonal and textual meanings. Setting up separate systems for intonation and grammatical structure is a good principle because it avoids the conflation of falling or rising tonal movement with declarative and two types of interrogative structure, as has been quite common. But cutting across this grammatical rank scale is the information unit , which is not independently defined, and seems to be in a circular-argument relationship with the phonological unit of the tone group , since by default one tone group is mapped onto one information unit: ‘Thus the two units, the phonological “tone unit” and the grammatical “information unit” correspond one to one; but since they are located on different strata, their boundaries do not correspond exactly. In fact, both are fuzzy: the boundaries are not clearly defined in either case’ (Halliday and Greaves ( Reference Halliday and Greaves 2008 ), p. 99). This means that adding yet another unit to the extremely complex taxonomic intonation-grammar system does not seem to serve a useful purpose, and Crystal ( Reference Crystal 1969b ) had already criticised the concept in his review of Halliday ( Reference Halliday 1967 ).

Another weak point of Halliday's intonational phonology concerns the division of the stream of sound into tone groups and of these into rhythmic feet. Although Halliday and Greaves gave up the doubtful isochrony principle and no longer quote Abercrombie ( Reference Abercrombie, Abercrombie, Fry, MacCarthy, Scott and Trim 1964 ), rhythmic regularity is still the building principle of the tone group: ‘When you listen carefully to continuously flowing English speech, you find there is a tendency for salient syllables to occur at fairly regular intervals, and this affects the syllables in between: the more of them there are, the more they will be squashed together to maintain the tempo’ (Halliday and Greaves ( Reference Halliday and Greaves 2008 ), p. 55). This can be a useful heuristics when dealing with isolated sentences in foreign language teaching, even more so for learners whose native languages have totally different rhythmic structures from English, such as French . Teaching English as a Foreign Language was a prominent field of application of a large part of intonation analysis in the London School of Phonetics . Halliday, likewise, worked out his system of intonational phonology for the Edinburgh Course in Spoken English ( Reference Halliday 1961 ) by R. Mackin, M. A. K. Halliday, K. Albrow and J. McH. Sinclair, later published by Oxford University Press (see Halliday Reference Halliday 1970 ). The Intonation Exercises of this course were reproduced as teaching materials at the Edinburgh Phonetics Department Summer Vacation Course on the Phonetics of English for foreign students. In 1965 and 1966, I was asked to give these intonation tutorials.

But the rhythmic foot analysis of the tone group does not really provide a good basis for analysing continuous speech. Moreover, Halliday's intonational phonology lacks the category of a phrase boundary. Such a prosodic phrase marker encapsulates a bundle of pitch, duration, energy and phonation features to signal a break, which may, but need not, coincide with grammatical boundaries and with the boundaries Halliday sets up for his tone groups . In sequences of rhythmic feet, Halliday earmarks those that contain one of his five tones, the tonic feet, constituting the tonics of tone groups. Since by arbitrary definition any one tone group can only have one tonic (except for the major+minor tonic compounds 13 and 53), there must be a tone group boundary between two succeeding tonics. Where this boundary is put is again arbitrary in view of the fuzziness Halliday and Greaves refer to in the quotation above, i.e. due to the lack of a phonetic criterion that determines a phrase boundary. This was again pointed out by Crystal ( Reference Crystal 1969b ). In many cases, Halliday no doubt takes the grammatical structure into account when deciding on the positions of tone group boundaries. But this is against his principle of setting up separate phonological- and grammatical-rank scales and relating them afterwards, and the violation of this principle borders on circularity.

And, finally, giving tone groups a rhythmic foot structure conflates rhythmic grouping into ictus and remiss with meaning-related phrasal accentuation. Halliday's framework does not provide a separate accent category outside the tonic, and in the latter it is the pitch-related tone category that determines the tonic foot and the tonic syllable, and thus constitutes a phrasal accent. The syllable string preceding the tonic may contain meaning-related phrasal accents, but not all ictus syllables of a postulated rhythmic foot structure are accented. A tonic foot may be preceded by a multisyllable prehead , which contains no accent, but may be perceived as a sequence of strong and weak syllables due to timing and vowel quality, for example before a tonic containing tone 3 in:

// 3 don't stay / out too */ long // (Halliday and Greaves ( Reference Halliday and Greaves 2008 ), p. 119; see Figure 1.2 a)

In Hallidayan notation ‘don't’ and ‘out’ are treated as ictus syllables in two rhythmic feet of the pretonic and a tone-3 tonic. But, when listening to the .wav file (supplied on the CDROM), no accent can be detected in the pretonic syllable sequence, and the perception of rhythmic structure fluctuates between the one noted and /don't stay out too/. The vocalic elements in all four syllables have durations between 120 and 130 ms. Duration would be considerably longer in an accented syllable containing a diphthongal element.

Figure 1.2 Spectrograms and F0 traces (log scale) of a // 3 don't stay / out too */ long // – audio file 5_2_2_4a3.wav, and b // 1 don't stay / out too */ long // – audio file 5_2_2_4a4.wav, from Halliday and Greaves ( Reference Halliday and Greaves 2008 ), p. 119. Standard Southern British English , male speaker

What the (male) speaker realises here is a high prehead before the (only) sentence accent, in a high register at a pitch level around 180 Hz, which at the same time increases the pitch range down to the following low rise. The speaker could, of course, have used a high prehead without going into a high register and thus without increasing the pitch range. In the high prehead, F0 fluctuation is largely conditioned by vowel-intrinsic and consonant-vowel coarticulatory microprosody : only the initial ‘don't’ has a more extensive rise, which, just like vowel duration, is not large enough to signal a phrasal accent. The listener may then structure the prehead rhythmically in variable ways. Halliday differs from the London School of Phonetics, e.g. O'Connor and Arnold ( Reference O'Connor and Arnold 1961 ), by not having the category of prehead . The composition of tone groups by rhythmic feet with an obligatory ictus syllable that may be silent precludes it.

How serious this omission is in a systemic functional approach to intonation is shown by the example:

// 1 don't stay / out too */ long // (Halliday and Greaves ( Reference Halliday and Greaves 2008 ), p. 119; see Figure 1.2 b)

The notation given for this tone group differs from the previous one only by having tone 1 instead of tone 3. But listening to the .wav file reveals two differences: (1) ‘don't’ is accented because its prominence is greater, due to longer duration of its sonorous part, and to more extensive F0 movement, well above the pitch level of the following syllables, so the pretonic sequence is not a prehead; (2) the pretonic sequence is at a much lower pitch level of 150 Hz – even the peak in ‘don't’ only reaches 170 Hz. The accent on ‘don't’, combined with the lower pitch level preceding the final fall, intensifies the meaning of a command, whereas the unaccented, but high prehead preceding the final low rise intensifies the meaning of a request, and the high register adds a note of entreaty. These are important aspects of the transmitted meanings, which are not reflected by different tonal categorisation in Halliday's notation: the two pretonics are identical because they are given the same rhythmic structure. But this rhythmic structure is an additional overlay on accentuation, register and range, and may surface perceptually in variable ways in both utterances. In PROLAB, the two utterances are differentiated as:

&HP &HR don't stay out too &2[ long &, &PG

&2^ don't stay out too &0. &2^ long &2. &PG

The additional rhythmic structure is captured at the level of segmental spectrum and timing.

The following postulates of Halliday's intonational phonology can be taken as essential for any prosodic framework:

English intonation is based on a system of contour-defined contrastive tones.

Parallel to the phonological tone system there are lexicogrammatical systems.

Phonological form is part of the grammar as another exponent of meaning in language functions.

But to be applicable to the analysis and description of prosodic systems in connected speech, more particularly spontaneous speech, and in text-to-speech synthesis, several weak points of Halliday's systemic functional approach need adjusting.

The nesting rank scale of phonological units, as well as the immediate constituents division of tone groups into tonic and pretonic, do not provide an adequate representation of prosodic structures – especially, the composition of the unit of the tone group by elements of the unit of the rhythmic foot cannot cope with the dynamic flow of speech and rhythmic disturbances such as hesitations, false starts, repetitions. Instead we need an accent category with several levels, based on degrees of prominence, to which tones are linked. In between successive accents, pitch is organised into distinctive concatenation patterns.

Speech is organised into prosodic phrases, so prosodic phrase boundaries need to be determined by bundles of phonetic features.

In prosodic phrases, the first accent may be preceded by unaccented preheads, and they form a system of mean, low and high pitch.

Register needs to be introduced to set the pitch level of prosodic phrases, or of the part up to the final accent-linked pitch turn (thus also determining pitch range), or of sequences of prosodic phrases.

When these weaknesses of Halliday's intonational phonology became relevant in the Kiel TTS development (Kohler Reference Kohler 1991a , Reference Kohler b ) and in spontaneous speech annotation for the Verbmobil project (Kohler, Pätzold and Simpson Reference Kohler, Pätzold and Simpson 1995 ), the description of German intonation, given in Hallidayan terms in the first edition of Kohler ( Reference Kohler 1977 ), was put on a new basis developed for the tasks: the Kiel Intonation Model . It was presented in Kohler ( Reference Kohler 1991a , Reference Kohler b ), then in the second edition of Kohler ( Reference Kohler 1995 ) and in Kohler ( Reference Kohler, van Santen, Sproat, Olive and Hirschberg 1997a , Reference Kohler, Sagisaka, Campbell and Higuchi b ), and will be set out in Chapter 2 . Subsequent chapters will take Halliday's form and function perspective one step further. Whereas Halliday looked from phonology to grammar and from grammar to phonology in the early papers, and later related phonological form in grammar to metalinguistic functions, I shall reverse the relationship, set up a few basic communicative functions within Bühler's model and then investigate language-specific prosodic, syntactic and lexical carriers for them.

1.4.3 Pike's Level Analysis

Pike laid the foundation for the analysis of American English intonation on a different descriptive basis, auditorily referring significant points of pitch contours – starting and ending points, and points of direction changes, in relation to stressed syllables – to four pitch levels, 1–4 from highest to lowest. Not every unstressed syllable gets a significant pitch point but may have its pitch interpolated between neighbouring pitch points. On the other hand, a syllable may get more than one significant point to represent the stress-related pitch contour, or even more than two, when a contour changes direction and is compressed into a single stressed syllable. Pike gives a detailed formal account of the resulting pitch-level contours of American English and relates them to syntactic structures. He points out that the contours found in statements can also occur in questions and vice versa, and he provides a wealth of ad hoc references to attitudinal and expressive shades of meaning added to utterances by pitch contours. His analysis thus parallels the one by O'Connor and Arnold with a different paradigm for a different variety of English.

1.4.4 Intonation in AM Phonology and ToBI

As Halliday provided a phonological framework within structuralist grammar for the intonation analysis of the London School of Phonetics , Pierrehumbert put the Pikean level analysis of intonation into a framework of Autosegmental Metrical (AM) Phonology. The distinctive pitch levels were reduced to two, H and L, which, on their own and in the sequence H+L and L+H, form systems of pitch accents, phrase accents and boundary tones. In pitch accents, H and L are associated with stressed syllables indicated by *, but they may have leading or trailing H or L, yielding H*, L*, H+L*, H*+L, L+H*, L*+H. The separation of H* and L+H* was a problematic alignment category in AM Phonology and ToBI because a dip between two H* accents requires an L tone attached to an H* tone, given the principle of linear phonetic interpolation between pitch accents.

Falling, rising or (rising-)falling-rising nuclear pitch contours (of the London School), which in the extreme case are compressed into a one-syllable utterance, such as ‘yes’, are decomposed into three elements: a pitch accent, followed by a phrase accent and, then, by a boundary tone, in each case with selection of H or L. All three must always be represented, e.g. H*L-L%, H*H-H%, L*H-H%, H*L-H%, L+H*L-H%, L*+HL-H%. Since falling-rising contours are defined by three pitch points, three types of syntagmatic element are needed to represent them. AM Phonology selects them from the three accent and boundary categories and extrapolates them to all contours, including monotonic falls and rises. These phonological elements are associated with syllables and phrase boundaries, linked to F0 traces and aligned with segmental syllable structure in spectrograms. The confounding of pitch accents with edge tones has already been reviewed in the discussion of AM solutions for FR and F+R patterns of the London School in 1.4.1 .

AM Phonology is a highly sophisticated formal framework, which, beyond the basic premisses sketched here, has been undergoing continual change over the years, and right from the outset the focus has been on form, not on function and meaning. When the AM phonological framework became the basis for a transcription system, ToBI , the original strict language-dependent systemic approach began to get lost, phonetic measurement was squeezed into the preset categories, which were transposed to other languages, and the transcription tool was elevated to the status of a model.

Questions of meaning of the formal intonation structures have been raised, but post festum , for example by Pierrehumbert and Hirschberg ( Reference Pierrehumbert, Hirschberg, Cohen, Morgan and Pollack 1990 ), who propose a compositional theory of intonational meaning related to pitch accents, phrase accents and boundary tones. Another, very influential example of linking intonational form to meaning is Ward and Hirschberg ( Reference Ward and Hirschberg 1985 ), where the rise-fall-rise contour, based on the AM representation L*+HL-H%, is analysed as a context-independent contribution to conveying speaker uncertainty . It appears, however, that most of the examples discussed by Ward and Hirschberg are not instances of L*+HL-H%, but of L+H*L-H%, which they explicitly exclude as the phonological representation of their rise-fall-rise. With the L*+HL-H% pattern, a speaker is said to relate an utterance element to a scale of alternative values and to indicate not being certain whether the hearer can accept the allocation as valid. For example, in:

	B:	I'm so excited. My girlfriend is coming to visit tonight.
	A:	From far afield?
a.	B:	From suburban Phila\del/phia.
b.	B:	*From next \door/. (p. 766)

‘[T]he speaker, a West Philadelphia resident, conveys uncertainty about whether, on a distance scale, suburban Philadelphia is far away from the speaker's location. … b. is distinctly odd, given the implausibility of B's uncertainty whether next door is far away ’ (p. 766).

The authors provide an analysis in terms of logical semantics at the Representation level, which considerably narrows the field of speech communication, and may thus make it difficult to capture the full range of the communicative function of the fall-rise pattern in English. If, in the above example, B were to give a facetious answer, with a smile on his face, b. would not be odd at all, but would be understood as an ironic reply to A's enquiry about distance. It would still be an instance of what Sharp ( Reference Sharp 1958 ) called the dissociative reference to alternatives in his fall-rise FR. The semantic-prosodic distinction between this pattern and Sharp's F+R is nicely illustrated by the two versions of the sentence ‘I thought so’ discussed in 1.4.1 . The speaker expresses association with, or dissociation from, the earlier belief, using either F+R or FR, and is certain about that in both cases. With FR, the speaker is, on the one hand, definite about having changed his mind, by using a peak pattern, but on the other hand, plays it down in social interaction conforming to a behavioural code, by adding a rise to alleviate the categoricalness in an appeal to the listener to accept the change of mind. If the speaker makes a statement about the present opinion without associative or dissociative reference to the past, it may be ‘I [F/R] think so’, with either a fall or a low rise for a definite or a non-committal response.

Whereas all the Ward and Hirschberg examples of American English have their fall-rise equivalents in Standard Southern British English , this may not hold for transposing Sharp's British English examples to American English. If the pattern distinctions do apply to both varieties, the conflation of pitch accents with edge tones and the lack of an accent category, separate from pitch, preclude the distinctive representations of the semantic-prosodic subtleties related to fall-rise pitch patterns. This may be illustrated by the following contextualisations:

	To provide sufficient seating at a family get-together, father A says to his two boys B and C
A	We need more chairs in the sitting-room. Go and get two from the kitchen and a couple more from the dining-room.
B	[Goes to the kitchen, comes back with two chairs, says to A]
	(a)	There's [FR] another one in the kitchen.
		There's another one in the kitchen
	(b)	There's [F+R] another one in the kitchen.
		There's another one in the kitchen
	(c)	There's [FR] another one in the [R] kitchen.
		There's another one in the kitchen
C	[Goes to the dining-room, gets two chairs, comes back via the kitchen, says to A]
	d)	There's [F] another one in the [R] kitchen.
		There's another one in the kitchen
	e)	There's [FR] another one in the [FR] kitchen.
		There's another one in the kitchen

In (a), B uses a rise-fall-rise that falls sharply to a low level on ‘another’, and then immediately rises again to mid-level at the end of ‘kitchen’, which is unaccented because it is integrated in a monotonic rise from ‘one’ onwards. This is Sharp's unitary FR, Halliday's tone 4, and L+H*L-H% in AM Phonology. B transmits the meaning ‘There's an additional chair in the kitchen, besides the two I have just brought from there, although Dad thought there were only two’, a dissociative reference to alternatives.

In (b), the rise after the low-level fall on ‘another’ is delayed until ‘kitchen’, which is partially foregrounded with a partial accent. This is Sharp's compound F+R, and Halliday's double-tonic with tone 13. However, the pattern cannot be represented in AM Phonology because the categorisation L+H* L*L-H% for a fall followed by a rise, with two pitch accents and final edge tones, allocates two full accents to the phrase, and therefore does not distinguish (b) from (d). The F+R pattern makes an associative reference to alternatives; it does not have the contrastive reference to A's mention of ‘two chairs from the kitchen’.

In (c), B makes the same dissociative reference to alternatives as in (a) but partially foregrounds ‘kitchen’, giving it a partial accent by breaking the rising contour of the fall-rise and by starting another rise from a lower level within the same intonation phrase. In Sharp's analysis, ‘another’ would receive a fall-rise FR, ‘kitchen’ a simple rise. Similarly, Halliday would have tone 4 followed by tone 3 in two tone groups . AM Phonology cannot represent this pattern because an intermediate intonation phrase would have to be postulated even in the absence of any phonetic boundary marker. If the pitch break were to be taken as the indication of such a phrase boundary , from which the presence of edge tones would in turn be deduced, the argument becomes circular. In all three descriptive frames, the different accent level of ‘kitchen’ versus that of ‘another’ would not be marked, and therefore the different meaning from (d) and (e) could not be captured.

Since C has brought chairs from the dining-room he refers contrastively to an additional chair in the kitchen, and gives ‘kitchen’ a full accent. In (d), Sharp's F+R is separated into F and R linked to the two accents, with associative reference to alternatives. Halliday would have to have two tone groups //1 There's a no ther one //3 in the kitch en.// This analysis is independent of the presence or absence of phonetic boundary markers. In AM Phonology, the pattern may be represented by two pitch accents in one intonation phrase, L+H* L*L-H%, because the L of the second pitch accent provides the right-hand pitch point for linear interpolation of the fall from the H of the first pitch accent. In (e), there are dissociative references to an alternative number of chairs and to an alternative locality, by two rise-fall-rises linked to the two accents. As in (c), there may again be a single prosodic phrase. Sharp's analysis would simply have FR in both positions; Halliday would again need two tone groups , each with tone 4. In AM Phonology, two intonation phrases with L+H*L-H% would be necessary to generate the four-point rise-fall-rise contours, each with two intonation-phrase edge tones in addition to two pitch-accent tones, irrespective of the potential absence of phonetic boundary markers between them.

Thus, the AM phonological representations in (d) and (e) of C differ in the relative allocation of prosodic information to the theoretical categories of paradigmatic pitch accent and syntagmatic intonation phrasing . This different allocation is conditioned by constraints in the canonical AM definitions of prosodic categories:

Pitch accents are defined by up to two sequential H or L tones.

In a sequence of pitch accents, the pitch contour between abutting tones is the result of linear phonetic interpolation between the phonological pitch-accent tones. Therefore, for example, in two successive peak patterns, a distinctive pitch dip between two H* necessitates postulating a bitonal pitch accent, either a trailing L tone in the first, or a leading L tone in the second.

The pitch contour between the last pitch-accent tone and the end of the intonation phrase is represented by two sequential H or L edge tones, a phrase accent and a boundary tone.

A rise-fall-rise intonation contour around an accented syllable, with four distinctive pitch points, must be represented by a bitonal pitch accent followed by two edge tones.

If a rise-fall-rise contour occurs utterance-internal, it must be followed by an intonation phrase boundary .

If there are no phonetic boundary markers indicating such a boundary, such as segmental lengthening, with or without a following pause, there are no pitch-independent reasons for postulating such a boundary, or the argumentation becomes circular by using pitch as the defining feature for the postulated boundary, which in turn determines the edge tones before it.

These constraints on the phonological representation of intonation contours in AM Phonology reduce descriptive and explanatory adequacy in prosodic data interpretation, compared with the accounts provided by the London School and Halliday.

1.4.4.1 Alignment of Rise-Fall-Rises in English

AM Phonology conceptualises English L*+HL-H% and L+H*L-H% rise-fall-rise patterns as different alignments of the L and H tones of the rise-fall pitch accent with the stressed syllable: either L or H is aligned with it, H trailing L* and L leading H*, producing later (delayed) or earlier association of the pitch accent with the stressed syllable. An even earlier alignment is given as H*L-H%, and there is a fourth possibility – H+L*L-H%, where alignment occurs with the syllable preceding the stressed one, which appears not to be discussed in the AM literature. KIM treats these pitch patterns as distinctive points on a scale of synchronisation, from early to late , of F0 peak maximum with vocal-tract timing, and uses the PROLAB notations < &2) &.,> , < &2^ &.,> , < &2^-(&.,> , < &2(&.,> (see 2.7 ).

Pierrehumbert and Steele ( Reference Pierrehumbert and Steele 1987 , Reference Pierrehumbert and Steele 1989 ) raised the question as to whether the L+H*L-H% versus L*+HL-H% distinction is discrete or scalar. They based their investigation on the utterance ‘Only a millionaire’, with initial stress on the noun and F0 peaking earlier or later in relation to the offset of /m/. They contextualised the two versions in a scenario of a fund-raising campaign targeting the richest. A potential donor, when approached as a billionaire in a telephone call, replies, ‘Oh, no. Only a millionaire’, with L+H*L-H%, whereupon the charity representative expresses his incredulity and uncertainty with the later peak alignment L*+HL-H%. To decide on the discrete versus scalar issue, the authors performed a perception-production experiment. They took a natural production of a L+H*L-H% utterance as the point of departure for LPC synthesis, shifting the stylised rise-fall pattern in 20 ms steps through the utterance, with peak positions ranging from 35 ms to 315 ms after /m/ offset.

Five subjects were asked to listen to each of the fifteen stimuli in fifteen randomised blocks, and to imitate what they had heard. These imitations were recorded and analysed with the hypothesis that, if the categories are discrete, the ideal speaker/listener will allocate the percepts to two different categories and then reproduce them in such a way that the realisations will show a bimodal clustering. The statistical basis of this experiment is weak, not only because of the insufficient number of subjects, but more particularly since one hearer-speaker was the junior author, who, of course, knew what the test categories were and sounded like, and who produced the clearest bimodal pattern. Furthermore, one subject failed to produce even a vague resemblance of bimodality.

The authors’ conclusion that the two phonological categorisations of rise-fall-rise patterns in AM Phonology represent a discrete contrast can therefore not be accepted as having been proved. It is to be assumed that peak shifts in rise-fall-rise patterns are perceptually processed in similar ways to peak shifts in rise-fall patterns, as obtained for English and German (see 2.8 ). These data show that the perception of peak synchronisation only changes categorically from early (pre-accent) to medial (in-accent) position, but not for peak shift inside the accented vowel, from medial to late , where changes are perceived along a continuum. Since the Pierrehumbert and Steele experiment only dealt with the in-accent shift, the potential categorical change from pre-accent to in-accent could not become a research issue, and in view of the weakness of the experimental paradigm, the results do not support discrete patterning. The perceptual and cognitive processing of rise-fall-rise peak shifts may be considered parallel to that observed for rise-falls, with the addition of an interactional rapport feature carried by final rising pitch. Whereas in a shift from early to medial peak there is a discrete semantic change from Finality to Openness , coupled with a categorical perceptual change (see 2.8 ), the shift from medial to late peak successively adds degrees of Contrast and of the expression of Unexpectedness along a continuum of peak synchronisation. Furthermore, this expression includes other prosodic variables besides F0 alignment, i.e. F0 peak height, timing, energy and more breathy phonation .

This issue was investigated by Hirschberg and Ward ( Reference Hirschberg and Ward 1992 ). They report recording the pattern L+H*L-H% with eight utterances in an ‘uncertainty’ as well as in an ‘incredulity’ context, where the latter was hypothesised to generate an expanded pitch range, different timing, amplitude and spectral characteristics. The utterances differed widely in the stretch of speech over which the rise-fall-rise was spread, with ‘ELEVEN in the morning’ at one end of the scale and ‘Nine MILLION’ at the other. For the former, the two contexts, as well as the F0 displays of the two data samples produced, are provided:

‘uncertainty’	A	So, do you tend to come in pretty late then?
	B	\ELEVEN in the morning/.
‘incredulity’	A	I'd like you here tomorrow morning at eleven.
	B	!ELEVEN in the morning!

! ! is to symbolise the incredulity version of the utterance with the same pitch-accent and edge-tone pattern L*+HL-H% as in the uncertainty version \ /. The two figures provided show that F0 sets in low and starts rising at the end of the stressed vowel of ‘eleven’, peaks at the end of the accented word, stays high during the following vowel and then descends to a low level in ‘the’. There follows a further small F0 drop in the stressed vowel of ‘morning’, before F0 rises again in the final syllable. The two displays differ only in the F0 range, which is wider in the ‘incredulity’ version, with a slightly higher precursor and considerably higher peak and end points. These F0 patterns suggest that ‘morning’ received extra prominence and was accented in both cases. This would also be a more plausible realisation of the utterance in the two contexts than the one with a single accent on ‘eleven’ and a much earlier rise, starting somewhere around ‘the’. So, this pattern looks different from a single-accent rise-fall-rise in ‘million’ and does not seem to be L*+HL-H%, but L*+HL*L-H%, a fall followed by a rise, as in (b) or (d) of ‘There's another one in the kitchen’ in 1.4.4 . This would mean that ‘incredulity’ is signalled by the expanded pitch ranges of the late peak , which signals expressively evaluated Contrast , and of the final rise , probably supported by non-modal phonation . A double-accent fall-rise in the ‘uncertainty’ context does not make a dissociative reference to other alternatives, as the single-accent rise-fall-rise would. But the late peak contrasts, and expressively evaluates, B's time reference with A's question about coming in ‘pretty late’, and the final rise establishes contact with the dialogue partner and alleviates the categoricalness of a late peak .

Hirschberg and Ward used the recordings of the eight contextualised utterances to generate two sets of stimuli, categorised as conveying ‘uncertainty’ and ‘incredulity’ for a listening experiment, where subjects had to allocate each stimulus to one of the two categories. Since the pitch patterns were most probably not homogeneous, and since such context-free semantic allocations are difficult, especially in view of the somewhat opaque meaning of ‘uncertainty’, the conclusions about the physical properties that cue ‘uncertainty’ or ‘incredulity’ are not so clear as they are made out to be.

1.4.5 A New Paradigm

The critical historical survey in 1.4 has prepared the ground, and provided the rationale, for presenting a new paradigm. The following chapters model prosody in relation to communicative functions of speech interaction, on the basis of the Kiel Intonation Model (KIM) in a broad linguistic-paralinguistic setting. The concern for function in prosody research at Kiel University goes back to Bill Barry's paper ‘Prosodic functions revisited again!’ (Barry Reference Barry 1981 ), following Brazil ( Reference Brazil 1975 , Reference Brazil 1978 ). The function perspective guided the analysis, in production and perception, of prosody in general, and of intonation in particular, from the early 1980s onwards, converging on the development of a prosodic model (Kohler Reference Kohler 1991a , Reference Kohler b , Reference Kohler, Sagisaka, Campbell and Higuchi 1997b , Reference Kohler, Sudhoff, Lenertová, Meyer, Pappert, Augurzky, Mleinek, Richter and Schließer 2006b , Reference Kohler, Fant, Fujisaki and Shen 2009b ).

The idea behind KIM is that modelling prosody should mirror its use by speakers and listeners in communicative action, i.e. prosodic categories must be an integral part of communication processes rather than just static elements in a linguistic description. Speakers use prosody to structure the flow of sound for the transmission of meaning to listeners. In a synsemantic field , prosody operates on linguistic signs in parallel to morphological and syntactic patterning for propositional representation, and in a sympractical deictic field it signals Speaker-Listener-Situation relations. Finally, speakers use prosody to express their emotions and attitudes , and to signal their appeals to listeners. The prosodic model is to be structured in such a way that it can capture and adequately represent all these communicative functions in speaker–listener interaction. This also implies that the model needs to be integrated into a theory of speech and language together with all the other formal means – segmental-phonetic, lexical, morphological, syntactic – contributing in varying proportions as carriers of these functions. The model must be oriented towards basic communicative functions of homo loquens , and at the same time it must take into account psycho-physical components of the human-speech producing, perceiving and understanding mechanisms, irrespective of any particular language form that organises the general psycho-physical prerequisites for communicative purposes in language-specific ways.

KIM follows the European tradition of postulating a system of distinctive global pitch contours – peak , valley , combined peak-valley and level patterns. The model sets out how these patterns are synchronised with vocal-tract articulation, how they are concatenated into a hierarchy of larger units from phrase to utterance to paragraph in reading or to turn in dialogue, and how they are embedded in other prosodic patterns – vocal-tract dynamics, prominence and phonation , paying attention to both the production and the perception of prosody in communicative function. The model was developed over many years, starting with a project in the German Research Council programme ‘Forms and Functions of Intonation’ in the 1980s (Kohler Reference Kohler 1991c ), continuing with its implementation in the INFOVOX TTS system (Kohler Reference Kohler, van Santen, Sproat, Olive and Hirschberg 1997a ) and with the development of a data acquisition and annotation platform in the PHONDAT and VERBMOBIL projects of the German Ministry of Research and Technology (Kohler, Pätzold and Simpson Reference Kohler, Pätzold and Simpson 1995 ; Scheffers and Rettstadt Reference Scheffers and Rettstadt 1997 ). In this research environment, large databases of read and spontaneous German speech were collected (IPDS 1994–2006; Kohler, Peters and Scheffers Reference Kohler, Peters and Scheffers 2017a – Reference Kohler, Peters and Scheffers b ) and annotated segmentally and prosodically with the help of the PRO[sodic]LAB[elling] tool (Kohler Reference Kohler, Sagisaka, Campbell and Higuchi 1997b ; Kohler, Peters and Scheffers Reference Kohler, Peters and Scheffers 2017a – Reference Kohler, Peters and Scheffers b ), which was devised to symbolise the prosodic systems and structures of KIM for computer processing of the German corpora. In a subsequent German Research Council project, ‘Sound Patterns of German Spontaneous Speech ’, various prosodic aspects of the corpus data were analysed in the KIM-PROLAB frame (Kohler, Kleber and Peters Reference Kohler, Kleber and Peters 2005 ). PhD theses by Benno Peters ( Reference Peters 2006 ) and Oliver Niebuhr ( Reference Niebuhr 2007b ) followed, and there has been a continuous flow of prosodic research within this paradigm in Kiel.

Figure 1.1. The Organon Model according to Bühler (1934, p. 28), with the original German labels, and their added English translations, of the three relationships, functions and aspects of the linguistic SIGN Z(eichen).

Figure 1.2 Spectrograms and F0 traces (log scale) of a // 3 don't stay / out too */ long // – audio file 5_2_2_4a3.wav, and b // 1 don't stay / out too */ long // – audio file 5_2_2_4a4.wav, from Halliday and Greaves (2008), p. 119. Standard Southern British English, male speaker

Save book to Kindle

To save this book to your Kindle, first ensure [email protected] is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle .

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service .

Speech Communication in Human Interaction
Klaus J. Kohler
Book: Communicative Functions and Linguistic Forms in Speech Interaction
Online publication: 13 October 2017
Chapter DOI: https://doi.org/10.1017/9781316756782.003

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox .

Save book to Google Drive

American Speech-Language-Hearing Association

Certification
Publications
Continuing Education
Practice Management
Audiologists
Speech-Language Pathologists
Academic & Faculty
Audiology & SLP Assistants

What Is Speech? What Is Language?

[ en Español ]

Jorge is 4 years old. It is hard to understand him when he talks. He is quiet when he speaks, and his sounds are not clear.

Vicki is in high school. She has had learning problems since she was young. She has trouble reading and writing and needs extra time to take tests.

Maryam had a stroke. She can only say one or two words at a time. She cannot tell her son what she wants and needs. She also has trouble following simple directions.

Louis also had a stroke. He is able to understand everything he hears and speaks in full sentences. The problem is that he has slurred speech and is hard to understand.

All of these people have trouble communicating. But their problems are different.

What Is Speech?

Speech is how we say sounds and words. Speech includes:

Articulation How we make speech sounds using the mouth, lips, and tongue. For example, we need to be able to say the “r” sound to say "rabbit" instead of "wabbit.”

Voice How we use our vocal folds and breath to make sounds. Our voice can be loud or soft or high- or low-pitched. We can hurt our voice by talking too much, yelling, or coughing a lot.

Fluency This is the rhythm of our speech. We sometimes repeat sounds or pause while talking. People who do this a lot may stutter.

What Is Language?

Language refers to the words we use and how we use them to share ideas and get what we want. Language includes:

What words mean. Some words have more than one meaning. For example, “star” can be a bright object in the sky or someone famous.
How to make new words. For example, we can say “friend,” “friendly,” or “unfriendly” and mean something different.
How to put words together. For example, in English we say, “Peg walked to the new store” instead of “Peg walk store new.”
What we should say at different times. For example, we might be polite and say, “Would you mind moving your foot?” But, if the person does not move, we may say, “Get off my foot!”

Language and Speech Disorders

We can have trouble with speech, language, or both. Having trouble understanding what others say is a receptive language disorder. Having problems sharing our thoughts, ideas, and feelings is an expressive language disorder. It is possible to have both a receptive and an expressive language problem.

When we have trouble saying sounds, stutter when we speak, or have voice problems, we have a speech disorder .

Jorge has a speech disorder that makes him hard to understand. So does Louis. The reason Tommy has trouble is different than the reason Louis does.

Maryam has a receptive and expressive language disorder . She does not understand what words mean and has trouble using words to talk to others.

Vicki also has a language disorder . Reading and writing are language skills. She could also have problems understanding others and using words well because of her learning disability.

Where to Get Help

SLPs work with people who have speech and language disorders. SLPs work in schools, hospitals, and clinics, and may be able to come to your home.

To find a speech-language pathologist near you, visit ProFind .

In the Public Section

Hearing & Balance
Speech, Language & Swallowing
About Health Insurance
Adding Speech & Hearing Benefits
Advocacy & Outreach
Find a Professional
Advertising Disclaimer
Advertise with us

ASHA Corporate Partners

Become A Corporate Partner

The American Speech-Language-Hearing Association (ASHA) is the national professional, scientific, and credentialing association for 234,000 members, certificate holders, and affiliates who are audiologists; speech-language pathologists; speech, language, and hearing scientists; audiology and speech-language pathology assistants; and students.

All ASHA Websites
Work at ASHA
Marketing Solutions

Information For

Get involved.

ASHA Community
Become a Mentor
Become a Volunteer
Special Interest Groups (SIGs)

Connect With ASHA

American Speech-Language-Hearing Association 2200 Research Blvd., Rockville, MD 20850 Members: 800-498-2071 Non-Member: 800-638-8255

MORE WAYS TO CONNECT

Media Resources

Press Queries

History & Society
Science & Tech
Biographies
Animals & Nature
Geography & Travel
Arts & Culture
Games & Quizzes
On This Day
One Good Fact
New Articles
Lifestyles & Social Issues
Philosophy & Religion
Politics, Law & Government
World History
Health & Medicine
Browse Biographies
Birds, Reptiles & Other Vertebrates
Bugs, Mollusks & Other Invertebrates
Environment
Fossils & Geologic Time
Entertainment & Pop Culture
Sports & Recreation
Visual Arts
Demystified
Image Galleries
Infographics
Top Questions
Britannica Kids
Saving Earth
Space Next 50
Student Center
Introduction

Respiratory mechanisms

Brain functions.

Cartilages of the larynx
Extrinsic muscles
Intrinsic muscles
Vocal cords
Esophageal voice
Artificial larynx
The basic registers
Studies of register differences
Vocal frequency
Voice types
Vocal ranges
Harmonic structure
Vocal styles
Individual voice quality
Singing and speaking
Synthetic production of speech sounds

Uncover the science behind the transformation of sounds into speech

What did Martin Luther King, Jr., do?
What is Martin Luther King, Jr., known for?
Who did Martin Luther King, Jr., influence and in what ways?
What was Martin Luther King’s family life like?
How did Martin Luther King, Jr., die?

Spike Lee at the 2007 Primetime Creative Arts Emmy Awards. Shrine Auditorium, Los Angeles, California

Our editors will review what you’ve submitted and determine whether to revise the article.

American Speech-Language-Hearing Association - What is Speech? What is Language?
Institute for Natural Language Processing - Voice quality: description and classification
speech - Children's Encyclopedia (Ages 8-11)
speech - Student Encyclopedia (Ages 11 and up)
Table Of Contents

speech , human communication through spoken language . Although many animals possess voices of various types and inflectional capabilities, humans have learned to modulate their voices by articulating the laryngeal tones into audible oral speech.

The regulators

Human speech is served by a bellows-like respiratory activator, which furnishes the driving energy in the form of an airstream; a phonating sound generator in the larynx (low in the throat) to transform the energy; a sound-molding resonator in the pharynx (higher in the throat), where the individual voice pattern is shaped; and a speech-forming articulator in the oral cavity ( mouth ). Normally, but not necessarily, the four structures function in close coordination. Audible speech without any voice is possible during toneless whisper , and there can be phonation without oral articulation as in some aspects of yodeling that depend on pharyngeal and laryngeal changes. Silent articulation without breath and voice may be used for lipreading .

An early achievement in experimental phonetics at about the end of the 19th century was a description of the differences between quiet breathing and phonic (speaking) respiration. An individual typically breathes approximately 18 to 20 times per minute during rest and much more frequently during periods of strenuous effort. Quiet respiration at rest as well as deep respiration during physical exertion are characterized by symmetry and synchrony of inhalation ( inspiration ) and exhalation ( expiration ). Inspiration and expiration are equally long, equally deep, and transport the same amount of air during the same period of time, approximately half a litre (one pint) of air per breath at rest in most adults. Recordings (made with a device called a pneumograph) of respiratory movements during rest depict a curve in which peaks are followed by valleys in fairly regular alternation.

Phonic respiration is different; inhalation is much deeper than it is during rest and much more rapid. After one takes this deep breath (one or two litres of air), phonic exhalation proceeds slowly and fairly regularly for as long as the spoken utterance lasts. Trained speakers and singers are able to phonate on one breath for at least 30 seconds, often for as much as 45 seconds, and exceptionally up to one minute. The period during which one can hold a tone on one breath with moderate effort is called the maximum phonation time; this potential depends on such factors as body physiology, state of health, age, body size, physical training, and the competence of the laryngeal voice generator—that is, the ability of the glottis (the vocal cords and the opening between them) to convert the moving energy of the breath stream into audible sound. A marked reduction in phonation time is characteristic of all the laryngeal diseases and disorders that weaken the precision of glottal closure, in which the cords (vocal folds) come close together, for phonation.

YOLO "You Only Live Once" written in bright colors and repeated on a purple background (acronym, slang)

Respiratory movements when one is awake and asleep, at rest and at work, silent and speaking are under constant regulation by the nervous system . Specific respiratory centres within the brain stem regulate the details of respiratory mechanics according to the body needs of the moment. Conversely, the impact of emotions is heard immediately in the manner in which respiration drives the phonic generator; the timid voice of fear, the barking voice of fury, the feeble monotony of melancholy , or the raucous vehemence during agitation are examples. Conversely, many organic diseases of the nervous system or of the breathing mechanism are projected in the sound of the sufferer’s voice. Some forms of nervous system disease make the voice sound tremulous; the voice of the asthmatic sounds laboured and short winded; certain types of disease affecting a part of the brain called the cerebellum cause respiration to be forced and strained so that the voice becomes extremely low and grunting. Such observations have led to the traditional practice of prescribing that vocal education begin with exercises in proper breathing.

The mechanism of phonic breathing involves three types of respiration: (1) predominantly pectoral breathing (chiefly by elevation of the chest), (2) predominantly abdominal breathing (through marked movements of the abdominal wall), (3) optimal combination of both (with widening of the lower chest). The female uses upper chest respiration predominantly, the male relies primarily on abdominal breathing. Many voice coaches stress the ideal of a mixture of pectoral (chest) and abdominal breathing for economy of movement. Any exaggeration of one particular breathing habit is impractical and may damage the voice.

How does the McGurk effect trick your brain?

The question of what the brain does to make the mouth speak or the hand write is still incompletely understood despite a rapidly growing number of studies by specialists in many sciences, including neurology, psychology , psycholinguistics, neurophysiology, aphasiology, speech pathology , cybernetics, and others. A basic understanding, however, has emerged from such study. In evolution, one of the oldest structures in the brain is the so-called limbic system , which evolved as part of the olfactory (smell) sense. It traverses both hemispheres in a front to back direction, connecting many vitally important brain centres as if it were a basic mainline for the distribution of energy and information. The limbic system involves the so-called reticular activating system (structures in the brain stem), which represents the chief brain mechanism of arousal, such as from sleep or from rest to activity. In humans, all activities of thinking and moving (as expressed by speaking or writing) require the guidance of the brain cortex. Moreover, in humans the functional organization of the cortical regions of the brain is fundamentally distinct from that of other species, resulting in high sensitivity and responsiveness toward harmonic frequencies and sounds with pitch , which characterize human speech and music.

Know Broca's lesion method in mapping brain activity in humans and how studies of brain disorders to the Broca area help evolve the scientific understanding of cognition

In contrast to animals, humans possess several language centres in the dominant brain hemisphere (on the left side in a clearly right-handed person). It was previously thought that left-handers had their dominant hemisphere on the right side, but recent findings tend to show that many left-handed persons have the language centres more equally developed in both hemispheres or that the left side of the brain is indeed dominant. The foot of the third frontal convolution of the brain cortex, called Broca’s area, is involved with motor elaboration of all movements for expressive language. Its destruction through disease or injury causes expressive aphasia , the inability to speak or write. The posterior third of the upper temporal convolution represents Wernicke’s area of receptive speech comprehension. Damage to this area produces receptive aphasia, the inability to understand what is spoken or written as if the patient had never known that language.

Broca’s area surrounds and serves to regulate the function of other brain parts that initiate the complex patterns of bodily movement (somatomotor function) necessary for the performance of a given motor act. Swallowing is an inborn reflex (present at birth) in the somatomotor area for mouth, throat, and larynx. From these cells in the motor cortex of the brain emerge fibres that connect eventually with the cranial and spinal nerves that control the muscles of oral speech.

In the opposite direction, fibres from the inner ear have a first relay station in the so-called acoustic nuclei of the brain stem. From here the impulses from the ear ascend, via various regulating relay stations for the acoustic reflexes and directional hearing, to the cortical projection of the auditory fibres on the upper surface of the superior temporal convolution (on each side of the brain cortex). This is the cortical hearing centre where the effects of sound stimuli seem to become conscious and understandable. Surrounding this audito-sensory area of initial crude recognition, the inner and outer auditopsychic regions spread over the remainder of the temporal lobe of the brain, where sound signals of all kinds appear to be remembered, comprehended, and fully appreciated. Wernicke’s area (the posterior part of the outer auditopsychic region) appears to be uniquely important for the comprehension of speech sounds.

The integrity of these language areas in the cortex seems insufficient for the smooth production and reception of language. The cortical centres are interconnected with various subcortical areas (deeper within the brain) such as those for emotional integration in the thalamus and for the coordination of movements in the cerebellum (hindbrain).

All creatures regulate their performance instantaneously comparing it with what it was intended to be through so-called feedback mechanisms involving the nervous system. Auditory feedback through the ear, for example, informs the speaker about the pitch, volume, and inflection of his voice, the accuracy of articulation, the selection of the appropriate words, and other audible features of his utterance. Another feedback system through the proprioceptive sense (represented by sensory structures within muscles, tendons, joints, and other moving parts) provides continual information on the position of these parts. Limitations of these systems curtail the quality of speech as observed in pathologic examples (deafness, paralysis , underdevelopment).

What Are the Types of Speech Communication?

John huddle.

Speech, or oral communication, is a process of sending and receiving spoken messages between people. Speech conveys and sways through the presentation of ideas, opinions, information, directions and commands, usually with responsive communication from the listener. Effective speech is tailored by our needs and those of the receiver.

Explore this article

Intrapersonal
Interpersonal
Small Groups
To the Masses

1 Intrapersonal

Some would say we listen to ourselves more than we do others. Intrapersonal communication happens inside us as inner speech, self-talk or a range of other self-interactions. The foundation for all other communication, it allows us to develop an awareness and understanding about ourselves and our personal world. We process what we say to others by first holding parts -- or sometimes all -- of the conversation with ourselves. For instance, politicians rehearse their 30-second introduction speech in front of a mirror at home, while job candidates practice saying why they're the best for the job. Not limited to planned interpersonal communication, intrapersonal speech also includes our daydreams and goals, where we place ourselves in different settings and situations for pleasure or goal setting.

2 Interpersonal

Interpersonal speech is communication to one another through our words, tone of voice, gestures and other body language. Once we say something, it's said and can't be taken back, adding weight to the adage to “watch your tongue.” Even though we might think this communication is simple, interpersonal communication is very complex, including the impressions we have of each other, the message as we think we said it and how it was heard, including the willingness of the listener to listen. What we say to others is never said in a vacuum; we bring our needs and values to the conversation. In addition, communication includes the listener's reception, the location and our cultural influences.

3 Small Groups

Successful group communication requires the development of good listening skills, to hear and understand what the members of the group are actually saying, as the group moves toward its goals. Within our dominant culture, that means making eye contact and showing agreement and attention with body language, such as leaning forward attentively. Group communication often requires that we clarify what someone else said, usually with a clarifying statement. Groups require a more democratic approach that doesn't just advance one position, such as engaging one another by agreeing with what they said or disagreeing in a way that encourages them to stay engaged. Groups also need someone to keep the group on task, ensure that all are heard, encourage feedback and mediate when conflicts arise.

4 To the Masses

Speaking to the masses, whether lecturing to a small group or worldwide, often involves an unseen audience, with the goal of informing or persuading. Unlike other types of communication, mass communication, or public speaking, is very dependent on the message. Still, the charisma of the speaker's tone, her inflection and her body language, if visible, also influence the message. Successful public speaking depends on the speaker's ability to organize and present the material in a manner that the listener receives and internalizes. The speaker provides a reason for listening, lends credibility to the topic and motivates the audience to respond through words calculated to produce the desired response.

1 Pennsylvania State University: Department of Communication Arts & Sciences, The Communication Process, David Dzikowski
2 ERIC: Intrapersonal Communication and Imagined Interactions, James M. Honeycutt, et al
3 University of Pittsburgh: Speaking in the Disciplines, Author
4 Oklahoma State University: Mass Communication, Maureen Nemecek

About the Author

John Huddle is an Army veteran with enlisted service as general hospital staff and hospital chaplain's assistant. His career also included stints as a teacher, adjunct faculty, administrator and school psychologist. Twice, Dr. Huddle was a major party nominee for state office. He also served as a director on several nonprofit boards. Today he enjoys consulting and lobbying for underdog causes.

Five Parts of Listening

Topics in Communication Skills

The Difference Between Conversation and Public Speaking

What Can You Do to Try & Improve Your Listening Skills?

What Are the Three Main Goals of Public Speaking?

The Difference Between Hearing & Listening Skills

What Are the Four Types of Nonverbal Communication?

Components of Verbal Communication

Nonverbal Communication Activities for Adults

How to Facilitate a Goal-Setting Meeting

Radio Communication Protocol and Etiquette

Goals to Improve Communication

Different Forms of Affection

What Is Sympathetic Arousal?

Four Types of Listeners

Why Do We Need Effective Communication?

How to Improve Therapeutic Communication

What Are the Traits of a Good Communicator?

What Is the Difference Between Being Nice and Flirting?

How Do Civil Rights Affect Democracy in the United...

Regardless of how old we are, we never stop learning. Classroom is the educational resource for people of all ages. Whether you’re studying times tables or applying to college, Classroom has the answers.

Accessibility
Terms of Use
Privacy Policy
Copyright Policy
Manage Preferences

4 Types of Communication Styles for Success at the Workplace

By Judhajit Sen
September 24, 2024

Communication is an important skill in any setting, particularly in the workplace, where it influences teamwork, productivity, and relationships. At its core, communication is the act of sharing information between individuals or groups. This process can take many forms, and understanding the four different types of communication skills is key to becoming an effective communicator.

What are the four important communication skills? The four major types of communication are verbal, nonverbal, written, and visual. Each of these plays a crucial role in our daily interactions, both personal and professional.

To communicate effectively, it’s not just about mastering one form, but using a combination of all four methods of communication depending on the context. Each type has its unique strengths, and when used together, they enhance the overall message, ensuring clear and concise communication. Developing these skills will not only help you succeed in your career but also improve your communication skills to connect with others in engaging ways.

Key Takeaways

Four Communication Styles: Effective workplace communication relies on four main styles: verbal, nonverbal, written, and visual. Each style plays a unique role in sharing information and ideas.
Importance of Clarity: Clear spoken communication fosters immediate feedback and emotional connections, while well-structured written messages provide a permanent record that aids understanding.
Power of Nonverbal Cues: Unspoken communication, such as body language and eye contact, can convey deeper feelings and intentions, enhancing the overall message.
Active Listening Matters: Listening actively is crucial for true engagement. It helps build understanding, resolve conflicts, and strengthen workplace relationships, making it a key component of effective communication.

Types of Communication Styles

Verbal Communication

Verbal communication comprises the use of words to share information, ideas, and emotions. It plays a central role in daily interactions, whether it’s face-to-face, over the phone, or through digital platforms like video conferencing. This form of communication is essential because it allows for immediate feedback, clarifications, and emotional connections.

There are different forms of spoken communication, such as interpersonal, group, public, and mass communication. Interpersonal communication occurs one-on-one, such as a conversation with a friend or coworker. In group communication , a small team discusses ideas, like during a work meeting. Public communication involves speaking to a larger audience, such as during presentations or speeches, while mass communication reaches a broad audience through media like television or social media.

To communicate effectively, clarity is key. Whether you’re speaking to colleagues, customers, or family members, clear communication reduces the risk of misunderstandings. This includes using appropriate words, adjusting your tone, and pacing your speech to ensure your message is easily understood. Furthermore, listening actively is just as important as speaking. When you listen attentively, you can respond appropriately, creating a more productive dialogue.

Another aspect to consider is the use of filler words like “um” or “like,” which can distract from the message. Practicing clear, concise speech can help eliminate these habits, making your communication more professional and effective. It’s also important to match your style of communication to your audience. Speaking to a child requires a different language and tone compared to addressing a group of executives.

Verbal communication is not just about what you say, but how you say it. Pitch, tone, and body language play crucial roles in conveying the right message. Effective communicators know how to use these elements to inspire, persuade, and build relationships.

In the workplace, spoken communication is vital for teamwork, decision-making, and conflict resolution. It influences how teams collaborate and how leaders motivate their employees. By improving your verbal communication, you can hone your ability to express ideas clearly, foster relationships, and avoid misunderstandings.

Nonverbal Communication

Nonverbal communication plays an indispensable role in how we convey and interpret messages beyond words. It involves body language, facial expressions, gestures, posture, and even the tone and pitch of our voice. These elements can either reinforce or contradict spoken communication, often giving deeper insight into a person’s true feelings and intentions.

Body Language and Gestures

Body language is a potent way to express emotions and attitudes. For example, standing tall with an open posture can signal confidence and attentiveness, while crossed arms or slumped shoulders might indicate defensiveness or discomfort. Gestures, such as nodding or using hand movements to emphasize points, are also key parts of unspoken communication. However, it’s important to note that some gestures, like the “OK” sign, can have different meanings across cultures.

Facial Expressions

Our faces are incredibly expressive and can communicate a wide array of emotions without saying a word. Smiling typically shows happiness or approval, while a frown may signal discontent or confusion. Facial expressions are often universal; a smile usually means the same thing no matter where you are. This universality makes them a reliable form of unspoken communication in most situations.

Eye Contact

Eye contact is another critical element of unspoken communication. Maintaining eye contact shows interest and engagement, while avoiding it can signal discomfort, dishonesty, or disinterest. In many cultures, the way you look at someone speaks volumes, helping to build trust or, conversely, create tension.

Tone and Pitch

The tone and pitch of our voice can drastically alter the meaning of what we’re saying. A friendly, upbeat tone can convey excitement or positivity, while a flat or monotone voice might suggest boredom or indifference. In a professional setting, controlling your tone can help ensure that your message is received as intended.

Physical Space

Personal space, or proxemics, is another important aspect of unspoken communication. Standing too close to someone can make them feel uncomfortable, while maintaining a respectful distance fosters ease. This varies depending on the cultural context and the nature of the relationship, so it’s essential to be mindful of how physical space impacts communication.

Unspoken communication is a formidable tool that often speaks louder than words. By paying attention to tone, facial expressions, etc., you can better your ability to communicate effectively.

Written Communication

Written communication involves conveying messages through the written word, such as emails, reports, social media posts, and letters. It serves as a vital tool in both personal and professional settings, allowing information to be shared clearly, recorded for reference, and distributed to large audiences.

One of the key strengths of written communication is its ability to maintain a permanent record. Whether it’s a business report, memo, or contract, written communication ensures that the information is documented and can be referred to later. This permanence is especially valuable in legal, academic, and professional contexts, where precise and clear records are essential.

Effective written communication should be simple, clear, and well-structured. It is important to avoid unnecessary complexity, as complicated language can lead to confusion or misunderstandings. Start with a clear introduction, elaborate on your points in the body, and summarize key takeaways at the end. This structure helps readers follow the information flow and increases comprehension.

One potential challenge with written communication is the lack of immediate feedback. Unlike face-to-face conversations, written messages don’t allow for instant clarification, which can sometimes lead to misinterpretation. Tone, emotion, and humor can also be difficult to convey accurately in writing. Therefore, it’s crucial to carefully choose words and avoid relying on tone or sarcasm, which may be misread. If necessary, follow up with verbal communication to add more context.

To improve your written communication, take time to review and edit your messages. Proofreading can help catch errors and ensure your message is clear and professional. For important documents, it might be helpful to have a colleague review them as well.

Written communication is an indispensable tool in today’s world. By following best practices—such as keeping messages concise, structuring them clearly, and proofreading—you can communicate effectively and avoid potential misunderstandings.

Visual Communication

Visual communication is a potent tool for conveying information, messages, and ideas through images, symbols, charts, and other graphical representations. In our highly visual society, this form of communication surrounds us daily—whether it’s through advertisements, social media, or even simple road signs. From Instagram posts to infographics, visuals allow for quick and efficient communication of complex information.

One of the key strengths of visual communication is its ability to transcend language barriers, making it a universal way to convey messages. Well-designed visuals like infographics, charts, and graphs can simplify intricate data, helping audiences quickly grasp key points. For example, in a business setting, a graph comparing sales figures or a pie chart breaking down a budget report makes it easier for team members to understand trends and make informed decisions. Visuals also enhance retention, as people tend to remember images more effectively than text alone.

In the workplace, visual communication plays a critical role. PowerPoints, performance reports, and infographics are frequently used to aid presentations, making data easier to digest. Additionally, promotional materials like videos, social media graphics, and TV ads are excellent examples of how visuals capture attention and keep audiences engaged .

However, visual communication has its challenges. While it can simplify information, it can also be misinterpreted if not designed with clarity. Different people may interpret images in varied ways, leading to misunderstandings. Furthermore, creating effective visuals can be time-consuming and may require specialized skills in design and software.

When using visuals, it’s important to consider the audience. A simple, clean design is sometimes more effective than a cluttered or overly complex one. Aligning visuals with the message is critical; for instance, a pie chart should clearly support the data it represents without overwhelming the viewer with too much information. Consistency in branding, including colors and fonts, is also essential for business presentations and promotional content.

Visual communication complements different forms of communication by making information easier to understand, memorable, and engaging. When used effectively, it can transform a message into something both visually appealing and impactful.

Bonus: Listening

Listening is often overlooked as a form of communication, but it plays an essential role in how we connect with others. In fact, listening actively may be the most important of the 5 types of communication because, without it, true engagement is impossible. For instance, in a negotiation, understanding what the other person needs is key to finding a win/win outcome, and that understanding starts with listening.

But listening is more than just hearing words. It involves an active process of receiving, interpreting, and reacting to a message. It’s about grasping not only what is said but also the intent and emotions behind the words. Effective listening means paying attention to both verbal cues, like tone, and unspoken cues, such as facial expressions and body language. This depth of listening helps foster mutual understanding, resolve conflicts, and strengthen relationships.

To become a great communicator, mastering the art of listening is essential. Active listening means engaging your mind fully while someone speaks, rather than simply waiting for your turn to talk. It’s not enough to hear someone; you need to make an effort to truly understand what they’re trying to say. Without this, the entire communication process can break down, especially in a work environment.

To improve your listening skills, here are a few practical tips:

Focus on the speaker: Maintain eye contact and minimize distractions. If your mind wanders, refocus on the present moment.

Seek clarity: If something isn’t clear, ask follow-up questions to ensure you understand the message.

Wait your turn: Avoid interrupting. If a thought pops into your head, jot it down so you can return your full attention to the speaker.

Show interest: Engaged body language signals that you’re paying attention, which helps the speaker feel heard.

Paraphrase: Repeating what was said in your own words can clarify the message and prevent misunderstandings.

By practicing these habits, you’ll not only become a better listener but also a more effective communicator.

Wrap-up: Types of Communication

Understanding different types of communication strategies is essential for success in the workplace. The four styles of communication—verbal, nonverbal, written, and visual—each serve unique roles in how we share and interpret information. Spoken communication facilitates immediate feedback and emotional connections, while unspoken cues provide deeper insights into feelings. Written communication offers a permanent record, essential for clarity and reference, and effective visual communication simplifies complex ideas through graphics and images.

Mastering these types of effective communication allows for more effective exchanges and fosters better teamwork. Additionally, listening actively enhances communication by ensuring understanding and engagement. By developing skills across these areas, you can improve not only your professional interactions but also your ability to connect meaningfully with others.

Frequently Asked Questions (FAQs)

1. What are the different types of communication styles?

The four major types of professional communication are verbal, nonverbal, written, and visual. Each style plays a vital role in how we share information and connect with others.

2. Why is verbal communication important?

Verbal communication allows for immediate feedback and emotional connections. It’s essential for teamwork, decision-making, and conflict resolution in the workplace.

3. How does nonverbal communication impact messages?

Nonverbal communication, such as facial expressions, tone, and body language, often conveys deeper meanings than words alone. It can reinforce or contradict what is being said.

4. What is the role of active listening in communication?

Active listening is crucial for understanding and engagement. It involves fully concentrating on the speaker and interpreting both verbal and unspoken cues to foster better relationships.

Master Different Communication Styles with Prezentium

Effective communication is important in any workplace, and understanding various types of messages in communication—verbal, nonverbal, written, and visual—can greatly enhance your interactions. At Prezentium, we prioritize a customer-first approach, offering tailored services that help you master these essential skills.

With our Overnight Presentations, we transform your ideas into polished presentations by the next morning, ensuring clear and impactful messaging. Our Accelerators team collaborates with you to refine your concepts and create engaging designs that resonate with your audience. Lastly, our Zenith Learning workshops equip you with the tools to harness structured problem-solving and visual storytelling, empowering you to communicate effectively in any setting.

Let Prezentium help you elevate your different communication skills and gain success in your professional journey. Reach out today to discover how we can assist you in mastering your unique style!

Why wait? Avail a complimentary 1-on-1 session with our presentation expert. See how other enterprise leaders are creating impactful presentations with us.

11 Google Slides Tips and Tricks: Make Google Slides Look Good

10 ai tools to boost productivity: best ai productivity tools in 2024, figurative language: simile and other types of figures of speech.

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

From unimodal to multimodal dynamics of verbal and nonverbal cues during unstructured conversation

Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Software, Visualization, Writing – original draft

* E-mail: [email protected]

Affiliation EuroMov Digital Health in Motion, Univ Montpellier, IMT Mines Ales, Montpellier, France

Roles Conceptualization, Funding acquisition, Investigation, Supervision, Validation, Writing – review & editing

Roles Validation, Writing – review & editing

Roles Conceptualization, Investigation, Methodology, Supervision, Validation, Writing – review & editing

Affiliation College of the Holy Cross, Worcester, MA, United States of America

Roles Conceptualization, Funding acquisition, Methodology, Supervision, Validation, Writing – review & editing

Affiliation CY Cergy Paris Université - ETIS UMR 8051, Cergy, Pontoise, France

Tifenn Fauviaux,
Ludovic Marin,
Mathilde Parisi,
Richard Schmidt,
Ghilès Mostafaoui

Published: September 25, 2024
https://doi.org/10.1371/journal.pone.0309831
Reader Comments

Conversations encompass continuous exchanges of verbal and nonverbal information. Previous research has demonstrated that gestures dynamically entrain each other and that speakers tend to align their vocal properties. While gesture and speech are known to synchronize at the intrapersonal level, few studies have investigated the multimodal dynamics of gesture/speech between individuals. The present study aims to extend our comprehension of unimodal dynamics of speech and gesture to multimodal speech/gesture dynamics. We used an online dataset of 14 dyads engaged in unstructured conversation. Speech and gesture synchronization was measured with cross-wavelets at different timescales. Results supported previous research on intrapersonal speech/gesture coordination, finding synchronization at all timescales of the conversation. Extending the literature, we also found interpersonal synchronization between speech and gesture. Given that the unimodal and multimodal synchronization occurred at similar timescales, we suggest that synchronization likely depends on the vocal channel, particularly on the turn-taking dynamics of the conversation.

Citation: Fauviaux T, Marin L, Parisi M, Schmidt R, Mostafaoui G (2024) From unimodal to multimodal dynamics of verbal and nonverbal cues during unstructured conversation. PLoS ONE 19(9): e0309831. https://doi.org/10.1371/journal.pone.0309831

Editor: Laura Morett, University of Missouri Columbia, UNITED STATES OF AMERICA

Received: May 21, 2024; Accepted: August 19, 2024; Published: September 25, 2024

Copyright: © 2024 Fauviaux et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data can be found online on the Open Science Framework (osf) repository at DOI 10.17605/OSF.IO/DCJ95 . identifier: https://osf.io/dcj95 .

Funding: This study has received funding from the Agence Nationale de la Recherche (ANR) for the project ENHANCER under the Grant agreement number ANR-22-CE17-0036 ( https://anr.fr/Projet-ANR-22-CE17-0036 ). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Social interaction, such as face-to-face communication, can be mirrored as a complex choreography wherein each speaker mutually exchanges perceptual information through the vocal (auditory) and visual channels [ 1 – 4 ].

Through the vocal channel, speech encompasses a combination of verbal and nonverbal expressions. Verbal cues convey the linguistic and semantic meaning of words whereas nonverbal (i.e., suprasegmental) speech features are used to communicate structural information and emotion [ 5 , 6 ]. Nonverbal speech features refer to prosody. Prosody involves fluctuations in fundamental frequency, amplitude, duration of segments and syllables, and intervals of pauses [ 7 ]. The visual channel comprises all elements of nonverbal behaviors, including body gestures [ 8 ]. Often, they accompany the speech’s content (i.e., co-speech gestures) to perform some specific communicative function such as emotional expression, agreement, or disagreement [ 9 ]. Co-speech gestures consist most often of manual and head movement and represent the gestures the most studied in the literature [ 10 , 11 ].

Both prosody and co-speech channels are beneficial for the conversation conveying respectively 38% and 55% of the information between the speaker and listener [ 5 ]. Their respective functions complement or supplement each other. Prosody is leveraged by the speaker as a means to express his/her beliefs, attitudes, and emotions concerning the content of the message [ 12 ]. The speaker also uses co-speech head and hand gestures as a visual aid to clarify complex concepts that may be challenging to convey solely through vocal communication (i.e., pointing at an object while saying, “What is that?”) [ 13 ]. These co-speech gestures also signal the speaker’s desire to keep talking. For the listener, nonverbal gestures (i.e., head nods) and brief vocalizations (i.e., “mm hm,” “uh-huh,” “yeah”) can be used as backchannels to convey information about his/her continued involvement in the conversation or to signal his/her desire to speak next [ 14 ]. Most of the time, the speaker and the listener combine modalities (i.e., a pointing gesture while speaking; a head nod combined with a “yeah”) [ 15 ].

Multimodality plays an essential role in orchestrating the turn-taking of the conversation [ 16 – 18 ]. Turn-taking is a fundamental structural feature of conversation [ 19 ]. In most cases, only one individual speaks at a time, taking on the role of the speaker, while an interacting partner listens, taking on the role of the listener [ 20 ]. Conversation is then associated with rapid, natural alternations of turn-taking with minimal overlap and speech interruptions [ 21 – 23 ]. This spontaneous way in which individuals naturally take their turn to speak stands out as one of the conversation’s outstanding features. The management of turn-taking relies on a variety of complex signals, including prosodic cues and nonverbal cues [ 24 ]. Therefore, to guarantee the fluidity of the turn-taking, a coordinated dance of vocal and nonverbal signals must be established [ 1 , 25 , 26 ].

Within this coordinated dance, each individual engages in a “back and forth” exchange of communicative signals, operating with their distinctive frequency and rhythm [ 27 ]. During interactions, the communicative signals will coordinate with each other. Coordination, in this context of verbal and nonverbal information exchange, refers to the concept of individuals mutually influencing each other’s behavior over time [ 28 ]. Over the past few decades, research has explored how coordination functions in communication and language interaction.

Early theories, such as the Communication Accommodation Theory postulated that speakers accommodate their communicative behavior toward and away from each other to indicate their attitude, either through convergence (i.e., individuals accommodate their linguistic, paralinguistic, and nonverbal features to become more similar) or divergence (i.e., individuals differentiate their speech patterns and nonverbal cues compared to those of others) [ 29 ].

Another perspective is the interactive alignment theory by Pickering and Garrod [ 30 ] which explains how individuals share mental representation by automatically aligning their linguistic (syntactic and linguistic) behaviors and situational models [ 31 ]. Relying on a priming mechanism, alignment at one level, such as lexical, influences and extends to other levels, such as prosodic and syntactic [ 32 ].

Fusaroli et al. [ 32 ] argued that alignment is only one of several mechanisms used to manage linguistic processing. Building on this, they proposed an alternative approach to better characterize the complexity of multimodal conversation, viewing dialogue as an interpersonal synergy. Dialog interactions are proposed to be dynamic and context-sensitive, where interlocutors assume complementary roles completing linguistic behaviors rather than copying them. Therefore, dialogs cannot be understood at the level of the individual alone but rather within the functional system level of the dyad.

A proposed extension to these theories relates to the concept of complexity matching found in dyadic conversation [ 33 , 34 ]. This alternative framework views individuals in conversation as two complex networks, where each of their behaviors, such as speech and movement, exhibit multiscale dynamics [ 26 , 35 ]. It investigates the matching of behaviors in terms of statistical, global dynamics rather than focusing on specific actions [ 33 ]. When two individuals match their level of complexity, they maximize their exchange of information [ 35 ]. These matching behaviors follow power law distributions, indicative of the multiscale variations characteristic of complex systems [ 36 ].

The idea of defining the behaviors of individuals within the interaction as a global dyadic system that represents an interpersonal synergy structure is part of the behavioral dynamics perspective in which coordinated signals can be seen as an entrainment process of biological and behavioral rhythms [ 37 , 38 ]. Researchers from this perspective have shown that coordination in social interaction is governed by the same dynamical processes of self-organization that constrain interaction interacting physical oscillators [ 39 , 40 ]. These principles represent a universal self-organizing law that occurs at multiple scales (from neural to behavioral, or social scales of nature) [ 41 ]. Synchronization emerges as the outcome of both space and time coordination resulting in a rhythmic convergence of behavior, whether in-phase (behaviors flowing in the same direction) or anti-phase (behaviors flowing in opposite directions) [ 42 ]. For example, during the conversation, while in-phase coordination could be illustrated by the two interlocutors nodding their heads simultaneously in agreement, the anti-phase relationship would imply each speaker nodding their head in an alternating fashion. These patterns of synchronization, alongside the degree of similarity in individuals’ behavior, serve as fundamental measurements of synchronization [ 42 ]. Studies have shown that synchronization can exhibit itself through two interconnected phenomena: self-synchrony within individuals (i.e., intrapersonal synchronization) and interpersonal synchronization between individuals ( Fig 1 ) [ 43 ]. Moreover, synchronization between signals of different modalities has received significant interest in past years [ 44 ]. In this study, we define modality as referring to the distinct nature of signals, such as movement-based signals and voice-based signals. Unimodal studies focused on a single modality, examining either speech synchronization or nonverbal gesture synchronization separately. In contrast, multimodal studies investigated the synchronization between two different modalities, specifically the synchronization between speech and gestures ( Fig 1 ). These studies on synchronization are crucial, as they shed light on the fundamental basis of synchronization in facilitating social bonding [ 45 ]. In the following, we review studies on the unimodal analysis of speech, unimodal analysis of movements, and multimodal analysis of gesture and speech.

PPT PowerPoint slide
PNG larger image
TIFF original image

A) and B) represent intrapersonal synchronization among the modalities of a single speaker. A) Full blue arrows highlight the unimodal relationship between the gestures generated by a single individual (i.e., head vs. head, head vs. wrist). B) Dashed blue arrows highlight the multimodal relationship between the voice and the gesture produced by a single individual (i.e., head vs. voice; wrist vs. voice). C) and D) represent the interpersonal synchronization between the modalities of speaker A and speaker B. C) Full red arrows highlight the unimodal relationships between the movements of speaker A and speaker B, and between their voices (i.e., head vs. head, head vs. wrist, voice vs. voice). D) Dashed red arrows highlight the multimodal relationships between the voice of speaker A and the gesture of speaker B and inversely (i.e., head vs. voice; wrist vs. voice).

https://doi.org/10.1371/journal.pone.0309831.g001

Unimodal analyses of speech focus on the interpersonal relationship between the speech of two interlocutors. Studies emphasized that when two individuals are engaged in a conversation, they tend to become similar in how they speak. Many studies have documented this phenomenon of similarity through different terms such as prosodic accommodation, convergence, or acoustic-prosodic entrainment, e.g. [ 46 ]. However, one must be careful with the generic term entrainment, as it differs from the ones used in the behavioral dynamics field. Here, entrainment refers to the similarity or alignment of prosodic speech features (pitch, loudness, intensity, speech rate…) between speakers at the conversation or the turn level [ 47 ]. In this field of speech communication, synchrony is considered a phenomenon underlying similarity [ 46 ]. From a behavioral dynamics perspective, synchrony refers to the amount of similarity in prosodic behavior between speakers, such as when a speaker changes his/her voice intensity, his/her interlocutor reacts in a parallel way across time [ 47 ]. To confirm the dynamic display of similarity, synchrony can be assessed utilizing Pearson correlation, either at the level of individual turns or within a defined moving time window [ 46 ]. At the turn level, past studies analyzed the synchronization of spontaneous dyadic conversation looking at the dimensions of intensity, pitch, voice quality, and speaking rate across time. They found that all these features exhibit significant synchrony [ 47 , 48 ]. De Looze et al. [ 46 ] investigated the evolution of prosodic synchrony with moving window time and found that prosodic synchrony varies several times throughout the conversation. They found that the fundamental frequency (F0) showed a greater amount of similarity compared to intensity, which displayed a lower occurrence of similarity. They suggested that high synchrony could be indicative of overlapped speech while lower synchrony might reflect the conversational dynamic where one individual is speaking while the other remains mostly silent.

The same process is found in the unimodal synchronization of non-verbal signals, both at the intrapersonal and interpersonal levels. Indeed, in the framework of motor synchronization, a set of studies has shown that individuals tend to coordinate their behavior. At the intrapersonal level, self-synchrony was first studied by Kelso [ 49 ] and Turvey et al. [ 50 ] through bimanual coordination and focus was then extended to coordination during walking [ 51 ], or between the movements of the hand and foot [ 52 ]. At the interpersonal level, Schmidt et al. [ 53 ] characterized the emergence of intentional coordination when they asked two participants to visually coordinate their legs in-phase or in antiphase with the rhythm of a metronome. This phenomenon was also found to happen unintentionally with individuals spontaneously coordinating their behaviors as soon as a perceptual coupling was established between them [ 54 – 56 ]. According to Bernieri and Rosenthal [ 57 ], interpersonal coordination can be described as either mimicry (or behavioral matching) or interactional synchrony. In this context, mimicry refers to the imitation of behaviors such as facial expressions, mannerisms, posture, and gestures, which do not require temporal coordination. For example, during a conversation, people might unintentionally mimic each other’s body language, like crossing their legs or touching their hair within a short window of time (and not necessarily at the same time). Synchronization, however, involves coordination in both space and time [ 58 ]. An illustration of this would be two people walking side-by-side, spontaneously synchronizing their step pattern at the same time. More recently, studies have focused on nonverbal behaviors during more naturalistic settings. In a series of knock-knock jokes, Schmidt et al. [ 41 ] uncovered a higher-than-expected level of synchronization between the bodily movements of the joke teller and the joke responder. The same synchronization results were likewise identified during unstructured conversations [ 59 ]. Moreover, Hale et al. [ 60 ] highlighted that coordination could also be accounting for more specific body parts, such as between the head movements of two interlocutors.

In addition to the investigation of unimodal synchronization, research has also explored the field of multimodal coordination between voice and gesture at the intrapersonal level. Condon [ 61 ] was the first to delve into this area, discussing self-synchrony as the representation of a coordinated system between the speaker’s speech and gesture [ 43 ]. From these observations, several researchers have noted that speech and gesture are temporally synchronous: the rhythmic pulse of prosody events such as stressed syllables and temporal patterns of nonverbal gestures are influencing each other [ 10 , 62 ]. For example, Pouw et al. [ 63 ] reported that a sudden increase in speech intensity will entrain spontaneous co-speech hand gestures. However, while intrapersonal synchronization appears to be finely tuned at the prosodic level with gesture-speech coupling occurring on relatively short timescales [ 64 ], less is known about the interpersonal coupling between the gesture and the voice.

To our knowledge, only a few studies worked on the synchronization of speech and gesture at the dyad level. Paxton and Dale [ 65 ] investigated the multimodal coordination between speech and bodily movements during an affiliative and an argumentative interaction. They found that coordination indeed occurs between the speaking event of one participant and the movement of the listener but dropped for argumentative conversation. However, nonverbal gestures were assessed globally, considering the overall body motion whereas speech was only considered as binary on/off events. Paxton et al. [ 36 ] created networks of speech and movement to highlight the interconnectivity of these modalities during a cooperative task where participants had to build the tallest tower structure possible. The authors analyzed patterns of influence between multimodal behaviors through the analyses of behavior matching (i.e., synchronization) and complexity matching. They found high cross-correlation values emphasizing synchrony in both speech and movement modalities, and observed a lower network strength, indicating efficient communication (i.e., a lower degree of connectivity between modalities). While investigating multimodal coordination during an interpersonal interaction, the type of task used in this study is not representative of daily social interaction and the structure imposed by such a task could have influenced the pattern of synchronization. More recently, Trujillo et al. [ 66 ] specifically focused on analyzing the correlation between linguistic alignment and movement alignment during a task-oriented conversation and an affiliative conversation. They found that movement entrainment was positively correlated with lexical entrainment but negatively with semantic entrainment. However, the authors concentrated their analyses on correlations between voice-to-voice and movement-to-movement interactions, rather than exploring all possible combinations of modalities.

While most of the research on synchronization focuses either on motor behavior or speech dynamics independently, studies exploring the interpersonal dynamics of speech in relation to nonverbal gestures remain limited [ 41 , 59 , 67 ]. A more thorough account of these interpersonal multimodal dynamics is needed as their synchronization is a foundation for effective social rapport [ 45 ]. Moreover, as pointed out by Alviar et al. [ 34 ], the precise timescale of the different patterns of coordination between modalities should be studied both at the level of the individual as well as at the level of the dyad interaction. Consequently, the current study aims to fill this gap by analyzing the multimodal synchronization between prosodic features of speech and nonverbal head/hand gestures, both intra and interpersonal, during a dyadic conversation. For this purpose, we examined synchronization among an online dataset, where dyads engaged in an unstructured conversation (i.e., where turn-taking between the partners was not controlled) [ 68 ]. This dataset recorded the global motion of the dyads as well as their specific hand, head movement, and speech. Synchronization of speech and gesture was assessed using the methodology of the cross-wavelet transform [ 41 , 67 ]. Especially, the cross-wavelet transform enabled us to extract two fundamental measurements of synchronization, namely, the pattern of synchronization and the degree of coherence between the two individuals’ dynamics, as emphasized previously. While the pattern of synchronization refers to how the signals, such as speech and gestures, align in time, the degree of coherence highlights how closely correlated or similar these signals are in terms of their frequency components [ 42 ]. These metrics were extracted at the intrapersonal and interpersonal levels, encompassing both unimodal and multimodal synchronization analyses of speech and gesture. Consequently, we decided to separate our work into two distinct questions:

1) What are the observed degree and patterns of synchronization at the intrapersonal level, encompassing both unimodal and multimodal synchronization? Based on previous studies, we hypothesize finding higher-than-chance coherence within both unimodal and multimodal modalities [ 67 ]. While we expect to find the wrist leading the voice [ 69 ], we lack empirical evidence to formulate hypotheses for unimodal synchronization patterns.
2) Is this intrapersonal synchrony also related to the interpersonal synchrony between individuals’ movements and voices, both in terms of unimodal and multimodal synchronization?

We expect to find higher-than-chance synchronization between the participants’ movements and between the participants’ voices respectively [ 41 , 46 , 59 ]. However, we lack the theoretical foundations to formulate hypotheses about multimodal coordination. Similarly, we lack empirical evidence to formulate hypotheses for unimodal and multimodal synchronization patterns.

Materials and methods

Task and recording.

The task and recording files come from the original dataset of Met Research available at https://github.com/facebookresearch/TalkingWithHands32M/ [ 68 , 70 ]. The dataset consisted of 50 sessions of two people engaged in unstructured face-to-face conversations. Each of these sessions contained 1 to 4 discussions. These conversations lasted from 7 to 15 minutes. During the conversation, participants were free to talk about specific topics. The topic originated from talking points initially intended for informal conversations in English such as “Where are you planning to go for your next vacation?” or “What good restaurants do you know of around here?”. The specific role of the speaker and listener was not specified, and participants were free to engage in and drift to another topic [ 68 ]. The duration of each discussion varied, resulting in an average conversation time of about 11 minutes.

Motions were recorded in a room fitted with an Optitrack motion capture system. and built into Biovision Hierarchy files. The motion capture system’s cameras were positioned on each side of the two participants, as well as from below and above, to capture the overall body of each participant, resulting in 83 different body coordinates being acquired. The motion recording of each participant was then built into Biovision Hierarchy files. The audio was recorded using OctaMic XTC, a versatile preamplifier, and was synchronized with motion data by BrightEye 56 Pulse Generator [ 68 ].

Out of the thirty-two sessions that were at our disposal, only fourteen were selected for our analyses (i.e., as multiple discussions occurred within each session, it resulted in a total of thirty-two discussions). These fourteen specific sessions were chosen due to their inclusion of both audio and motion data from the presented tasks.

Data processing

Motion analyses..

We used a GitHub script available at https://github.com/wspr/bvh-matlab and a MATLAB script to extract specific body coordinates (X, Y, Z). All body data was processed at 90 fps. We applied a second-order Butterworth low pass filter to the data. Then, we calculated the velocity time series from this filtered position data. Of the 83-body part velocity data, the final analyses focused on a head marker, and the sum of the left and wrist marker velocities, as both can be used to accompany speech.

Audio analyses

Every session from the dataset consisted of two distinct audio channel files (in WAV), intended to capture the voices of the participants who wore the microphones. Going through the analysis of each audio file, it became clear that both voices were indiscernible: Attributing a single audio signal exclusively to one participant was challenging since the responder’s voice was also captured in the recording. To overcome this issue, we used the ELAN software to manually annotate the participants’ speaking time [ 71 ].

To quantitatively assess the shared temporal structure of speech and gesture, we calculated the amplitude envelope of the speech. The amplitude envelope is a continuous measure for tracking the rhythmicity of speech (i.e. the prosody of speech) known to correlate highly with articulatory movements [ 72 ]. The amplitude envelope was extracted through the code of Pouw and Trujillo [ 73 ]. which computed the analytic signal and temporal fine structure using the Hilbert transform method.

We also used the speaking behavior annotations that we manually annotated along with a MATLAB script to describe the turn-taking behavior of the conversation. A turn is defined as a sequence of speech units from a single speaker, that can be separated by pauses, but which are not interrupted by speech units of the other speaker [ 74 ]. These speech units are characterized as speech segments from one speaker without any silence exceeding 200ms [ 24 ]. Therefore, to create these speech units, we merged speech segments that were separated by silence shorter than 200 milliseconds. The remaining silences, (longer than 200 milliseconds) can be identified as pauses, if they occurred between consecutive speech units from the same speaker, or as gaps if they occurred between consecutive speech units from different speakers. If pauses occurred without any detected voice from the other speaker, indicating no interruption, the speech units from the same speaker were combined to form a turn. For example, imagine that speaker "A" is talking but a pause > 200ms is observed. When speaker “A” paused for more than 200 milliseconds, we checked whether individual “B” stayed silent; if “B” remained quiet and “A” continued speaking afterward, “A”’s speech segments were combined into a single turn. Moreover, speech units from speakers A and B can occur at the same time (i.e., they overlap). Overlaps could represent an interruption or a backchannel; however, such a distinction was not made here [ 24 ]. Both onsets of overlaps and onsets of turns constituted a turn switch. Turn duration was calculated, and revealed that 72% of the turn lasted up to 10 seconds, 22% extended from 10 to 20 seconds, while a minority, comprising only 6%, continued up to 30 seconds. 70% of the turn switches were attributed to overlaps and the remaining 30% to the onsets of turns.

Wavelet analysis

Studying social interaction involves analyzing different types of signals (i.e., different modalities), such as movement and speech signals. These signals are known to oscillate on multiple timescales, from faster ones (syllable rate) to longer ones (movement duration, turn duration) [ 22 , 41 , 67 ]. Actually, studies suggested that co-speech gestures and speech are organized in a complex, hierarchical manner, where shorter timescale patterns are nested within longer timescales [ 67 ]. This means that synchronization could change over the time course of the interaction, making those time series non-stationary [ 75 , 76 ]. While cross-correlation analyses are commonly used to study interpersonal coordination [ 77 ], these methods assume stationarity in the signals [ 76 ]. This assumption makes them inadequate for capturing the dynamic phenomena that occur during conversations [ 2 ]. To overcome this, wavelet-analysis methods have been recently used in many studies on synchronization [ 41 , 59 , 60 , 67 ]. This validated method allows one to investigate complex and non-stationary time series with multiple frequencies occurring at the same time [ 41 , 42 ]. The analysis resulting from the wavelet transform is then mapped onto a time-frequency plane and illustrated in a wavelet power plot [ 42 ]. Fig 2 provides an example of frequency modifications observed in a speech envelope signal.

The wavelet transforms represents one person’s voice saying “OK” every 2 seconds (0.5 Hz). The x-axis shows the time in seconds. The y-axis represents the different timescale expressed in frequency (Hz)(i.e., where the 1 Hz = 1/period = 1/1 sec). The main frequency of the voice is represented with the continuous yellow line at 0.5 Hz while the consecutive burst reflects the moment when frequency modification occurs (i.e., when the person said, “OK”).

https://doi.org/10.1371/journal.pone.0309831.g002

A Cross-wavelet transform is then employed to evaluate the dynamic interaction implying two different time series. Cross-wavelet wavelet analyses allow giving information about the coherence and relative phase of two signals. The coherence measures the degree of similarity between two-time series at each timescale on a range from 0 to 1. The coherence of 1 reflects perfect synchronization between the two-time series, while 0 characterizes no synchronization. The relative phase captures and quantifies the pattern of synchronization between these two components and can also determine the transition from one behavior to another [ 42 ]. Coordination where the two signals are moving together at the same time, has a relative phase of 0° meaning both are in-phase. On the opposite, if people are moving in alternation, their activity time series will be in anti-phase and will have a relative phase angle of 180° [ 26 , 42 , 78 ].

In the current study, cross-wavelet transforms were calculated for each dyad using the MATLAB wavelet toolbox. Morlet was used as the mother wavelet. We extracted the coherence and relative phase values for all the selected sessions across 23 timescale ranges. We extracted the coherence and relative phase values for 23 ranges of timescales ranging from 0.125 s to 30 s (0.125–0.25, 0.25–0.375, 0.375–0.5, 0.5–1, 1–2, 2–3, 3–4, 4–5, 5–6, 6–7, 7–8, 8–9, 9–10, 10–12, 12–14, 14–16, 16–18, 18–20, 20–22, 22–24, 24–26, 26–28, 28–30). These intervals were selected as they represent relevant fine-grained timescales for both speech and movements [ 59 , 67 ].

The mean of the coherence and the circular mean of the relative phase at these subsidiary timescales were extracted to evaluate patterns and degree of body motor coordination, respectively, and submitted to statistical analyses.

Surrogate data.

Surrogate data were generated to form control condition estimates to evaluate whether the degree and the pattern of coordination at the different timescales were significantly different from that expected by chance [ 41 ].

For interpersonal analyses, between-subject shuffling was applied to create these surrogate data. It consists of permuting each participant’s whole session data, irrespective of session order and participant, creating artificial interaction or “pseudo-dyads”. In this study, we combined Participant 1 from a specific dyad with Participant 2 from all remaining dyads and takes. All possible combinations were extracted, resulting in 942 random pseudo-dyads. When time series were of unequal length, the longer time series was truncated to the length of the shorter series [ 79 ].

For intrapersonal analyses, within-subject shuffling (i.e., segment shuffling) was applied to create the surrogate data. This process consists of splitting individual time series into small segments which are then permuted in time but keeping the subject structure intact. In the current study, we chose a segment length equal to 200ms as intrapersonal synchrony can be found at higher frequencies [ 80 , 81 ].

Overall, the same wavelet coherence and phase analyses were conducted over all possible combinations of pseudo-dyads and then averaged at the dyad level for further statistical analyses.

The results section is divided into two major parts. The first focuses on intrapersonal analyses, examining both degrees and patterns of synchronization between the speech and the gesture (unimodal and multimodal). The second focuses on interpersonal analyses, examining both degrees and patterns of synchronization between the speech and the gesture (unimodal and multimodal). The term “unimodal” focuses on “movement vs. movement” analyses (e.g., head vs. head, head vs. wrist) or “voice vs. voice” analyses. In contrast, the term “multimodal” involves the joint analysis of movement and voice modalities (e.g., head vs. voice; wrist vs. voice). To analyze degrees and patterns of synchronization, separate analyses of variance (ANOVA) were conducted on the mean coherence and the circular mean relative phase respectively. Condition (Experimental, Virtual) and Timescales (ranging from 0.125 s to 30 s) were chosen as variables. Moreover, depending on whether the analyses were unimodal or multimodal, Modalities (such as Head vs. Head, Head vs. Wrist, and Head vs. Voice, etc.) were included as a third variable. For all statistical analyses, Greenhouse-Geisser adjustments for violations of sphericity were made as necessary. For post hoc analyses and pairwise comparisons, Bonferroni correction was implemented to determine significant differences between individual means.

What degrees and patterns of synchrony are observed within participants (intrapersonal analyses)?

Unimodal movement vs. movement analyses..

The unimodal analyses focused exclusively on one modality of “Wrist vs. Head”. Therefore, a two-way repeated ANOVA was conducted on both mean coherence and mean relative phase with Condition and Timescales as within-variables.

Degree of coherence.

The ANOVA revealed a significant main effect of Condition ( F (1, 13) = 154.95, p < .0001, η 2 = 0.73) on the mean coherence. Post-hoc analysis highlighted a significantly higher mean coherence for the experimental condition ( M = 0.34) compared to the virtual condition ( M = 0.25) ( Fig 3A ). Therefore, the subsequent analysis focused on the experimental group and found a main effect of Timescales ( F (22,286) = 25.89, p < .0001, η 2 = 0.54). The highest mean coherence was found in the 12–16 s range ( M = 0.38) while the lowest was found in the 0.25–0.5 range ( M = 0.26). No effects of Modality were found to be significant.

(A) Differences in mean coherence between the experimental condition (red) and the virtual condition (blue), across different timescales. Mean and standard deviation are represented with dots and colored ribbons respectively. Grey areas depict significant differences between conditions. Here, all timescales are significant. (B) A circular histogram depicting the probability distribution for the experimental condition and ranges timescale where wavelet significantly differed from chance. The blue line represents the mean phase angle. Unimodal wrist vs. head relative phase angle oscillates around -6°, the head leading the wrist on average.

https://doi.org/10.1371/journal.pone.0309831.g003

Pattern of synchronization.

The ANOVA was performed within all timescales. It revealed a significant interaction effect between Timescales and Condition ( F (22, 286) = 1.99, p < .05, η 2 = 0.062). Post-hoc analysis highlighted an effect of Condition in the time intervals 0.375–0.5, 0.5–1, 1–2, and 4–5 s. Further descriptive analysis focused on these specific timescales and involved wrapping the data around 360° to acknowledge the inherent cyclic nature of the data and enhance visualization. Circular plots were handled using the toolbox for circular statistics with Matlab [ 82 ]. It demonstrated that the angle where oscillations were most concentrated (the peak of the probability distribution) was identified to be 342° (i.e. -18° in the range [-180°-180°]) and that the average angle direction was found at -6°. On average, intrapersonally, the head tended to precede the wrist, indicating an in-phase lead ( Fig 3B ).

Multimodal voice vs. Movement analyses

The multimodal analyses between voice and movement focused on two modalities “Wrist vs. Voice”, and “Head vs. Voice”. Therefore, a three-way repeated ANOVA was conducted on both mean coherence and mean relative phase, with Condition and Timescales, and Modalities (added here as a third variable).

The ANOVA revealed a significant interaction effect between Condition and Timescales ( F (22, 286) = 16.66, p < .0001, η 2 = 0.15). In spite of the interaction, further analysis highlighted a statistically significant effect of condition at all levels of Timescales, with a higher mean coherence for the experimental condition ( M = 0.29) compared to the virtual condition ( M = 0.25) ( Fig 4A ). Therefore, the subsequent analysis focused on the experimental condition and found a main effect of Timescales ( F (22,286) = 22.54, p < .0001, η 2 = 0.32). The highest mean coherence was found in the 26–28 s range (M = 0.34) while the lowest was found in the 0.5–2 s range ( M = 0.25). No effects of Modality were found to be significant.

(A) differences in mean coherence between the experimental condition (red) and the virtual condition (blue), across different timescales. Mean and standard deviation are represented with dots and colored ribbons respectively. Grey areas depict significant differences between conditions. Here, all timescales are significant. (B) and (C) multimodal Head vs. Voice and Wrist vs. Voice relative phase angle respectively. The circular histograms depict the probability distribution for the experimental condition for timescales where wavelet significantly differed from chance. The blue line represents the mean phase angle. Head vs. voice relative phase angle oscillates around -18°, the voice leading the head on average. The wrist vs. voice relative phase angle oscillates around -34°, the voice leading the wrist on average.

https://doi.org/10.1371/journal.pone.0309831.g004

The ANOVA was performed within all timescales. It revealed a significant interaction effect between Condition and Timescales ( F (22, 286) = 3.79, p < .0001, η 2 = 0.06). The interaction effect underscored a difference in the mean circular relative phase between the Experimental and Virtual conditions, especially within the 3–18 range ( M experimental = -23.54, M virtual = 0.229), the 20–26 range ( M experimental = -15.34, M virtual = 2.54), and the 28–30 range ( M experimental = -16.6, M virtual = -0.79). Further descriptive analysis indicated that the angle where oscillations were most concentrated (the peak of the probability distribution) was identified to be 342° (i.e. -18° on the range [-180°-180°]) for the Head vs. Voice angles, and 324° (i.e. -36° on the range [-180°-180°]) for the Wrist vs. Voice angles. The average angle direction was found at -18° for the Head vs. Voice angle, highlighting that intrapersonally, the voice tended to precede the head, indicating an in-phase lead ( Fig 4B ). The average angle direction was found at -34° for the Wrist vs. Voice angle, highlighting that intrapersonally, the voice tended to lead the wrist, indicating an in-phase lead ( Fig 4C ).

What degrees and patterns of synchrony are observed between participants (interpersonal analyses)?

Unimodal voice vs. voice analyses..

The unimodal analyses focused exclusively on one modality of “Voice vs. Voice”. Therefore, a two-way ANOVA was conducted on both mean coherence and mean relative phase, with Condition as the between-subject variable and Timescales as the within-subject variable.

The ANOVA revealed a significant interaction between Condition and Timescales ( F (22, 572) = 6.75, p < .0001, η 2 = 0.14). The interaction effect indicated that the Experimental group exhibited a higher mean coherence in comparison to the virtual dyads within all ranges except 0.375–1 ( Fig 5A ). The subsequent analysis focused on the Experimental condition on ranges where wavelet coherence was significantly higher than chance. An ANOVA revealed a significant effect of Timescales ( F (20,260) = 9.01, p < .0001, η 2 = 0.28). The highest mean coherence was found in the 22–26, and 0.125–0.25 s ranges ( M = 0.34) while the lowest was found in the 0.25–0.375 , and 1–3 s ranges ( M = 0.25). No effects of Modality were found to be significant.

(A) differences in mean coherence between the experimental condition (red) and the virtual condition (blue), across different timescales. Mean and standard deviation are represented with dots and colored ribbons respectively. Grey areas depict significant differences between conditions. Here, the significative timescales are 0.125–0.375, and 1–30 for (A); 5–7 and 9–26 for (B). (C) A circular histogram depicting the probability distribution for the experimental condition and ranges timescale where wavelet significantly differed from chance. The blue line represents the mean phase angle. Unimodal wrist vs. head relative phase angle oscillates around 14°, the wrist leading the head on average.

https://doi.org/10.1371/journal.pone.0309831.g005

One may note that since the voice vs. voice comparison is between the same modalities (voice), further analysis wouldn’t clarify which modality is leading. Such an analysis would only address which participant’s voice was leading the other. Since this is likely random within and across dyads, no pattern of synchronization analyses was conducted unimodal voice vs. voice coordination.

The unimodal movement analyses focused on three modalities, including “Head vs. Head”, “Wrist vs. Wrist”, and “Wrist vs. Head”. Therefore, a three-way ANOVA was conducted on both mean coherence and mean relative phase, with Condition as the between-subject variable, Timescales and Modalities as within-subject variables.

The ANOVA revealed a significant interaction effect between Condition and Timescales ( F (22, 572) = 2.76, p < .0001, η 2 = 0.03). The interaction effect indicated that the Experimental group exhibited a higher mean coherence in comparison to the virtual dyads within specific ranges: 5–7 s ( M experimental = 0.24, M virtual = 0.24), and 9–26 s ( M experimental = 0.26, M virtual = 0.23). Within these specific timescales, the coherence of the experimental dyads was significantly above the one obtained from chance (virtual condition), irrespective of the modality ( Fig 5B ). A subsequent analysis focused on the Experimental condition on ranges where wavelet coherence was significantly higher than chance. An ANOVA revealed a significant effect of Timescales ( F (9, 117) = 4.85, p < .0001, η 2 = 0.06). The highest mean coherence was found in the 18–22 range ( M = 0.27) while the lowest was found in the 5–7 s range ( M = 0.24). No effects of Modality were found to be significant.

The three-way ANOVA was performed on ranges where wavelet coherence was significantly different from chance.

The ANOVA revealed significant main effects for Condition ( F (1, 26) = 5.16, p = .032, η 2 = 0.08). The angle where oscillations were most concentrated (the peak of the probability distribution) was identified to be 324° (i.e. -36° in the range [-180°-180°]), and the average angle direction was found at 14°. On average, interpersonally, the wrist tended to lead the head, indicating an in-phase lead ( Fig 5C ).

Multimodal voice vs. Movement analyses.

The multimodal voice/movement analyses focused on two modalities, including “Head vs. Voice”, and “Wrist vs. Voice”. Therefore, a three-way ANOVA was conducted on both mean coherence and mean relative phase, with Condition as the between-subject variable and Timescales and Modalities as within-subject variables.

The ANOVA revealed a significant interaction between Condition and Timescales ( F (22, 572) = 9.45, p < .0001, η 2 = 0.12). Irrespective of the modalities, the interaction effect indicated that the Experimental group exhibited a higher mean coherence in comparison to the virtual dyads on timescales ranging from 2 to 30 s ( Fig 6A ).

(A) differences in mean coherence between the experimental condition (red) and the virtual condition (blue), across different timescales. Mean and standard deviation are represented with dots and colored ribbons respectively. Grey areas depict significant differences between conditions. Here, the significative timescales are 2-30s. (B) and (C) multimodal head vs. voice and wrist vs. voice relative phase angle respectively. The circular histograms depict the probability distribution for the experimental condition for timescales where wavelet significantly differed from chance. The blue line represents the mean phase angle. Head vs. voice relative phase angle oscillates around 5°, the head leading the voice on average. The wrist vs. voice relative phase angle oscillates around 100°, the voice leading the wrist on average.

https://doi.org/10.1371/journal.pone.0309831.g006

The subsequent analysis focused on the Experimental condition on ranges where wavelet coherence was significantly higher than chance. An ANOVA revealed a significant effect of Timescales ( F (14, 182) = 11.01, p < .0001, η 2 = 0.20) and Modalities ( F (1,13) = 4.85, p < .05, η 2 = 0.052). The highest mean coherence was found in the 26–30 s range ( M = 0.29) while the lowest was found in the 2–5 s range ( M = 0.25). In addition, the coherence between Head vs. Voice ( M = 0.28) was significantly higher than the coherence between Wrist vs. Voice ( M = 0.27).

The ANOVA revealed a significant main effect for Condition ( F (1, 26) = 13.47, p = 0.001, η 2 = 0.13). Further descriptive analysis indicated that the angle where oscillations were most concentrated (the peak of the probability distribution) was identified to be 342° (i.e., -18° on the range [-180°-180°]) for the Head vs. Voice angles, and 108° for the Wrist vs. Voice angles. The average angle direction was found at -0.9° for the Head vs. Voice, highlighting that interpersonally, the voice tended to lead the head, indicating an in-phase lead ( Fig 6B ). The average angle direction was found at 97° for the Wrist vs. Voice angle, highlighting that interpersonally, the voice tended to lead the wrist, indicating an anti-phase lead ( Fig 6C ).

The current study was designed to enrich our full picture of the multimodal behavior dynamics observed during social interaction, at the intrapersonal and interpersonal levels. We concentrated our attention on specific modalities including the voice and head/wrist non-verbal gestures performed during an unstructured conversational task. We used the cross-wavelet coherence and relative phase to investigate the degree and pattern of synchronization of these modalities across different timescales.

For our first focus, intrapersonal synchronization, we observed, as expected, higher-than-chance self-coherence both between the head and wrist movements and between the participant voice and movements but importantly also found that self-synchronization occurred at all the interaction timescales. In exploring the pattern of these synchronizations, we found that intrapersonal synchronization exhibited an in-phase correlation in both unimodal movement synchronization and multimodal gesture-speech synchronization.

For our second focus, interpersonal synchronization, we found higher-than-chance unimodal movement-movement and voice-voice coordination, as we hypothesized but also extended previous findings by highlighting higher-than-chance multimodal synchronization between the voice and the movement of interlocutors at specific conversation timescales. The analyses of the relative phase highlighted an in-phase relationship with unimodal movement synchronization as well as for multimodal head vs. voice, but an anti-phase pattern was found between the wrist and the voice interpersonally.

Intrapersonal degree of synchrony

A primary focus of this study was to analyze the degree and pattern of synchronization observed at the intrapersonal level, both in terms of unimodal and multimodal synchronization. Our results demonstrated higher-than-chance unimodal synchronization between the participant’s wrist and head movements and found similar synchronization in the multimodal coordination between the participant’s head/wrist movements and voice. These findings are consistent with what might be expected from one gesticulated during speech and are supported by past research. For example, Hadar et al. [ 83 ] recorded participants’ head movements during conversations and discovered that the head maintains nearly constant movement during speech. It is not surprising that head movements match closely with speech, as speakers commonly use head movements to emphasize key points or intensify words [ 14 ]. Our results also relate to Tuite’s [ 84 ] Rhythmical Pulse Hypothesis which postulates that the gestural stroke of either manual or non-manual gesture coincides with the intonation peak of spoken language [ 10 ]. This theory was illustrated in the literature on gesture-speech synchronization in which it is highly demonstrated that manual gestures co-occur in time with the suprasegmental properties of speech such as intonation and rhythm [ 10 ]. This process was depicted in the study of Pouw et al. [ 63 ] which demonstrated how rhythmic arm movements influenced vocalization acoustics by amplifying the amplitude envelope of speech.

Moreover, our results revealed that synchronization, whether unimodal or multimodal, occurs throughout all the conversational timescales, from faster time intervals at 0.125 seconds to longer ones at 30 seconds. Underlying the observation is the view that naturalistic settings such as conversations are made up of the succession of vocal exchanges between individuals, referred to as turns [ 85 ].

Considering that speakers take the floor one at a time, one speaker’s turn will consist of utterances, which are sequences of words, further broken down into sequences of syllables. Hence, at the intrapersonal level, these linguistics hierarchical events involve multiple embedded timescales [ 67 , 86 ]. For instance, in an experiment where a participant had to retell a cartoon, Pouw and Dixon [ 67 ] found that speech and hand gestures converged at the periodicity relevant to syllable completion times (0.125 sec to 0.5 sec), gesture completion times (0.5 sec to 2 sec), and sentence completion times (2–6 sec). These shared temporal structures between speech and gesture seem consistent with a subset of our results, notably for the coherence found at the fastest timescales (0.125–0.5 sec), likely illustrating the syllable duration. Indeed, this latter vocal feature is known for its consistency and stability across language, with particular oscillation around 200ms [ 87 ].

However, our results revealed the highest coherence for the longest timescales (26–28 sec), a finding not accounted for in previous speech/gesture synchronization studies. One potential explanation lies in the fact that participants’ turns are highly dependent on the interpersonal dynamics of the interaction, which is also influenced by the type of conversational task employed. Introducing a simple modeling framework simulating the dynamics of speakers’ behaviors across different task contexts, Miao et al. [ 88 ] highlighted that small changes in the task configuration (i.e., a topic or goal change) may indeed modify the structure of turns. These views would effectively match the interpersonal synergy theory, which suggests that a conversation, and therefore the turn it constitutes, cannot be fully understood at this individual component level but must be integrated within the whole conversation organization that is shaped by the task constraints. Supporting this theory, a study by Dideriksen et al. [ 89 ] showed how task demands (whether demanding a high or a low level of precision) leverage the rate and level of conversational entrainment, to foster mutual comprehension.

In the current study, the participants were engaged in long unstructured conversations that don’t necessarily involve rapid exchanges like question-answer sessions. We noted that around 70% of the turns lasted up to 10 seconds, with 20% extending to 20 seconds. We believe this relatively slow rhythm of turn could encompass discussions that delve into deeper and more open-ended subjects, such as the sharing of personal experiences, or more nuanced exploration of ideas. Moreover, Yuan et al. [ 90 ] found that the topic and the relationship between speakers could affect turn length and speaking rate, with longer turns observed between strangers. The authors attributed these longer turns to the potential formality of interactions and the absence of shared knowledge between strangers. This explanation could account for our results, as participants in our study were unfamiliar with each other.

Intrapersonal pattern of synchrony

Our unimodal relative phase results indicate that the head and the wrist tend to be in-phase, with the head leading the wrist on specific timescales of 0.375–2 s and 4–5 s whereas the multimodal coordination of voice and body, reveal higher than chance in-phase relative phase on the time intervals between 3–18, 20–26 and 28–30 s. This in-phase synchronization underscores the intimate coupling between an individual’s gesture and speech. Moreover, on average the voice leads the wrist and head movements, as indicated by the mean relative phase of -34° and -18°, respectively.

On the face of it, these results seem at odds with a previous study by Pouw, Harrison et al. [ 69 ] that highlighted that repetitive arm beat movements tend to entrain phonation. The authors explained this finding by the physical coupling between arm movement and speech. When making a gesture, a force will be produced which will increase alveolar lung pressure. This increase will modulate the laryngeal pressure, leading to changes in the amplitude and intensity of vocalization. This process holds for emphasized gestures, such as simple and fast arm movements (i.e., beat gestures), as they demand greater force production [ 11 , 69 ]. However, the differences in the types of gestures being analyzed could account for the disparities with our results. Indeed, we analyzed the entire movement time series without systematic annotation of a specific movement. It resulted in a wider range of gestures, some of them possibly less physically effortless, such as simple wrist movements or natural oscillations. Results on temporal alignment are therefore less precise, and a possible phase shift could have happened between any kind of postural and voice oscillation. This explanation applies to all our results on the relative phase.

Interpersonal degree of synchrony

The second focus of this study aimed to analyze the degree and pattern of synchronization observed at the interpersonal level, in terms of both unimodal and multimodal synchronization—between the voices and the bodies of the two speakers as well as the voice of one speaker and body of the other. Unimodal voice analyses revealed higher-than-chance coherence between participants’ voices on all temporal ranges except for 0.375-1sec. These observations are consistent with Wilson and Wilson’s [ 22 ] dynamic model of turn-taking which proposed that speakers’ oscillatory cycles are established by their syllable rate which rhythmically entrain the listeners’ oscillators. Moreover, Manson and collaborators [ 91 ] examined dyadic synchronization in vocal characteristics and found that mean syllable duration (i.e., speech rate) converged through the interaction. When the two speakers talk one at a time, syllable duration convergence might not be indicative of synchronization. However, daily conversations are not that simple and often involve overlaps, whether when speakers change turns or use backchannels and interruptions [ 92 ]. In our study, 70% of the turn switches were found to be overlaps. Therefore, it is suitable to believe that synchronization can occur at the syllable duration for the overlapping part of the conversation. In addition, unimodal voice synchronization is likewise determined in larger temporal intervals, with a tendency among speakers to coordinate their voice at the turn level (i.e., speakers are more similar to each other at turn exchanges) [ 47 ]. These observations potentially explain our findings of high coherence observed for shorter timescales which are associated with syllable durations, as well as high coherence for longer timescales which are representative of turn durations.

Regarding unimodal movement-movement analyses, results revealed higher-than-chance coherence between participants’ movements, with specific temporal ranges showing greater synchronization. More precisely, the participants synchronized their movements at a middle timescale of 5–7 s and at a slower timescale between 9 s and 26 s. While consistent with past literature on existing bodily synchronization, our findings diverge on the associated periodicity. In their Knock-knock jokes task, Schmidt et al. [ 41 ] found a higher-than-expected level of synchronization between the global quantity of movement of the two interactants. They identified the moment where high synchronization happened to be every 1.5 s, when the speaking turn occurs, as well as every 6 s, at the end of the joke. However, knock-knock jokes are a highly structured conversation, and a degree of synchronization could have emerged from this inherent rhythmic organization. In a recent paper, Schmidt et al. [ 93 ] overcame this by analyzing synchronization within an interview, a less structured task. They observed synchronization occurring over longer timescales ranging from 10 to 16 s, consistent with the timing of the interview questions. Moreover, they found synchronization even in the absence of visual cues. They discussed this outcome, stating that interpersonal coordination of speech rhythms provides the basis for interpersonal synchronization. For this reason, we posit that distinct turn rhythms significantly impact interpersonal synchronization, similar to our explanation for intrapersonal coordination. This point of view is supported by a study conducted by Fujiwara et al. [ 59 ], wherein dyads participated in unstructured conversation. They found the highest coherence to increase for the longer timescales 2–40 s compared to faster ones 0.25–2 s. Although the author did not extract any information about the turn-taking space, they suggest these slow rhythms to be representative of our everyday conversation. These findings corroborate our results, with the highest synchronization observed over longer timescales. Moreover, the task used in our study also resembles daily conversation which might explain this similarity.

Concerning multimodal coherence of the voice of one speaker and the body of the other, we interestingly found similar results to those of unimodal coherence. The voice of one participant and the movements of the other participant are synchronized above the chance level for the middle and slower timescales between 2 s and 30 s. These results are in line with our explanation for the unimodal coherence, supporting that the vocal features involved in the sentences and turn duration indeed seem of major importance in multimodal coherence. Another possible explanation of these specific timescales may be attributed to backchanneling, a fundamental feature of multimodal voice/gesture synchronization. Backchannels are described as feedback produced simultaneously with speech to provide speakers with real-time information about how their turn is being received [ 94 ]. According to previous studies, vocal and nonverbal backchannels occur at the same time, with vocal cues occurring every 9 s and nonverbal cues every 6 s [ 89 , 95 , 96 ]. Based on our results, it is possible that high synchronization found between the voice and movements, around 6 and 9 s, depicts the use of backchannels.

Interpersonal pattern of synchrony

Our results on the unimodal pattern of synchrony indicate a relative phase higher than chance on the time intervals between 3–18, 20–26 and 28–30 s. While the wrist leads in-phase on average, there are notable instances where the head leads (as indicated by the probability peak at -36°).

For the multimodal analyses of the voice of one speaker and body of the other, a relative phase higher than chance was found on the time intervals between 2 s and 30 s. On average, the voice leads wrist movement in antiphase and the voice leads the head in-phase. We believe the anti-phase relationship found between the wrist and the voice of the participant to be representative of the turn-taking nature of the task. Indeed, while taking the floor, the speaker will actively use their hands to emphasize their speech. Conversely, the listener tends to remain comparatively still [ 97 ]. In the same way, the in-phase relationship between the voice and the head could also reflect the dynamics of the conversation, notably highlighting the feedback nature of the listener’s head movements and vocalizations. This explanation is coherent with previous studies which observed that the feedback nods produced by the listeners were close in time to their corresponding speech, preceding it by ~175-400ms [ 98 , 99 ]. The authors proposed that vocal responses and nonverbal head movements serve an interpersonal function, where the listener addresses feedback to the speaker. In other words, while a participant is speaking, the listener will likely produce backchannels in the form of head nods or soft vocalization to provide information about its involvement, without disturbing the interlocutor’s speech. Low-amplitude single nods were indeed found to happen in phase with speakers’ stressed syllables [ 100 ]. Moreover, this would also account for the in-phase relationship between the wrist and the head of the participants that we observed, as wrist movement typically occurs during the speaking turn.

Table 1 summarizes the major findings of this study in relation to previous results. Overall, for the intrapersonal and interpersonal analyses, our findings support the idea of self-coordination between the speaker’s voice and its movements at all timescales of the conversation, including the syllable rate. At the interpersonal level, synchronization was found at specific timescales, which we believe are relevant to the turn-taking dynamics of the conversation. Notably, coordination seems to match the speaker’s turn duration. Moreover, because we found unimodal and multimodal coordination to append approximatively on the same timescale, we assume these synchronizations to rely mostly on the auditory channel, specifically on speech rhythms [ 93 ]. This would verify previous studies that emphasized that verbal information only is sufficient for creating spontaneously coordinated movements between two people talking to each other [ 41 , 101 ]. In addition, we highlighted in-phase relationships for intrapersonal synchronization as well as some anti-phase relationships for interpersonal synchronization, which certainly accounted for the turn-taking dynamics of the interaction. Moreover, our results suggest that some intrapersonal patterns of relative phase synchronization coincide with interpersonal ones. Actually, intrapersonal coupling between the head and the voice hit the same peak of oscillation as interpersonal coupling between the head and the wrist (i.e. -36°). In the same view, intrapersonal coupling between the voice and the wrist, and between the head and the wrist also shares the same peak of oscillation as interpersonal coupling between the head and the voice (i.e. -18°). This similarity seems to indicate that the same mechanism underlies both intrapersonal and interpersonal communication and, hence, provides support for the behavioral dynamics’ perspective that the same dynamical processes of self-organization that constrain physical oscillators govern coordination at the behavioral and social scales of nature [ 93 ].

https://doi.org/10.1371/journal.pone.0309831.t001

Limitations and directions for future research

While our results provide insight into the multimodal dynamics of social interaction, there are also limitations to consider. First, the dataset allowed us to use only a small number of dyads, which could have lowered the statistical power of our results and complicated the generalization of our findings. Second, while all the dyads were different in their composition, one participant engaged in all the interactions (i.e., participated in all conversations). This could have influenced our findings on synchronization as this consistent participant could have led to a uniform pattern of synchronization across timescales and all interactions. Third, the generation of intrapersonal surrogate time series involved shuffling within subjects, using a specific segment length of 200ms that may not have captured all synchronization patterns comprehensively. While the choice of this segment size was made based on the prior knowledge of the tendency of speakers to synchronize at the syllable duration (around 200ms), it could have overlooked longer synchronization patterns [ 80 ]. We suggest future research to explore surrogate methods capable of representing both short and long-term synchrony more effectively. Fourth, while we extracted turn duration from our manual annotation combined with a MATLAB script, other automatic methods could have been employed to obtain more detailed information about participant sentences and syllable length. Fifth, we believe that further vocal and gestural annotation could be beneficial in identifying and understanding the causal relationship between specific vocal features (i.e., such as turn, and backchannels) and corresponding gestures. This deeper classification could indeed provide more comprehensive insights into how multimodality between the voice and movements interact and influence each other during social interaction. Then, while cross-wavelet coherence measures time coherence, it may not fully capture the dynamics (whether speech or movements) that are not closely aligned in time. In support of this assumption is the interpersonal synergy view, which highlights that behaviors may not always synchronize but complement each other (e.g., one speaks while the other listens) in a manner that ensures the coherence of the global dyadic-system [ 32 ]. Finally, as described by Mogan et al. [ 45 ], the ability of individuals to synchronize their vocalization and movements helps increase perceived social connection, positive affect, and prosocial behaviors. Extending our findings on multimodal synchronization to a more thorough classification could shed light on which behavior leads to different prosocial behaviors, especially among individuals encountering deficits in social connection, such as in individuals diagnosed with schizophrenia.

In conclusion, the current study provided evidence of unimodal and multimodal synchronization in unstructured conversation, both at the intrapersonal and interpersonal levels. While intrapersonal coordination relates to specific vocal properties such as syllable duration or backchannel time, interpersonal coordination also seems mediated by some vocal features. Notably, we found the turn-taking dynamics of the interaction to be of particular importance in the observed synchrony, likely to enable the conversation to proceed efficiently. These findings highlight the major contribution of vocal rhythm features, on the specific time interval in which synchronization occurs. Overall, this study extends previous research on interpersonal gesture synchronization and strengthens our knowledge regarding specific speech-gesture synchrony.

1. Ashenfelter KT. Simultaneous analysis of verbal and nonverbal data during conversation: symmetry and turn-taking. University of Notre Dame; 2007.
View Article
PubMed/NCBI
Google Scholar
6. Laukka P. Vocal Communication of Emotion. In: Zeigler-Hill V, Shackelford TK, editors. Encyclopedia of Personality and Individual Differences [Internet]. Cham: Springer International Publishing; 2017 [cited 2023 Sep 12]. p. 1–6. Available from: https://doi.org/10.1007/978-3-319-28099-8_562-1
7. Ball MJ, Perkins MR, Müller N, Howard S. The Handbook of Clinical Linguistics. Wiley; 2008. 712 p.
14. Heylen D. Challenges Ahead: Head movements and other social acts in conversations. In: Proceedings of the Joint Symposium on Virtual Social Agents [Internet]. The Society for the Study of AI and the Simulation of Behav.; 2005 [cited 2023 Dec 7]. p. 45–52. Available from: https://research.utwente.nl/en/publications/challenges-ahead-head-movements-and-other-social-acts-in-conversa
15. Allwood J, Cerrato L. A study of gestural feedback expressions. In: First nordic symposium on multimodal communication. Copenhagen; 2003. p. 7–22.
23. Yang L, Achard C, Pelachaud C. Multimodal Analysis of Interruptions. In: Duffy VG, editor. Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management Anthropometry, Human Behavior, and Communication. Cham: Springer International Publishing; 2022. p. 306–25. (Lecture Notes in Computer Science).
24. Nguyen TA, Kharitonov E, Copet J, Adi Y, Hsu WN, Elkahky A, et al. Generative Spoken Dialogue Language Modeling [Internet]. arXiv; 2022 [cited 2023 Dec 5]. Available from: http://arxiv.org/abs/2203.16502
37. Kelso JS. Dynamic patterns: The self-organization of brain and behavior [Internet]. MIT press; 1995 [cited 2024 May 2]. Available from: https://books.google.com/books?hl=en&lr=&id=zpjejjytkiIC&oi=fnd&pg=PR9&dq=info:w4CGyXE1wHAJ:scholar.google.com&ots=-eeB-Ac_aw&sig=HMyxpalcWyntl9XnGcDvNhe3N5c
38. Kugler PN, Turvey MT. Information, natural law, and the self-assembly of rhythmic movement. Hillsdale, NJ, US: Lawrence Erlbaum Associates, Inc; 1987. xxxi, 481 p. (Information, natural law, and the self-assembly of rhythmic movement).
48. Lubold N, Pon-Barry H. Acoustic-Prosodic Entrainment and Rapport in Collaborative Learning Dialogues. In: Proceedings of the 2014 ACM workshop on Multimodal Learning Analytics Workshop and Grand Challenge [Internet]. Istanbul Turkey: ACM; 2014 [cited 2024 Jan 3]. p. 5–12. Available from: https://dl.acm.org/doi/10.1145/2666633.2666635
57. Bernieri FJ, Rosenthal R. Interpersonal coordination: Behavior matching and interactional synchrony. In: Fundamentals of nonverbal behavior. New York, NY, US: Cambridge University Press; 1991. p. 401–32. (Studies in emotion & social interaction).
71. Brugman H, Russel A. Annotating Multi-media/Multi-modal Resources with ELAN. In: Lino MT, Xavier MF, Ferreira F, Costa R, Silva R, editors. Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04) [Internet]. Lisbon, Portugal: European Language Resources Association (ELRA); 2004 [cited 2024 Jan 18]. Available from: http://www.lrec-conf.org/proceedings/lrec2004/pdf/480.pdf
78. de Jonge-Hoekstra L. How hand movements and speech tip the balance in cognitive development: A story about children, complexity, coordination, and affordances. [Groningen]: University of Groningen; 2021.
81. Ramseyer F, Tschacher W. Nonverbal Synchrony or Random Coincidence? How to Tell the Difference. In: Esposito A, Campbell N, Vogel C, Hussain A, Nijholt A, editors. Development of Multimodal Interfaces: Active Listening and Synchrony: Second COST 2102 International Training School, Dublin, Ireland, March 23–27, 2009, Revised Selected Papers [Internet]. Berlin, Heidelberg: Springer; 2010 [cited 2023 Jul 20]. p. 182–96. (Lecture Notes in Computer Science). Available from: https://doi.org/10.1007/978-3-642-12397-9_15

IMAGES

The 4 types of speeches: overviews, writing guidelines, examples
Types of Speech Style
12 Types of Communication (2024)
Business Speech: Types with Examples, Informative, Special, Persuasive
ELEMENTS OF SPEECH COMMUNICATION PROCESS by Najaa Hamdan on Prezi
Types of Speech Style

VIDEO

Speech Communication class
Introduction speech communication class
Speech Communication Informative speech regarding Construction Industry age-increasing
SPEECH COMMUNICATION: INTRODUCTION
Introductory Speech (Communication Class)
Speech Communication Class Presentation

COMMENTS

Types of Speech in Communication
3. Demonstrative Speech. Demonstrative speech involves showing the audience how to do something. It combines explanation with practical demonstration, making it easier for the audience to understand and replicate the process. This type of speech is useful in workshops, training sessions, and instructional videos.
13 Main Types of Speeches (With Examples and Tips)
Informative speech. Informative speeches aim to educate an audience on a particular topic or message. Unlike demonstrative speeches, they don't use visual aids. They do, however, use facts, data and statistics to help audiences grasp a concept. These facts and statistics help back any claims or assertions you make.
The 4 types of speeches: overviews, writing guidelines, examples
This type of speech is frequently used for giving reports, lectures and, sometimes for training purposes. Examples of informative speech topics: the number, price and type of dwellings that have sold in a particular suburb over the last 3 months; the history of the tooth brush; how trees improves air quality in urban areas; a brief biography of ...
Types of Speeches: A Guide to Different Styles and Formats
1. Informative Speech. An informative speech is designed to educate the audience on a particular topic. The goal is to provide the audience with new information or insights and increase their understanding of the topic. The speech should be well-researched, organized, and delivered in a clear and engaging manner. 2.
Understanding the Various Types of Speech: A Comprehensive Guide
Understanding different types of speeches helps us communicate better in diverse situations by choosing the right approach to connect with our audience. Dr. Emily Thompson emphasizes the importance of knowing speech types for effective communication and advocates for honesty and respect in all aspects of public speaking.
Understanding Different Types of Speeches and Their Purposes
Understanding the different types of speeches and how to craft them effectively can significantly enhance one's communication skills. Whether you're presenting an informative lecture, persuading an audience, or celebrating a special occasion, knowing which type of speech to use and how to deliver it is crucial.
Subject Libguide: Speech Communication: Types of speeches
A persuasive speech tries to influence or reinforce the attitudes, beliefs, or behavior of an audience. This type of speech often includes the following elements: appeal to the audience. appeal to the reasoning of the audience. focus on the relevance of your topic. alligns the speech to the audience - ensure they understand the information ...
The Speech Communication Process
The Speech Communication Process ... This type of interference occurs when your body is responsible for the blocked signals. A deaf person, for example, has the truest form of physiological interference; s/he may have varying degrees of difficulty hearing the message. If you've ever been in a room that was too cold or too hot and found ...
Need Help With A Speech? Here Are 10 Types of Speeches To Explore
Other types of speeches are mixes or variations of the basic types discussed previously but deal with a smaller, more specific number of situations. 5. Motivational Speech. A motivational speech is a special kind of persuasive speech, where the speaker encourages the audience to pursue their own well-being.
9.2 Types of Speeches
Informative Speaking. One of the most common types of public speaking is informative speaking. The primary purpose of informative presentations is to share one's knowledge of a subject with an audience. Reasons for making an informative speech vary widely. For example, you might be asked to instruct a group of coworkers on how to use new ...
PDF 4 TYPES OF SPEACHES
The four basic types of speeches are: to inform, to instruct, to entertain, and to persuade. These are not mutually exclusive of one another. You may have several purposes in mind when giving your presentation. For example, you may try to inform in an entertaining style. Another speaker might inform the audience and try to persuade them to act ...
4 Main Types of Speeches in Public Speaking (With Examples)
Demonstrative. Persuasive. Entertaining. 1. Informative Speech. An informative speech is a type of public speaking that aims to educate or provide information to the audience about a specific topic. The main purpose of this speech is to present facts, concepts, or ideas in a clear and understandable manner. ‍. ‍.
Research Guides: Speech Communications: Types of Speeches
Persuasive Speeches. A persuasive speech attempts to influence or reinforce the attitudes, beliefs, or behavior of an audience. This type of speech often includes the following elements: appeal to the needs of the audience. appeal to the reasoning of the audience. focus on the relevance of your topic to the audience.
The 5 Different Types of Speech Styles
Application: any type of two-way communication, dialogue, whether between two people or more, where there's no intimacy or any acquaintanceship. Examples: group discussions, teacher-student communication, expert-apprentice, communication between work colleagues or even between employer-employee, and talking to a stranger. 4. Casual Style (or Informal Style)
1
1.2 Deictic and Symbolic Fields in Speech Communication . In addition to the Organon Model, Bühler proposed a two-field theory of speech communication: the pointing or deictic field and the naming or symbolic field. The deictic field is one-dimensional, with systems of deictic elements that receive their ordering in contexts of situation.
10 Most Common Speech-Language Disorders & Impediments
Spasmodic Dysphonia (SD) is a chronic long-term disorder that affects the voice. It is characterized by a spasming of the vocal chords when a person attempts to speak and results in a voice that can be described as shaky, hoarse, groaning, tight, or jittery. It can cause the emphasis of speech to vary considerably.
What Is Speech? What Is Language?
Speech is how we say sounds and words. Speech includes: How we make speech sounds using the mouth, lips, and tongue. For example, we need to be able to say the "r" sound to say "rabbit" instead of "wabbit.". How we use our vocal folds and breath to make sounds. Our voice can be loud or soft or high- or low-pitched.
Speech
Speech is human communication through spoken language. Although many animals possess voices of various types and inflectional capabilities, humans have learned to modulate their voices by articulating the laryngeal tones into audible oral speech. Learn more about speech in this article.
Speech
Speech is the use of the human voice as a medium for language. Spoken language combines vowel and consonant sounds to form units of meaning like words, which belong to a language's lexicon.There are many different intentional speech acts, such as informing, declaring, asking, persuading, directing; acts may vary in various aspects like enunciation, intonation, loudness, and tempo to convey ...
What Are the Types of Speech Communication?
Speech, or oral communication, is a process of sending and receiving spoken messages between people. Speech conveys and sways through the presentation of ideas, opinions, information, directions and commands, usually with responsive communication from the listener. ... Unlike other types of communication, mass communication, or public speaking ...
4 Types of Communication Styles for Workplace Success
Effective communication is important in any workplace, and understanding various types of messages in communication—verbal, nonverbal, written, and visual—can greatly enhance your interactions. At Prezentium, we prioritize a customer-first approach, offering tailored services that help you master these essential skills.
From unimodal to multimodal dynamics of verbal and nonverbal cues
Introduction. Social interaction, such as face-to-face communication, can be mirrored as a complex choreography wherein each speaker mutually exchanges perceptual information through the vocal (auditory) and visual channels [1-4].Through the vocal channel, speech encompasses a combination of verbal and nonverbal expressions.

Types of Speech in Communication

1. Informative Speech

2. Persuasive Speech

3. Demonstrative Speech

4. Entertaining Speech

5. Special Occasion Speech

6. Impromptu Speech

7. Extemporaneous Speech

8. Manuscript Speech

9. Memorized Speech

10. Motivational Speech

11. Pitch Speech

Tips for Giving a Great Speech

How to Make Your Speech More Memorable

Related Posts:

The 4 types of speeches in public speaking

What's on this page:

Informative speeches

Examples of informative speech topics:

Demonstration, demonstrative or 'how to' speeches

Examples of 'how to' speech topics are:

Resources for demonstration speeches

Persuasive speeches

Everyday examples of persuasive speeches

Models of the persuasive process

Monroe's Motivated Sequence of persuasion

Resources for persuasive speeches

Special occasion or entertaining speeches

Resources for special occasion speeches

Speech types often overlap

Related pages:

speaking out loud

Top 10 popular pages

From fear to fun in 28 ways

Useful pages

Types of Speeches: A Guide to Different Styles and Formats

1. Informative Speech

2. Persuasive Speech

3. Entertaining Speech

4. Special Occasion Speech

5. Impromptu Speech

You Might Also Like:

Understanding Different Types of Speeches and Their Purposes

Informative Speeches: Sharing Knowledge Effectively

Persuasive Speeches: Influencing Beliefs and Actions

Demonstrative Speeches: Showing How It's Done

Oratorical Speeches: The Art of Powerful Delivery

Motivational Speeches: Inspiring the Audience

Entertaining Speeches: Engaging and Amusing the Audience

Special Occasion Speeches: Marking Important Events

Impromptu Speeches: Speaking Off-the-Cuff

How to Choose the Right Type of Speech for Your Situation

Using Teleprompters for Different Types of Speeches

Advantages of Using Teleprompters:

Mastering the Types of Speeches

Recording videos is hard. Try Teleprompter.com

Related Articles

Creative Vlog Ideas for Beginners

Since 2018 we’ve helped 1M+ creators smoothly record 17,000,000 + videos

Subject Libguide: Speech Communication: Types of speeches

The Speech Communication Process

Interference

Share This Book

10 Types of Speeches Every Speechwriter Should Know

Basic Types of Speeches

1. Entertaining Speech

2. Informative Speech

3. Demonstrative Speech

4. Persuasive Speech

Other Types of Speeches

5. Motivational Speech

6. Impromptu Speech

7. Oratorical Speech

8. Debate Speech

9. Forensic Speech

10. Special Occasion Speech

4 Main Types of Speeches in Public Speaking (With Examples)

What is Speech?

Other Types of Speeches

1. Informative Speech