• Directories
  • Introductory Overviews
  • Conferences & Preprints
  • Articles & Journals
  • Books & Ebooks
  • Linguistic Data Resources
  • Theses & Dissertations
  • Writing & Citing in Linguistics
  • Off-Campus Access This link opens in a new window
  • Start Your Research
  • Research Guides
  • University of Washington Libraries
  • Library Guides
  • UW Libraries
  • Computational Linguistics

Computational Linguistics: Theses & Dissertations

Thesis collections.

Theses and dissertations are a key source for finding the latest scholarship, additional material such as data sets, and detailed research. They can also help you find out what has been written on a topic, uncover other sources through citations, and get inspiration for your own research project. Theses and dissertations are typically held in print and/or electronically by the institution where they were written. Many newer theses can be accessed online. Check with the library or academic department where it was written if you cannot find a thesis or dissertation online.

Access for all on-campus; login required from off-campus

  • ResearchWorks Archive: Linguistics Digital archive of full-text UW linguistics papers; including all UW linguistics theses and dissertations 2011 to present. Login with UW NetID for full access.
  • Older UW linguistics theses (UW Libraries Search) Hard copies of over 400 UW linguistics dissertations and theses submitted to the University Libraries, 1963–2011
  • << Previous: Linguistic Data Resources
  • Next: Writing & Citing in Linguistics >>
  • Last Updated: Feb 25, 2024 5:38 PM
  • URL: https://guides.lib.uw.edu/research/compling

Computational Linguistics

photo of two students

The computational linguistics program at Stanford is one of the oldest in the country, and offers a wide range of courses and research opportunities.

We take a very broad view of computational linguistics, from theoretical investigations to practical natural language processing applications, ranging across linguistic areas like computational semantics and pragmatics, discourse and dialogue, sociolinguistics, historical linguistics, syntax and morphology, phonology, psycholinguistics, and phonetics and speech, and applications including machine translation, question answering, and sentiment analysis.

Uniting this wide variety of research is the shared ambitious goal of dealing with the complexity and the uncertainty of human language by integrating rich models of linguistic structure with sophisticated modern neural and statistical techniques.

Together with the  Computer Science Department , our department houses a wide variety of research labs, reading groups, and informal workshops on computational linguistics, and we also maintain close ties with industrial natural language processing work in Silicon Valley.  For more information, see the  Stanford Natural Language Processing Group  and the  CSLI Pragmatics Lab .

People in this subfield

Graduate students, related news, look who’s talking: jurafsky at eacl, congrats to our newest ph.d. recipients, in the media: bender, congrats to manning, congrats to schuster, upcoming events.

University of Rochester

Search Rochester.edu

Popular Searches

Resources for

  • Prospective students
  • Current students
  • Faculty and staff

School of Arts & Sciences

Department of Linguistics

  • Graduate Program
  • MS in Computational Linguistics

Master of Science in Computational Linguistics

The computational linguistics master's program at Rochester trains students to be conversant both in language analysis and computational techniques applied to natural language. The curriculum consists of courses in linguistics and computer science for a total of 32 credit hours.

Graduates from the computational linguistics program will be prepared for both further training at the PhD level in computer science and linguistics, as well as industry positions. A number companies such as Google, Amazon, Nuance, LexisNexis, and Oracle are searching for employees with advanced degrees in computational linguistics for positions ranging from speech recognition technology to improving translation systems to developing better models of language understanding.

The curriculum consists of courses in linguistics and computer science, in roughly a 50/50 mix, for a total of 32 credit hours. Four courses (16 credits) are required in linguistics and four courses (16 credits) in computer science. The degree also requires a culminating special written project on a topic relevant to the student's interest and in consultation with individual advisors.

This program’s coursework can typically be completed in three full-time semesters. A fourth semester is for students to prepare their program’s final assignment, project, or thesis.

Linguistics Courses


Students are required to have completed the following prerequisite course, or its equivalent.

  • LING 110: Introduction to Linguistic Analysis

Track Courses

Within linguistics, students will work with an advisor to create a “track” for their coursework in one of three areas:

  • Sound structure (LING 410, 427, 510)
  • Grammatical structure (LING 420, 460, 461, 462, 520)
  • Meaning (LING 425, 465, 466, 468, 525, 535)

Students will be encouraged to take LING 450 and LING 501 as it suits their programs.

At least one of the following:

  • LING 410: Introduction to Language Sound Systems
  • LING 420: Introduction to Grammatical Systems
  • LING 425: Introduction to Semantic Analysis

Plus at least two from the following:

  • LING 427: Topics in Phonetics and Phonology
  • LING 450: Data Science for Linguistics
  • LING 460: Syntactic Theory
  • LING 461: Phrase Structure Grammar
  • LING 462: Topics in Experimental Syntax
  • LING 465: Formal Semantics
  • LING 466: Pragmatics
  • LING 468: Computational Semantics
  • LING 481: Statistical Methods in Computational Linguistics
  • LING 482: Deep Learning Methods in Computational Linguistics
  • LING 501: Linguistics Graduate Proseminar
  • LING 520: Syntax
  • LING 525: Graduate Semantics
  • LING 527: Topics in Phonetics and Phonology
  • LING 535: Formal Pragmatics

Computer Science Courses


  • CSC 171: The Science of Programming
  • CSC 172: The Science of Data Structures
  • CSC 173: Computation and Formal Systems
  • MATH 150: Discrete Math
  • MATH 165: Linear Algebra with Differential Equations 
  • LING 424: Introduction to Computational Linguistics
  • CSC 447: Natural Language Processing
  • CSC 448: Statistical Speech and Language Processing

Plus at least two of the following:

  • CSC 440: Data Mining
  • CSC 442: Artificial Intelligence
  • CSC 444: Logical Foundations of Artificial Intelligence
  • CSC 446: Machine Learning

Program Faculty


  • Ash Asudeh , Professor and Director of the Center for Language Science
  • Scott Grimm , Department Chair and Associate Professor
  • Aaron White , Associate Professor and Director of Graduate Studies

Computer science:

  • James Allen
  • Ehsan Hoque
  • Len Schubert
  • jump to content
  • jump to footer

Logo: Universität Stuttgart - zur Startseite

Theses at Department Theoretical Computational Linguistics

How do you find a thesis topic and how do you prepare for it. From finding a topic to registration and submission, everything is discussed.


This website explains how you could find a thesis topic and how do you prepare for it. It discusses everything from finding a topic until registration and submission.

This document holds for theses in which Roman Klinger or Sebastian Padó act as reviewers. The proposed procedure is probably similar for other supervisors, but might differ slightly.

Topic Selection

The first question when you prepare for writing a Master's or Bachelor's thesis is the topic selection. To come up with ideas for your thesis, we propose the following sources:

  • Seminars and lectures you attended in specialized areas. Perhaps you can come up with ideas for research questions which were raised in a paper that was discussed in class? Or your teacher mentioned a topic which might be promising? You see an interesting application of a method or want to improve a method?
  • We have a list of open topics at Ilias . This might be outdated for some topics, but the general ideas might help you to see what current topics of interest are and what typical topics of our research group can be.
  • Look at our research interests, check out our recent publications. Is there something that sounds interesting? Could you come up with an idea that is related to one of our approaches? (See also at personal webpages of Sebastian Padó and Roman Klinger .)
  • In general, have a look at our group website and our lists of projects and publications there.

Theses in collaboration with companies or external institutes

If you work in a company or an institute outside of University of Stuttgart and there might be the possiblity to write a thesis in collaboration with them, we are in principle willing to support this. However, we ask for the following procedure:

  • At IMS, we find it very important that BSc/MSc theses have a clear  scientific perspective : they typically investigate a  substantial,  relevant, current  research question and present empirical experiments about that thesis on a non-trivial dataset. Implementation plays only the role of a tool in this process. This may not be easy to square with companies' interests, which are typically more focussed on practical developments or improvements.
  • As a result, the process of defining a thesis typically takes substantially longer than for an internal thesis. Please make concrete steps at least 2 months, better 3 months, before the thesis should start. 
  • The first step should be that you ask a person at the institute or company to contact us with regard to the thesis topic and the potential suitability as a thesis.

Finalization of Thesis Topic Definition: Writing a proposal

After you found interesting topics (or an external partner agreed with one of us to collaborate) get in touch with your supervisor in our group. Get an appointment and discuss the thesis topic.

The next step is now to write a research proposal. Do not start to work on this before a first meeting with your supervisor!

The proposal is a short document (typically 3–5) pages which helps in different aspects:

  • You familiarize yourself with the topic.
  • The supervisor and you make sure that you both understand the topic in the same way.
  • The goals are well-defined.

Such proposal typically consists of the following subsections:

  • Introduction and Motivation Explain the general background of your topic. What is the application, what is the task, what is the challenge you focus on. Why is this a relevant research topic and area.
  • Previous and Related Work Introduce relevant papers regarding the task and application, previous attempts to solve the task or similar tasks. From this section, it needs to become clear what the current state of research is. This leads then to...
  • Goals and Objective Based on the introduction and motivation, in which you explained the challenge your work on and the previous work, it should now be clear what the research gap is, the small area you want to fill with novel knowledge. This section is comparably short and clearly states what you plan to achieve. This is typically formulated as a research hypothesis or a goal what should be possible after your completed your thesis that has not been possible before. This section can also include the artefacts your generate in your thesis (programs, resources, corpora, etc.).
  • Material and Methods, Approach Now the task is clear, the current state of research is introduced, and the goals are set. In this section (which is often the longest in a proposal), you explain how you build on previous work to address and reach the goal which you motivated at the beginning of the text. This can include things like resource generation, annotation, software design and implementation, problem analysis, implementation of a baseline system, a first prototype, an extended version, performing experiments, analyzing the results.
  • Time Plan The approach section explains how you do your work. This section consists of a table in which the left column defines a date and the right column explains a milestone to reach. Explanations are not needed here, the titles of the milestones should be clear from the approach section.

Typically, a couple of iterations of writing the proposal, getting feedback from your supervisor, refining are needed. When you are done and your supervisor agrees, you register your thesis. This cannot wait long: After the thesis is well-defined, you need to register as soon as possible. You cannot start to work and register later.

For that, you need to get the form for Bachelor' theses  or Master's theses  and get the confirmation from the examination office that you have enough credit points to get started. You can also directly print this from C@MPUS.

After that, you get the signatures from your supervisor and then go to the responsible person in the "Sekretariat" to perform the registration. She will stamp the document, take a copy and then you go to the examination office again.

During the thesis and when you are done

While you work on your thesis, you will meet your supervisor several times. How often is necessary and makes sense depends on the topic and what you need. When you experience delays or unexpected circumstances, you should talk to your supervisor. They will help you to solve the problem. If you become ill, you should also tell them if it takes longer.

You can extend your thesis submission deadline once, but only if there is a good reason. A good reason is if you were ill. Not a good reason is that programming took longer than expected. If you need to extend, talk to your supervisor.

You are really in trouble?

Writing a thesis is typically an enjoyable experience. You work on an interesting topic intensily, and you actually do something new. However, it can be stressful. If you experience serious issues, you should talk to your supervisor. In cases where this seems not to help, we also recommend to think about talking to the  psychological support at Studentenwerk . You can also talk to your program manager or the head of the examination committee. 

This image shows Roman Klinger

Roman Klinger

Adjunct Professor

  • Profile page
  • +49 711 685 81406
  • Write e-mail

This image shows Sebastian Padó

Sebastian Padó

Chair of Theoretical Computational Linguistics, Managing Director of the IMS

  • +49 711 685 81400

thesis computational linguistics

Help | Advanced Search

Computer Science > Computation and Language

Title: category-theoretic quantitative compositional distributional models of natural language semantics.

Abstract: This thesis is about the problem of compositionality in distributional semantics. Distributional semantics presupposes that the meanings of words are a function of their occurrences in textual contexts. It models words as distributions over these contexts and represents them as vectors in high dimensional spaces. The problem of compositionality for such models concerns itself with how to produce representations for larger units of text by composing the representations of smaller units of text. This thesis focuses on a particular approach to this compositionality problem, namely using the categorical framework developed by Coecke, Sadrzadeh, and Clark, which combines syntactic analysis formalisms with distributional semantic representations of meaning to produce syntactically motivated composition operations. This thesis shows how this approach can be theoretically extended and practically implemented to produce concrete compositional distributional models of natural language semantics. It furthermore demonstrates that such models can perform on par with, or better than, other competing approaches in the field of natural language processing. There are three principal contributions to computational linguistics in this thesis. The first is to extend the DisCoCat framework on the syntactic front and semantic front, incorporating a number of syntactic analysis formalisms and providing learning procedures allowing for the generation of concrete compositional distributional models. The second contribution is to evaluate the models developed from the procedures presented here, showing that they outperform other compositional distributional models present in the literature. The third contribution is to show how using category theory to solve linguistic problems forms a sound basis for research, illustrated by examples of work on this topic, that also suggest directions for future research.

Submission history

Access paper:.

  • Other Formats

References & Citations

  • Google Scholar
  • Semantic Scholar

DBLP - CS Bibliography

Bibtex formatted citation.

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

Master of Science in Computational Linguistics

Academic experience.

students in classroom

Program Format

The master's in computational linguistics is a nine-course, 43-credit program that culminates with a master's project. The format is flexible — you can study part time or full time and take classes online, on campus or both.

Full-time students take three courses per quarter for three quarters and then complete the master's project over the summer, finishing the program in 12 months. Part-time students take one or two courses per quarter and finish in two to three years.

The program is offered in a hybrid format, giving you the choice to either attend class on the UW campus or watch online via a live webcast. Classes meet during the day but are recorded and made available for later viewing. Find out more about online learning .

The curriculum consists of a sequence of four core computational linguistics courses, two linguistics courses and a choice of three electives. You'll also complete a master's project, which can take the form of a thesis or an internship.

With its mix of courses in computational linguistics, linguistics and related fields, the curriculum is designed to immerse you in the latest natural language processing technologies while providing essential theoretical knowledge that will be relevant for years to come. You’ll gain a strong foundation in the methodologies of language technology, including expertise in state-of-the-art techniques for evaluating results and experience creating solutions for end-to-end systems. Learn more about program courses . 

Extensive hands-on work provides opportunities for in-depth exploration and research, and collaborative projects allow you to tackle challenging problems in a manner closely modeled on industry practices.

Master’s Project (Thesis or Internship)

To conclude the program, you have the choice of completing a thesis or a six- to 10-week internship. The thesis project gives you experience with independent research and academic writing, and typically involves the implementation or evaluation of working systems. 

With the internship option, you'll gain practical experience working in computational linguistics at a local company. Our students have had considerable success in securing positions at companies such as Google, Microsoft, PARC, Adapx, VoiceBox Technologies, InQuira and Amazon. 

Single Course Enrollment and Certificate Option

Students who are interested in studying computational linguistics but not yet ready to take on a full master's degree program may want to consider taking individual program courses. See the Single Course Enrollment page for more details.

The UW also offers an embedded certificate program, the Certificate in Natural Language Technology , which consists of a summer fundamentals refresher course and the first two courses in the core sequence. You can earn up to eight graduate credits toward degree requirements should you later be accepted into the master’s program. To use those credits, you need to obtain graduate nonmatriculated status before you register for autumn quarter.

In This Section

  • Courses & Requirements
  • Online Learning
  • Single Course Enrollment
  • Certificate in Natural Language Technology

At a Glance

We're happy to answer any questions you may have about the Master of Science in Computational Linguistics.

Be boundless

Brought to you in partnership with UW Continuum College

© 2024 University of Washington | Seattle, WA

Quick links

  • Directories
  • Make a Gift


  •   Facebook
  •   Twitter
  •   Newsletter

Modeling Thesis Clarity in Student Essays

Isaac Persing , Vincent Ng

Export citation

  • Preformatted

Markdown (Informal)

[Modeling Thesis Clarity in Student Essays](https://aclanthology.org/P13-1026) (Persing & Ng, ACL 2013)

  • Modeling Thesis Clarity in Student Essays (Persing & Ng, ACL 2013)
  • Isaac Persing and Vincent Ng. 2013. Modeling Thesis Clarity in Student Essays . In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages 260–269, Sofia, Bulgaria. Association for Computational Linguistics.
  • Graduate Studies
  • Five-Year Bachelor's/MS Program
  • Frequently Asked Questions
  • Advising Information for Current CL MS Students
  • Student Handbook
  • Google Drive Folder for Current Students (requires Brandeis login)
  • PhD Studies in Computational Linguistics
  • Faculty and Staff
  • Current Students
  • Research Program
  • Industry Connections
  • Linguistics and Computational Linguistics Home
  • Department of Computer Science
  • Degree Programs
  • Majors and Minors
  • Graduate Programs
  • The Brandeis Core
  • School of Arts and Sciences
  • Brandeis Online
  • Brandeis International Business School
  • Graduate School of Arts and Sciences
  • Heller School for Social Policy and Management
  • Rabb School of Continuing Studies
  • Precollege Programs
  • Faculty and Researcher Directory
  • Brandeis Library
  • Academic Calendar
  • Undergraduate Admissions
  • Summer School
  • Financial Aid
  • Research that Matters
  • Resources for Researchers
  • Brandeis Researchers in the News
  • Provost Research Grants
  • Recent Awards
  • Faculty Research
  • Student Research
  • Centers and Institutes
  • Office of the Vice Provost for Research
  • Office of the Provost
  • Housing/Community Living
  • Campus Calendar
  • Student Engagement
  • Clubs and Organizations
  • Community Service
  • Dean of Students Office
  • Orientation
  • Hiatt Career Center
  • Spiritual Life
  • Graduate Student Affairs
  • Directory of Campus Contacts
  • Division of Creative Arts
  • Brandeis Arts Engagement
  • Rose Art Museum
  • Bernstein Festival of the Creative Arts
  • Theater Arts Productions
  • Brandeis Concert Series
  • Public Sculpture at Brandeis
  • Women's Studies Research Center
  • Creative Arts Award
  • Our Jewish Roots
  • The Framework for the Future
  • Mission and Diversity Statements
  • Distinguished Faculty
  • Nobel Prize 2017
  • Notable Alumni
  • Administration
  • Working at Brandeis
  • Commencement
  • Offices Directory
  • Faculty & Staff
  • Alumni & Friends
  • Parents & Families
  • 75th Anniversary
  • New Students
  • Shuttle Schedules
  • Support at Brandeis

Graduate Studies in Computational Linguistics

Our Master of Science (MS) in Computational Linguistics curriculum does an excellent job of preparing students for careers in speech recognition, artificial intelligence, machine translation, big data, automated text analysis and web search. The highlights of our curriculum include:

  • Offering in-depth programming training so that students know how to build algorithms from scratch.
  • Incorporating best practices for programming and cutting edge technology from the field.
  • Through skill-building, our courses complement the learning process during on campus computational linguistics research work with faculty and off campus internships.
  • Through the required internship, capstone project and/or thesis (our “exit requirement”), students gain invaluable work and independent research experience to add to their portfolios and to enhance their job or PhD application.
  • Starting during orientation and throughout the program, you will benefit from tailored, one-on-one faculty advising to build a course plan that meets your needs.


The program requires students to complete at least 12 courses, which are a combination of core, background and elective courses. All students must take our 5 core computational linguistics courses. During orientation, students will meet with a faculty advisor to determine which of the six background courses (in linguistics and/or computer programming) will be required and to discuss elective options.

Students will satisfy the remainder of their 12-course requirement by taking electives and one "exit requirement" course. Representing a culmination of their learning in the program, the exit requirement allows students to use the skills they’ve developed in an internship, capstone project and/or thesis. Students enroll in a course in order to receive credit for the work in their internship, capstone project and/or thesis.

The First Year

The goal is for all students to emerge from the first year with:

  • a strong foundation in the basics of both computer science and formal linguistics
  • facility and comfort with the fundamental techniques, goals, and methodology of computational linguistics, natural language processing, and corpus linguistics.

Core Courses

  • COSI 114a Fundamentals in Natural Language Processing I
  • COSI 115b Fundamentals in Natural Language Processing II
  • COSI 230b Natural Language Annotation for Machine Learning (formerly COSI 140b)

Background Courses

  • COSI 10a Introduction to Problem Solving in Python
  • COSI 12b Advanced Programming Techniques in Java
  • COSI 21a Data Structures and the Fundamentals of Computing
  • LING 120b Syntax I
  • LING 130a Introduction to Formal Semantics
  • LING 160b Mathematical Methods for Computational Linguistics

Any additional room in the first-year schedule is devoted to developing and strengthening the student’s computer programming abilities, along with taking other computer science or linguistics electives of interest to the particular student. Although not satisfying any requirements toward the MS degree, students can also opt to add courses of interest from other disciplines, such as foreign language study.

The Second Year

The goal in the second year is to build more advanced programming skills in preparation for the job market or PhD application.

  • COSI 231a Advanced Machine Learning Methods for Natural Language Processing  (formerly COSI 134a)
  • COSI 232b Information Extraction (formerly COSI 137b)
  • COSI 293b Internship or COSI 295a Capstone Project or COSI 299a Thesis

Additional advanced courses on applied or theoretically oriented topics within computational linguistics and natural language processing can include:

  • COSI 112a Modal, Temporal, and Spatial Logic for Language
  • COSI 132a Information Retrieval
  • COSI 135b Computational Semantics
  • COSI 136a Automated Speech Recognition
  • COSI 138a Computational Linguistics Second Year Seminar
  • COSI 216a Topics in Natural Language Processing
  • COSI 217b Natural Language Processing Systems
  • COSI 233a Discourse and Dialog

For more detailed program information, please review our Student Handbook and the University Bulletin . The most relevant sections of the Student Handbook for prospective students include: Degree Requirements, Course Selection, and the Exit Requirement.

The minimum residence requirement for full-time students is two years, i.e., four semesters of full-time enrollment.

  • For Current Students

For questions, please contact:

Division of Science, Graduate Affairs Office [email protected] 781-736-3148

UCL logo

Linguistics with a Specialisation in Computational Linguistics MA

London, Bloomsbury

The Linguistics MA with a specialisation in Computational Linguistics aims to give students a thorough grounding in both theoretical and computational linguistics. Students gain a basic understanding of the three core areas of linguistics: phonetics and phonology; syntax; and semantics and pragmatics, plus they will be introduced to algorithms and models that implement linguistic theories and form the basis of modern natural language processing systems. Through option modules, students are also able to tailor their programme to meet their personal research interests.

UK tuition fees (2024/25)

Overseas tuition fees (2024/25), programme starts, applications accepted.

Applications open

  • Entry requirements

Normally a minimum of a second-class Bachelor's degree from a UK university or an overseas qualification of an equivalent standard is required. Applicants must be able to demonstrate that they have foundational knowledge in at least one procedural programming language (e.g., Python or Java).

The English language level for this programme is: Level 2

UCL Pre-Master's and Pre-sessional English courses are for international students who are aiming to study for a postgraduate degree at UCL. The courses will develop your academic English and academic skills required to succeed at postgraduate level.

Further information can be found on our English language requirements page.

Equivalent qualifications

Country-specific information, including details of when UCL representatives are visiting your part of the world, can be obtained from the International Students website .

International applicants can find out the equivalent qualification for their country by selecting from the list below. Please note that the equivalency will correspond to the broad UK degree classification stated on this page (e.g. upper second-class). Where a specific overall percentage is required in the UK qualification, the international equivalency will be higher than that stated below. Please contact Graduate Admissions should you require further advice.

About this degree

Students gain knowledge and understanding of current research in computational linguistics and are prepared fo independent research. On completion of the programme, they will be able to formulate appropriate research questions, find and evaluate relevant literature, develop and test new hypotheses, and produce cogent, structured and professionally presented reports.

Students will use symbolic models, deep learning models, or crowdsourcing-based data collection and experimental methods, and will receive extensive training in research methods and the scholarly presentation of ideas.

Who this course is for

The programme is designed for students with a background either in linguistics, cognitive science or computer science who wish to pursue an interest in computational linguistics.

What this course will give you

The UCL Division of Psychology and Language Sciences undertakes world-leading research and teaching in linguistics, language, mind, and behaviour. More specifically, UCL Linguistics is one of the leading departments for research in theoretical linguistics in the UK and its staff includes world leaders in theoretical syntax, semantics, pragmatics, phonology, and experimental linguistics.

Our work attracts staff and students from around the world. Together they create an outstanding and vibrant environment, taking advantage of cutting-edge resources such as a behavioural neuroscience laboratory, a centre for brain imaging, and extensive laboratories for research in speech and language, perception, and cognition.

Our world-class research is characterised by a tight integration of theoretical and experimental work spanning the full range of the linguistic enterprise and forms the bedrock of the department's eminent reputation, which is also reflected in other markers of excellence, such as its editorial involvement with top journals in the field.

You can find further information at  ucl.ac.uk/pals/study/masters .

The foundation of your career

This Linguistics MA equips graduates with the necessary skills to carry out advanced research in linguistics with a particular focus on computational linguistics. It additionally provides transferable skills that prepare students for a wide range of careers within and outside of academia, including analytical thinking, problem solving, project management, and writing and oral presentation skills. It provides training in computational linguistics.

These skills open up opportunities in many different sectors, including language and speech technology, but also in language teaching, translating and interpreting, marketing, communication, journalism, management, and law.


Alth o ugh the degree can be an end in itself, this advanced programme is an excellent preparation for independent doctoral research in computational linguistics. Graduates from our specialisation MA programmes in linguistics have a very strong track record of securing funded doctoral studentships and have in recent years gone on to research at UCL, MIT, Cambridge, University of Massachusetts in Amherst, and the Universitat Autonoma de Barcelona. Recent graduates have also gone on to work in prominent technology companies, such as Google.

Students have ample opportunity to interact with world-renowned researchers in linguistics and other areas within the Division of Psychology and Language Sciences; they will serve as your teachers, mentors, and research supervisors throughout the programme.

The department also runs a number of research seminars and invited lectures throughout they year, allowing students to engage with prominent researchers from other universities.

Finally, students have the opportunity to engage with UCL’s Career support service and to connect to UCL’s extensive alumni network.

Teaching and learning

The teaching and assessment of this programme is strongly research-orientated. It is delivered through a combination of lectures, small-group teaching and a virtual learning environment. Some modules also involve workshops or practical classes.

Assessment is through take-home and unseen examination, essays, presentations, assignments and a research dissertation.

Each 15-credit taught module has approximately 30 hours of contact time with staff (including lectures, seminars, and tutorials). Students are expected to supplement these contact hours with additional time spent reading, studying, and preparing for assessments, for a total of 150 hours of work per 15-credit module. Additional contact time will be offered during the year in the form of staff office hours, optional workshops, and research seminars.

The Linguistics MA programme with specialisation in Computational Linguistics consists of five compulsory core modules, a selection of two option modules out of a group of three choices, one electivemodule, and the dissertation.

In the first term, you will take four compulsory modules focused on the core areas of linguistics: phonetics/phonology, syntax, semantics/pragmatics and computational linguistics. In the second term, you will take another compulsory module in computational linguistics and two further option core modules from a selection of three choices: phonology, syntax, or semantics/pragmatics.

In addition, you will choose an elective module from a range of choices in consultation with your programme director. These can be taken in term 1 or term 2, but they are mostly taken in term 2. You will begin working on the dissertation in May and continue to work on it over the summer months. The due date for the dissertation is in late August or early September.

Part-time students take the same modules as full-time students but spread over two years. They take the dissertation in Year 2 of their studies.

Compulsory modules

Optional modules.

Please note that the list of modules given here is indicative. This information is published a long time in advance of enrolment and module content and availability are subject to change. Modules that are in use for the current academic year are linked for further information. Where no link is present, further information is not yet available.

Students undertake modules to the value of 180 credits. Upon successful completion of 180 credits, you will be awarded an MA in Linguistics with a Specialisation in Computational Linguistics.


Details of the accessibility of UCL buildings can be obtained from AccessAble accessable.co.uk . Further information can also be obtained from the UCL Student Support and Wellbeing team .

Fees and funding

Fees for this course.

The tuition fees shown are for the year indicated above. Fees for subsequent years may increase or otherwise vary. Where the programme is offered on a flexible/modular basis, fees are charged pro-rata to the appropriate full-time Master's fee taken in an academic session. Further information on fee status, fee increases and the fee schedule can be viewed on the UCL Students website: ucl.ac.uk/students/fees .

Additional costs

This programme has no additional costs.

For more information on additional costs for prospective students please go to our estimated cost of essential expenditure at Accommodation and living costs .

Funding your studies

For a comprehensive list of the funding opportunities available at UCL, including funding relevant to your nationality, please visit the Scholarships and Funding website .

There is an application processing fee for this programme of £90 for online applications and £115 for paper applications. Further information can be found at Application fees .

When we assess your application we would like to learn:

  • why you want to study Linguistics with a specialisation in Computational Linguistics at graduate level
  • why you want to study Linguistics with a specialisation in Computational Linguistics at UCL
  • what particularly attracts you to the chosen programme
  • how your academic and professional background meets the demands of this rigorous programme

Together with essential academic requirements, the personal statement is your opportunity to illustrate whether your reasons for applying to this programme match what the programme will deliver.

Please note that you may submit applications for a maximum of two graduate programmes (or one application for the Law LLM) in any application cycle.

Choose your programme

Please read the Application Guidance before proceeding with your application.

Year of entry: 2024-2025

Got questions get in touch.

Division of Psychology and Language Sciences

Division of Psychology and Language Sciences

[email protected]

UCL is regulated by the Office for Students .

Prospective Students Graduate

  • Graduate degrees
  • Taught degrees
  • Taught Degrees
  • Applying for Graduate Taught Study at UCL
  • Research degrees
  • Research Degrees
  • Funded Research Opportunities
  • Doctoral School
  • Funded Doctoral Training Programmes
  • Applying for Graduate Research Study at UCL
  • Teacher training
  • Teacher Training
  • Early Years PGCE programmes
  • Primary PGCE programmes
  • Secondary PGCE programmes
  • Further Education PGCE programme
  • How to apply
  • The IOE approach
  • Teacher training in the heart of London
  • Why choose UCL?
  • Entrepreneurship
  • Inspiring facilities and resources
  • Careers and employability
  • Your global alumni community
  • Your wellbeing
  • Postgraduate Students' Association
  • Your life in London
  • Accommodation

Book cover

Systemic Functional Insights on Language and Linguistics pp 125–145 Cite as

Computational Linguistics

  • Christian M. I. M. Matthiessen 7 ,
  • Bo Wang 8 ,
  • Yuanyi Ma 9 &
  • Isaac N. Mwinlaaru 10  
  • First Online: 06 April 2022

723 Accesses

Part of the book series: The M.A.K. Halliday Library Functional Linguistics Series ((TMAKHLFLS))

This chapter first summarizes the contributions of Systemic Functional Linguistics (SFL) to computational linguistics. It elaborates on Martin Kay’s Functional Unification Grammar, highlights the achievements of the Penman Project on text generation directed by William C. Mann and comments on the influences from computational linguistics on SFL. The connections between Cardiff Grammar and Nigel Grammar are also discussed.

This is a preview of subscription content, log in via an institution .

Buying options

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

See: https://en.wikipedia.org/wiki/Frederick_Parker-Rhodes .

For some indications of the correspondences between Roget’s lexical taxonomy and systemic functional descriptions of lexicogrammar, see Halliday ( 1976 ) and Matthiessen ( 1995a ).

See: http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/kpml-description.htm .

See: https://nlp.stanford.edu/software/nndep.html .

Alves, Fabio, Adriana Pagano, Stella Neumann, Erich Steiner & Silvia Hansen-Schirra. 2010. “Translation units and grammatical shifts: Towards an integration of product- and process-based translation research.” In Gregory M. Shreve & Erik Angelone (eds.), Translation and cognition. Amsterdam & Philadelphia: John Benjamins. 109–142.

Google Scholar  

Bateman, John A. 1989. “Dynamic systemic-functional grammar: A new frontier.” WORD 40(1–2): 263–287.

Bateman, John A. 1996. KPML development environment — multilingual linguistic resource development and sentence generation . Manual for release 0.9, March 1996. IPSI/GMD, Darmstadt, Germany.

Bateman, John A. 1997. “Enabling technology for multilingual natural language generation: The KPML development environment.” Journal of Natural Language Engineering 3(1): 15–55.

Bateman, John A. 2008a. “Systemic Functional Linguistics and the notion of linguistic structure: Unanswered questions, new possibilities.” In Jonathan J. Webster (ed.), Meaning in context: Implementing intelligent applications of language studies . London & New York: Continuum. 24–58.

Bateman, John A. 2008b. Multimodality and genre: A foundation for the systematic analysis of multimodal documents. London & New York: Palgrave Macmillan.

Bateman, John A. & Christian M.I.M. Matthiessen. 1993. “The text base in generation.” In Keqi Hao, Hermann Bluhme & Renzhi Li (eds.), Proceedings of the International Conference on Texts and Language Research , Xi’an, 29–31 March 1989. Xi’an: Xi’an Jiaotong University Press. 3–45.

Bateman, John A. & Stefan Momma. 1991. The nondirectional representation of systemic functional grammars and semantics as typed feature structures . Technical report, GMD, Institute für Integrierte Publikations- und Informationssysteme, Darmstadt & Institut für Maschinelle Sprachverarbeitung, Universität Stuttgart.

Bateman, John & Mick O’Donnell. 2015. “Computational linguistics: The Halliday connection.” In Jonathan J. Webster (ed.), The Bloomsbury companion to M.A.K. Halliday . London & New York: Bloomsbury. 453–467.

Bateman, John A., Renate Henschel & Judy Delin. 2002. “A brief introduction to the GEM annotation schema for complex document layout.” In Graham Wilcock, Nancy Ide & Laurent Romary (eds.), Proceedings of the 2nd Workshop on NLP and XML (NLPXML-2002) — Post-Conference Workshop of the 19th International Conference on Computational Linguistics (COLING-2002) . Taipei: Association of Computational Linguistics and Chinese Language Processing, Academia Sinica, Taiwan. 13–20.

Bateman, John A., Christian M.I.M. Matthiessen & Zeng Licheng. 1999. “Multilingual language generation for multilingual software: A functional linguistic approach.” Applied Artificial Intelligence: An International Journal 13(6): 607–639.

Bateman, John A., Joana Hois, Robert Ross & Thora Tenbrink. 2010. “A linguistic ontology of space for natural language processing.” Artificial Intelligence 174: 1027–1071.

Bateman, John, Christian Matthiessen, Keizo Nanri & Licheng Zeng. 1991. “The rapid prototyping of natural language generation components: An application of functional typology.” Proceedings of the 12th International Conference on Artificial Intelligence, Sydney, 24–30 August 1991. Sydney. San Mateo, CA: Morgan Kaufman. 966–971.

Bateman, John, Daniel McDonald, Tuomo Hiippala, Daniel Couto-Vale & Eugeniu Costetchi. 2019. “Systemic Functional Linguistics and computation: New directions, new challenges.” In Geoff Thompson, Wendy L. Bowcher, Lise Fontaine & David Schöntal (eds.), The Cambridge handbook of Systemic Functional Linguistics. Cambridge: Cambridge University Press. 561–586.

Boas, Hans C. & Ivan A. Sag (eds.). 2010. Sign-based construction grammar. Stanford: Center for the Study of Language and Information.

Bohm, David. 1979. Wholeness and the implicate order. London: Routledge & Kegan Paul.

Brachman, Ronald J. 1978. A structural paradigm for representing knowledge . BBN Report No. 3605, Bolt Beranek and Newman, Inc. Cambridge, MA.

Brachman, Ronald J. & Hector J. Levesque. (eds.). 1985. Readings in knowledge representation . Los Altos, CA: Morgan Kaufman.

Bresnan, Joan, Ash Asudeh, Ida Toivonen & Stephen Wechsler. 2016. Lexical-functional syntax. 2nd edition. Hoboken: Wiley Blackwell.

Carpenter, Bob. 1992. The logic of typed feature structures: With applications to unification grammars, logic programs and constraint resolution. Cambridge: Cambridge University Press.

Cross, Marilyn. 1992. “Choice in lexis: Computer generation of lexis as most delicate grammar.” Language Sciences 14(4): 579–607.

Davey, Anthony. 1978. Discourse production: A computer model of some aspects of a speaker . Edinburgh: Edinburgh University Press.

Elhadad, Michael & Jacques Robin. 1999. “SURGE: A comprehensive plug-in syntactic realization component for text generation.” Computational Linguistics 99(4): 1–44.

Fawcett, Robin P. 1981. “Generating a sentence in systemic-functional grammar.” In M.A.K. Halliday & J.R. Martin (eds.), Readings in systemic linguistics . London: Batsford. 146–183.

Fawcett, Robin P. 1988. “Language generation as choice in social interaction.” In Michael Zock & Gerard Sabah (eds.), Advances in natural language generation . London: Pinter. 27–49.

Fawcett, Robin P. 1994. “A generationist approach to grammar reversibility in natural language processing.” In Tomek Strzalkowski (ed.), Reversible grammar in natural language processing . Dordrecht: Kluwer. 365–414.

Francez, Nissim & Shuly Wintner. 2012. Unification grammars. Cambridge: Cambridge University Press.

Gross, Maurice. 1979. “On the failure of generative grammar.” Language 55(4): 859–885.

Halliday, M.A.K. 1956a. “The linguistic basis of a mechanical thesaurus, and its application to English preposition classification.” Mechanical Translation 3: 81–88. Reprinted in M.A.K. Halliday. 2005. Computational and quantitative Studies. Volume 6 in the Collected works of M.A.K. Halliday . Edited by Jonathan J. Webster. London & New York: Continuum. 6–19.

Halliday, M.A.K. 1956b. “Grammatical categories in modern Chinese.” Transactions of the Philosophical Society 1956: 177–224. Reprinted in M.A.K. Halliday. 2005. Studies in Chinese language. Volume 8 in the Collected works of M.A.K. Halliday . Edited by Jonathan J. Webster. London & New York: Continuum. 209–248.

Halliday, M.A.K. 1962. “Linguistics and machine translation.” Zeitschrift für Phonetik, Sprachwissenschaft und Kommunikationsforschung 15: 145–158. Reprinted in M.A.K. Halliday. 2005. Computational and quantitative studies. Volume 6 in the Collected works of M.A.K. Halliday . Edited by Jonathan J. Webster. London & New York: Continuum. 20–36.

Halliday, M.A.K. 1976. System and function in language. Edited by Gunther Kress. London: Oxford University Press.

Halliday, M.A.K. 2007. “Applied linguistics as an evolving theme.” In Jonathan J. Webster (ed.), Language and education. Volume 9 in the Collected works of M.A.K. Halliday . London & New York: Continuum. 1–19.

Halliday, M.A.K. 2008. “Working with meaning: Towards an appliable linguistics.” In Jonathan J. Webster (ed.), Meaning in context: Implementing intelligent applications of language studies . London & New York: Continuum. 7–23.

Halliday, M.A.K. & Christian M.I.M. Matthiessen. 1999/2006. Construing experience through meaning: A language-based approach to cognition . London: Continuum.

Hasan, Ruqaiya. 1987. “The grammarian’s dream: Lexis as most delicate grammar.” In M.A.K. Halliday & Robin P. Fawcett (eds.), New developments in systemic linguistics: Theory and description (volume 1). London: Pinter. 184–211. Reprinted in Ruqaiya Hasan. 2019. Jonathan J. Webster & Carmel Cloran (eds.), Describing language: Form and function. Volume 5 in the Collected works of Ruqaiya Hasan . London & New York: Continuum. 143–173.

Henrici, Alick. 1965. “Notes on the systemic generation of a paradigm of the English clause.” In M.A.K. Halliday & J.R. Martin (eds.), 1981, Readings in systemic linguistics. London: Batsford. 74–98.

Honnibal, Matthew. 2004. Adapting the Penn Treebank to Systemic Functional Grammar: Design, creation and use of a metafunctionally annotated corpus. BA Honours thesis, Macquarie University, Sydney.

Joshi, Aravind K. & Yves Schabes. 1997. “Tree-Adjoining Grammars.” In Grzegorz Rozenberg & Arto Salomaa (eds.), Handbook of formal languages (volume 3) . Berlin & Heidelberg: Springer. 69–123.

Kaplan, Ronald & Joan Bresnan. 1982. “Lexical-functional grammar: A formal system for grammatical representation.” In Joan Bresnan (ed.), The mental representation of grammatical relations . Cambridge, MA: The MIT Press. 173–281.

Kasper, Robert. 1988a. “Systemic Grammar and Functional Unification Grammar.” In James D. Benson & William S. Greaves (eds.), Systemic functional approaches to discourse. Norwood: Ablex. 176–199.

Kasper, Robert. 1988b. “An experimental parser for systemic grammars.” The 12th International Conference on Computational Linguistics. Budapest, Hungary. COLING. 309–312.

Kay, Martin. 1979. “Functional grammar.” In Proceedings of the Fifth Annual Meeting of the Berkeley Linguistic Society . Berkeley, UC. 142–158.

Kay, Martin. 1985. “Parsing in functional unification grammar.” In Barbara J. Grosz, Karen S. Jones & Bonnie Lynn Webber (eds.), Readings in natural language processing . Los Altos, CA: Morgan Kaufmann. 125–138.

Mann, William C. 1982. An overview of the Penman text generation system . Information Sciences Institute, University of Southern California: ISI/RR-83–114.

Mann, William C. & Christian M.I.M. Matthiessen. 1985. “Demonstration of the Nigel text generation grammar.” In James D. Benson & William S. Greaves (eds.), Systemic perspectives on discourse (volume 1) . Norwood: Ablex. 50–83.

Matthiessen, Christian M.I.M. 1985. “The systemic framework in text generation: Nigel.” In James D. Benson & William S. Greaves (eds.), Systemic perspectives on discourse (volume 1) . Norwood: Ablex. 96–118.

Matthiessen, Christian M.I.M. 1988a. “Representational issues in Systemic Functional Grammar.” In James D. Benson & William S. Greaves (eds.), Systemic perspectives on discourse (volume 1) . Norwood: Ablex. 136–175.

Matthiessen, Christian M.I.M. 1988b. “Semantics for a systemic grammar: The chooser and inquiry framework.” In James D. Benson, Michael J. Cummings & William S. Greaves (eds.), Linguistics in a systemic perspective. Amsterdam & Philadelphia: John Benjamins. 221–242.

Matthiessen, Christian M.I.M. 1991. “Lexico(grammatical) choice in text-generation.” In Cécile Paris, William Swartout & William C. Mann (eds.), Natural language generation in artificial intelligence and computational linguistics . Dordrecht: Kluwer. 249–292.

Matthiessen, Christian M.I.M. 1993. “Instantial systems and logogenesis.” Written version of Paper presented at the Third Chinese Systemic-functional symposium, Hangzhou University, Hangzhou, June 17–20, 1993.

Matthiessen, Christian M.I.M. 1995a. Lexicogrammatical cartography: English systems. Tokyo: International Language Sciences Publishers.

Matthiessen, Christian M.I.M. 1995b. “THEME as an enabling resource in ideational ‘knowledge’ construction.” In Mohsen Ghadessy (ed.), Thematic developments in English texts. London & New York: Pinter. 20–55.

Matthiessen, Christian M.I.M. 1995c. “Fuzziness construed in language: A linguistic perspective.” Proceedings of FUZZ/IEEE, Yokohama, March 1995. Yokohama. 1871–1878.

Matthiessen, Christian M.I.M. 2002. “Lexicogrammar in discourse development: Logogenetic patterns of wording.” In Guowen Huang & Zongyan Wang (eds.), Discourse and language functions. Beijing: Foreign Language Teaching and Research Press. 91–127.

Matthiessen, Christian M.I.M. 2018. “The notion of a multilingual meaning potential: A systemic exploration.” In Akila Sellami-Baklouti & Lise Fontaine (eds.), Perspectives from Systemic Functional Linguistics. Abingdon & New York: Routledge. 90–120.

Matthiessen, Christian M.I.M. & John A. Bateman. 1991. Text generation and systemic-functional linguistics: Experiences from English and Japanese . London: Frances Pinter.

Matthiessen, Christian M.I.M. & Christopher Nesbitt. 1996. “On the idea of theory — neutral descriptions.” In Ruqaiya Hasan, Carmel Cloran & David Butt (eds.), Functional descriptions: Theory in practice. Amsterdam: Benjamins. 39–85.

Matthiessen, Christian M.I.M., Licheng Zeng, Marilyn Cross, Ichiro Kobayashi, Kazuhiro Teruya & Canzhong Wu. 1998. “The Multex generator and its environment: Application and development.” Proceedings of the International Generation Workshop ’98, August ’98, Niagara-on-the-Lake. 228–237.

McKeown, Kathleen. 1982. Generating natural language text in response to questions about database structure . PhD thesis, University of Pennsylvania, Philadelphia.

McKeown, Kathleen. 1985. Text generation: Using discourse strategies and focus constraints to generate natural language text . Cambridge: Cambridge University Press.

Mohan, Bernhard A. 1986. Language and content. Reading, Mass.: Addison-Wesley.

Neale, Amy. 2002. More delicate transitivity : Extending the process type system networks for English to include full semantic classifications. PhD thesis, Cardiff University, Cardiff.

O’Donnell, Michael. 1990. “A dynamic model of exchange.” WORD 41(3): 293–328.

O’Donnell, Michael. 1994. Sentence analysis and generation: A systemic perspective. PhD thesis, University of Sydney, Sydney.

O’Donnell, Michael. 2012. UAM CorpusTool: Version 2.8 User Manual .

O’Donnell, Michael & Peter Sefton. 1995. “Modelling telephonic interaction: A dynamic approach.” Journal of Applied Linguistics 10(1): 63–78.

O’Donnell, Mick & John Bateman. 2005. “SFL in computational contexts: A contemporary history.” In Ruqaiya Hasan, Christian M.I.M. Matthiessen & Jonathan J. Webster (eds.), Continuing discourse on language: A functional perspective (volume 1) . London: Equinox. 343–382.

O’Halloran, Kay. 2003. “Systemics 1.0: Software for research and teaching Systemic Functional Linguistics.” RELC Journal 34(2): 157–158.

O’Halloran, Kay L., Sabine Tan, Peter Wignell, John A. Bateman, Duc-Son Pham, Michele Grossman & Andrew Vande Moere. 2016. “Interpreting text and image relations in violent extremist discourse: A mixed methods approach for big data analytics.” Terrorism and Political Violence 31(3): 454–474.

Parker-Rhodes, A.F. 1978. Inferential semantics. Atlantic Highlands, NJ: Humanities Press.

Patten, Terry. 1988. Systemic text generation as problem solving. Cambridge: Cambridge University Press.

Peters, P. Stanley & R.W. Ritchie. 1973. “On the generative power of transformational grammars.” Information Sciences 6: 48–83.

Pollard, Carl & Ivan A. Sag. 1993. Head-driven phrase structure grammar . Chicago & London: University of Chicago Press.

Roget, Peter Mark. 1852. Roget’s thesaurus of English words and phrases . Essex: Longman.

Shapiro, Stuart C. 1982. “Generalized augmented transition network grammars for generation from semantic networks.” American Journal of Computational Linguistics 8(1): 12–25.

Shieber, Stuart M. 1986. An introduction to unification-based approaches to grammar. Stanford: CSLI Publications.

Simmons, Robert & Jonathan Slocum. 1972. “Generating English discourse from semantic networks.” Communications of the ACM 15(10): 891–905.

Stockwell, Robert P., Paul Schachter & Barbara Partee. 1973. Major syntactic structures of English. New York: Holt, Rinehart and Winston.

Teich, Elke. 1995. A proposal for dependency in Systemic Functional Grammar: Metasemiosis in computational Systemic Functional Linguistics . PhD thesis, Universität des Saarlandes, Saarbrücken.

Teich, Elke. 1999. Systemic Functional Grammar in natural language generation: Linguistic description and computational representation . London & New York: Cassell.

Teich, Elke. 2009. “Computational linguistics.” In M.A.K. Halliday & Jonathan Webster (eds.), A companion to Systemic Functional Linguistics. London & New York: Continuum. 113–127.

Tucker, Gordon H. 1998. The lexicogrammar of adjectives: A systemic functional approach to lexis. London: Cassell.

Wanner, Leo. 1997. Exploring lexical resources for text generation in a systemic functional language model. PhD thesis, Universität des Saarlandes, Saarbrücken.

Webster, Jonathan J. 1993. “Text processing using the Functional Grammar Processor (FGP).” In Mohsen Ghadessy (ed.), Register analysis: Theory and practice. London & New York: Pinter. 181–195.

Weerasinghe, A. Ruvan. 1994. Probabilistic parsing in Systemic Functional Grammar. PhD thesis, University of Wales College of Cardiff, Cardiff.

Weerasinghe, A. Ruvan & Robin P. Fawcett. 1993. “Probabilistic incremental parsing in Systemic Functional Grammar.” In Harry Bunt & Masaru Tomita (eds.), Proceedings of the Third Workshop on Parsing Technologies. Tilburg: Institute for Language Technology and Artificial Intelligence. 349–367.

Winograd, Terry. 1972. Understanding natural language . Edinburgh: Edinburgh University Press.

Winograd, Terry. 1983. Language as a cognitive process: Syntax . Reading, Mass.: Addison Wesley Pub.

Woods, William. 1975. “What’s in a link: Foundations for semantic networks.” In Daniel G. Bobrow & Allan Collins (eds.), Representation and understanding: Studies in cognitive science. New York: Academic Press. 35–82.

Wu, Canzhong. 2000. Modelling linguistic resources: A systemic functional approach. PhD thesis, Macquarie University, Sydney.

Zappavigna, Michele. 2011. “Visualizing logogenesis: Preserving the dynamics of meaning.” In Shoshana Dreyfus, Susan Hood & Maree Stenglin (eds.), Semiotic margins: Meaning in multimodalities. London & New York: Continuum. 211–229.

Download references

Author information

Authors and affiliations.

School of International Studies, University of International Business and Economics, Beijing, China

Christian M. I. M. Matthiessen

School of Translation Studies, Jinan University, Zhuhai, China

School of International Cooperation, Guangdong Polytechnic of Science and Technology, Zhuhai, China

Department of English, University of Cape Coast, Cape Coast, Ghana

Isaac N. Mwinlaaru

You can also search for this author in PubMed   Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Singapore Pte Ltd.

About this chapter

Cite this chapter.

Matthiessen, C.M., Wang, B., Ma, Y., Mwinlaaru, I.N. (2022). Computational Linguistics. In: Systemic Functional Insights on Language and Linguistics. The M.A.K. Halliday Library Functional Linguistics Series. Springer, Singapore. https://doi.org/10.1007/978-981-16-8713-6_5

Download citation

DOI : https://doi.org/10.1007/978-981-16-8713-6_5

Published : 06 April 2022

Publisher Name : Springer, Singapore

Print ISBN : 978-981-16-8712-9

Online ISBN : 978-981-16-8713-6

eBook Packages : Education Education (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research


Issue Cover

  • Previous Article
  • Next Article

1. The Deep Learning Tsunami

2. the success of deep learning, 3. why computational linguists need not worry, 4. deep learning of language, 5. scientific questions that connect computational linguistics and deep learning, acknowledgments, computational linguistics and deep learning.

Departments of Computer Science and Linguistics, Stanford University, Stanford CA 94305-9020, U.S.A. E-mail: [email protected] .

  • Cite Icon Cite
  • Open the PDF for in another window
  • Permissions
  • Article contents
  • Figures & tables
  • Supplementary Data
  • Peer Review
  • Search Site

Christopher D. Manning; Computational Linguistics and Deep Learning. Computational Linguistics 2015; 41 (4): 701–707. doi: https://doi.org/10.1162/COLI_a_00239

Download citation file:

  • Ris (Zotero)
  • Reference Manager

Deep Learning waves have lapped at the shores of computational linguistics for several years now, but 2015 seems like the year when the full force of the tsunami hit the major Natural Language Processing (NLP) conferences. However, some pundits are predicting that the final damage will be even worse. Accompanying ICML 2015 in Lille, France, there was another, almost as big, event: the 2015 Deep Learning Workshop. The workshop ended with a panel discussion, and at it, Neil Lawrence said, “NLP is kind of like a rabbit in the headlights of the Deep Learning machine, waiting to be flattened.” Now that is a remark that the computational linguistics community has to take seriously! Is it the end of the road for us? Where are these predictions of steam-rollering coming from?

At the June 2015 opening of the Facebook AI Research Lab in Paris, its director Yann LeCun said: “The next big step for Deep Learning is natural language understanding, which aims to give machines the power to understand not just individual words but entire sentences and paragraphs.” 1 In a November 2014 Reddit AMA (Ask Me Anything), Geoff Hinton said, “I think that the most exciting areas over the next five years will be really understanding text and videos. I will be disappointed if in five years' time we do not have something that can watch a YouTube video and tell a story about what happened. In a few years time we will put [Deep Learning] on a chip that fits into someone's ear and have an English-decoding chip that's just like a real Babel fish.” 2 And Yoshua Bengio, the third giant of modern Deep Learning, has also increasingly oriented his group's research toward language, including recent exciting new developments in neural machine translation systems. It's not just Deep Learning researchers. When leading machine learning researcher Michael Jordan was asked at a September 2014 AMA, “If you got a billion dollars to spend on a huge research project that you get to lead, what would you like to do?”, he answered: “I'd use the billion dollars to build a NASA-size program focusing on natural language processing, in all of its glory (semantics, pragmatics, etc.).” He went on: “Intellectually I think that NLP is fascinating, allowing us to focus on highly structured inference problems, on issues that go to the core of ‘what is thought’ but remain eminently practical, and on a technology that surely would make the world a better place.” Well, that sounds very nice! So, should computational linguistics researchers be afraid? I'd argue, no. To return to the Hitchhiker's Guide to the Galaxy theme that Geoff Hinton introduced, we need to turn the book over and look at the back cover, which says in large, friendly letters: “Don't panic.”

There is no doubt that Deep Learning has ushered in amazing technological advances in the last few years. I won't give an extensive rundown of successes, but here is one example. A recent Google blog post told about Neon, the new transcription system for Google Voice. 3 After admitting that in the past Google Voice voicemail transcriptions often weren't fully intelligible, the post explained the development of Neon, an improved voicemail system that delivers more accurate transcriptions, like this: “Using a (deep breath) long short-term memory deep recurrent neural network (whew!), we cut our transcription errors by 49%.” Do we not all dream of developing a new approach to a problem which halves the error rate of the previously state-of-the-art system?

Michael Jordan, in his AMA, gave two reasons why he wasn't convinced that Deep Learning would solve NLP: “Although current deep learning research tends to claim to encompass NLP, I'm (1) much less convinced about the strength of the results, compared to the results in, say, vision; (2) much less convinced in the case of NLP than, say, vision, the way to go is to couple huge amounts of data with black-box learning architectures.” 4

Jordan is certainly right about his first point: So far, problems in higher-level language processing have not seen the dramatic error rate reductions from deep learning that have been seen in speech recognition and in object recognition in vision. Although there have been gains from deep learning approaches, they have been more modest than sudden 25% or 50% error reductions. It could easily turn out that this remains the case. The really dramatic gains may only have been possible on true signal processing tasks. On the other hand, I'm much less convinced by his second argument. However, I do have my own two reasons why NLP need not worry about deep learning: (1) It just has to be wonderful for our field for the smartest and most influential people in machine learning to be saying that NLP is the problem area to focus on; and (2) Our field is the domain science of language technology; it's not about the best method of machine learning—the central issue remains the domain problems. The domain problems will not go away. Joseph Reisinger wrote on his blog: “I get pitched regularly by startups doing ‘generic machine learning’ which is, in all honesty, a pretty ridiculous idea. Machine learning is not undifferentiated heavy lifting, it's not commoditizable like EC2, and closer to design than coding.” 5 From this perspective, it is people in linguistics, people in NLP, who are the designers. Recently at ACL conferences, there has been an over-focus on numbers, on beating the state of the art. Call it playing the Kaggle game. More of the field's effort should go into problems, approaches, and architectures. Recently, one thing that I've been devoting a lot of time to—together with many other collaborators—is the development of Universal Dependencies. 6 The goal is to develop a common syntactic dependency representation and POS and feature label sets that can be used with reasonable linguistic fidelity and human usability across all human languages. That's just one example; there are many other design efforts underway in our field. One other current example is the idea of Abstract Meaning Representation. 7

Where has Deep Learning helped NLP? The gains so far have not so much been from true Deep Learning (use of a hierarchy of more abstract representations to promote generalization) as from the use of distributed word representations—through the use of real-valued vector representations of words and concepts. Having a dense, multi-dimensional representation of similarity between all words is incredibly useful in NLP, but not only in NLP. Indeed, the importance of distributed representations evokes the “Parallel Distributed Processing” mantra of the earlier surge of neural network methods, which had a much more cognitive-science directed focus (Rumelhart and McClelland 1986 ). It can better explain human-like generalization, but also, from an engineering perspective, the use of small dimensionality and dense vectors for words allows us to model large contexts, leading to greatly improved language models. Especially seen from this new perspective, the exponentially greater sparsity that comes from increasing the order of traditional word n -gram models seems conceptually bankrupt.

I do believe that the idea of deep models will also prove useful. The sharing that occurs within deep representations can theoretically give an exponential representational advantage, and, in practice, offers improved learning systems. The general approach to building Deep Learning systems is compelling and powerful: The researcher defines a model architecture and a top-level loss function and then both the parameters and the representations of the model self-organize so as to minimize this loss, in an end-to-end learning framework. We are starting to see the power of such deep systems in recent work in neural machine translation (Sutskever, Vinyals, and Le 2014 ; Luong et al. 2015 ).

Finally, I have been an advocate for focusing more on compositionality in models, for language in particular, and for artificial intelligence in general. Intelligence requires being able to understand bigger things from knowing about smaller parts. In particular for language, understanding novel and complex sentences crucially depends on being able to construct their meaning compositionally from smaller parts—words and multi-word expressions—of which they are constituted. Recently, there have been many, many papers showing how systems can be improved by using distributed word representations from “deep learning” approaches, such as word2vec (Mikolov et al. 2013 ) or GloVe (Pennington, Socher, and Manning 2014 ). However, this is not actually building Deep Learning models, and I hope in the future that more people focus on the strongly linguistic question of whether we can build meaning composition functions in Deep Learning systems.

I encourage people to not get into the rut of doing no more than using word vectors to make performance go up a couple of percent. Even more strongly, I would like to suggest that we might return instead to some of the interesting linguistic and cognitive issues that motivated noncategorical representations and neural network approaches.

The not observing this rule is that which the world has blamed in our satorist. (Dryden, Essay Dramatick Poesy , 1684, page 310)

The only mental provision she was making for the evening of life, was the collecting and transcribing all the riddles of every sort that she could meet with. (Jane Austen, Emma , 1816)

The difficulty is in the getting the gold into Erewhon. (Sam Butler, Erewhon Revisited , 1902)

Tom's winning the election was a big upset.

?This teasing John all the time has got to stop.

?There is no marking exams on Fridays.

*The cessation hostilities was unexpected.

[That kind [of knife]] isn't used much.

We are [kind of] hungry.

[a [kind [of dense rock]]]

[a [[kind of] dense] rock]

A nette sent in to the see, and of alle kind of fishis gedrynge ( Wyclif , 1382)

Their finest and best, is a kind of course red cloth ( True Report , 1570)

I was kind of provoked at the way you came up ( Mass. Spy , 1830)

NLP is kind of like a rabbit in the headlights of the deep learning machine (Neil Lawrence, DL workshop panel, 2015)

Just recently, there has started to be some new work harnessing the power of distributed representations for modeling and explaining linguistic variation and change. Sagi, Kaufmann, and Clark ( 2011 )—actually using the more traditional method of Latent Semantic Analysis to generate distributed word representations—show how distributed representations can capture a semantic change: the broadening and narrowing of reference over time. They look at examples such as how in Old English deer was any animal, whereas in Middle and Modern English it applies to one clear animal family. The words dog and hound have swapped: In Middle English, hound was used for any kind of canine, while now it is used for a particular sub-kind, whereas the reverse is true for dog .

Kulkarni et al. ( 2015 ) use neural word embeddings to model the shift in meaning of words such as gay over the last century (exploiting the online Google Books Ngrams corpus). At a recent ACL workshop, Kim et al. ( 2014 ) use a similar approach—using word2vec—to look at recent changes in the meaning of words. For example, in Figure 1 , they show how around 2000, the meaning of the word cell changed rapidly from being close in meaning to closet and dungeon to being close in meaning to phone and cordless . The meaning of a word in this context is the average over the meanings of all senses of a word, weighted by their frequency of use.

Trend in the meaning of cell, represented by showing its cosine similarity to four other words over time (where 1.0 represents maximal similarity, and 0.0 represents no similarity).

Trend in the meaning of cell , represented by showing its cosine similarity to four other words over time (where 1.0 represents maximal similarity, and 0.0 represents no similarity).

These more scientific uses of distributed representations and Deep Learning for modeling phenomena characterize the previous boom in neural networks. There has been a bit of a kerfuffle online lately about citing and crediting work in Deep Learning, and from that perspective, it seems to me that the two people who scarcely get mentioned any more are Dave Rumelhart and Jay McClelland. Starting from the Parallel Distributed Processing Research Group in San Diego, their research program was aimed at a clearly more scientific and cognitive study of neural networks.

Now, there are indeed some good questions about the adequacy of neural network approaches for rule-governed linguistic behavior. Old timers in our community should remember that arguing against the adequacy of neural networks for rule-governed linguistic behavior was the foundation for the rise to fame of Steve Pinker—and the foundation of the career of about six of his graduate students. It would take too much space to go through the issues here, but in the end, I think it was a productive debate. It led to a vast amount of work by Paul Smolensky on how basically categorical systems can emerge and be represented in a neural substrate (Smolensky and Legendre 2006 ). Indeed, Paul Smolensky arguably went too far down the rabbit hole, devoting a large part of his career to developing a new categorical model of phonology, Optimality Theory (Prince and Smolensky 2004 ). There is a rich body of earlier scientific work that has been neglected. It would be good to return some emphasis within NLP to cognitive and scientific investigation of language rather than almost exclusively using an engineering model of research.

Overall, I think we should feel excited and glad to live in a time when Natural Language Processing is seen as so central to both the further development of machine learning and industry application problems. The future is bright. However, I would encourage everyone to think about problems, architectures, cognitive science, and the details of human language, how it is learned, processed, and how it changes, rather than just chasing state-of-the-art numbers on a benchmark task.

This Last Words contribution covers part of my 2015 ACL Presidential Address. Thanks to Paola Merlo for suggesting writing it up for publication.

http://www.wired.com/2014/12/fb/ .

https://www.reddit.com/r/MachineLearning/comments/2lmo0l/ama_geoffrey_hinton .

http://googleblog.blogspot.com/2015/07/neon-prescription-or-rather-new.html .

http://www.reddit.com/r/MachineLearning/comments/2fxi6v/ama_michael_i_jordan .

http://thedatamines.com/post/13177389506/why-generic-machine-learning-fails .

http://universaldependencies.github.io/docs/ .

http://amr.isi.edu .

Author notes

Email alerts, related articles, related book chapters, affiliations.

  • Online ISSN 1530-9312
  • Print ISSN 0891-2017

A product of The MIT Press

Mit press direct.

  • About MIT Press Direct


  • Accessibility
  • For Authors
  • For Customers
  • For Librarians
  • Direct to Open
  • Open Access
  • Media Inquiries
  • Rights and Permissions
  • For Advertisers
  • About the MIT Press
  • The MIT Press Reader
  • MIT Press Blog
  • Seasonal Catalogs
  • MIT Press Home
  • Give to the MIT Press
  • Direct Service Desk
  • Terms of Use
  • Privacy Statement
  • Crossref Member
  • COUNTER Member  
  • The MIT Press colophon is registered in the U.S. Patent and Trademark Office

This Feature Is Available To Subscribers Only

Sign In or Create an Account

ScholarWorks@UMass Amherst

Home > HFA > Linguistics > LINGUIST_DISS


Linguistics Department Dissertations Collection

Current students, please follow this link to submit your dissertation.

Dissertations from 2023 2023

Long(er) Object Movement in Turkish , Duygu Göksu, Linguistics

'You' Will Always Have 'Me': A Compositional Theory of Person , Kaden T. Holladay, Linguistics

Associative Plurals , Sherry Hucklebridge, Linguistics

Counterdirectionality in the Grammar: Reversals and Restitutions , Jyoti Iyer, Linguistics

The Online Processing of Even's Likelihood Presupposition , Erika Mayer, Linguistics

Dissertations from 2022 2022

On the Semantics of Verbal and Nominal Tense in Mvskoke (Creek) , Kimberly C. Johnson, Linguistics

Restrictive Tier Induction , Seoyoung Kim, Linguistics


TENSE IN CONDITIONALS: INS AND OUTS , Zahra Mirrazi, Linguistics

Phonotactic Learning with Distributional Representations , Max A. Nelson, Linguistics

The Linearization of V(P)-doubling Constructions , Rong Yin, Linguistics

Dissertations from 2021 2021

Shifting the Perspectival Landscape: Methods for Encoding, Identifying, and Selecting Perspectives , Carolyn Jane Anderson, Linguistics

There and Gone Again: Syntactic Structure In Memory , Caroline Andrews, Linguistics

The Event Structure of Attitudes , Deniz Özyıldız, Linguistics


The Syntactic and Semantic Atoms of the Spray/load Alternation , Michael A. Wilson, Linguistics

Dissertations from 2020 2020

Representing Context: Presupposition Triggers and Focus-sensitivity , Alexander Goebel, Linguistics

Person-based Prominence in Ojibwe , Christopher Hammerly, Linguistics

Emergent Typological Effects of Agent-Based Learning Models in Maximum Entropy Grammar , Coral Hughto, Linguistics

TALKING ABOUT HER(SELF): AMBIGUITY AVOIDANCE AND PRINCIPLE B. A Theoretical and Psycholinguistic Investigation of Romanian Pronouns , Rudmila-Rodica Ivan, Linguistics


Optimal Linearization: Prosodic displacement in Khoekhoegowab and Beyond , Leland Kusmer, Linguistics

Dissertations from 2019 2019

Computing Agreement in a Mixed System , Sakshi Bhatia, Linguistics

Binding and Coreference in Vietnamese , Thuy Bui, Linguistics

Divorce Licensing: Separate Criteria for Predicate and Clausal Ellipsis , Tracy Conner, Linguistics

Effects of Phonological Contrast on Within-Category Phonetic Variation , Ivy Hauser, Linguistics

Phrasal and Clausal Exceptive-Additive Constructions Crosslinguistically , Ekaterina Vostrikova, Linguistics

Dissertations from 2018 2018

Typology of bizarre ellipsis varieties , David Erschler, Linguistics

The Head-Quarters of Mandarin Arguments , Hsin-Lun Huang, Linguistics


Responding to questions and assertions: embedded Polar Response Particles, ellipsis, and contrast , Jeremy Pasquereau, Linguistics

Dissertations from 2017 2017

The Form and Acquisition of Free Relatives , Michael Clauss, Linguistics

Amount Relatives Redux , Jon Ander Mendia, Linguistics

Movement and the Semantic Type of Traces , Ethan Poole, Linguistics

Preferential early attribution in segmental parsing , Amanda Rysling, Linguistics

When errors aren't: How comprehenders selectively violate Binding Theory , Shayne Sloggett, Linguistics

Dissertations from 2016 2016

Building Meaning in Navajo , Elizabeth A. Bogal-Allbritten, Linguistics

Probes and their Horizons , Stefan Keine, Linguistics

Anaphora, Inversion, and Focus , Nicholas J. LaCara, Linguistics

The Representation of Probabilistic Phonological Patterns: Neurological, Behavioral, and Computational Evidence from the English Stress System , Claire Moore-Cantwell, Linguistics

Extending Hidden Structure Learning: Features, Opacity, and Exceptions , Aleksei I. Nazarov, Linguistics

Dissertations from 2015 2015

Experiencing in Japanese: The Experiencer Restriction across Clausal Types , Masashi Hashimoto, Linguistics

Rightward Movement: A Study in Locality , Jason Overfelt, Linguistics

Investigating Properties of Phonotactic Knowledge Through Web-Based Experimentation , Presley Pizzo, Linguistics

Phonologically Conditioned Allomorphy and UR Constraints , Brian W. Smith, Linguistics

Dissertations from 2014 2014

Contrastive Topic: Meanings and Realizations , Noah Constant, Linguistics

The Grammar of Individuation and Counting , Suzi Lima, Linguistics

Comprehending Each Other: Weak Reciprocity and Processing , Helen Majewski, Linguistics

Computational Modeling of Learning Biases in Stress Typology , Robert D. Staubs, Linguistics

Fragments and Clausal Ellipsis , Andrew Weir, Linguistics

Dissertations from 2013 2013

Gapping in Farsi: A Crosslinguistic Investigation , Annahita Farudi, Linguistics

The Parsing and Interpretation of Comparatives: More than Meets the Eye , Margaret Ann Grant, Linguistics

Dissertations from 2012 2012

Syntax-Prosody Interactions in Irish , Emily Elfner, Linguistics

Processing Perspectives , Jesse Aron Harris, Linguistics

Exhaustivity In Questions & Clefts; And The Quantifier Connection: A Study In German And English , Tanja Heizmann, Linguistics

Phonological And Phonetic Biases In Speech Perception , Michael Parrish Key, Linguistics

The Role of Contextual Restriction in Reference-Tracking , Andrew Robert McKenzie, Linguistics

Stress in Harmonic Serialism , Kathryn Ringler Pruitt, Linguistics

Roots of Modality , Aynat Rubinstein, Linguistics

Goals, Big and Small , Martin Walkow, Linguistics

Dissertations from 2011 2011

Quantification, misc. , Jan Anderssen, Linguistics

Anchoring Pragmatics In Syntax And Semantics , Maria Biezma, Linguistics

Constraining Interpretation: Sentence Final Particles in Japanese , Christopher M. Davis, Linguistics

Cumulative constraint interaction in phonological acquisition and typology , Karen Christine Jesney

Cumulative Constraint Interaction In Phonological Acquisition And Typology , Karen Christine Jesney, Linguistics

Competing Triggers: Transparency And Opacity In Vowel Harmony , Wendell A Kimper, Linguistics

Dissertations from 2010 2010

Topics In The Nez Perce Verb , Amy Rose Deal, Linguistics

Concealed Questions. In Search Of Answers , Ilaria Frana, Linguistics

Dissertations from 2009 2009

Phonological Trends In The Lexicon: The Role Of Constraints , Michael Becker, Linguistics

Natural Selection and the Syntax of Clausal Complementation , Keir Moulton, Linguistics

Two Types of Definites in Natural Language , Florian Schwarz, Linguistics

The Role Of Lexical Contrast In The Perception Of Intonational Prominence In Japanese , Takahito Shinya, Linguistics

The Emergence of DP in the Partitive Structure , Helen Stickney, Linguistics

Optionality and Variability: Syntactic Licensing Meets Morphological Spell-Out , Cherlon Ussery, Linguistics

Word, Phrase, And Clitic Prosody In Bosnian, Serbian, And Croatian , Adam Werle, Linguistics

Dissertations from 2008 2008

Optimal interleaving: Serial phonology -morphology interaction in a constraint-based model , Matthew Adam Wolf

Dissertations from 2007 2007

The sources of phonological markedness , Kathryn Gilbert Flack

The emergence of phonetic naturalness , Shigeto Kawahara

Biases and stages in phonological acquisition , Anne-Michelle Tessier

Acquisition of scalar implicatures , Anna VerBuk

Dissertations from 2006 2006

Disjunction in alternative semantics , Luis Alonso-Ovalle

Acquisition of a natural versus an unnatural stress system , Angela C Carpenter

Asymmetries in the acquisition of consonant clusters , Della Chambless

Telicity and the syntax-semantics of the *object and *subject , Miren J Hodgson

Variables in Natural Language , Meredith Landman, Linguistics

Dissertations from 2005 2005

On the Accessiblity of Possible Worlds: The Role of Tense and Aspect , Ana Cristina Arregui

Perception of foreignness , Ben Gelbart

Prosody and LF interpretation: Processing Japanese wh -questions , Masako Hirotani

The grammar of choice , Paula Menendez-Benito

Mediated *modification: Functional structure and the interpretation of modifier position , Marcin Morzycki

Dissertations from 2004 2004

What it means to be a loser: Non -optimal candidates in optimality theory , Andries W Coetzee

Scope: The View from Indefinites , Ji-Yung Kim

Event-structure and the internally headed relative clause construction in Korean and Japanese , Min-Joo Kim

Spain or bust? Assessment and student perceptions of out-of-class contact and oral proficiency in a study abroad context , Vija Glazer Mendelson

On the articulation of aspectual meaning in African -American English , Jules Michael Eugene Terry

Dissertations from 2003 2003

Deriving Economy: Syncope in Optimality Theory , Maria Gouskova

Gestures and segments: Vowel intrusion as overlap , Nancy Elizabeth Hall

The development of phonological categories in children's perception of final voicing in dialects of English , Caroline Jones

Argument structure and the lexicon /syntax interface , Eva Juarros

Contrast preservation in phonological mappings , Anna Lubowicz

Advanced Search

  • Notify me via email or RSS
  • Collections
  • Disciplines

Author Corner

  • Login for Faculty Authors
  • Faculty Author Gallery
  • Expert Gallery
  • University Libraries
  • Linguistics Website
  • UMass Amherst

This page is sponsored by the University Libraries.

© 2009 University of Massachusetts Amherst • Site Policies

Privacy Copyright

the Computational Linguistics & Text Mining Lab

Thesis topics.

We offer a wide variety of research topics for Master theses Text Mining and Human Language Technology, both with or without internships centered around language and technology. We provide an overview with suggestions on theses below.

You can find examples of thesis projects of recent and soon to be graduates from the Research Master’s in  Human Language Technology  and the Master’s in Text Mining here. 

Feel free to contact us for more information on these and other possible topics. Theses can be written and supervised in English or Dutch (depending on the topic and preference) Primary contact: Dr. Hennie van der Vliet .

More information:

  • General Information

Topics focusing on Natural Language Processing

Topics focusing on linguistics and language resources, topics focusing on knowledge representation and reasoning.

  • Topics focusing on Digital Humanities (and Social Science)

CLTL is the Computational Lexicology and Terminology Lab, headed by Piek Vossen. We study computational linguistics or natural language processing (NLP). We are interested in how language works and how we can analyse it using computers. We work on automatically getting knowledge from text . This is becoming more and more popular, as all the large technology companies (e.g. Google, IBM, Microsoft and Facebook) are investing in big data and language technology . At the same time, natural language processing is one of the core aspects of digital humanities research. We are collaborating with literature, history and social science researchers to explore the potential of NLP tools in their line of work, automatically analysing thousands of documents.Just imagine what you can do with all that data!

Computational Linguistics operates on the interface between computer science and linguistics. We have topics that require different levels of technical skill as well as different levels of linguistic knowledge. Feel free to come and have a chat if any of the topics below seem appealing to you.

How does automatic text analysis work? Which tools are available and what can they do? Do they deliver what they promise on new text? Can the results of the state-of-the-art be replicated? How can existing technology be improved?

We work on several technologies that can be adapted for a domain or Dutch, or simply tested and improved. Topics with NLP focus are mainly interesting for people with a strong technical background and programming skills, but it is also possible to study the outcome of tools and analyze what mistakes they make and why.

  • Provide Dutch language support to TermSuite ( http://termsuite.github.io/ ). Termsuite is an open source term extraction tool that is very useful if you want to extract keywords from a text. It supports multiple languages, but so far there is no support for Dutch yet. We do have all the resources that are necessary to add Dutch language support, but they need a little tweaking.
  • Improve the state-of-the-art for Dutch language technology. At the 25th edition of the Computational Linguistics In the Netherlands (CLIN) conference, we ran the first shared task for Dutch, where several teams of computational linguists tried to see which tools are the best at annotating texts. Next year there will be another shared task. Could you win the competition?
  • Search for stories in a large structured database of events extracted from text.
  • In what ways do different sources refer to the same event? What variations can be observed?
  • Event extraction from large data repository: being able to identify what happened (and who are the participants) is not an easy task. Supervised methods have provided good results but also show limits. This project aims at investigating unsupervised methods for event extraction and classification (the types of events) using unsupervised or semi-supervised methods.
  • Temporal Relation Processing: being able to anchor and order in time events (and their participants) is the first step developing more robust NLP systems for information extraction, question answering, and summarisation, among others. The goal of this project is to develop systems which are able to anchor and order events in time, thus providing the users with what is called a timeline of events. Different datasets are available both for single document and cross-document temporal processing, and in different domains (news and clinical data). Extensions to existing annotated data are encouraged as strategies to overcome current limits of current state-of-the art systems.
  • Storyline Extraction: this project aims at extracting stories from large collections of news clustered per topic and spanning over a time period. The main research questions are: 1) are there patterns in which news stories about an event are reported? (e.g. is there a narrative pattern to report on natural disasters? is the way natural distress are reported different from man-made disasters?); 2) in which ways are events connected together so as to form a coherent story?; 3) given a collection of documents on a certain topic and spanning over a period of time, how can we identify the most important events or rank events with respect to their salience?
  • Content Types Extraction: different types of information is expressed in a document. For instance, in a news article you can find both portions of the document reporting on things that happened (i.e. a narration) and opinions and comments (i.e. argumentations). The goal of this project is to develop systems which can detect the content types expressed in documents, such as novels (fictional and non-fictional), news articles, and other text genres, and then use this information to improve the performance of NLP for high-level semantic tasks (e.g. temporal relation extraction, sentiment analysis, entity typing, among others).
  • Sentiment analysis What opinions do people have and how do they express them? How does this change from one domain (e.g. hotel reviews) to another (e.g. news articles)? See also the more detailed description under topics focusing on linguistics and language resources .
  • (Domain-specific) Entity Linking  Which entities (people, organisations, locations, etc.) are mentioned in text? Are they popular enough to be described in Wikipedia? If not, can we build a profile based on the information from text? What knowledge is needed to link these entities to their representation on Wikipedia (or other knowledge base) correctly? Does the type of knowledge vary per topic and time? How to acquire knowledge in a given domain, e.g. historical texts?

How does language work and how can we model it in such a way that a computer can work with it? But also: what does computational linguistics have to offer to linguists (verifying theories through implementation or corpus study).

Topics in this area are interesting for both people with strong linguistic background as well as people who like to build interfaces and resources.

  • Open Source Wordnet . we are building a wordnet database for Dutch that is open source and can be downloaded for free. A wordnet is a semantic network with all the words of a language connected through semantic relations. This database is derived from various sources. Each wordnet groups word meanings in different ways and the open source wordnet combines structures of both the English and the original Dutch wordnet. We need help from students to study the existing wordnet structures in English and Dutch and to evaluate the fit of the open source wordnet to both.
  • A Dutch FrameNet : our group built the Referentie Bestand Nederlands (RBN). It has rich information about the combinatorics of words in particular meanings. For example “behandelen” can refer to social interaction or medical treatment. Only in the latter meaning we say “behandelen aan iets”. The combinatorics in RBN are represented in various ways. At this very moment, our group is building a Dutch Frame Net, in cooperation with Groningen University. We can think of many possibilities for theses on this project. One of them is   finding out how the RBN-entries match the FrameNet structure that was developed for English at Berkeley and how well FrameNet can be mapped to our Dutch words and meanings.
  • Annotation of opinionated text, how detailed can opinions be described consistently by humans and how well do computers learn from this annotation?
  • Can sentiment and opinion analysis be used to obtain information on overall opinions and positions of groups in society or can it be used to track changes over time?
  • Are sentiments expressed differently across genres and what is the impact of genre on sentiment analysis systems?
  • What is the quality of current sentiment lexica? What can they do and where do they fail?

We have various projects where we mine text, extract information and represent this formally using RDF . This allows us to link information extracted from text to other resources and it allows end-users to query the data we extract. Research related to these topics involve ontology design and evaluation as well as evaluating and improving the results of our NLP analyses.

This topics are mainly interesting for students with some background in data representation. Topics with a higher or lower technical component can be found.

Research topics in this area include:

  • Ontology design: what definitions are needed to represent relevant information? How well do the ontologies we currently use work? What can they do and what not?
  • What patterns do you observe in the data? E.g. which events occur with the same entities? What stories can be found in the data?
  • What is the quality of the extracted data? What are common errors? How can you track them?

Topics focusing on Digital Humanities (and Social Sciences)

There are many digitized resources that are relevant for researchers in the humanities. We have various projects where we apply NLP technologies to automatically analyze text. The output of these analyses can be used by historians, specialists in language and literature, philosophers, communication scientist, sociologists and many others.

Topics in this area can be of interest to students of various backgrounds: people with a strong background in computer science or linguistics and who are interested in other domains of the humanities or social sciences can work on a topic where they use their expertise to support researchers in these various fields. Students with a background in other fields of the humanities or social sciences who are interested in text analysis can work on a topic where they investigate what NLP has to offer them.

Here are a few examples of possible projects in this domain:

  • Analyzing politics using N-grams . Build an N-gram viewer, similar to https://projects.fivethirtyeight.com/reddit-ngram/ , for Dutch political debates. What are the trends in language used by politicians?
  • Mining historical figures . In the BiographyNet project, we analyze biographical descriptions using NLP tools. We have approximately 80,000 biographies from various sources. Several research questions can be addressed from basic question (what properties do people who are included share?) up to highly complex (how does the perspective on specific people change over time?). Research can be conducted on the output of the tools, on improving the tools for the domain or specific sources as well as overall methodological questions.
  • Identifying perspectives What opinions are expressed in text? How are specific events, people and organizations depicted? Can we identify biases in particular source (e.g. can we spot differences between left-wing and right-wing papers?). We have several projects that look into perspectives in text, notably Spinoza project ULM3 World views as a key to understanding language , AAA data science project QuPiD and the project Reading between the lines .

Got any suggestions?

We want to hear from you! Send us a message and help improve Slidesgo

Top searches

Trending searches

thesis computational linguistics

8 templates

thesis computational linguistics

55 templates

thesis computational linguistics

ai technology

148 templates

thesis computational linguistics


14 templates

thesis computational linguistics

13 templates

thesis computational linguistics

9 templates

Applied Computational Linguistics Master's Thesis

Applied computational linguistics master's thesis presentation, free google slides theme and powerpoint template.

Applied computational linguistics consists of creating linguistic applications using computational means. Undoubtedly, it is a complex topic that requires research, so if your thesis is about it and you need a template that will make you shine in your defense, today we bring you a proposal that fits it. Its gray background brings seriousness to the topic, and its sans serif typefaces give it a modern look. Customize it to your liking and get ready to succeed.

Features of this template

  • 100% editable and easy to modify
  • 30 different slides to impress your audience
  • Contains easy-to-edit graphics such as graphs, maps, tables, timelines and mockups
  • Includes 500+ icons and Flaticon’s extension for customizing your slides
  • Designed to be used in Google Slides and Microsoft PowerPoint
  • 16:9 widescreen format suitable for all types of screens
  • Includes information about fonts, colors, and credits of the resources used

How can I use the template?

Am I free to use the templates?

How to attribute?

Attribution required If you are a free user, you must attribute Slidesgo by keeping the slide where the credits appear. How to attribute?

Related posts on our blog.

How to Add, Duplicate, Move, Delete or Hide Slides in Google Slides | Quick Tips & Tutorial for your presentations

How to Add, Duplicate, Move, Delete or Hide Slides in Google Slides

How to Change Layouts in PowerPoint | Quick Tips & Tutorial for your presentations

How to Change Layouts in PowerPoint

How to Change the Slide Size in Google Slides | Quick Tips & Tutorial for your presentations

How to Change the Slide Size in Google Slides

Related presentations.

Spanish Applied Linguistics - Master of Arts in Spanish presentation template

Register for free and start editing online

Site Logo

Spring 2024 Colloquium - Alexis Wellwood - Graded Plurals and Indeterminacy

  • by Nicholas Bamshad Aoki
  • April 08, 2024

Abstract:   The compositional semantics of a sentence like (1a) is relatively uncontroversial, but no consensus about that of a sentence like (1b) has yet been achieved.

  (1) a. The red dot is bigger than the blue dot.

    b. The red dots are bigger than the blue dots.

  Early hypotheses have been claimed to be too strong (e.g. (1b) is true iff every red dot is bigger than every blue dot), others too weak (e.g., (1b) is true iff the biggest dot is red and the smallest is blue), and more recent approaches raise issues of their own. Such disagreement is puzzling in light of the apparently minimal grammatical differences between sentences like (1a) and (1b). Reporting the results of a series of experiments conducted in the USC Meaning Lab, I contrast and test extant proposals that assign distinct, determinate truth conditions to sentences like (1b) against the suggestion that their meanings ultimately fail to determine any. The empirical evidence—canvassed using sentences about different kinds of objects, expressing comparisons along different dimensions, in the positive and negative declarative forms, and evaluated under time pressure or not—appears to provide strong evidence for an indeterminacy thesis. If so, these results challenge the assumption that linguistic meanings functionally deliver truth conditions, and raise new questions about the life of linguistic meanings in the mind.

Event Logistics:  3pm in Kerr 273 on April 8th

Speaker Biography: Dr. Alexis Wellwood   is an Associate Professor of Philosophy and Linguistics at USC.


  1. PPT

    thesis computational linguistics

  2. What Is Computational Linguistics? Definition and Career Info

    thesis computational linguistics

  3. PPT

    thesis computational linguistics

  4. The 20 Best Computational Linguistics Graduate Programs in the U.S

    thesis computational linguistics

  5. From Linguistics to Computational Linguistics

    thesis computational linguistics

  6. PPT

    thesis computational linguistics


  1. Closure Properties of Context Free Languages1

  2. How to Defend Your MS/MPhil/PhD Research Thesis

  3. How to Start your Writing

  4. Session 2

  5. Computational linguistics- An Introduction --- Dr G.Praveen

  6. Introduction to Thesis Proposal Seminar Presentation


  1. PDF A Guide to Writing a Senior Thesis in Linguistics

    A linguistics thesis is an original research project undertaken during your senior year at Harvard College . You will conduct research into past literature on your topic, con- ... computational linguistics) you've studied during your time as a concentrator at the department During the course of your project, you'll work closely with an ...

  2. Theses & Dissertations

    Thesis Collections. Theses and dissertations are a key source for finding the latest scholarship, additional material such as data sets, and detailed research. ... including computational linguistics and related disciplines. Includes most UW dissertations and theses published since 2012. ResearchWorks Archive: Linguistics.

  3. Masters Theses

    Grammar, Computational Linguistics: Prescott Klassen. "Calculating LLR Topic Signatures with Dependency Relations for Automatic Text Summarization." MS Thesis. U of Washington, 2012. Graduate, Masters Theses: Computational Linguistics: Joshua Crowgey. "The Syntactic Exponence of Sentential Negation: a model for the LinGO Grammar Matrix." MA Thesis.

  4. PDF Linguistic Knowledge in Data-Driven Natural Language Processing

    Thesis Committee: Chris Dyer (chair), Carnegie Mellon University Alan Black, Carnegie Mellon University ... The first few decades of research in computational linguistics were dedicated to the devel-opment of computational (and mathematical) models of various levels of linguistic represen-

  5. Computational Linguistics

    The computational linguistics program at Stanford is one of the oldest in the country, and offers a wide range of courses and research opportunities. Research. We take a very broad view of computational linguistics, from theoretical investigations to practical natural language processing applications, ranging across linguistic areas like ...

  6. Master of Science in Computational Linguistics : Graduate Program

    The computational linguistics master's program at Rochester trains students to be conversant both in language analysis and computational techniques applied to natural language. ... A fourth semester is for students to prepare their program's final assignment, project, or thesis. Linguistics Courses Prerequisite. Students are required to have ...

  7. Theses at Department Theoretical Computational Linguistics

    Thesis Theoretical Computational Linguistics; Theses at Department Theoretical Computational Linguistics. How do you find a thesis topic and how do you prepare for it. From finding a topic to registration and submission, everything is discussed. Introduction.

  8. [1311.1539] Category-Theoretic Quantitative Compositional

    There are three principal contributions to computational linguistics in this thesis. The first is to extend the DisCoCat framework on the syntactic front and semantic front, incorporating a number of syntactic analysis formalisms and providing learning procedures allowing for the generation of concrete compositional distributional models. The ...

  9. Academic Experience

    The master's in computational linguistics is a nine-course, 43-credit program that culminates with a master's project. The format is flexible — you can study part time or full time and take classes online, on campus or both. Full-time students take three courses per quarter for three quarters and then complete the master's project over the ...

  10. Master's Thesis

    The topic of the Master's thesis must be agreed upon and approved by a representative of the department (a professor) of Computational Linguistics. The agreed and approved topic will be recorded on the topic sheet (see below). The representative of the department can delegate this task to a member of middle management.

  11. Complexity in Linguistics

    Cobham-Edmonds Thesis The class of practically computable problems is identical to the PTIME ... computational complexity of parsing and recognition has become a major topic along with the development of computational linguistics. Footnote 6 In general, the results show that even for relatively simple grammatical frameworks some problems might ...

  12. Graduate Studies in Computational Linguistics

    To complete a thesis, students must enroll in the 'independent instruction' course COSI 299 - Computational Linguistics Master's Thesis. This involves (analogously to COSI 293b - Exit Requirement Internship Course) registering for an individual section taught by one of the CL faculty—in this case, whatever faculty member is supervising the ...

  13. Dissertations

    Serial verb constructions and the linker in Nuuchahnulth. PhD thesis, University of Washington. Graduate, Dissertations: American Indian/Native American, Computational Linguistics, Morphology, Syntax: Chak-Lam Colum Yip. "Evidence for DP in Chinese from Reduplicative Classifiers and DP-Internal Structural Phenomena." Diss. U of Washington, 2018.

  14. Computational Linguistics

    Computational Linguistics is the longest-running publication devoted exclusively to the computational and mathematical properties of language and the design and analysis of natural language processing systems. This highly regarded quarterly offers university and industry linguists, computational linguists, artificial intelligence and machine learning investigators, cognitive scientists, speech ...

  15. Modeling Thesis Clarity in Student Essays

    persing-ng-2013-modeling. Cite (ACL): Isaac Persing and Vincent Ng. 2013. Modeling Thesis Clarity in Student Essays. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 260-269, Sofia, Bulgaria. Association for Computational Linguistics.

  16. M.A. with Computational Linguistics Concentration

    The M.A. in Computational Linguistics requires thirty-two (32) credits of approved graduate course work, ... The student will seek out a tenured or tenure-tracked faculty member in the Linguistics Program to serve as their thesis research supervisor. The research topic must be approved by student's chosen faculty supervisor.

  17. Graduate Studies in Computational Linguistics

    Through skill-building, our courses complement the learning process during on campus computational linguistics research work with faculty and off campus internships. Through the required internship, capstone project and/or thesis (our "exit requirement"), students gain invaluable work and independent research experience to add to their ...

  18. Linguistics with a Specialisation in Computational Linguistics MA

    The Linguistics MA with a specialisation in Computational Linguistics aims to give students a thorough grounding in both theoretical and computational linguistics. Students gain a basic understanding of the three core areas of linguistics: phonetics and phonology; syntax; and semantics and pragmatics, plus they will be introduced to algorithms and models that implement

  19. Computational Linguistics

    Computational linguistics, to simplify greatly, is the application of linguistic theory and description to the interpretation, analysis and generation of linguistic units such as sentences and even whole texts in digital systems. Systemic Functional Linguistics (SFL), as an appliable theory of language, has been applied in computational ...

  20. Computational Linguistics and Deep Learning

    Deep Learning waves have lapped at the shores of computational linguistics for several years now, but 2015 seems like the year when the full force of the tsunami hit the major Natural Language Processing (NLP) conferences. However, some pundits are predicting that the final damage will be even worse. Accompanying ICML 2015 in Lille, France, there was another, almost as big, event: the 2015 ...

  21. Linguistics Department Dissertations Collection

    Dissertations from 2021. Shifting the Perspectival Landscape: Methods for Encoding, Identifying, and Selecting Perspectives, Carolyn Jane Anderson, Linguistics. There and Gone Again: Syntactic Structure In Memory, Caroline Andrews, Linguistics. The Event Structure of Attitudes, Deniz Özyıldız, Linguistics.

  22. Thesis topics

    Computational Linguistics operates on the interface between computer science and linguistics. We have topics that require different levels of technical skill as well as different levels of linguistic knowledge. Feel free to come and have a chat if any of the topics below seem appealing to you. Topics focusing on Natural Language Processing

  23. Applied Computational Linguistics Master's Thesis

    Applied computational linguistics consists of creating linguistic applications using computational means. Undoubtedly, it is a complex topic that requires research, so if your thesis is about it and you need a template that will make you shine in your defense, today we bring you a proposal that fits it. Its gray background brings seriousness to ...

  24. Spring 2024 Colloquium

    Abstract: The compositional semantics of a sentence like (1a) is relatively uncontroversial, but no consensus about that of a sentence like (1b) has yet been achieved. (1) a. The red dot is bigger than the blue dot. b. The red dots are bigger than the blue dots. Early hypotheses have been claimed to be too strong (e.g. (1b) is true iff every red dot is bigger than every blue dot), others too ...