data machine learning thesis

Machine Learning - CMU

PhD Dissertations

[all are .pdf files].

Learning Models that Match Jacob Tyo, 2024

Improving Human Integration across the Machine Learning Pipeline Charvi Rastogi, 2024

Reliable and Practical Machine Learning for Dynamic Healthcare Settings Helen Zhou, 2023

Automatic customization of large-scale spiking network models to neuronal population activity (unavailable) Shenghao Wu, 2023

Estimation of BVk functions from scattered data (unavailable) Addison J. Hu, 2023

Rethinking object categorization in computer vision (unavailable) Jayanth Koushik, 2023

Advances in Statistical Gene Networks Jinjin Tian, 2023 Post-hoc calibration without distributional assumptions Chirag Gupta, 2023

The Role of Noise, Proxies, and Dynamics in Algorithmic Fairness Nil-Jana Akpinar, 2023

Collaborative learning by leveraging siloed data Sebastian Caldas, 2023

Modeling Epidemiological Time Series Aaron Rumack, 2023

Human-Centered Machine Learning: A Statistical and Algorithmic Perspective Leqi Liu, 2023

Uncertainty Quantification under Distribution Shifts Aleksandr Podkopaev, 2023

Probabilistic Reinforcement Learning: Using Data to Define Desired Outcomes, and Inferring How to Get There Benjamin Eysenbach, 2023

Comparing Forecasters and Abstaining Classifiers Yo Joong Choe, 2023

Using Task Driven Methods to Uncover Representations of Human Vision and Semantics Aria Yuan Wang, 2023

Data-driven Decisions - An Anomaly Detection Perspective Shubhranshu Shekhar, 2023

Applied Mathematics of the Future Kin G. Olivares, 2023

METHODS AND APPLICATIONS OF EXPLAINABLE MACHINE LEARNING Joon Sik Kim, 2023

NEURAL REASONING FOR QUESTION ANSWERING Haitian Sun, 2023

Principled Machine Learning for Societally Consequential Decision Making Amanda Coston, 2023

Long term brain dynamics extend cognitive neuroscience to timescales relevant for health and physiology Maxwell B. Wang, 2023

Long term brain dynamics extend cognitive neuroscience to timescales relevant for health and physiology Darby M. Losey, 2023

Calibrated Conditional Density Models and Predictive Inference via Local Diagnostics David Zhao, 2023

Towards an Application-based Pipeline for Explainability Gregory Plumb, 2022

Objective Criteria for Explainable Machine Learning Chih-Kuan Yeh, 2022

Making Scientific Peer Review Scientific Ivan Stelmakh, 2022

Facets of regularization in high-dimensional learning: Cross-validation, risk monotonization, and model complexity Pratik Patil, 2022

Active Robot Perception using Programmable Light Curtains Siddharth Ancha, 2022

Strategies for Black-Box and Multi-Objective Optimization Biswajit Paria, 2022

Unifying State and Policy-Level Explanations for Reinforcement Learning Nicholay Topin, 2022

Sensor Fusion Frameworks for Nowcasting Maria Jahja, 2022

Equilibrium Approaches to Modern Deep Learning Shaojie Bai, 2022

Towards General Natural Language Understanding with Probabilistic Worldbuilding Abulhair Saparov, 2022

Applications of Point Process Modeling to Spiking Neurons (Unavailable) Yu Chen, 2021

Neural variability: structure, sources, control, and data augmentation Akash Umakantha, 2021

Structure and time course of neural population activity during learning Jay Hennig, 2021

Cross-view Learning with Limited Supervision Yao-Hung Hubert Tsai, 2021

Meta Reinforcement Learning through Memory Emilio Parisotto, 2021

Learning Embodied Agents with Scalably-Supervised Reinforcement Learning Lisa Lee, 2021

Learning to Predict and Make Decisions under Distribution Shift Yifan Wu, 2021

Statistical Game Theory Arun Sai Suggala, 2021

Towards Knowledge-capable AI: Agents that See, Speak, Act and Know Kenneth Marino, 2021

Learning and Reasoning with Fast Semidefinite Programming and Mixing Methods Po-Wei Wang, 2021

Bridging Language in Machines with Language in the Brain Mariya Toneva, 2021

Curriculum Learning Otilia Stretcu, 2021

Principles of Learning in Multitask Settings: A Probabilistic Perspective Maruan Al-Shedivat, 2021

Towards Robust and Resilient Machine Learning Adarsh Prasad, 2021

Towards Training AI Agents with All Types of Experiences: A Unified ML Formalism Zhiting Hu, 2021

Building Intelligent Autonomous Navigation Agents Devendra Chaplot, 2021

Learning to See by Moving: Self-supervising 3D Scene Representations for Perception, Control, and Visual Reasoning Hsiao-Yu Fish Tung, 2021

Statistical Astrophysics: From Extrasolar Planets to the Large-scale Structure of the Universe Collin Politsch, 2020

Causal Inference with Complex Data Structures and Non-Standard Effects Kwhangho Kim, 2020

Networks, Point Processes, and Networks of Point Processes Neil Spencer, 2020

Dissecting neural variability using population recordings, network models, and neurofeedback (Unavailable) Ryan Williamson, 2020

Predicting Health and Safety: Essays in Machine Learning for Decision Support in the Public Sector Dylan Fitzpatrick, 2020

Towards a Unified Framework for Learning and Reasoning Han Zhao, 2020

Learning DAGs with Continuous Optimization Xun Zheng, 2020

Machine Learning and Multiagent Preferences Ritesh Noothigattu, 2020

Learning and Decision Making from Diverse Forms of Information Yichong Xu, 2020

Towards Data-Efficient Machine Learning Qizhe Xie, 2020

Change modeling for understanding our world and the counterfactual one(s) William Herlands, 2020

Machine Learning in High-Stakes Settings: Risks and Opportunities Maria De-Arteaga, 2020

Data Decomposition for Constrained Visual Learning Calvin Murdock, 2020

Structured Sparse Regression Methods for Learning from High-Dimensional Genomic Data Micol Marchetti-Bowick, 2020

Towards Efficient Automated Machine Learning Liam Li, 2020

LEARNING COLLECTIONS OF FUNCTIONS Emmanouil Antonios Platanios, 2020

Provable, structured, and efficient methods for robustness of deep networks to adversarial examples Eric Wong , 2020

Reconstructing and Mining Signals: Algorithms and Applications Hyun Ah Song, 2020

Probabilistic Single Cell Lineage Tracing Chieh Lin, 2020

Graphical network modeling of phase coupling in brain activity (unavailable) Josue Orellana, 2019

Strategic Exploration in Reinforcement Learning - New Algorithms and Learning Guarantees Christoph Dann, 2019 Learning Generative Models using Transformations Chun-Liang Li, 2019

Estimating Probability Distributions and their Properties Shashank Singh, 2019

Post-Inference Methods for Scalable Probabilistic Modeling and Sequential Decision Making Willie Neiswanger, 2019

Accelerating Text-as-Data Research in Computational Social Science Dallas Card, 2019

Multi-view Relationships for Analytics and Inference Eric Lei, 2019

Information flow in networks based on nonstationary multivariate neural recordings Natalie Klein, 2019

Competitive Analysis for Machine Learning & Data Science Michael Spece, 2019

The When, Where and Why of Human Memory Retrieval Qiong Zhang, 2019

Towards Effective and Efficient Learning at Scale Adams Wei Yu, 2019

Towards Literate Artificial Intelligence Mrinmaya Sachan, 2019

Learning Gene Networks Underlying Clinical Phenotypes Under SNP Perturbations From Genome-Wide Data Calvin McCarter, 2019

Unified Models for Dynamical Systems Carlton Downey, 2019

Anytime Prediction and Learning for the Balance between Computation and Accuracy Hanzhang Hu, 2019

Statistical and Computational Properties of Some "User-Friendly" Methods for High-Dimensional Estimation Alnur Ali, 2019

Nonparametric Methods with Total Variation Type Regularization Veeranjaneyulu Sadhanala, 2019

New Advances in Sparse Learning, Deep Networks, and Adversarial Learning: Theory and Applications Hongyang Zhang, 2019

Gradient Descent for Non-convex Problems in Modern Machine Learning Simon Shaolei Du, 2019

Selective Data Acquisition in Learning and Decision Making Problems Yining Wang, 2019

Anomaly Detection in Graphs and Time Series: Algorithms and Applications Bryan Hooi, 2019

Neural dynamics and interactions in the human ventral visual pathway Yuanning Li, 2018

Tuning Hyperparameters without Grad Students: Scaling up Bandit Optimisation Kirthevasan Kandasamy, 2018

Teaching Machines to Classify from Natural Language Interactions Shashank Srivastava, 2018

Statistical Inference for Geometric Data Jisu Kim, 2018

Representation Learning @ Scale Manzil Zaheer, 2018

Diversity-promoting and Large-scale Machine Learning for Healthcare Pengtao Xie, 2018

Distribution and Histogram (DIsH) Learning Junier Oliva, 2018

Stress Detection for Keystroke Dynamics Shing-Hon Lau, 2018

Sublinear-Time Learning and Inference for High-Dimensional Models Enxu Yan, 2018

Neural population activity in the visual cortex: Statistical methods and application Benjamin Cowley, 2018

Efficient Methods for Prediction and Control in Partially Observable Environments Ahmed Hefny, 2018

Learning with Staleness Wei Dai, 2018

Statistical Approach for Functionally Validating Transcription Factor Bindings Using Population SNP and Gene Expression Data Jing Xiang, 2017

New Paradigms and Optimality Guarantees in Statistical Learning and Estimation Yu-Xiang Wang, 2017

Dynamic Question Ordering: Obtaining Useful Information While Reducing User Burden Kirstin Early, 2017

New Optimization Methods for Modern Machine Learning Sashank J. Reddi, 2017

Active Search with Complex Actions and Rewards Yifei Ma, 2017

Why Machine Learning Works George D. Montañez , 2017

Source-Space Analyses in MEG/EEG and Applications to Explore Spatio-temporal Neural Dynamics in Human Vision Ying Yang , 2017

Computational Tools for Identification and Analysis of Neuronal Population Activity Pengcheng Zhou, 2016

Expressive Collaborative Music Performance via Machine Learning Gus (Guangyu) Xia, 2016

Supervision Beyond Manual Annotations for Learning Visual Representations Carl Doersch, 2016

Exploring Weakly Labeled Data Across the Noise-Bias Spectrum Robert W. H. Fisher, 2016

Optimizing Optimization: Scalable Convex Programming with Proximal Operators Matt Wytock, 2016

Combining Neural Population Recordings: Theory and Application William Bishop, 2015

Discovering Compact and Informative Structures through Data Partitioning Madalina Fiterau-Brostean, 2015

Machine Learning in Space and Time Seth R. Flaxman, 2015

The Time and Location of Natural Reading Processes in the Brain Leila Wehbe, 2015

Shape-Constrained Estimation in High Dimensions Min Xu, 2015

Spectral Probabilistic Modeling and Applications to Natural Language Processing Ankur Parikh, 2015 Computational and Statistical Advances in Testing and Learning Aaditya Kumar Ramdas, 2015

Corpora and Cognition: The Semantic Composition of Adjectives and Nouns in the Human Brain Alona Fyshe, 2015

Learning Statistical Features of Scene Images Wooyoung Lee, 2014

Towards Scalable Analysis of Images and Videos Bin Zhao, 2014

Statistical Text Analysis for Social Science Brendan T. O'Connor, 2014

Modeling Large Social Networks in Context Qirong Ho, 2014

Semi-Cooperative Learning in Smart Grid Agents Prashant P. Reddy, 2013

On Learning from Collective Data Liang Xiong, 2013

Exploiting Non-sequence Data in Dynamic Model Learning Tzu-Kuo Huang, 2013

Mathematical Theories of Interaction with Oracles Liu Yang, 2013

Short-Sighted Probabilistic Planning Felipe W. Trevizan, 2013

Statistical Models and Algorithms for Studying Hand and Finger Kinematics and their Neural Mechanisms Lucia Castellanos, 2013

Approximation Algorithms and New Models for Clustering and Learning Pranjal Awasthi, 2013

Uncovering Structure in High-Dimensions: Networks and Multi-task Learning Problems Mladen Kolar, 2013

Learning with Sparsity: Structures, Optimization and Applications Xi Chen, 2013

GraphLab: A Distributed Abstraction for Large Scale Machine Learning Yucheng Low, 2013

Graph Structured Normal Means Inference James Sharpnack, 2013 (Joint Statistics & ML PhD)

Probabilistic Models for Collecting, Analyzing, and Modeling Expression Data Hai-Son Phuoc Le, 2013

Learning Large-Scale Conditional Random Fields Joseph K. Bradley, 2013

New Statistical Applications for Differential Privacy Rob Hall, 2013 (Joint Statistics & ML PhD)

Parallel and Distributed Systems for Probabilistic Reasoning Joseph Gonzalez, 2012

Spectral Approaches to Learning Predictive Representations Byron Boots, 2012

Attribute Learning using Joint Human and Machine Computation Edith L. M. Law, 2012

Statistical Methods for Studying Genetic Variation in Populations Suyash Shringarpure, 2012

Data Mining Meets HCI: Making Sense of Large Graphs Duen Horng (Polo) Chau, 2012

Learning with Limited Supervision by Input and Output Coding Yi Zhang, 2012

Target Sequence Clustering Benjamin Shih, 2011

Nonparametric Learning in High Dimensions Han Liu, 2010 (Joint Statistics & ML PhD)

Structural Analysis of Large Networks: Observations and Applications Mary McGlohon, 2010

Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy Brian D. Ziebart, 2010

Tractable Algorithms for Proximity Search on Large Graphs Purnamrita Sarkar, 2010

Rare Category Analysis Jingrui He, 2010

Coupled Semi-Supervised Learning Andrew Carlson, 2010

Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong, 2009

Efficient Matrix Models for Relational Learning Ajit Paul Singh, 2009

Exploiting Domain and Task Regularities for Robust Named Entity Recognition Andrew O. Arnold, 2009

Theoretical Foundations of Active Learning Steve Hanneke, 2009

Generalized Learning Factors Analysis: Improving Cognitive Models with Machine Learning Hao Cen, 2009

Detecting Patterns of Anomalies Kaustav Das, 2009

Dynamics of Large Networks Jurij Leskovec, 2008

Computational Methods for Analyzing and Modeling Gene Regulation Dynamics Jason Ernst, 2008

Stacked Graphical Learning Zhenzhen Kou, 2007

Actively Learning Specific Function Properties with Applications to Statistical Inference Brent Bryan, 2007

Approximate Inference, Structure Learning and Feature Estimation in Markov Random Fields Pradeep Ravikumar, 2007

Scalable Graphical Models for Social Networks Anna Goldenberg, 2007

Measure Concentration of Strongly Mixing Processes with Applications Leonid Kontorovich, 2007

Tools for Graph Mining Deepayan Chakrabarti, 2005

Automatic Discovery of Latent Variable Models Ricardo Silva, 2005

Warning : Invalid argument supplied for foreach() in /home/customer/www/opendatascience.com/public_html/wp-includes/nav-menu.php on line 95 Warning : array_merge(): Expected parameter 2 to be an array, null given in /home/customer/www/opendatascience.com/public_html/wp-includes/nav-menu.php on line 102
ODSC EUROPE
AI+ Training
Speak at ODSC

Data Analytics
Data Engineering
Data Visualization
Deep Learning
Generative AI
Machine Learning
NLP and LLMs
Business & Use Cases
Career Advice
Write for us
ODSC Community Slack Channel
Upcoming Webinars

10 Compelling Machine Learning Ph.D. Dissertations for 2020

Machine Learning Modeling Research posted by Daniel Gutierrez, ODSC August 19, 2020 Daniel Gutierrez, ODSC

As a data scientist, an integral part of my work in the field revolves around keeping current with research coming out of academia. I frequently scour arXiv.org for late-breaking papers that show trends and reveal fertile areas of research. Other sources of valuable research developments are in the form of Ph.D. dissertations, the culmination of a doctoral candidate’s work to confer his/her degree. Ph.D. candidates are highly motivated to choose research topics that establish new and creative paths toward discovery in their field of study. Their dissertations are highly focused on a specific problem. If you can find a dissertation that aligns with your areas of interest, consuming the research is an excellent way to do a deep dive into the technology. After reviewing hundreds of recent theses from universities all over the country, I present 10 machine learning dissertations that I found compelling in terms of my own areas of interest.

[Related article: Introduction to Bayesian Deep Learning ]

I hope you’ll find several that match your own fields of inquiry. Each thesis may take a while to consume but will result in hours of satisfying summer reading. Enjoy!

1. Bayesian Modeling and Variable Selection for Complex Data

As we routinely encounter high-throughput data sets in complex biological and environmental research, developing novel models and methods for variable selection has received widespread attention. This dissertation addresses a few key challenges in Bayesian modeling and variable selection for high-dimensional data with complex spatial structures.

2. Topics in Statistical Learning with a Focus on Large Scale Data

Big data vary in shape and call for different approaches. One type of big data is the tall data, i.e., a very large number of samples but not too many features. This dissertation describes a general communication-efficient algorithm for distributed statistical learning on this type of big data. The algorithm distributes the samples uniformly to multiple machines, and uses a common reference data to improve the performance of local estimates. The algorithm enables potentially much faster analysis, at a small cost to statistical performance.

Another type of big data is the wide data, i.e., too many features but a limited number of samples. It is also called high-dimensional data, to which many classical statistical methods are not applicable.

This dissertation discusses a method of dimensionality reduction for high-dimensional classification. The method partitions features into independent communities and splits the original classification problem into separate smaller ones. It enables parallel computing and produces more interpretable results.

3. Sets as Measures: Optimization and Machine Learning

The purpose of this machine learning dissertation is to address the following simple question:

How do we design efficient algorithms to solve optimization or machine learning problems where the decision variable (or target label) is a set of unknown cardinality?

Optimization and machine learning have proved remarkably successful in applications requiring the choice of single vectors. Some tasks, in particular many inverse problems, call for the design, or estimation, of sets of objects. When the size of these sets is a priori unknown, directly applying optimization or machine learning techniques designed for single vectors appears difficult. The work in this dissertation shows that a very old idea for transforming sets into elements of a vector space (namely, a space of measures), a common trick in theoretical analysis, generates effective practical algorithms.

4. A Geometric Perspective on Some Topics in Statistical Learning

Modern science and engineering often generate data sets with a large sample size and a comparably large dimension which puts classic asymptotic theory into question in many ways. Therefore, the main focus of this dissertation is to develop a fundamental understanding of statistical procedures for estimation and hypothesis testing from a non-asymptotic point of view, where both the sample size and problem dimension grow hand in hand. A range of different problems are explored in this thesis, including work on the geometry of hypothesis testing, adaptivity to local structure in estimation, effective methods for shape-constrained problems, and early stopping with boosting algorithms. The treatment of these different problems shares the common theme of emphasizing the underlying geometric structure.

5. Essays on Random Forest Ensembles

A random forest is a popular machine learning ensemble method that has proven successful in solving a wide range of classification problems. While other successful classifiers, such as boosting algorithms or neural networks, admit natural interpretations as maximum likelihood, a suitable statistical interpretation is much more elusive for a random forest. The first part of this dissertation demonstrates that a random forest is a fruitful framework in which to study AdaBoost and deep neural networks. The work explores the concept and utility of interpolation, the ability of a classifier to perfectly fit its training data. The second part of this dissertation places a random forest on more sound statistical footing by framing it as kernel regression with the proximity kernel. The work then analyzes the parameters that control the bandwidth of this kernel and discuss useful generalizations.

6. Marginally Interpretable Generalized Linear Mixed Models

A popular approach for relating correlated measurements of a non-Gaussian response variable to a set of predictors is to introduce latent random variables and fit a generalized linear mixed model. The conventional strategy for specifying such a model leads to parameter estimates that must be interpreted conditional on the latent variables. In many cases, interest lies not in these conditional parameters, but rather in marginal parameters that summarize the average effect of the predictors across the entire population. Due to the structure of the generalized linear mixed model, the average effect across all individuals in a population is generally not the same as the effect for an average individual. Further complicating matters, obtaining marginal summaries from a generalized linear mixed model often requires evaluation of an analytically intractable integral or use of an approximation. Another popular approach in this setting is to fit a marginal model using generalized estimating equations. This strategy is effective for estimating marginal parameters, but leaves one without a formal model for the data with which to assess quality of fit or make predictions for future observations. Thus, there exists a need for a better approach.

This dissertation defines a class of marginally interpretable generalized linear mixed models that leads to parameter estimates with a marginal interpretation while maintaining the desirable statistical properties of a conditionally specified model. The distinguishing feature of these models is an additive adjustment that accounts for the curvature of the link function and thereby preserves a specific form for the marginal mean after integrating out the latent random variables.

7. On the Detection of Hate Speech, Hate Speakers and Polarized Groups in Online Social Media

The objective of this dissertation is to explore the use of machine learning algorithms in understanding and detecting hate speech, hate speakers and polarized groups in online social media. Beginning with a unique typology for detecting abusive language, the work outlines the distinctions and similarities of different abusive language subtasks (offensive language, hate speech, cyberbullying and trolling) and how we might benefit from the progress made in each area. Specifically, the work suggests that each subtask can be categorized based on whether or not the abusive language being studied 1) is directed at a specific individual, or targets a generalized “Other” and 2) the extent to which the language is explicit versus implicit. The work then uses knowledge gained from this typology to tackle the “problem of offensive language” in hate speech detection.

8. Lasso Guarantees for Dependent Data

Serially correlated high dimensional data are prevalent in the big data era. In order to predict and learn the complex relationship among the multiple time series, high dimensional modeling has gained importance in various fields such as control theory, statistics, economics, finance, genetics and neuroscience. This dissertation studies a number of high dimensional statistical problems involving different classes of mixing processes.

9. Random forest robustness, variable importance, and tree aggregation

Random forest methodology is a nonparametric, machine learning approach capable of strong performance in regression and classification problems involving complex data sets. In addition to making predictions, random forests can be used to assess the relative importance of feature variables. This dissertation explores three topics related to random forests: tree aggregation, variable importance, and robustness.

10. Climate Data Computing: Optimal Interpolation, Averaging, Visualization and Delivery

This dissertation solves two important problems in the modern analysis of big climate data. The first is the efficient visualization and fast delivery of big climate data, and the second is a computationally extensive principal component analysis (PCA) using spherical harmonics on the Earth’s surface. The second problem creates a way to supply the data for the technology developed in the first. These two problems are computationally difficult, such as the representation of higher order spherical harmonics Y400, which is critical for upscaling weather data to almost infinitely fine spatial resolution.

I hope you enjoyed learning about these compelling machine learning dissertations.

Editor’s note: Interested in more data science research? Check out the Research Frontiers track at ODSC Europe this September 17-19 or the ODSC West Research Frontiers track this October 27-30.

Daniel Gutierrez, ODSC

Daniel D. Gutierrez is a practicing data scientist who’s been working with data long before the field came in vogue. As a technology journalist, he enjoys keeping a pulse on this fast-paced industry. Daniel is also an educator having taught data science, machine learning and R classes at the university level. He has authored four computer industry books on database and data science technology, including his most recent title, “Machine Learning and Data Science: An Introduction to Statistical Learning Methods with R.” Daniel holds a BS in Mathematics and Computer Science from UCLA.

10 Best Books to Teach You About Artificial Intelligence in 2024

Featured Post posted by ODSC Team Apr 12, 2024 Are you looking into getting into AI? In addition to conferences, meetups, and other in-person events,...

How to Measure Data Quality in Data Environments

Data Engineering posted by ODSC Community Apr 12, 2024 In light of ongoing technological advancements, the proliferation of smart devices continues persistently, leading to an...

Unlock Safety & Savings: Mastering a Secure, Cost-Effective Cloud Data Lake

East 2024 Data Engineering posted by ODSC Community Apr 12, 2024 Editor’s note: Ori Nakar and Johnathan Azaria are speakers for ODSC East this April 23-25. Be...

(Stanford users can avoid this Captcha by logging in.)

Send to text email RefWorks EndNote printer

Theory and algorithms for data-centric machine learning

Digital content, also available at, more options.

Find it at other libraries via WorldCat
Contributors

Description

Creators/contributors, contents/summary, bibliographic information.

Stanford Home
Maps & Directions
Search Stanford
Emergency Info
Terms of Use
Non-Discrimination
Accessibility

The Future of AI Research: 20 Thesis Ideas for Undergraduate Students in Machine Learning and Deep Learning for 2023!

A comprehensive guide for crafting an original and innovative thesis in the field of ai..

By Aarafat Islam on 2023-01-11

“The beauty of machine learning is that it can be applied to any problem you want to solve, as long as you can provide the computer with enough examples.” — Andrew Ng

This article provides a list of 20 potential thesis ideas for an undergraduate program in machine learning and deep learning in 2023. Each thesis idea includes an introduction , which presents a brief overview of the topic and the research objectives . The ideas provided are related to different areas of machine learning and deep learning, such as computer vision, natural language processing, robotics, finance, drug discovery, and more. The article also includes explanations, examples, and conclusions for each thesis idea, which can help guide the research and provide a clear understanding of the potential contributions and outcomes of the proposed research. The article also emphasized the importance of originality and the need for proper citation in order to avoid plagiarism.

1. Investigating the use of Generative Adversarial Networks (GANs) in medical imaging: A deep learning approach to improve the accuracy of medical diagnoses.

Introduction: Medical imaging is an important tool in the diagnosis and treatment of various medical conditions. However, accurately interpreting medical images can be challenging, especially for less experienced doctors. This thesis aims to explore the use of GANs in medical imaging, in order to improve the accuracy of medical diagnoses.

2. Exploring the use of deep learning in natural language generation (NLG): An analysis of the current state-of-the-art and future potential.

Introduction: Natural language generation is an important field in natural language processing (NLP) that deals with creating human-like text automatically. Deep learning has shown promising results in NLP tasks such as machine translation, sentiment analysis, and question-answering. This thesis aims to explore the use of deep learning in NLG and analyze the current state-of-the-art models, as well as potential future developments.

3. Development and evaluation of deep reinforcement learning (RL) for robotic navigation and control.

Introduction: Robotic navigation and control are challenging tasks, which require a high degree of intelligence and adaptability. Deep RL has shown promising results in various robotics tasks, such as robotic arm control, autonomous navigation, and manipulation. This thesis aims to develop and evaluate a deep RL-based approach for robotic navigation and control and evaluate its performance in various environments and tasks.

4. Investigating the use of deep learning for drug discovery and development.

Introduction: Drug discovery and development is a time-consuming and expensive process, which often involves high failure rates. Deep learning has been used to improve various tasks in bioinformatics and biotechnology, such as protein structure prediction and gene expression analysis. This thesis aims to investigate the use of deep learning for drug discovery and development and examine its potential to improve the efficiency and accuracy of the drug development process.

5. Comparison of deep learning and traditional machine learning methods for anomaly detection in time series data.

Introduction: Anomaly detection in time series data is a challenging task, which is important in various fields such as finance, healthcare, and manufacturing. Deep learning methods have been used to improve anomaly detection in time series data, while traditional machine learning methods have been widely used as well. This thesis aims to compare deep learning and traditional machine learning methods for anomaly detection in time series data and examine their respective strengths and weaknesses.

Photo by Joanna Kosinska on Unsplash

6. Use of deep transfer learning in speech recognition and synthesis.

Introduction: Speech recognition and synthesis are areas of natural language processing that focus on converting spoken language to text and vice versa. Transfer learning has been widely used in deep learning-based speech recognition and synthesis systems to improve their performance by reusing the features learned from other tasks. This thesis aims to investigate the use of transfer learning in speech recognition and synthesis and how it improves the performance of the system in comparison to traditional methods.

7. The use of deep learning for financial prediction.

Introduction: Financial prediction is a challenging task that requires a high degree of intelligence and adaptability, especially in the field of stock market prediction. Deep learning has shown promising results in various financial prediction tasks, such as stock price prediction and credit risk analysis. This thesis aims to investigate the use of deep learning for financial prediction and examine its potential to improve the accuracy of financial forecasting.

8. Investigating the use of deep learning for computer vision in agriculture.

Introduction: Computer vision has the potential to revolutionize the field of agriculture by improving crop monitoring, precision farming, and yield prediction. Deep learning has been used to improve various computer vision tasks, such as object detection, semantic segmentation, and image classification. This thesis aims to investigate the use of deep learning for computer vision in agriculture and examine its potential to improve the efficiency and accuracy of crop monitoring and precision farming.

9. Development and evaluation of deep learning models for generative design in engineering and architecture.

Introduction: Generative design is a powerful tool in engineering and architecture that can help optimize designs and reduce human error. Deep learning has been used to improve various generative design tasks, such as design optimization and form generation. This thesis aims to develop and evaluate deep learning models for generative design in engineering and architecture and examine their potential to improve the efficiency and accuracy of the design process.

10. Investigating the use of deep learning for natural language understanding.

Introduction: Natural language understanding is a complex task of natural language processing that involves extracting meaning from text. Deep learning has been used to improve various NLP tasks, such as machine translation, sentiment analysis, and question-answering. This thesis aims to investigate the use of deep learning for natural language understanding and examine its potential to improve the efficiency and accuracy of natural language understanding systems.

Photo by UX Indonesia on Unsplash

11. Comparing deep learning and traditional machine learning methods for image compression.

Introduction: Image compression is an important task in image processing and computer vision. It enables faster data transmission and storage of image files. Deep learning methods have been used to improve image compression, while traditional machine learning methods have been widely used as well. This thesis aims to compare deep learning and traditional machine learning methods for image compression and examine their respective strengths and weaknesses.

12. Using deep learning for sentiment analysis in social media.

Introduction: Sentiment analysis in social media is an important task that can help businesses and organizations understand their customers’ opinions and feedback. Deep learning has been used to improve sentiment analysis in social media, by training models on large datasets of social media text. This thesis aims to use deep learning for sentiment analysis in social media, and evaluate its performance against traditional machine learning methods.

13. Investigating the use of deep learning for image generation.

Introduction: Image generation is a task in computer vision that involves creating new images from scratch or modifying existing images. Deep learning has been used to improve various image generation tasks, such as super-resolution, style transfer, and face generation. This thesis aims to investigate the use of deep learning for image generation and examine its potential to improve the quality and diversity of generated images.

14. Development and evaluation of deep learning models for anomaly detection in cybersecurity.

Introduction: Anomaly detection in cybersecurity is an important task that can help detect and prevent cyber-attacks. Deep learning has been used to improve various anomaly detection tasks, such as intrusion detection and malware detection. This thesis aims to develop and evaluate deep learning models for anomaly detection in cybersecurity and examine their potential to improve the efficiency and accuracy of cybersecurity systems.

15. Investigating the use of deep learning for natural language summarization.

Introduction: Natural language summarization is an important task in natural language processing that involves creating a condensed version of a text that preserves its main meaning. Deep learning has been used to improve various natural language summarization tasks, such as document summarization and headline generation. This thesis aims to investigate the use of deep learning for natural language summarization and examine its potential to improve the efficiency and accuracy of natural language summarization systems.

Photo by Windows on Unsplash

16. Development and evaluation of deep learning models for facial expression recognition.

Introduction: Facial expression recognition is an important task in computer vision and has many practical applications, such as human-computer interaction, emotion recognition, and psychological studies. Deep learning has been used to improve facial expression recognition, by training models on large datasets of images. This thesis aims to develop and evaluate deep learning models for facial expression recognition and examine their performance against traditional machine learning methods.

17. Investigating the use of deep learning for generative models in music and audio.

Introduction: Music and audio synthesis is an important task in audio processing, which has many practical applications, such as music generation and speech synthesis. Deep learning has been used to improve generative models for music and audio, by training models on large datasets of audio data. This thesis aims to investigate the use of deep learning for generative models in music and audio and examine its potential to improve the quality and diversity of generated audio.

18. Study the comparison of deep learning models with traditional algorithms for anomaly detection in network traffic.

Introduction: Anomaly detection in network traffic is an important task that can help detect and prevent cyber-attacks. Deep learning models have been used for this task, and traditional methods such as clustering and rule-based systems are widely used as well. This thesis aims to compare deep learning models with traditional algorithms for anomaly detection in network traffic and analyze the trade-offs between the models in terms of accuracy and scalability.

19. Investigating the use of deep learning for improving recommender systems.

Introduction: Recommender systems are widely used in many applications such as online shopping, music streaming, and movie streaming. Deep learning has been used to improve the performance of recommender systems, by training models on large datasets of user-item interactions. This thesis aims to investigate the use of deep learning for improving recommender systems and compare its performance with traditional content-based and collaborative filtering approaches.

20. Development and evaluation of deep learning models for multi-modal data analysis.

Introduction: Multi-modal data analysis is the task of analyzing and understanding data from multiple sources such as text, images, and audio. Deep learning has been used to improve multi-modal data analysis, by training models on large datasets of multi-modal data. This thesis aims to develop and evaluate deep learning models for multi-modal data analysis and analyze their potential to improve performance in comparison to single-modal models.

I hope that this article has provided you with a useful guide for your thesis research in machine learning and deep learning. Remember to conduct a thorough literature review and to include proper citations in your work, as well as to be original in your research to avoid plagiarism. I wish you all the best of luck with your thesis and your research endeavors!

Continue Learning

Intelligent interactions are transforming app development using chatgpt, innovating responsibly with ai: an ethical perspective, are we building ai systems that learned to lie to us.

DeepFakes = DeepLearning + Fake

How to Use Llama 2 with an API on AWS to Power Your AI Apps

Role of artificial intelligence in metaverse.

Exploring the Saga of Metaverse with AI

Beginner’s Guide to OpenAI’s GPT-3.5-Turbo Model

From GPT-3 to GPT-3.5-Turbo: Understanding the Latest Upgrades in OpenAI’s Language Model API.

Google Custom Search

Wir verwenden Google für unsere Suche. Mit Klick auf „Suche aktivieren“ aktivieren Sie das Suchfeld und akzeptieren die Nutzungsbedingungen.

Hinweise zum Einsatz der Google Suche

Data Analytics and Machine Learning Group
TUM School of Computation, Information and Technology
Technical University of Munich

Open Topics

We offer multiple Bachelor/Master theses, Guided Research projects and IDPs in the area of data mining/machine learning. A non-exhaustive list of open topics is listed below.

If you are interested in a thesis or a guided research project, please send your CV and transcript of records to Prof. Stephan Günnemann via email and we will arrange a meeting to talk about the potential topics.

Robustness of Large Language Models

Type: Master's Thesis

Prerequisites:

Strong knowledge in machine learning
Very good coding skills
Proficiency with Python and deep learning frameworks (TensorFlow or PyTorch)
Knowledge about NLP and LLMs

Description:

The success of Large Language Models (LLMs) has precipitated their deployment across a diverse range of applications. With the integration of plugins enhancing their capabilities, it becomes imperative to ensure that the governing rules of these LLMs are foolproof and immune to circumvention. Recent studies have exposed significant vulnerabilities inherent to these models, underlining an urgent need for more rigorous research to fortify their resilience and reliability. A focus in this work will be the understanding of the working mechanisms of these attacks.

We are currently seeking students for the upcoming Summer Semester of 2024, so we welcome prompt applications. This project is in collaboration with Google Research .

Contact: Tom Wollschläger

References:

Universal and Transferable Adversarial Attacks on Aligned Language Models
Attacking Large Language Models with Projected Gradient Descent
Representation Engineering: A Top-Down Approach to AI Transparency
Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks

Generative Models for Drug Discovery

Type: Mater Thesis / Guided Research

Strong machine learning knowledge
Proficiency with Python and deep learning frameworks (PyTorch or TensorFlow)
Knowledge of graph neural networks (e.g. GCN, MPNN)
No formal education in chemistry, physics or biology needed!

Effectively designing molecular geometries is essential to advancing pharmaceutical innovations, a domain which has experienced great attention through the success of generative models. These models promise a more efficient exploration of the vast chemical space and generation of novel compounds with specific properties by leveraging their learned representations, potentially leading to the discovery of molecules with unique properties that would otherwise go undiscovered. Our topics lie at the intersection of generative models like diffusion/flow matching models and graph representation learning, e.g., graph neural networks. The focus of our projects can be model development with an emphasis on downstream tasks ( e.g., diffusion guidance at inference time ) and a better understanding of the limitations of existing models.

Contact : Johanna Sommer , Leon Hetzel

Equivariant Diffusion for Molecule Generation in 3D

Equivariant Flow Matching with Hybrid Probability Transport for 3D Molecule Generation

Structure-based Drug Design with Equivariant Diffusion Models

Efficient Machine Learning: Pruning, Quantization, Distillation, and More - DAML x Pruna AI

Type: Master's Thesis / Guided Research / Hiwi

The efficiency of machine learning algorithms is commonly evaluated by looking at target performance, speed and memory footprint metrics. Reduce the costs associated to these metrics is of primary importance for real-world applications with limited ressources (e.g. embedded systems, real-time predictions). In this project, you will work in collaboration with the DAML research group and the Pruna AI startup on investigating solutions to improve the efficiency of machine leanring models by looking at multiple techniques like pruning, quantization, distillation, and more.

Contact: Bertrand Charpentier

The Efficiency Misnomer
A Gradient Flow Framework for Analyzing Network Pruning
Distilling the Knowledge in a Neural Network
A Survey of Quantization Methods for Efficient Neural Network Inference

Deep Generative Models

Type: Master Thesis / Guided Research

Strong machine learning and probability theory knowledge
Knowledge of generative models and their basics (e.g., Normalizing Flows, Diffusion Models, VAE)
Optional: Neural ODEs/SDEs, Optimal Transport, Measure Theory

With recent advances, such as Diffusion Models, Transformers, Normalizing Flows, Flow Matching, etc., the field of generative models has gained significant attention in the machine learning and artificial intelligence research community. However, many problems and questions remain open, and the application to complex data domains such as graphs, time series, point processes, and sets is often non-trivial. We are interested in supervising motivated students to explore and extend the capabilities of state-of-the-art generative models for various data domains.

Contact : Marcel Kollovieh , David Lüdke

Flow Matching for Generative Modeling
Auto-Encoding Variational Bayes
Denoising Diffusion Probabilistic Models
Structured Denoising Diffusion Models in Discrete State-Spaces

Graph Structure Learning

Type: Guided Research / Hiwi

Optional: Knowledge of graph theory and mathematical optimization

Graph deep learning is a powerful ML concept that enables the generalisation of successful deep neural architectures to non-Euclidean structured data. Such methods have shown promising results in a vast range of applications spanning the social sciences, biomedicine, particle physics, computer vision, graphics and chemistry. One of the major limitations of most current graph neural network architectures is that they often rely on the assumption that the underlying graph is known and fixed. However, this assumption is not always true, as the graph may be noisy or partially and even completely unknown. In the case of noisy or partially available graphs, it would be useful to jointly learn an optimised graph structure and the corresponding graph representations for the downstream task. On the other hand, when the graph is completely absent, it would be useful to infer it directly from the data. This is particularly interesting in inductive settings where some of the nodes were not present at training time. Furthermore, learning a graph can become an end in itself, as the inferred structure can provide complementary insights with respect to the downstream task. In this project, we aim to investigate solutions and devise new methods to construct an optimal graph structure based on the available (unstructured) data.

Contact : Filippo Guerranti

A Survey on Graph Structure Learning: Progress and Opportunities
Differentiable Graph Module (DGM) for Graph Convolutional Networks
Learning Discrete Structures for Graph Neural Networks

NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification

A Machine Learning Perspective on Corner Cases in Autonomous Driving Perception

Type: Master's Thesis

Industrial partner: BMW

Prerequisites:

Strong knowledge in machine learning
Knowledge of Semantic Segmentation
Good programming skills
Proficiency with Python and deep learning frameworks (TensorFlow or PyTorch)

Description:

In autonomous driving, state-of-the-art deep neural networks are used for perception tasks like for example semantic segmentation. While the environment in datasets is controlled in real world application novel class or unknown disturbances can occur. To provide safe autonomous driving these cased must be identified.

The objective is to explore novel class segmentation and out of distribution approaches for semantic segmentation in the context of corner cases for autonomous driving.

Contact: Sebastian Schmidt

References:

Segmenting Known Objects and Unseen Unknowns without Prior Knowledge
Efficient Uncertainty Estimation for Semantic Segmentation in Videos
Natural Posterior Network: Deep Bayesian Uncertainty for Exponential Family
Description of Corner Cases in Automated Driving: Goals and Challenges

Active Learning for Multi Agent 3D Object Detection

Type: Master's Thesis Industrial partner: BMW

Knowledge in Object Detection
Excellent programming skills

In autonomous driving, state-of-the-art deep neural networks are used for perception tasks like for example 3D object detection. To provide promising results, these networks often require a lot of complex annotation data for training. These annotations are often costly and redundant. Active learning is used to select the most informative samples for annotation and cover a dataset with as less annotated data as possible.

The objective is to explore active learning approaches for 3D object detection using combined uncertainty and diversity based methods.

Exploring Diversity-based Active Learning for 3D Object Detection in Autonomous Driving
Efficient Uncertainty Estimation for Semantic Segmentation in Videos
KECOR: Kernel Coding Rate Maximization for Active 3D Object Detection
Towards Open World Active Learning for 3D Object Detection

Graph Neural Networks

Type: Master's thesis / Bachelor's thesis / guided research

Knowledge of graph/network theory

Graph neural networks (GNNs) have recently achieved great successes in a wide variety of applications, such as chemistry, reinforcement learning, knowledge graphs, traffic networks, or computer vision. These models leverage graph data by updating node representations based on messages passed between nodes connected by edges, or by transforming node representation using spectral graph properties. These approaches are very effective, but many theoretical aspects of these models remain unclear and there are many possible extensions to improve GNNs and go beyond the nodes' direct neighbors and simple message aggregation.

Contact: Simon Geisler

Semi-supervised classification with graph convolutional networks
Relational inductive biases, deep learning, and graph networks
Diffusion Improves Graph Learning
Weisfeiler and leman go neural: Higher-order graph neural networks
Reliable Graph Neural Networks via Robust Aggregation

Physics-aware Graph Neural Networks

Type: Master's thesis / guided research

Proficiency with Python and deep learning frameworks (JAX or PyTorch)
Knowledge of graph neural networks (e.g. GCN, MPNN, SchNet)
Optional: Knowledge of machine learning on molecules and quantum chemistry

Deep learning models, especially graph neural networks (GNNs), have recently achieved great successes in predicting quantum mechanical properties of molecules. There is a vast amount of applications for these models, such as finding the best method of chemical synthesis or selecting candidates for drugs, construction materials, batteries, or solar cells. However, GNNs have only been proposed in recent years and there remain many open questions about how to best represent and leverage quantum mechanical properties and methods.

Contact: Nicholas Gao

Directional Message Passing for Molecular Graphs
Neural message passing for quantum chemistry
Learning to Simulate Complex Physics with Graph Network
Ab initio solution of the many-electron Schrödinger equation with deep neural networks
Ab-Initio Potential Energy Surfaces by Pairing GNNs with Neural Wave Functions
Tensor field networks: Rotation- and translation-equivariant neural networks for 3D point clouds

Robustness Verification for Deep Classifiers

Type: Master's thesis / Guided research

Strong machine learning knowledge (at least equivalent to IN2064 plus an advanced course on deep learning)
Strong background in mathematical optimization (preferably combined with Machine Learning setting)
Proficiency with python and deep learning frameworks (Pytorch or Tensorflow)
(Preferred) Knowledge of training techniques to obtain classifiers that are robust against small perturbations in data

Description : Recent work shows that deep classifiers suffer under presence of adversarial examples: misclassified points that are very close to the training samples or even visually indistinguishable from them. This undesired behaviour constraints possibilities of deployment in safety critical scenarios for promising classification methods based on neural nets. Therefore, new training methods should be proposed that promote (or preferably ensure) robust behaviour of the classifier around training samples.

Contact: Aleksei Kuvshinov

References (Background):

Intriguing properties of neural networks
Explaining and harnessing adversarial examples
SoK: Certified Robustness for Deep Neural Networks
Certified Adversarial Robustness via Randomized Smoothing
Formal guarantees on the robustness of a classifier against adversarial manipulation
Towards deep learning models resistant to adversarial attacks
Provable defenses against adversarial examples via the convex outer adversarial polytope
Certified defenses against adversarial examples
Lipschitz-margin training: Scalable certification of perturbation invariance for deep neural networks

Uncertainty Estimation in Deep Learning

Type: Master's Thesis / Guided Research

Strong knowledge in probability theory

Safe prediction is a key feature in many intelligent systems. Classically, Machine Learning models compute output predictions regardless of the underlying uncertainty of the encountered situations. In contrast, aleatoric and epistemic uncertainty bring knowledge about undecidable and uncommon situations. The uncertainty view can be a substantial help to detect and explain unsafe predictions, and therefore make ML systems more robust. The goal of this project is to improve the uncertainty estimation in ML models in various types of task.

Contact: Tom Wollschläger , Dominik Fuchsgruber , Bertrand Charpentier

Can You Trust Your Model’s Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift
Predictive Uncertainty Estimation via Prior Networks
Posterior Network: Uncertainty Estimation without OOD samples via Density-based Pseudo-Counts
Evidential Deep Learning to Quantify Classification Uncertainty
Weight Uncertainty in Neural Networks

Hierarchies in Deep Learning

Type: Master's Thesis / Guided Research

Multi-scale structures are ubiquitous in real life datasets. As an example, phylogenetic nomenclature naturally reveals a hierarchical classification of species based on their historical evolutions. Learning multi-scale structures can help to exhibit natural and meaningful organizations in the data and also to obtain compact data representation. The goal of this project is to leverage multi-scale structures to improve speed, performances and understanding of Deep Learning models.

Contact: Marcel Kollovieh , Bertrand Charpentier

Tree Sampling Divergence: An Information-Theoretic Metricfor Hierarchical Graph Clustering
Hierarchical Graph Representation Learning with Differentiable Pooling
Gradient-based Hierarchical Clustering
Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space

Simple Search
Advanced Search
Deposit an Item
Deposit Instructions
Instructions for Students

Thesis Files

Repository Staff Only: item control page

Bibliography
More Referencing guides Blog Automated transliteration Relevant bibliographies by topics
Automated transliteration
Relevant bibliographies by topics
Referencing guides

Dissertations / Theses on the topic 'Machine Learning (ML)'

Create a spot-on reference in apa, mla, chicago, harvard, and other styles.

Consult the top 50 dissertations / theses for your research on the topic 'Machine Learning (ML).'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

Holmberg, Lars. "Human In Command Machine Learning." Licentiate thesis, Malmö universitet, Malmö högskola, Institutionen för datavetenskap och medieteknik (DVMT), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-42576.

Nangalia, V. "ML-EWS - Machine Learning Early Warning System : the application of machine learning to predict in-hospital patient deterioration." Thesis, University College London (University of London), 2017. http://discovery.ucl.ac.uk/1565193/.

John, Meenu Mary. "Design Methods and Processes for ML/DL models." Licentiate thesis, Malmö universitet, Institutionen för datavetenskap och medieteknik (DVMT), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-45026.

Tabell, Johnsson Marco, and Ala Jafar. "Efficiency Comparison Between Curriculum Reinforcement Learning & Reinforcement Learning Using ML-Agents." Thesis, Blekinge Tekniska Högskola, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-20218.

Mattsson, Fredrik, and Anton Gustafsson. "Optimize Ranking System With Machine Learning." Thesis, Högskolan i Halmstad, Akademin för informationsteknologi, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-37431.

Kakadost, Naser, and Charif Ramadan. "Empirisk undersökning av ML strategier vid prediktion av cykelflöden baserad på cykeldata och veckodagar." Thesis, Malmö universitet, Fakulteten för teknik och samhälle (TS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-20168.

Sammaritani, Gloria. "Google BigQuery ML. Analisi comparativa di un nuovo framework per il Machine Learning." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2020.

Gustafsson, Sebastian. "Interpretable serious event forecasting using machine learning and SHAP." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-444363.

Schoenfeld, Brandon J. "Metalearning by Exploiting Granular Machine Learning Pipeline Metadata." BYU ScholarsArchive, 2020. https://scholarsarchive.byu.edu/etd/8730.

Hellborg, Per. "Optimering av datamängder med Machine learning : En studie om Machine learning och Internet of Things." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-13747.

Nämerforslund, Tim. "Machine Learning Adversaries in Video Games : Using reinforcement learning in the Unity Engine to create compelling enemy characters." Thesis, Mittuniversitetet, Institutionen för informationssystem och –teknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-42746.

Mahfouz, Tarek Said. "Construction legal support for differing site conditions (DSC) through statistical modeling and machine learning (ML)." [Ames, Iowa : Iowa State University], 2009.

REPETTO, MARCO. "Black-box supervised learning and empirical assessment: new perspectives in credit risk modeling." Doctoral thesis, Università degli Studi di Milano-Bicocca, 2023. https://hdl.handle.net/10281/402366.

Björkberg, David. "Comparison of cumulative reward withone, two and three layered artificialneural network in a simple environmentwhen using ml-agents." Thesis, Blekinge Tekniska Högskola, Institutionen för datavetenskap, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-21188.

Ucci, Graziano. "The Interstellar Medium of Galaxies: a Machine Learning Approach." Doctoral thesis, Scuola Normale Superiore, 2019. http://hdl.handle.net/11384/85928.

Gilmore, Eugene M. "Learning Interpretable Decision Tree Classifiers with Human in the Loop Learning and Parallel Coordinates." Thesis, Griffith University, 2022. http://hdl.handle.net/10072/418633.

Sridhar, Sabarish. "SELECTION OF FEATURES FOR ML BASED COMMANDING OF AUTONOMOUS VEHICLES." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-287450.

Lagerkvist, Love. "Neural Novelty — How Machine Learning Does Interactive Generative Literature." Thesis, Malmö universitet, Fakulteten för kultur och samhälle (KS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-21222.

Bhogi, Keerthana. "Two New Applications of Tensors to Machine Learning for Wireless Communications." Thesis, Virginia Tech, 2021. http://hdl.handle.net/10919/104970.

Garg, Anushka. "Comparing Machine Learning Algorithms and Feature Selection Techniques to Predict Undesired Behavior in Business Processesand Study of Auto ML Frameworks." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-285559.

Hanski, Jari, and Kaan Baris Biçak. "An Evaluation of the Unity Machine Learning Agents Toolkit in Dense and Sparse Reward Video Game Environments." Thesis, Uppsala universitet, Institutionen för speldesign, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-444982.

Stellmar, Justin. "Predicting the Deformation of 3D Printed ABS Plastic Using Machine Learning Regressions." Youngstown State University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ysu1587462911261523.

Krüger, Franz David, and Mohamad Nabeel. "Hyperparameter Tuning Using Genetic Algorithms : A study of genetic algorithms impact and performance for optimization of ML algorithms." Thesis, Mittuniversitetet, Institutionen för informationssystem och –teknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-42404.

Björkman, Desireé. "Machine Learning Evaluation of Natural Language to Computational Thinking : On the possibilities of coding without syntax." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-424269.

Lundin, Lowe. "Artificial Intelligence for Data Center Power Consumption Optimisation." Thesis, Uppsala universitet, Avdelningen för systemteknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-447627.

Wessman, Filip. "Advanced Algorithms for Classification and Anomaly Detection on Log File Data : Comparative study of different Machine Learning Approaches." Thesis, Mittuniversitetet, Institutionen för informationssystem och –teknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-43175.

Narmack, Kirilll. "Dynamic Speed Adaptation for Curves using Machine Learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-233545.

Mathias, Berggren, and Sonesson Daniel. "Design Optimization in Gas Turbines using Machine Learning : A study performed for Siemens Energy AB." Thesis, Linköpings universitet, Programvara och system, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-173920.

Giuliani, Luca. "Extending the Moving Targets Method for Injecting Constraints in Machine Learning." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/23885/.

Tarullo, Viviana. "Artificial Neural Networks for classification of EMG data in hand myoelectric control." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019. http://amslaurea.unibo.it/19195/.

Wallner, Vanja. "Mapping medical expressions to MedDRA using Natural Language Processing." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-426916.

Hellberg, Johan, and Kasper Johansson. "Building Models for Prediction and Forecasting of Service Quality." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-295617.

Nardello, Matteo. "Low-Power Smart Devices for the IoT Revolution." Doctoral thesis, Università degli studi di Trento, 2020. http://hdl.handle.net/11572/274371.

Michelini, Mattia. "Barcode detection by neural networks on Android mobile platforms." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2020. http://amslaurea.unibo.it/21080/.

Lundström, Robin. "Machine Learning for Air Flow Characterization : An application of Theory-Guided Data Science for Air Fow characterization in an Industrial Foundry." Thesis, Karlstads universitet, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kau:diva-72782.

Daneshvar, Saman. "User Modeling in Social Media: Gender and Age Detection." Thesis, Université d'Ottawa / University of Ottawa, 2019. http://hdl.handle.net/10393/39535.

Rosell, Felicia. "Tracking a ball during bounce and roll using recurrent neural networks." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-239733.

Hallberg, Jesper. "Searching for the charged Higgs boson in the tau nu analysis using Boosted Decision Trees." Thesis, Uppsala universitet, Högenergifysik, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-301351.

Klingvall, Emelie. "Artificiell intelligens som ett beslutsstöd inom mammografi : En kvalitativ studie om radiologers perspektiv på icke-tekniska utmaningar." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-18768.

Alatram, Ala'a A. M. "A forensic framework for detecting denial-of-service attacks in IoT networks using the MQTT protocol." Thesis, Edith Cowan University, Research Online, Perth, Western Australia, 2022. https://ro.ecu.edu.au/theses/2561.

Forssell, Melker, and Gustav Janér. "Product Matching Using Image Similarity." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-413481.

Bengtsson, Theodor, and Jonas Hägerlöf. "Stora mängder användardata för produktutveckling : Möjligheter och utmaningar vid integrering av stora mängder användardata i produktutvecklingsprocesser." Thesis, KTH, Integrerad produktutveckling, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-297966.

Nordqvist, My. "Classify part of day and snow on the load of timber stacks : A comparative study between partitional clustering and competitive learning." Thesis, Mittuniversitetet, Institutionen för informationssystem och –teknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-42238.

Mele, Matteo. "Convolutional Neural Networks for the Classification of Olive Oil Geographical Origin." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2020.

Hjerpe, Adam. "Computing Random Forests Variable Importance Measures (VIM) on Mixed Numerical and Categorical Data." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-185496.

Ahlm, Kristoffer. "IDENTIFIKATION AV RISKINDIKATORER I FINANSIELL INFORMATION MED HJÄLP AV AI/ML : Ökade möjligheter för myndigheter att förebygga ekonomisk brottslighet." Thesis, Umeå universitet, Institutionen för matematik och matematisk statistik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-184818.

Zanghieri, Marcello. "sEMG-based hand gesture recognition with deep learning." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019. http://amslaurea.unibo.it/18112/.

Benatti, Mattia. "Progettazione e Sviluppo di una Piattaforma Multi-Sorgente per l’Ottimizzazione dei Servizi di Emergenza." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021.

Sibelius, Parmbäck Sebastian. "HMMs and LSTMs for On-line Gesture Recognition on the Stylaero Board : Evaluating and Comparing Two Methods." Thesis, Linköpings universitet, Artificiell intelligens och integrerade datorsystem, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-162237.

Help & FAQ

Data Mining

Data Science
Data and Artificial Intelligence

Student theses

1 - 50 out of 258 results
Title (descending)

Search results

3d face reconstruction using deep learning.

Supervisor: Medeiros de Carvalho, R. (Supervisor 1), Gallucci, A. (Supervisor 2) & Vanschoren, J. (Supervisor 2)

Student thesis : Master

Achieving Long Term Fairness through Curiosity Driven Reinforcement Learning: How intrinsic motivation influences fairness in algorithmic decision making

Supervisor: Pechenizkiy, M. (Supervisor 1), Gajane, P. (Supervisor 2) & Kapodistria, S. (Supervisor 2)

Activity Recognition Using Deep Learning in Videos under Clinical Setting

Supervisor: Duivesteijn, W. (Supervisor 1), Papapetrou, O. (Supervisor 2), Zhang, L. (External person) (External coach) & Vasu, J. D. (External coach)

A Data Cleaning Assistant

Supervisor: Vanschoren, J. (Supervisor 1)

Student thesis : Bachelor

A Data Cleaning Assistant for Machine Learning

A deep learning approach for clustering a multi-class dataset.

Supervisor: Pei, Y. (Supervisor 1), Marczak, M. (External person) (External coach) & Groen, J. (External person) (External coach)

Aerial Imagery Pixel-level Segmentation

A framework for understanding business process remaining time predictions.

Supervisor: Pechenizkiy, M. (Supervisor 1) & Scheepens, R. J. (Supervisor 2)

A Hybrid Model for Pedestrian Motion Prediction

Supervisor: Pechenizkiy, M. (Supervisor 1), Muñoz Sánchez, M. (Supervisor 2), Silvas, E. (External coach) & Smit, R. M. B. (External coach)

Algorithms for center-based trajectory clustering

Supervisor: Buchin, K. (Supervisor 1) & Driemel, A. (Supervisor 2)

Allocation Decision-Making in Service Supply Chain with Deep Reinforcement Learning

Supervisor: Zhang, Y. (Supervisor 1), van Jaarsveld, W. L. (Supervisor 2), Menkovski, V. (Supervisor 2) & Lamghari-Idrissi, D. (Supervisor 2)

Analyzing Policy Gradient approaches towards Rapid Policy Transfer

An empirical study on dynamic curriculum learning in information retrieval.

Supervisor: Fang, M. (Supervisor 1)

An Explainable Approach to Multi-contextual Fake News Detection

Supervisor: Pechenizkiy, M. (Supervisor 1), Pei, Y. (Supervisor 2) & Das, B. (External person) (External coach)

An exploration and evaluation of concept based interpretability methods as a measure of representation quality in neural networks

Supervisor: Menkovski, V. (Supervisor 1) & Stolikj, M. (External coach)

Anomaly detection in image data sets using disentangled representations

Supervisor: Menkovski, V. (Supervisor 1) & Tonnaer, L. M. A. (Supervisor 2)

Anomaly Detection in Polysomnography signals using AI

Supervisor: Pechenizkiy, M. (Supervisor 1), Schwanz Dias, S. (Supervisor 2) & Belur Nagaraj, S. (External person) (External coach)

Anomaly detection in text data using deep generative models

Supervisor: Menkovski, V. (Supervisor 1) & van Ipenburg, W. (External person) (External coach)

Anomaly Detection on Dynamic Graph

Supervisor: Pei, Y. (Supervisor 1), Fang, M. (Supervisor 2) & Monemizadeh, M. (Supervisor 2)

Anomaly Detection on Finite Multivariate Time Series from Semi-Automated Screwing Applications

Supervisor: Pechenizkiy, M. (Supervisor 1) & Schwanz Dias, S. (Supervisor 2)

Anomaly Detection on Multivariate Time Series Using GANs

Supervisor: Pei, Y. (Supervisor 1) & Kruizinga, P. (External person) (External coach)

Anomaly detection on vibration data

Supervisor: Hess, S. (Supervisor 1), Pechenizkiy, M. (Supervisor 2), Yakovets, N. (Supervisor 2) & Uusitalo, J. (External person) (External coach)

Application of P&ID symbol detection and classification for generation of material take-off documents (MTOs)

Supervisor: Pechenizkiy, M. (Supervisor 1), Banotra, R. (External person) (External coach) & Ya-alimadad, M. (External person) (External coach)

Applications of deep generative models to Tokamak Nuclear Fusion

Supervisor: Koelman, J. M. V. A. (Supervisor 1), Menkovski, V. (Supervisor 2), Citrin, J. (Supervisor 2) & van de Plassche, K. L. (External coach)

A Similarity Based Meta-Learning Approach to Building Pipeline Portfolios for Automated Machine Learning

Aspect-based few-shot learning.

Supervisor: Menkovski, V. (Supervisor 1)

Assessing Bias and Fairness in Machine Learning through a Causal Lens

Supervisor: Pechenizkiy, M. (Supervisor 1)

Assessing fairness in anomaly detection: A framework for developing a context-aware fairness tool to assess rule-based models

Supervisor: Pechenizkiy, M. (Supervisor 1), Weerts, H. J. P. (Supervisor 2), van Ipenburg, W. (External person) (External coach) & Veldsink, J. W. (External person) (External coach)

A Study of an Open-Ended Strategy for Learning Complex Locomotion Skills

A systematic determination of metrics for classification tasks in openml, a universally applicable emm framework.

Supervisor: Duivesteijn, W. (Supervisor 1), van Dongen, B. F. (Supervisor 2) & Yakovets, N. (Supervisor 2)

Automated machine learning with gradient boosting and meta-learning

Automated object recognition of solar panels in aerial photographs: a case study in the liander service area.

Supervisor: Pechenizkiy, M. (Supervisor 1), Medeiros de Carvalho, R. (Supervisor 2) & Weelinck, T. (External person) (External coach)

Automatic data cleaning

Automatic scoring of short open-ended questions.

Supervisor: Pechenizkiy, M. (Supervisor 1) & van Gils, S. (External coach)

Automatic Synthesis of Machine Learning Pipelines consisting of Pre-Trained Models for Multimodal Data

Automating string encoding in automl, autoregressive neural networks to model electroencephalograpy signals.

Supervisor: Vanschoren, J. (Supervisor 1), Pfundtner, S. (External person) (External coach) & Radha, M. (External coach)

Balancing Efficiency and Fairness on Ride-Hailing Platforms via Reinforcement Learning

Supervisor: Tavakol, M. (Supervisor 1), Pechenizkiy, M. (Supervisor 2) & Boon, M. A. A. (Supervisor 2)

Benchmarking Audio DeepFake Detection

Better clustering evaluation for the openml evaluation engine.

Supervisor: Vanschoren, J. (Supervisor 1), Gijsbers, P. (Supervisor 2) & Singh, P. (Supervisor 2)

Bi-level pipeline optimization for scalable AutoML

Supervisor: Nobile, M. (Supervisor 1), Vanschoren, J. (Supervisor 1), Medeiros de Carvalho, R. (Supervisor 2) & Bliek, L. (Supervisor 2)

Block-sparse evolutionary training using weight momentum evolution: training methods for hardware efficient sparse neural networks

Supervisor: Mocanu, D. (Supervisor 1), Zhang, Y. (Supervisor 2) & Lowet, D. J. C. (External coach)

Boolean Matrix Factorization and Completion

Supervisor: Peharz, R. (Supervisor 1) & Hess, S. (Supervisor 2)

Bootstrap Hypothesis Tests for Evaluating Subgroup Descriptions in Exceptional Model Mining

Supervisor: Duivesteijn, W. (Supervisor 1) & Schouten, R. M. (Supervisor 2)

Bottom-Up Search: A Distance-Based Search Strategy for Supervised Local Pattern Mining on Multi-Dimensional Target Spaces

Supervisor: Duivesteijn, W. (Supervisor 1), Serebrenik, A. (Supervisor 2) & Kromwijk, T. J. (Supervisor 2)

Bridging the Domain-Gap in Computer Vision Tasks

Supervisor: Mocanu, D. C. (Supervisor 1) & Lowet, D. J. C. (External coach)

CCESO: Auditing AI Fairness By Comparing Counterfactual Explanations of Similar Objects

Supervisor: Pechenizkiy, M. (Supervisor 1) & Hoogland, K. (External person) (External coach)

Clean-Label Poison Attacks on Machine Learning

Supervisor: Michiels, W. P. A. J. (Supervisor 1), Schalij, F. D. (External coach) & Hess, S. (Supervisor 2)

Chapman University Digital Commons

Home > Dissertations and Theses > Computational and Data Sciences (PhD) Dissertations

Computational and Data Sciences (PhD) Dissertations

Below is a selection of dissertations from the Doctor of Philosophy in Computational and Data Sciences program in Schmid College that have been included in Chapman University Digital Commons. Additional dissertations from years prior to 2019 are available through the Leatherby Libraries' print collection or in Proquest's Dissertations and Theses database.

Dissertations from 2023 2023

Computational Analysis of Antibody Binding Mechanisms to the Omicron RBD of SARS-CoV-2 Spike Protein: Identification of Epitopes and Hotspots for Developing Effective Therapeutic Strategies , Mohammed Alshahrani

Integration of Computer Algebra Systems and Machine Learning in the Authoring of the SANYMS Intelligent Tutoring System , Sam Ford

Voluntary Action and Conscious Intention , Jake Gavenas

Random Variable Spaces: Mathematical Properties and an Extension to Programming Computable Functions , Mohammed Kurd-Misto

Computational Modeling of Superconductivity from the Set of Time-Dependent Ginzburg-Landau Equations for Advancements in Theory and Applications , Iris Mowgood

Application of Machine Learning Algorithms for Elucidation of Biological Networks from Time Series Gene Expression Data , Krupa Nagori

Stochastic Processes and Multi-Resolution Analysis: A Trigonometric Moment Problem Approach and an Analysis of the Expenditure Trends for Diabetic Patients , Isaac Nwi-Mozu

Applications of Causal Inference Methods for the Estimation of Effects of Bone Marrow Transplant and Prescription Drugs on Survival of Aplastic Anemia Patients , Yesha M. Patel

Causal Inference and Machine Learning Methods in Parkinson's Disease Data Analysis , Albert Pierce

Causal Inference Methods for Estimation of Survival and General Health Status Measures of Alzheimer’s Disease Patients , Ehsan Yaghmaei

Dissertations from 2022 2022

Computational Approaches to Facilitate Automated Interchange between Music and Art , Rao Hamza Ali

Causal Inference in Psychology and Neuroscience: From Association to Causation , Dehua Liang

Advances in NLP Algorithms on Unstructured Medical Notes Data and Approaches to Handling Class Imbalance Issues , Hanna Lu

Novel Techniques for Quantifying Secondhand Smoke Diffusion into Children's Bedroom , Sunil Ramchandani

Probing the Boundaries of Human Agency , Sook Mun Wong

Dissertations from 2021 2021

Predicting Eye Movement and Fixation Patterns on Scenic Images Using Machine Learning for Children with Autism Spectrum Disorder , Raymond Anden

Forecasting the Prices of Cryptocurrencies using a Novel Parameter Optimization of VARIMA Models , Alexander Barrett

Applications of Machine Learning to Facilitate Software Engineering and Scientific Computing , Natalie Best

Exploring Behaviors of Software Developers and Their Code Through Computational and Statistical Methods , Elia Eiroa Lledo

Assessing the Re-Identification Risk in ECG Datasets and an Application of Privacy Preserving Techniques in ECG Analysis , Arin Ghazarian

Multi-Modal Data Fusion, Image Segmentation, and Object Identification using Unsupervised Machine Learning: Conception, Validation, Applications, and a Basis for Multi-Modal Object Detection and Tracking , Nicholas LaHaye

Machine-Learning-Based Approach to Decoding Physiological and Neural Signals , Elnaz Lashgari

Learning-Based Modeling of Weather and Climate Events Related To El Niño Phenomenon via Differentiable Programming and Empirical Decompositions , Justin Le

Quantum State Estimation and Tracking for Superconducting Processors Using Machine Learning , Shiva Lotfallahzadeh Barzili

Novel Applications of Statistical and Machine Learning Methods to Analyze Trial-Level Data from Cognitive Measures , Chelsea Parlett

Optimal Analytical Methods for High Accuracy Cardiac Disease Classification and Treatment Based on ECG Data , Jianwei Zheng

Dissertations from 2020 2020

Development of Integrated Machine Learning and Data Science Approaches for the Prediction of Cancer Mutation and Autonomous Drug Discovery of Anti-Cancer Therapeutic Agents , Steven Agajanian

Allocation of Public Resources: Bringing Order to Chaos , Lance Clifner

A Novel Correction for the Adjusted Box-Pierce Test — New Risk Factors for Emergency Department Return Visits within 72 hours for Children with Respiratory Conditions — General Pediatric Model for Understanding and Predicting Prolonged Length of Stay , Sidy Danioko

A Computational and Experimental Examination of the FCC Incentive Auction , Logan Gantner

Exploring the Employment Landscape for Individuals with Autism Spectrum Disorders using Supervised and Unsupervised Machine Learning , Kayleigh Hyde

Integrated Machine Learning and Bioinformatics Approaches for Prediction of Cancer-Driving Gene Mutations , Oluyemi Odeyemi

On Quantum Effects of Vector Potentials and Generalizations of Functional Analysis , Ismael L. Paiva

Long Term Ground Based Precipitation Data Analysis: Spatial and Temporal Variability , Luciano Rodriguez

Gaining Computational Insight into Psychological Data: Applications of Machine Learning with Eating Disorders and Autism Spectrum Disorder , Natalia Rosenfield

Connecting the Dots for People with Autism: A Data-driven Approach to Designing and Evaluating a Global Filter , Viseth Sean

Novel Statistical and Machine Learning Methods for the Forecasting and Analysis of Major League Baseball Player Performance , Christopher Watkins

Dissertations from 2019 2019

Contributions to Variable Selection in Complexly Sampled Case-control Models, Epidemiology of 72-hour Emergency Department Readmission, and Out-of-site Migration Rate Estimation Using Pseudo-tagged Longitudinal Data , Kyle Anderson

Bias Reduction in Machine Learning Classifiers for Spatiotemporal Analysis of Coral Reefs using Remote Sensing Images , Justin J. Gapper

Estimating Auction Equilibria using Individual Evolutionary Learning , Kevin James

Employing Earth Observations and Artificial Intelligence to Address Key Global Environmental Challenges in Service of the SDGs , Wenzhao Li

Image Restoration using Automatic Damaged Regions Detection and Machine Learning-Based Inpainting Technique , Chloe Martin-King

Theses from 2017 2017

Optimized Forecasting of Dominant U.S. Stock Market Equities Using Univariate and Multivariate Time Series Analysis Methods , Michael Schwartz

Collections
Disciplines

Advanced Search

Notify me via email or RSS

Author Corner

Submit Research
Rights and Terms of Use
Leatherby Libraries
Chapman University

ISSN 2572-1496

Home | About | FAQ | My Account | Accessibility Statement

Privacy Copyright

Machine Learning & Data Science Foundations

Online Graduate Certificate

Be a Game Changer

Harness the power of big data with skills in machine learning and data science, your pathway to the ai workforce.

Organizations know how important data is, but they don’t always know what to do with the volume of data they have collected. That’s why Carnegie Mellon University designed the online Graduate Certificate in Machine Learning & Data Science Foundations; to teach technically-savvy professionals how to leverage AI and machine learning technology for harnessing the power of large scale data systems.

Computer-Science Based Data Analytics

When you enroll in this program, you will learn foundational skills in computer programming, machine learning, and data science that will allow you to leverage data science in various industries including business, education, environment, defense, policy and health care. This unique combination of expertise will give you the ability to turn raw data into usable information that you can apply within your organization.

Throughout the coursework, you will:

Practice mathematical and computational concepts used in machine learning, including probability, linear algebra, multivariate differential calculus, algorithm analysis, and dynamic programming.
Learn how to approach and solve large-scale data science problems.
Acquire foundational skills in solution design, analytic algorithms, interactive analysis, and visualization techniques for data analysis.

An online Graduate Certificate in Machine Learning & Data Science from Carnegie Mellon will expand your possibilities and prepare you for the staggering amount of data generated by today’s rapidly changing world.

A Powerful Certificate. Conveniently Offered.

The online Graduate Certificate in Machine Learning & Data Science Foundations is offered 100% online to help computer science professionals conveniently fit the program into their busy day-to-day lives. In addition to a flexible, convenient format, you will experience the same rigorous coursework for which Carnegie Mellon University’s graduate programs are known.

For Today’s Problem Solvers

This leading certificate program is best suited for:

Industry Professionals looking to deliver value to companies by acquiring in-demand data science, AI, and machine learning skills. After completing the program, participants will acquire the technical know-how to build machine learning models as well as the ability to analyze trends.
Recent computer science degree graduates seeking to expand their skill set and become even more marketable in a growing field. Over the past few years, data sets have grown tremendously. Today’s top companies need data science professionals who can leverage machine learning technology.

At a Glance

Start Date May 2024

Application Deadlines Rolling Admissions

We are still accepting applications for a limited number of remaining spots to start in Summer 2024. Apply today to secure your space in the program.

Program Length 12 months

Program Format 100% online

Live-Online Schedule 1x per week for 90 minutes in the evening

Taught By School of Computer Science

Request Info

Questions? There are two ways to contact us. Call 412-501-2686 or send an email to [email protected] with your inquiries .

Program Name Change

To better reflect the emphasis on machine learning in the curriculum, the name of this certificate has been updated from Computational Data Science Foundations to Machine Learning & Data Science Foundations.

Although the name has changed, the course content, faculty, online experience, admissions requirements, and everything else has remained the same. Questions about the name change? Please contact us.

Looking for information about CMU's on-campus Master of Computational Data Science degree? Visit the program's website to learn more. Admissions consultations with our team will only cover the online certificate program.

A National Leader in Computer Science

Carnegie Mellon University is world renowned for its technology and computer science programs. Our courses are taught by leading researchers in the fields of Machine Learning, Language Technologies, and Human-Computer Interaction.

Number One in the nation for our artificial intelligence programs.

Number One in the nation for our programming language courses.

Number Four in the nation for the caliber of our computer science programs.

Cornell Chronicle

Architecture & Design
Arts & Humanities
Business, Economics & Entrepreneurship
Computing & Information Sciences
Energy, Environment & Sustainability
Food & Agriculture
Global Reach
Health, Nutrition & Medicine
Law, Government & Public Policy
Life Sciences & Veterinary Medicine
Physical Sciences & Engineering
Social & Behavioral Sciences
Coronavirus
News & Events
Public Engagement
New York City
Photos of the Week
Big Red Sports
Freedom of Expression
Student Life
University Statements

Around Cornell

All Stories
In the News
Expert Quotes
Cornellians

News directly from Cornell's colleges and centers

Applied Machine Learning certificate prepares professionals for data science careers

By justin heitzman.

Effective data analysis can make all the difference for a company’s efficiency. Advances in machine learning (ML) have helped data scientists harness the power of artificial intelligence (AI) and take their analysis to the next level. Brian D’Alessandro, head of data science for Instagram’s Well-Being and Integrity teams and author of Cornell’s Applied Machine Learning and AI certificate program , has 20 years of experience in ML model construction. D’Alessandro spoke to the eCornell team about the certificate, ML instructional approaches and career paths.

What can students expect in the program?

“Even when students have learned how to build a model, I would argue that they haven’t learned to build a good model yet. There’s an empirical process required in tuning model parameters such that you get the prediction quality that you’re looking for. So, in this certificate, we ensure students learn how to build a strong, basic model . . . how to tune it and validate it, and then we start to introduce more complex algorithms that are more often used in industry settings. We cover K-nearest neighbors, decision trees and linear models – the full spectrum. Throughout the program, we emphasize the consistency of the API across different machine learning algorithms.”

Read the full story on the eCornell blog.

Media Contact

Media relations office.

Get Cornell news delivered right to your inbox.

Gallery Heading

Graduate College

Mingying zheng, using data preprocessing techniques and machine learning algorithms to explore predictors of word difficulty in english language assessment.

VC at FirstMark

Full Steam Ahead: The 2024 MAD (Machine Learning, AI & Data) Landscape

This is our tenth annual landscape and “state of the union” of the data, analytics, machine learning and AI ecosystem.

In 10+ years covering the space, things have never been as exciting and promising as they are today. All trends and subtrends we described over the years are coalescing : data has been digitized, in massive amounts; it can be stored, processed and analyzed fast and cheaply with modern tools; and most importantly, it can be fed to ever-more performing ML/AI models which can make sense of it, recognize patterns, make predictions based on it, and now generate text, code, images, sounds and videos.

The MAD (ML, AI & Data) ecosystem has gone from niche and technical, to mainstream . The paradigm shift seems to be accelerating with implications that go far beyond technical or even business matters, and impact society, geopolitics and perhaps the human condition.

There are still many chapters to write in the multi-decade megatrend, however. As every year, this post is an attempt at making sense of where we are currently, across products, companies and industry trends.

Here are the prior versions: 2012 , 2014 , 2016 , 2017 , 2018 , 2019 ( Part I and Part II ), 2020 , 2021 and 2023 ( Part I , Part II , Part III , Part IV ).

Our team this year was Aman Kabeer and Katie Mills (FirstMark), Jonathan Grana (Go Fractional) and Paolo Campos, major thanks to all. And a big thank you as well to CB Insights for providing the card data appearing in the interactive version.

This annual state of the union post is organized in three parts:

Part: I: The landscape (PDF, Interactive version)
Part II: 24 themes we’re thinking about in 2024
Part III: Financings, M&A and IPOs

PART I: THE LANDSCAPE

To see a PDF of the 2024 MAD Landscape in full resolution (please zoom!), please CLICK HERE

To access the interactive version of the 2024 MAD landscape, please CLICK HERE

Number of companies

The 2024 MAD landscape features 2,011 logos in total.

That number is up from 1,416 last year, with 578 new entrants to the map.

For reference, the very first version in 2012 had just 139 logos.

The intensely (insanely?) crowded nature of the landscape primarily r esults from two back-to-back massive waves of company creation and funding.

The first wave was the 10-ish year long data infrastructure cycle , which started with Big Data and ended with the Modern Data Stack. The long awaited consolidation in that space has not quite happened yet, and the vast majority of the companies are still around.

The second wave is the ML/AI cycle , which started in earnest with Generative AI. As we are in the early innings of this cycle, and most companies are very young, we have been liberal in including young startups (a good number of which are seed stage still) in the landscape.

Note: those two waves are intimately related. A core idea of the MAD Landscape every year has been to show the symbiotic relationship between data infrastructure (on the left side); analytics/BI and ML/AI (in the middle) and applications (on the right side).

While it gets harder every year to fit the ever-increasing number of companies on the landscape every year, but ultimately, the best way to think of the MAD space is as an assembly line – a full lifecycle of data from collection to storage to processing to delivering value through analytics or applications.

Two big waves + limited consolidation = lots of companies on the landscape.

Main changes in “Infrastructure” and “ Analytics “

We’ve made very few changes to the overall structure of the left side of the landscape – as we’ll see below (Is the Modern Data Stack dead?), this part of the MAD landscape has seen a lot less heat lately.

Some noteworthy changes: We renamed “Database Abstraction” to “Multi-Model Databases & Abstractions” , to capture the rising wave around an all-in-one ‘Multi-Model’ database group (SurrealDB*, EdgeDB); killed the “Crypto / Web 3 Analytics” section we experimentally created last year, which felt out of place in this landscape; and removed the “Query Engine” section, which felt more like a part of a section than a separate section (all the companies in that section still appear on the landscape – Dremio, Starburst, PrestoDB etc).

Main changes in “Machine Learning & Artificial Intelligence”

With the explosion of AI companies in 2023, this is where we found ourselves making by far the most structural changes.

“AI Observability” is a new category this year, with startups that help test, evaluate and monitor LLM applications
“AI Developer Platforms” is close in concept to MLOps but we wanted to recognize the wave of platforms that are wholly focused on AI application development, in particular around LLM training, deployment and inference
“AI Safety & Security” includes companies addressing concerns innate to LLMs, from hallucination to ethics, regulatory compliance, etc
If the very public beef between Sam Altman and Elon Musk has told us anything, it’s that the distinction between commercial and nonprofit is a critical one when it comes to foundational model developers. As such, we have split what was previously “Horizontal AI/AGI” into two categories: “Commercial AI Research” and “Nonprofit AI Research”
The final change we made was another nomenclature one, where we amended “GPU Cloud” to reflect the addition of core infrastructure feature sets made by many of the GPU Cloud providers: “GPU Cloud / ML Infra”

Main changes in “Applications”

The biggest update here is that…to absolutely no one’s surprise…every application-layer company is now a self-proclaimed “AI company” – which, as much as we tried to filter, drove the explosion of new logos you see on the right side of the MAD landscape this year
In “Horizontal Applications”, we added a “Presentation & Design” category
We renamed “Search” to “Search / Conversational AI” to reflect the rise of LLM-powered chat-based interface such as Perplexity.
In “Industry”, we rebranded “Gov’t & Intelligence” to “Aerospace, Defense & Gov’t”

Main changes in “Open Source Infrastructure”

We merged categories that have always been close, creating a single “Data Management” category that spans both “Data Access” and “Data Ops”
We added an important new category, “Local AI” as builders sought to provide the infrastructure tooling to bring AI & LLMs to the local development age

PART II: 24 THEMES WE’RE THINKING ABOUT IN 2024

Things in AI are both moving so fast, and getting so much coverage, that it is almost impossible to provide a fully comprehensive “state of the union” of the MAD space, as we did in prior years.

So here’s for a different format: in no particular order, here are 24 themes that are top of mind and/or come up frequently in conversations. Some are fairly fleshed out thoughts, some largely just questions or thought experiments.

Structured vs unstructured data

This is partly a theme, partly something we find ourselves mentioning a lot in conversations to help explain the current trends.

So, perhaps as an introduction to this 2024 discussion, here’s one important reminder upfront, which explains some of the key industry trends. Not all data is the same. At the risk of grossly over-simplifying, there are two main families of data, and around each family, a set of tools and use cases has emerged.

For analytical purposes, data gets extracted from transactional databases and SaaS tools, stored in cloud data warehouses (like Snowflake), transformed, and analyzed and visualized using Business Intelligence (BI) tools, mostly for purposes of understanding the present and the past (what’s known as “descriptive analytics”). That assembly line is often enabled by the Modern Data Stack discussed below, with analytics as the core use case.
In addition, structured data can also get fed in “traditional” ML/AI models for purposes of predicting the future (predictive analytics) – for example, which customers are most likely to churn
Unstructured data pipelines : that is the world of data that typically doesn’t fit into rows and columns such as text, images, audio and video. Unstructured data is largely what gets fed in Generative AI models (LLMs, etc), both to train and use (inference) them.

Those two families of data (and the related tools and companies) are experiencing very different fortunes and levels of attention right now.

Unstructured data (ML/AI) is hot; structured data (Modern Data Stack, etc) is not.

Is the Modern Data Stack dead?

Not that long ago (call it, 2019-2021), there wasn’t anything sexier in the software world than the Modern Data Stack (MDS). Alongside “Big Data”, it was one of the rare infrastructure concepts to have crossed over from data engineers to a broader audience (execs, journalists, bankers).

The Modern Data Stack basically covered the kind of structured data pipeline mentioned above. It gravitated around the fast-growing cloud data warehouses, with vendors positioned upstream from it (like Fivetran and Airbyte), on top of it (DBT) and downstream from it (Looker, Mode).

As Snowflake emerged as the biggest software IPO ever, interest in the MDS exploded, with rabid, ZIRP-fueled company creation and VC funding. Entire categories became overcrowded within a year or two – data catalogs, data observability, ETL, reverse ETL, to name a few.

A real solution to a real problem, the Modern Data Stack was also a marketing concept and a de-facto alliance amongst a number of startups across the value chain of data.

Fast forward to today, the situation is very different. In 2023, we had previewed that the MDS was “under pressure”, and that pressure will only continue to intensify in 2024.

The MDS is facing two key issues:

Putting together a Modern Data Stack requires stitching together various best-of-breed solutions from multiple independent vendors. As a result, it’s costly in terms of money, time and resources. This is not looked upon favorably by the CFO office in a post ZIRP budget cut era
The MDS is no longer the cool kid on the block. Generative AI has stolen all the attention from execs, VCs and the press – and it requires the kind of unstructured data pipelines we mentioned above.

Watch: MAD Podcast with Tristan Handy, CEO, dbt Labs ( Apple , Spotify )

Consolidation in data infra, and the big getting bigger

Given the above, what happens next in data infra and analytics in 2024?

It may look something like this:

Many startups in and around the Modern Data Stack will aggressively reposition as “AI infra startups ” and try to find a spot in the Modern AI Stack (see below). This will work in some cases, but going from structured to unstructured data may require a fundamental product evolution in most cases.
The data infra industry will finally see some consolidation . M&A has been fairly limited to date, but some acquisitions did happen in 2023, whether tuck-ins or medium-size acquisitions – including Stemma (acquired by Teradata), Manta (acquired by IBM), Mode (acquired by Thoughtspot), etc (see PART III below)
There will be a lot more startup failure – as VC funding dried up, things have gotten tough. Many startups have cut costs dramatically, but at some point their cash runway will end. Don’t expect to see flashy headlines, but this will (sadly) happen.
The bigger companies in the space, whether scale-ups or public companies, will double down on their platform play and push hard to cover ever more functionality. Some of it will be through acquisitions (hence the consolidation) but a lot of it will also be through homegrown development.
Checking in on Databricks vs Snowflake

Speaking of big companies in the space, let’s check in on the “titanic shock” (see our MAD 2021 blog post) between the two key data infra players, Snowflake and Databricks.

Snowflake (which historically comes from the structured data pipeline world) remains an incredible company, and one of the highest valued public tech stocks (14.8x EV/NTM revenue as of the time of writing). However, much like a lot of the software industry, its growth has dramatically slowed down – it finished fiscal 2024 with a 38% year-over-year product revenue growth, totaling $2.67 billion, projecting 22% NTM rev growth as of the time of writing). Perhaps most importantly, Snowflake gives the impression of a company under pressure on the product front – it’s been slower to embrace AI , and comparatively less acquisitive. The recent, and somewhat abrupt, CEO transition is another interesting data point.

Databricks (which historically comes from the unstructured data pipeline and machine learning world) is experiencing all-around strong momentum, reportedly (as it’s still a private company) closing FY’24 with $1.6B in revenue with 50%+ growth. Importantly, Databricks is emerging as a key Generative AI player , both through acquisitions (most notably, MosaicML for $1.3B) and homegrown product development – first and foremost as a key repository for the kind of unstructured data that feeds LLMs, but also as creator of models, from Dolly to DBRX, a new generative AI model the company just announced at the time of writing.

The major new evolution in the Snowflake vs Databricks rivalry is the launch of Microsoft Fabric . Announced in May 2023, it’s an end-to-end, cloud-based SaaS platform for data and analytics. It integrates a lot of Microsoft products, including OneLake (open lakehouse), PowerBI and Synapse Data Science, and covers basically all data and analytics workflows, from data integration and engineering to data science. As always for large company product launches, there’s a gap between the announcement and the reality of the product, but combined with Microsoft’s major push in Generative AI, this could become a formidable threat (as an additional twist to the story, Databricks largely sits on top of Azure).

BI in 2024, and Is Generative AI about to transform data analytics?

Of all parts of the Modern Data Stack and structured data pipelines world, the category that has felt the most ripe for reinvention is Business Intelligence. We highlighted in the 2019 MAD how the BI industry had almost entirely consolidated, and talked about the emergence of metrics stores in the 2021 MAD .

The transformation of BI/analytics has been slower than we’d have expected. The industry remains largely dominated by older products, Microsoft’s PowerBI, Salesforce’s Tableau and Google’s Looker, which sometimes get bundled in for free in broader sales contracts. Some more consolidation happened (Thoughtspot acquired Mode; Sisu was quietly acquired by Snowflake). Some young companies are taking innovative approaches, whether scale-ups (see dbt and their semantic layer/MetricFlow) or startups (see Trace* and their metrics tree), but they’re generally early in the journey.

In addition to potentially playing a powerful role in data extraction and transformation, Generative AI could have a profound impact in terms of superpowering and democratizing data analytics .

There’s certainly been a lot of activity. OpenAI launched Code Interpreter, later renamed to Advanced Data Analysis. Microsoft launched a Copilot AI chatbot for finance workers in Excel. Across cloud vendors, Databricks, Snowflake, open source and a substantial group of startups, a lot of people are working on or have released “text to SQL” products, to help run queries into databases using natural language.

The promise is both exciting and potentially disruptive. The holy grail of data analytics has been its democratization. Natural language, if it were to become the interface to notebooks, databases and BI tools, would enable a much broader group of people to do analysis.

Many people in the BI industry are skeptical, however. The precision of SQL and the nuances of understanding the business context behind a query are considered big obstacles to automation.

The Rise of the Modern AI Stack

A lot of what we’ve discussed so far had to do with the world of structured data pipelines.

As mentioned, the world of unstructured data infrastructure is experiencing a very different moment. Unstructured data is what feeds LLMs, and there’s rabid demand for it. Every company that’s experimenting or deploying Generative AI is rediscovering the old cliche: “data is the new oil”. Everyone wants the power of LLMs, but trained on their (enterprise) data.

Companies big and small have been rushing into the opportunity to provide the infrastructure of Generative AI.

Several AI scale-ups have been aggressively evolving their offerings to capitalize on market momentum – everyone from Databricks (see above) to Scale AI (which evolved their labeling infrastructure, originally developed for the self-driving car market, to partner as an enterprise data pipeline with OpenAI and others) to Dataiku* (which launched their LLM Mesh to enable Global 2000 companies to seamlessly work across multiple LLM vendors and models).

Meanwhile a new generation of AI infra startups is emerging, across a number of domains, including:

Vector databases , which store data in a format (vector embeddings) that Generative AI models can consume. Specialized vendors (Pinecone, Weaviate, Chroma, Qudrant etc) have had a banner year, but some incumbent database players (MongoDB) were also quick to react and add vector search capabilities. There’s also an ongoing debate about whether longer context windows will obviate the need for vector databases altogether, with strong opinions on both sides of the argument.
Frameworks (LlamaIndex, Langchain etc), which connect and orchestrate all the moving pieces
Guardrails , which sit between an LLM and users and make sure the model provides outputs that follow the organization’s rules.
Evaluators which help test, analyze and monitor Generative AI model performance, a hard problem as demonstrated by the general distrust in public benchmarks
Routers , which help direct user queries across different models in real time, to optimize performance, cost and user experience
Cost guards , which help monitor the costs of using LLMs
Endpoints , effectively APIs that abstract away the complexities of underlying infrastructure (like models)

We’ve been resisting using the term “ Modern AI Stack ”, given the history of the Modern Data Stack.

But the expression captures the many parallels: many of those startups are the “hot companies” of the day, just like MDS companies before them, they tend to travel in pack , forging marketing alliances and product partnerships.

And this new generation of AI infra startups is going to face some of the same challenges as MDS companies before them: are any of those categories big enough to build a multi-billion dollar company? Which part will big companies (mostly cloud providers, but also Databricks and Snowflake) end up building themselves?

WATCH – we have featured many emerging Modern AI Stack startups on the MAD Podcast:

MAD Podcast with Edo Liberty, CEO, Pinecone ( Apple , Spotify )
MAD Podcast with Jeff Huber, CEO, Chroma ( Apple , Spotify )
MAD Podcast with Bob van Luijt, Weaviate ( Apple , Spotify )
MAD Podcast with Shreya Rajpal, CEO, Guardrails AI ( Apple , Spotify )
MAD Podcast with Jerry Liu, CEO, Llama Index ( Apple , Spotify )
MAD Podcast with Sharon Zhou, CEO, Lamini ( Apple , Spotify )
MAD Podcast with Dylan Fox, CEO, Assembly AI ( Apple , Spotify )
Where are we in the AI hype cycle?

AI has a multi decade-long history of AI summers and winters. Just in the last 10-12 years, this is the third AI hype cycle we’ve experienced : there was one in 2013-2015 after deep learning came to the limelight post ImageNet 2012; another one sometime around 2017-2018 during the chatbot boom and the rise of TensorFlow; and now since November 2022 with Generative AI.

This hype cycle has been particularly intense, to the point of feeling like an AI bubble, for a number of reasons: the technology is incredibly impressive; it is very visceral and crossed over to a broad audience beyond tech circles; and for VCs sitting on a lot of dry powder, it’s been the only game in town as just about everything else in technology has been depressed.

Hype has brought all the usual benefits (“nothing great has ever been achieved without irrational exuberance”, “let a 1000 flowers bloom” phase, with lots of money available for ambitious projects) and noise (everyone is an AI expert overnight, every startup is an AI startup, too many AI conferences/podcasts/newsletters… and dare we say, too many AI market maps???).

The main issue of any hype cycle is the inevitable blowback .

There’s a fair amount of “quirkiness” and risk built into this market phase : the poster-child company for the space has a very unusual legal and governance structure; there are a lot of “compute for equity” deals happening (with potential round-tripping) that are not fully understood or disclosed; a lot of top startups are run by teams of AI researchers; and a lot of VC dealmaking is reminiscent of the ZIRP times: “land grabs”, big rounds and eye-watering valuations for very young companies.

There certainly have been cracks in AI hype (see below), but we’re still in a phase where every week a new thing blows everyone’s minds. And news like the reported $40B Saudi Arabia AI fund seem to indicate that money flows into the space are not going to stop anytime soon.

Experiments vs reality: was 2023 a headfake?

Related to the above – given the hype, how much has been real so far, vs merely experimental?

2023 was an action packed year: a) every tech vendor rushed to include Generative AI in their product offering, b) every Global 2000 board mandated their teams to “do AI”, and some enterprise deployments happened a record speed, including at companies in regulated industries like Morgan Stanley and Citibank and c) of course, consumers showed rabid interest for Generative AI apps.

As a result, 2023 was a year of big wins: OpenAI reached $2B in annual run rate; Anthropic grew at a pace that allowed it to forecast $850M in revenues for 2024; Midjourney grew to $200M in revenue with no investment and a team of 40; Perplexity AI went from 0 to 10 million monthly active users, etc.

Should we be cynical? Some concerns:

In the enterprise, a lot of the spend was on proof of concepts, or easy wins, often coming out of innovation budgets.
How much was driven by executives wanting to not appear flat-footed, vs solving actual business problems?
In consumer, AI apps show high churn. How much was it mere curiosity?
Both in their personal and professional lives, many report not being entirely sure what to do with Generative AI apps and products
Not all Generative AI products, even those built by the best AI minds, are going to be magical: should we view Inflection AI’s decision to fold quickly, after raising $1.3B, as an admission that the world doesn’t need yet another AI chatbot, or even LLM provider?
LLM companies: maybe not so commoditized after all?

Billions of venture capital and corporate money are being invested in foundational model companies.

Hence everyone’s favorite question in the last 18 months: are we witnessing a phenomenal incineration of capital into ultimately commoditized products? Or are those LLM providers the new AWS, Azure and GCP?

A troubling fact (for the companies involved) is that no LLM seems to be building a durable performance advantage. At the time of writing, Claude 3 Sonnet and Gemini Pro 1.5 perform better than GPT-4 which performs better than Gemini 1.0 Ultra, and so on and so forth – but this seems to change every few weeks. Performance also can fluctuate – ChatGPT at some point “lost its mind” and “got lazy”, temporarily.

In addition, open source models (Llama 3, Mistral and others like DBRX) are quickly catching up in terms of performance.

Separately – there are a lot more LLM providers on the market than could have appeared at first. A couple of years ago, the prevailing narrative was that there could be only one or two LLM companies, with a winner-take-all dynamic – in part because there was a tiny number of people around the world with the necessary expertise to scale Transformers.

It turns out there are more capable teams than first anticipated . Beyond OpenAI and Anthropic, there are a number of startups doing foundational AI work – Mistral, Cohere, Adept, AI21, Imbue, 01.AI to name a few – and then of course the teams at Google, Meta, etc.

Having said that – so far the LLM providers seem to be doing just fine. OpenAI and Anthropic revenues are growing at extraordinary rates, thank you very much. Maybe the LLM models do get commoditized, the LLM companies still have an immense business opportunity in front of them. They’ve already become “ full stack ” companies, offering applications and tooling to multiple audiences (consumer, enterprise, developers), on top of the underlying models.

Perhaps the analogy with cloud vendors is indeed pretty apt. AWS, Azure and GCP attract and retain customers through an application/tooling layer and monetize through a compute/storage layer that is largely undifferentiated.

Breaking: Anthropic working on its most powerful model yet, Jean-Claude pic.twitter.com/geJFls6yHs — Matt Turck (@mattturck) March 4, 2024

MAD Podcast with Ori Goshen, co-founder, AI21 Labs
MAD Podcast with Kanjun Qiu, CEO, Imbue
LLMs, SLMs and a hybrid future

For all the excitement about Large Language Models, one clear trend of the last few months has been the acceleration of small language models (SLMs), such as Llama-2-13b from Meta, Mistral-7b and Mixtral 8x7b from Mistral and Phi-2 and Orca-2 from Microsoft.

While the LLMs are getting ever bigger (GPT-3 reportedly having 175 billion parameters, GPT-4 reportedly having 1.7 trillion, and the world waiting for an even more massive GPT-5), SLMs are becoming a strong alternative for many use cases are they are cheaper to operate, easier to finetune, and often offer strong performance.

Another trend accelerating is the rise of specialized models , focused on specific tasks like coding (Code-Llama, Poolside AI) or industries (e.g. Bloomberg’s finance model, or startups Orbital Materials building models for material sciences, etc).

As we are already seeing across a number of enterprise deployments, the world is quickly evolving towards hybrid architectures , combining multiple models.

Although prices have been going down (see below), big proprietary LLMs are still very expensive, experience latency problems, and so users/customers will increasingly be deploying combinations of models, big and small, commercial and open source, general and specialized, to meet their specific needs and cost constraints.

Watch: MAD Podcast with Eiso Kant, CTO, Poolside AI ( Apple , Spotify )

Is traditional AI dead?

A funny thing happened with the launch of ChatGPT: much of the AI that had been deployed up until then got labeled overnight as “Traditional AI” , in contrast to “Generative AI”.

This was a little bit of a shock to many AI practitioners and companies that up until then were considered to be doing leading-edge work, as the term “traditional” clearly suggests an impending wholesale replacement of all forms of AI by the new thing.

The reality is a lot more nuanced. Traditional AI and Generative AI are ultimately very complementary as they tackle different types of data and use cases .

What is now labeled as “traditional AI”, or occasionally as “predictive AI” or “tabular AI”, is also very much part of modern AI (deep learning based). However, it generally focuses on structured data (see above), and problems such as recommendations, churn prediction, pricing optimization, inventory management. “Traditional AI” has experienced tremendous adoption in the last decade, and it’s already deployed at scale in production in thousands of companies around the world.

In contrast, Generative AI largely operates on unstructured data (text, image, videos, etc.). Is exceptionally good at a different class of problems (code generation, image generation, search, etc).

Here as well, the future is hybrid : companies will use LLMs for certain tasks, predictive models for other tasks. Most importantly, they will often combine them – LLMs may not be great at providing a precise prediction, like a churn forecast, but you could use an LLM that calls on the output of another model which is focused on providing that prediction, and vice versa.

Thin wrappers, thick wrappers and the race to be full stack

“Thin wrappers” was the dismissive term everyone loved to use in 2023. It’s hard to build long lasting value and differentiation if your core capabilities are provided by someone else’s technology (like OpenAI), the argument goes. And reports a few months ago that startups like Jasper were running into difficulties, after experiencing a meteoric revenue rise, seem to corroborate that line of thinking.

The interesting question is what happens over time, as young startups build more functionality. Do thin wrappers become thick wrappers ?

In 2024, it feels like thick wrappers have a path towards differentiation by:

Focusing on a specific problem, often vertical – as anything too horizontal runs the risk of being in the “kill zone” of Big Tech
Building workflow, collaboration and deep integrations, that are specific to that problem
Doing a lot of work at the AI model level – whether finetuning models with specific datasets or creating hybrid systems (LLMs, SLMs, etc) tailored for their specific business

In other words, they will need to be both narrow and “ full stack ” (both applications and infra).

Interesting areas to watch in 2024: AI agents, Edge AI

There’s been plenty of excitement over the last year around the concept of AI agents – basically the last mile of an intelligent system that can execute tasks, often in a collaborative manner. This could be anything from helping to book a trip (consumer use case) to automatically running full SDR campaigns (productivity use case) to RPA-style automation (enterprise use case).

AI agents are the holy grail of automation – a “text to action” paradigm where AI just gets stuff done for us.

Every few months, the AI world goes crazy for an agent-like product, from BabyAGI last year to Devin AI (an “AI software engineer”) just recently. However, in general, much of this excitement has proven premature to date. There’s a lot of work to be done first to make Generative less brittle and more predictable, before complex systems involving several models can work together and take actual actions on our behalf. There are also missing components – such as the need to build more memory into AI systems. However, expect AI agents to be a particularly exciting area in the next year or two.

Another interesting area is Edge AI . As much as there is a huge market for LLMs that run at massive scale and delivered as end points, a holy grail in AI has been models that can run locally on a device, without GPUs, in particular phones, but also intelligent, IoT-type devices. The space is very vibrant: Mixtral, Ollama, Llama.cpp, Llamafile, GPT4ALL (Nomic). Google and Apple are also likely to be increasingly active.

Is Generative AI heading towards AGI, or towards a plateau?

It’s almost a sacrilegious question to ask given all the breathless takes on AI, and the incredible new products that seem to come out every week – but is there a world where progress in Generative AI slows down rather than accelerates all the way to AGI? And what would that mean?

The argument is twofold: a) foundational models are a brute force exercise , and we’re going to run out of resources (compute, data) to feed them, and b) even if we don’t run out, ultimately the path to AGI is reasoning, which LLMs are not capable of doing .

Interestingly, this is more or less the same discussion as the industry was having 6 years ago, as we described in a 2018 blog post . Indeed what seems to have changed mostly since 2018 is the sheer amount of data and compute we’ve thrown at (increasingly capable) models.

How much progress we’ve made in AI reasoning is less clear, overall – although DeepMind’s program AlphaGeometry seems to be an important milestone, as it combines a language model with a symbolic engine, which logical rules to make deductions.

How close we are from any kind of “running out” of compute or data is very hard to assess.

The frontier for “running out of compute” seems to be pushed back further every day. NVIDIA’s recently announced Blackwell GPU system, and the company says it can deploy a 27 trillion parameter model (vs 1.7 trillion for GPT-4).

The data part is complex – there’s a more tactical question around running out of legally licensed data (see all the OpenAI licensing deals), and a broader question around running out of textual data, in general. There is certainly a lot of work happening around synthetic data. Yann LeCun discussed how taking models to the next level would probably require them to be able to ingest much richer video input, which is not yet possible.

There’s a tremendous amount of expectations on GPT-5 . How much better it will be than GPT-4 will be widely viewed as a bellwether of the overall pace of progress in AI.

From the narrow perspective of participants in the startup ecosystem (founders, investors), perhaps the question matters less, in the medium term – if progress in Generative AI reached an asymptote tomorrow, we’d still have years of business opportunity ahead deploying what we currently have across verticals and use cases.

The GPU wars (is NVIDIA overvalued?)

Are we in the early innings of a massive cycle where compute becomes the most precious commodity in the world, or dramatically over-building GPU production in a way that’s sure to lead to a big crash?

As pretty much the only game in town when it comes to Generative AI-ready GPUs, NVIDIA certainly has been having quite the moment, with a share price up five-fold to a $2.2 trillion valuation, and total sales three-fold since late 2022, massive excitement around its earnings and Jensen Huang at GTC rivaling Taylor Swift for the biggest event of 2024.

Love this shot of Jensen Huang from the NVIDIA Q4 results announcement pic.twitter.com/9BrJgv88Yq — Matt Turck (@mattturck) February 22, 2024

Perhaps this was also in part because it was the ultimate beneficiary of all the billions invested by VCs in AI?

Generative AI investing: a process by which venture capital firms transfer large amounts of money to NVIDIA via intermediaries known as “startups” — Matt Turck (@mattturck) June 14, 2023

Regardless, for all its undeniable prowess as a company, NVIDIA’s fortunes will be tied to how sustainable the current gold rush will turn out to be. Hardware is hard, and predicting with accuracy how many GPUs need to be manufactured by TSMC in Taiwan is a difficult art.

In addition, competition is trying its best to react, from AMD to Intel to Samsung; startups (like Groq or Cerebras) are accelerating, and new ones may be formed, like Sam Altman’s rumored $7 trillion chip company. A new coalition of tech companies including Google, Intel and Qualcomm is trying to go after NVIDIA’s secret weapon: its CUDA software that keeps developers tied to Nvidia chips.

Our take: As the GPU shortage subsides, there may be short-to medium term downward pressure on NVIDIA, but the long term for AI chips manufacturers remains incredibly bright.

Open source AI: too much of a good thing?

This one is just to stir a pot a little bit. We’re huge fans of open source AI, and clearly this has been a big trend of the last year or so. Meta made a major push with its Llama models, France’s Mistral went from controversy fodder to new shining star of Generative AI, Google released Gemma, and HuggingFace continued its ascension as the ever so vibrant home of open source AI, hosting a plethora of models. Some of the most innovative work in Generative AI has been done in the open source community.

However, there’s also a general feeling of inflation permeating the open source community . Hundreds of thousands of open source AI models are now available. Many are toys or weekend projects. Models go up and down the rankings, some of them experiencing meteoric rises by Github star standards (a flawed metric, but still) in just a few days, only to never transform into anything particularly usable.

The market will be self-correcting, with a power law of successful open-source projects that will get disproportionate support from cloud providers and other big tech companies. But in the meantime, the current explosion has been dizzying to many.

How much does AI actually cost?

The economics of Generative AI is a fast-evolving topic. And not surprisingly, a lot of the future of the space revolves around it – for example, can one seriously challenge Google in search, if the cost of providing AI-driven answers is significantly higher than the cost of providing ten blue links? And can software companies truly be AI-powered if the inference costs eat up chunks of their gross margin?

The good news, if you’re a customer/user of AI models: we seem to be in the early phase of a race to the bottom on the price side, which is happening faster than one may have predicted. One key driver has been the parallel rise of open source AI (Mistral etc) and commercial inference vendors (Together AI, Anyscale, Replit) taking those open models and serving them as end points. There are very little switching costs for customers (other than the complexity of working with different models producing different results), and this is putting pressure on OpenAI and Anthropic. An example of this has been the significant cost drops for embedding models where multiple vendors (OpenAI, Together AI etc) dropped prices at the same time.

From a vendor perspective, the costs of building and serving AI remain very high. It was reported in the press that Anthropic spent more than half of the revenue it generated paying cloud providers like AWS and GCP to run its LLMs. There’s the cost of licensing deals with publishers as well

On the plus side, maybe all of us as users of Generative technologies should just enjoy the explosion of VC-subsidized free services:

VCs brought you cheap Ubers VCs brought you cheap Airbnbs VCs are bringing you cheap AI inference YOU'RE WELCOME — Matt Turck (@mattturck) January 26, 2024

Watch: MAD Podcast with Brandon Duderstadt and Zach Nussbaum, Nomic

Big companies and the shifting political economy of AI: Has Microsoft won?

This was one of the first questions everyone asked in late 2022, and it’s even more top of mind in 2024: will Big Tech capture most of the value in Generative AI?

AI rewards size – more data, more compute, more AI researchers tends to yield more power. Big Tech has been keenly aware of this. Unlike incumbents in prior platform shifts, it has also been intensely reactive to the potential disruption ahead.

Among Big Tech companies, it certainly feels like Microsoft has been playing 4-D chess . There’s obviously the relationship with OpenAI, in which Microsoft first invested in 2019, and has now backed to the tune of $13B. But Microsoft also partnered with open source rival Mistral. It invested in ChatGPT rival Inflection AI (Pi), only to acqui-hire it in spectacular fashion recently.

And ultimately, all those partnerships seem to only create more need for Microsoft’s cloud compute – Azure revenue grew 24% year-over-year to reach $33 billion in Q2 2024, with 6 points of Azure cloud growth attributed to AI services.

In case you’re confused: Microsoft is the biggest investor in OpenAI but also a competitor to OpenAI and an investor in competitor chatbot Inflection AI – meanwhile Microsoft is also a key partner, but also a competitor, to Databricks with Azure AI Hope this clarifies — Matt Turck (@mattturck) November 7, 2023

Meanwhile, Google and Amazon have partnered with and invested in OpenAI rival Anthropic (at the time of writing, Amazon just committed another $2.75B to the company, in the 2nd tranche of its planned $4B investment). Amazon also partnered with open source platform Hugging Face. Google and Apple are reportedly discussing an integration of Gemini AI in Apple products. Meta is possibly under-cutting everyone by going full hog on open source AI. Then there is everything happening in China.

The obvious question is how much room there is for startups to grow and succeed. A first tier of startups (OpenAI and Anthropic, mainly, with perhaps Mistral joining them soon) seem to have struck the right partnerships, and reached escape velocity. For a lot of other startups, including very well funded ones, the jury is still very much out.

Should we read in Inflection AI’s decision to let itself get acquired, and Stability AI’s CEO troubles, an admission that commercial traction has been harder to achieve for a group of “second tier” Generative AI startups ?

Fanboying OpenAI – or not?

OpenAI continues to fascinate – the $86B valuation, the revenue growth, the palace intrigue, and Sam Altman being the Steve Jobs of this generation:

Sam Altman returning to OpenAI after a day is like Steve Jobs returning to Apple after 12 years, but for the TikTok generation https://t.co/AHqH7WmVfF — Matt Turck (@mattturck) November 18, 2023

A couple of interesting questions:

Is OpenAI trying to do too much? Before all the November drama, there was the OpenAI Dev Day, during which OpenAI made it clear that it was going to do *everything* in AI, both vertically (full stack) and horizontally (across use cases): models + infrastructure + consumer search + enterprise + analytics + dev tools + marketplace, etc. It’s not an unprecedented strategy when a startup is an early leader in a big paradigm shift with de facto unlimited access to capital (Coinbase sort of did it in crypto). But it will be interesting to watch: while it would certainly simplify the MAD Landscape , it’s going to be a formidable execution challenge, particularly in a context where competition has intensified. From ChatGPT laziness issues to the underwhelming performance of its marketplace effort suggest that OpenAI is not immune to the business law of gravity.

Will OpenAI and Microsoft break up? The relationship with Microsoft has been fascinating – obviously Microsoft’s support has been a huge boost for OpenAI in terms of resources (including compute) and distribution (Azure in the enterprise), and the move was widely viewed as a master move by Microsoft in the early days of the Generative AI wave. At the same time, as just mentioned above, Microsoft has made it clear that it’s not dependent on OpenAI (has all the code, weights, data), it has partnered with competitors (e.g. Mistral), and through the Inflection AI acqui-hire it now has considerably beefed up its AI research team.

Meanwhile, will OpenAI want to continue being single threaded in a partnership with Microsoft, vs being deployed on other clouds?

Given OpenAI’s massive ambitions, and Microsoft aim at global domination, at what point do both companies conclude that they’re more competitors than partners?

Will 2024 be the year of AI in the enterprise?

As mentioned above, 2023 in the enterprise (defined, directionally, as Global 2000 companies) felt like one of those pivotal years where everyone scrambles to embrace a new trend, but nothing much actually happens.

There were some proof-of-concepts, and adoption of discreet AI products that provide “quick wins” without requiring a company-wide effort (e.g., AI video for training and enterprise knowledge, like Synthesia*).

Beyond those, perhaps the biggest winners of Generative AI in the enterprise so far have been the Accentures of the world (Accenture reportedly generated $2B in fees for AI consulting last year).

Big winners of the AI craze so far: consultants. — Matt Turck (@mattturck) May 12, 2023

Regardless, there’s tremendous hope that 2024 is going to be a big year for AI in the enterprise – or at least for Generative AI, as traditional AI already has a significant footprint there already (see above).

But we’re early in answering some of the key questions Global 2000-type companies face:

What are the use cases? The low hanging fruit use cases so far have been mostly a) code generation co-pilots for developer teams, b) enterprise knowledge management (search, text summarization, translation, etc), and c) AI chatbots for customer service (a use case that pre-dates Generative AI). There are certainly others (marketing, automated SDRs etc) but there’s a lot to figure out (co-pilot mode vs full automation etc).

What tools should we pick? As per the above, it feels like the future is hybrid, a combination of commercial vendors and open source, big and small models, horizontal and vertical GenAI tools. But where does one start?

Who will be deploying and maintaining the tools? There is a clear skill shortage in Global 2000 companies. If you thought recruiting software developers was hard, just try to recruit machine learning engineers.

How do we make sure they don’t hallucinate? Yes there’s a tremendous amount of work being done around RAG and guardrails and evaluations etc, but the possibility that a Generative AI tool may be plain wrong, and the broader question that we don’t really know how Generative AI models work, are big problems in the enterprise.

What is the ROI? Large tech companies have been early in leveraging Generative AI for their own needs, and they’re showing interesting early data. In their earnings call, Palo Alto Networks mentioned roughly halving the cost of their T&E servicing, and ServiceNow mentioned increasing our developer innovation speed by 52%, but we’re early in understanding the cost / return equation for Generative AI in the enterprise.

The good news for Generative AI vendors is that there’s plenty of interest from enterprise customers to allocate budget (importantly, no longer “innovation” budgets but actual OpEx budget, possibly re-allocated from other places) and resources to figuring it out. But we’re probably talking about a 3-5 year deployment cycle, rather than one.

MAD Podcast with Florian Douetteau, CEO, Dataiku ( Apple , Spotify )
MAD Podcast with Victor Riparbelli, CEO, Synthesia ( Apple , Spotify )
MAD Podcast with Mike Murchison, CEO, Ada ( Apple , Spotify )
Is AI going to kill SaaS?

This was one of the trendy ideas of the last 12 months.

One version of the question: AI makes it 10x to code, so with just a few average developers, you’ll be able to create a custom-made version of a SaaS product, tailored to your needs. Why pay a lot of money to a SaaS provider when you can build your own.

Another version of the question: the future is one AI intelligence (possibly made of several models) that runs your whole company with a series of agents. You no longer buy HR software, finance software or sales software because the AI intelligence does everything, in a fully automated and seamless way.

We seem to be somewhat far away from both of those trends actually happening in any kind of full-fledged manner, but as we all know, things change very fast in AI.

In the meantime, it feels like a likely version of the future is that SaaS products are going to become more powerful as AI gets built into every one of them.

Is AI going to kill venture capital?

Leaving aside the (ever-amusing) topic of whether AI could automate venture capital, both in terms of company selection, and post-investment value-add, there’s an interesting series of questions around whether the asset class is correctly-sized for the AI platform shift:

Is Venture Capital too small? The OpenAIs of the world have needed to raise billions of dollars, and may need to raise many more billions. A lot of those billions have been provided by big corporations like Microsoft – probably in large part in the form of compute-for-equity deals, but not only. Of course, many VCs have also invested in big foundational model companies, but at a minimum, those investments in highly capital-intensive startups are a clear departure from the traditional VC software investing model. Perhaps AI investing, at least when it comes to LLM companies, is going to require mega-sized VC funds – at the time of writing, Saudi Arabia seems to be about to launch a $40B AI fund in collaboration with US VC firms.

Is Venture Capital too big? If you believe that AI is going to 10x our productivity, including super coders and automated SDR agents and automated marketing creation, then we’re about to witness the birth of a whole generation of fully-automated companies run by skeleton teams (or maybe just one solo-preneur) that could theoretically reach hundreds of millions in revenues (and go public)? Does a $100M ARR company run by a solopreneur need venture capital ?

Reality is always more nuanced, but if one believes real value creation will happen either at the foundation model layer or at the application layer, there’s a world where the venture capital asset class, as it exists today, gets uncomfortably barbelled .

Will AI revive consumer?

Consumer has been looking for its next wind since the social media and mobile days. Generative AI may very well be it.

As a particularly exciting example, MidJourney emerged seemingly out of nowhere with somewhere between $200M and $300M, and it’s presumably vastly profitable given it has a small team (40-60 people depending on who you ask).

Some interesting areas (among many others):

Search : for the first time in decades, Google’s search monopoly has some early, but credible competitors. A handful of startups like Perplexity AI and You.com are leading the evolution f rom search engines to answer engines .

AI companions : beyond the dystopian aspects, what if every human had an infinitely patient and helpful companion attuned to one’s specific needs, whether for knowledge, entertainment or therapy

Surprisingly controversial take: hyper-personalized companion AI that can be your best friend and/or an always-on therapist is not dystopian, and instead a major net positive for humanity that will lead to less loneliness, violence and perhaps wars. — Matt Turck (@mattturck) December 17, 2023

AI hardware : Humane, Rabbit, VisionPro are exciting entries in consumer hardware

Hyper-personalized entertainment : what new forms of entertainment and art will we invent as Generative AI powered tools keep getting better (and cheaper)?

Movie watching experience: 2005: go to a movie theater 2015: stream Netflix 2025: ask LLM + text-to-video to create a new season of Narcos to watch tonight, but have it take place in Syria with Brad Pitt, Mr Beast and Travis Kelce in the leading roles — Matt Turck (@mattturck) February 15, 2024

MAD Podcast with Aravind Srinivas, CEO, Perplexity AI (Apple, Spotify)
MAD Podcast with Richard Socher, CEO, You.com (Apple, Spotify)
MAD Podcast with Cris Valenzuela, CEO, Runway (Apple, Spotify)
AI and blockchain: BS, or exciting?

I know, I know. The intersection of AI and crypto feels like perfect fodder for X/Twitter jokes.

However, it is an undeniable concern that AI is getting centralized in a handful of companies that have the most compute, data and AI talent – from Big Tech to the famously-not-open OpenAI. Meanwhile, the very core of the blockchain proposition is to enable the creation of decentralized networks that allow participants to share resources and assets. There is fertile ground for exploration there, a topic we started exploring years ago ( presentation ).

A number of AI-related crypto projects have experienced noticeable acceleration, including Bittensor* (decentralized machine intelligence platform), Render (decentralized GPU rendering platform), Arweave (decentralized data platform).

While we didn’t include a crypto section in this year’s MAD Landscape, this is an interesting area to watch.

Now, as always, the question is whether the crypto industry will be able to help itself, and not devolve into hundreds of AI-related memecoins, pump-and-dump schemes and scams.

BONUS: Other topics we did not discuss here:

Will AI kill us all? AI doomers vs AI accelerationists
Regulation, privacy, ethics, deep fakes
Can AI only be “made” in SF?

"All AI is in San Francisco", as demonstrated by the fact that everyone is obsessed with a Paris startup — Matt Turck (@mattturck) February 27, 2024

PART III: FINANCINGS, M&A AND IPOS

The current financing environment is one of the “tale of two markets” situations, where there’s AI, and everything else.

The overall funding continued to falter, declining 42% to $248.4B in 2023. The first few months of 2024 are showing some possible green shoots, but as of now the trend has been more or less the same.

Data infrastructure, for all the reasons described above, saw very little funding activity, with Sigma Computing and Databricks being some of the rare exceptions.

Obviously, AI was a whole different story.

The inescapable characteristics of the AI funding market have been:

A large concentration of capital in a handful of startups, in particular OpenAI, Anthropic, Inflection AI, Mistral, etc.
A disproportionate level of activity from corporate investors. The 3 most active AI investors in 2023 were Microsoft, Google and NVIDIA
Some murkiness in the above corporate deals about what amount is actual cash, vs “compute for equity”

Some noteworthy deals since our 2023 MAD, in rough chronological order (not an exhaustive list!):

OpenAI , a (or the?) foundational model developer, raised $10.3B across two rounds, now valued at $86B; Adept , another foundational model developer, raised $350M at a $1B valuation; AlphaSense , a market research platform for financial services, raised $475M across two rounds, now valued at $2.5B, Anthropic , yet another foundational model developer, raised $6.45B over three rounds, at a $18.4B valuation; Pinecone , a vector database platform, raised $100M at a $750M valuation; Celestial AI , an optical interconnect technology platform for memory and compute, raised $275M across two rounds; CoreWeave , a GPU Cloud provider, raised $421M at a $2.5B valuation; Lightmatter , developer of a light-powered chip for computing, raised $308M across two rounds, now valued at $1.2B; Sigma Computing , a cloud-hosted data analytics platform, raised $340M at a $1.1B valuation; Inflection , another foundational model developer, raised $1.3B at a $4B valuation; Mistral , a foundational model developer, raised $528M across two rounds, now valued at $2B; Cohere , (surprise) a foundational model developer, raised $270M at a $2B valuation; Runway , a generative video model developer, raised $191M at a $1.5B valuation; Synthesia* , a video generation platform for enterprise, raised $90M at a $1B valuation; Hugging Face , a machine learning and data science platform for working with open source models, raised $235M at a $4.5B valuation; Poolside , a foundational model developer specifically for code generation and software development, raised $126M; Modular , an AI development platform, raised $100M at a $600M valuation; Imbue , an AI agent developer, raised $212M; Databricks , provider of data, analytics and AI solutions, raised $684M at a $43.2B valuation; Aleph Alpha , another foundational model developer, raised $486M; AI21 Labs , a foundational model developer, raised $208M at a $1.4B valuation; Together , a cloud platform for generative AI development, raised $208.5M across two rounds, now valued at $1.25B; VAST Data , a data platform for deep learning, raised $118M at a $9.1B valuation; Shield AI , an AI pilot developer for the aerospace and defense industry, raised $500M at a $2.8B valuation; 01.ai , a foundational model developer, raised $200M at a $1B valuation; Hadrian , a manufacturer of precision component factories for aerospace and defense, raised $117M; Sierra AI , an AI chatbot developer for customer service / experience, raised $110M across two rounds; Glean , an AI-powered enterprise search platform, raised $200M at a $2.2B valuation; Lambda Labs , a GPU Cloud provider, raised $320M at a $1.5B valuation; Magic , a foundational model developer for code generation and software development, raised $117M at a $500M valuation.

M&A, Take Privates

The M&A market has been fairly quiet since the 2023 MAD.

A lot of traditional software acquirers were focused on their own stock price and overall business, rather than actively looking for acquisition opportunities.

And the particularly strict antitrust environment has made things trickier for potential acquirers.

Adobe blocked from buying Figma JetBlue blocked from acquiring Spirit Airlines Amazon abandons $1.4B deal to buy iRobot as it sees "no path to regulatory approval" Something is very broken in antitrust enforcement — Matt Turck (@mattturck) January 29, 2024

Private equity firms have been reasonably active, seeking lower price opportunities in the tougher market.

Some noteworthy transactions involving companies that have appeared over the years on the MAD landscape (in order of scale):

Broadcom , a semiconductor manufacturer, acquired VMWare , a cloud computing company, for $69B; Cisco , a networking and security infrastructure company, acquired Splunk , a monitoring and observability platform, for $28B; Qualtrics , a customer experience management company, was taken private by Silver Lake and CPP Investments for $12.5B; Coupa , a spend management platform, was taken private by Thoma Bravo for $8B; New Relic , a monitoring and observability platform, was acquired by Francisco Partners and TPG for $6.5B; Alteryx , a data analytics platform, was taken private by Clearlake Capital and Insight Partners for $4.4B; Salesloft , a revenue orchestration platform, was acquired by Vista Equity for $2.3B, which then also acquired Drift , an AI chatbot developer for customer experience; Databricks , a provider of data lakehouses, acquired MosaicML , an AI development platform, for $1.3B (and several other companies, for lower amounts like Arcion and Okera ); Thoughtspot , a data analytics platform, acquired Mode Analytics , a business intelligence startup, for $200M; Snowflake , a provider of data warehouses, acquired Neeva , a consumer AI search engine, for $150M; DigitalOcean , a cloud hosting provider, acquired Paperspace , a cloud computing and AI development startup, for $111M; NVIDIA , a chip manufacturer for cloud computing, acquired OmniML , an AI/ML optimization platform for the edge.

And of course, there was the “non-acquisition acquisition” of Inflection AI by Microsoft.

Is 2024 going to be the year of AI M&A ? A lot depends on continued market momentum.

At the lower end of the market, A lot of young AI startups with strong teams have been funded in the last 12-18 months. In the last couple of AI hype cycles of the last decade, a lot of acquihires happened after the initial funding cycle – often at prices that seemed disproportionate to the actual traction those companies had, but AI talent has always been rare and today is not very different.
At the higher end of the market, there is strong business rationale for further convergence between leading data platforms and leading AI platforms. Those deals are likely to be much more expensive, however.

In public markets, AI has been a hot trend. The “Magnificent Seven” stocks (Nvidia, Meta, Amazon, Microsoft, Alphabet, Apple and Tesla) gained at least 49% in 2023 and powered the overall stock market higher.

Overall, there is still a severe dearth of pure-play AI stocks in public markets. The few that are available are richly rewarded – Palantir stock jumped 167% in 2023.

This should bode well for a whole group of AI-related pre-IPO startups. There are a lot of companies at significant amounts of scale in the MAD space – first and foremost Databricks, but also a number of others including Celonis, Scale AI, Dataiku* or Fivetran.

Then there’s the intriguing question of how OpenAI and Anthropic will think about public markets.

In the meantime, 2023 was a very poor year in terms of IPOs. Only a handful of MAD-related companies went public: Klaviyo , a marketing automation platform, went public at a $9.2B valuation in September 2023 (see our Klaviyo S-1 teardown ); Reddit , a forum-style social networking platform (which licenses its content to AI players) , went public at a $6.4B valuation in March 2024; Astera Labs , a semiconductor company providing intelligent connectivity for AI and cloud infrastructure, went public at a $5.5B valuation in March 2024.

We live in very special times. We are early in a paradigm shift. Time to experiment and try new things. We’re just getting started.

Thinking you're late to AI today is like thinking you're late to the Internet in 1996 — Matt Turck (@mattturck) December 23, 2023

4 thoughts on “Full Steam Ahead: The 2024 MAD (Machine Learning, AI & Data) Landscape”

Great post, excellent work on the landscape, and a comprehensive review of the chaos of AI in 2023. Well done!

I felt that two thoughts were missing and I’m curious if you can comment. One, the question of what impacts AI regulation (or for that matter, rulings on copyright lawsuits) might do to the field and the subsequent investment landscape. Two, China. It was only mentioned as an aside…and while AI in 2023 was US-centric, you could say that TikTok has mastered traditional AI with their recommendation algorithm…and Baidu, Tencent, and Alibaba are certainly moving fast, probably with a lot of government help.

Maybe these were topics #25 and #26 and you had to make the cutoff somewhere…

Intel OpenVINO can be added in “AI Frameworks, Tools & Libraries”.

Super comprehensive Matt; kudos to you and the team. The shifts in the overall bucketing and naming was really interesting, and makes for a compelling view you uniquely have because you do this ever year. More in depth than anything I’ve read on the entire landscape of MAD in a long time. Cheers! -Bo

Computer Science > Machine Learning

Title: big earth data and machine learning for sustainable and resilient agriculture.

Abstract: Big streams of Earth images from satellites or other platforms (e.g., drones and mobile phones) are becoming increasingly available at low or no cost and with enhanced spatial and temporal resolution. This thesis recognizes the unprecedented opportunities offered by the high quality and open access Earth observation data of our times and introduces novel machine learning and big data methods to properly exploit them towards developing applications for sustainable and resilient agriculture. The thesis addresses three distinct thematic areas, i.e., the monitoring of the Common Agricultural Policy (CAP), the monitoring of food security and applications for smart and resilient agriculture. The methodological innovations of the developments related to the three thematic areas address the following issues: i) the processing of big Earth Observation (EO) data, ii) the scarcity of annotated data for machine learning model training and iii) the gap between machine learning outputs and actionable advice. This thesis demonstrated how big data technologies such as data cubes, distributed learning, linked open data and semantic enrichment can be used to exploit the data deluge and extract knowledge to address real user needs. Furthermore, this thesis argues for the importance of semi-supervised and unsupervised machine learning models that circumvent the ever-present challenge of scarce annotations and thus allow for model generalization in space and time. Specifically, it is shown how merely few ground truth data are needed to generate high quality crop type maps and crop phenology estimations. Finally, this thesis argues there is considerable distance in value between model inferences and decision making in real-world scenarios and thereby showcases the power of causal and interpretable machine learning in bridging this gap.

Submission history

Access paper:.

Other Formats

References & Citations

Google Scholar
Semantic Scholar

BibTeX formatted citation

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

A data specialist shares the 2-page résumé that got him a $300,000 job at Google — and explains 3 details he got right on it.

Ankit Virmani made a career switch from consulting to tech.
After a full day of work at Deloitte, he would spend hours every night teaching himself how to code.
The résumé that landed Virmani a job at Google is two pages long — a decision he defends today.

Ankit Virmani had spent five years in consulting when he began eyeing a shift to tech.

"I always thought in my heart that I wanted more technical depth. I wanted to build things rather than sell them too much," said Virmani, who first moved to the US from India to pursue a master's degree.

In the first half of 2020, he dove right in.

After wrapping up a day at his full-time job at Deloitte, Virmani would spend three to four hours practicing coding every night, and another two hours reading up about the industry. He also began spending time with people in the field, asking them about real-time scenarios and what challenges they face in their jobs.

"I didn't want answers from them. I wanted their thought process —how do they navigate through these complex challenges at scale," he told Business Insider.

It didn't pay off right away. He was rejected by Microsoft and Amazon at different stages of their application processes.

Six months after deciding to switch careers, he landed a role as a data and machine learning specialist at Google's Seattle office.

Sacrificing the 'one-page only' résumé rule

Looking back on his résumé four years later, Virmani said he would make some formatting changes.

"This résumé is giving importance to everything equally, which is what I don't like," he said. "I would have a gradient of importance, like executive summary on top, achievements so far, and then I would go to professional experience, education, and technical skills."

But with more insight into what employers like Google appreciate, Virmani said he would keep several things the same — including the length of the document.

Sacrificing the "one-page only" rule to improve readability: Virmani broke the "one-page only" rule and prioritized having an uncluttered résumé. "It has very neatly structured sections and high-level themes," he said about using subheadings like "data architecture" and "cloud strategy." His manager at Google later told him that style helped them pick up on his responsibilities without having to decipher the lines below.

Highlighting team effort: Virmani said some people overly highlight individual contributions on their résumé: "It's never that way, at least in my experience — it's always teamwork." That's why he focused parts of his résumé on his teams' accomplishments. "In my experience, Google highly, highly appreciates honesty and humility. That's the culture of the company — we know that nothing great gets achieved by an individual," he said.

Saving some details for the interview: Virmani said he was careful not to over-explain his past projects so that he could build curiosity and have a good conversation during the interview: "If you put everything in the résumé, you'll run out of points to talk about in the interview."

Virmani is not alone in choosing to sacrifice "typical" résumé decisions. For Shola West, that came in the shape of breaking the "no résumé gap" idea.

West is part of a growing group of Gen Zs who are trying to destigmatize the résumé gap — a period of unemployment between jobs or between education and work.

West previously told BI she took a yearlong break at the start of her career to understand what she really wanted to pursue. She embraced her résumé gap and now works at an advertising agency and runs a career advice side hustle.

For Mariana Kobayashi, breaking from the résumé norms meant abandoning the written format altogether.

Kobayashi landed a role as an account executive at Google after she curated a video about why she should get the role.

She sent her video résumé, which took her 10 hours to create, to the hiring manager directly, Kobayashi previously told BI. A Google recruiter saw the video and reached out to her, and she eventually landed a role at the tech giant.

Do you work in finance or consulting, and have a story to share about your personal résumé journey? Email this reporter at [email protected] .

On February 28, Axel Springer, Business Insider's parent company, joined 31 other media groups and filed a $2.3 billion suit against Google in Dutch court, alleging losses suffered due to the company's advertising practices.

Watch: Lorraine Twohill, chief marketing officer at Google, says inclusive advertising is just good business

Main content

IMAGES

Top 15+ Interesting Machine Learning Master Thesis (Research Guidance)
Webinar on THESIS with Data Science & Machine Learning
Machine Learning Thesis Ideas
How to Design a Poster for Your Data Science Project?
thesis ideas for machine learning
(PDF) Data Science: The Impact of Machine Learning

VIDEO

Why you should read Research Papers in ML & DL? #machinelearning #deeplearning
Empowering Music Creation with Machine Learning (Thesis Proposal)
5_1_2 Data Integration in Supervised Machine Learning
Fundamental of machine learning Introduction to Machine Learning
Interview on Using Big Data and Machine Learning to Understand the Impact of Digitization
Real PhD, Virtual Thesis Defense

COMMENTS

PhD Dissertations
PhD Dissertations [All are .pdf files] Probabilistic Reinforcement Learning: Using Data to Define Desired Outcomes, and Inferring How to Get There Benjamin Eysenbach, 2023. Data-driven Decisions - An Anomaly Detection Perspective Shubhranshu Shekhar, 2023. METHODS AND APPLICATIONS OF EXPLAINABLE MACHINE LEARNING Joon Sik Kim, 2023. Applied Mathematics of the Future Kin G. Olivares, 2023
17 Compelling Machine Learning Ph.D. Dissertations
The thesis also studies the problem of learning convex penalty functions directly from data for settings in which we lack the domain expertise to choose a penalty function. ... "Machine Learning and Data Science: An Introduction to Statistical Learning Methods with R." Daniel holds a BS in Mathematics and Computer Science from UCLA. 1 ...
10 Compelling Machine Learning Ph.D. Dissertations for 2020
10 Compelling Machine Learning Ph.D. Dissertations for 2020. Machine Learning Modeling Research posted by Daniel Gutierrez, ODSC August 19, 2020. As a data scientist, an integral part of my work in the field revolves around keeping current with research coming out of academia. I frequently scour arXiv.org for late-breaking papers that show ...
PDF Undergraduate Fundamentals of Machine Learning
grouping, or even predictive power over data sets. With all that said, it's important to emphasize the limitations of machine learning. It is not nor will it ever be a replacement for critical thought and methodical, procedural work in data science. Indeed, machine learning can be reasonably characterized a loose collection of disciplines and ...
How to write a great data science thesis
Glancing through past dissertations helped me understand how a typical machine learning research paper is structured and led to numerous ideas about interesting statistics and visualizations that I could include in my thesis. Below, I've compiled a list of great sources and databases containing previous theses.
Theory and algorithms for data-centric machine learning
Machine learning (ML) and AI have achieved remarkable, super-human performance in a wide variety of domains: computer vision, natural language processing, and protein folding, to name but a few. ... such as data privacy, into account. In this thesis, we will take a critical look at several points in the ML data pipeline: before, during, and ...
The Future of AI Research: 20 Thesis Ideas for Undergraduate ...
5. Comparison of deep learning and traditional machine learning methods for anomaly detection in time series data. Introduction: Anomaly detection in time series data is a challenging task, which is important in various fields such as finance, healthcare, and manufacturing. Deep learning methods have been used to improve anomaly detection in ...
[2402.10888] Explainability for Machine Learning Models: From Data
This thesis explores the generation of local explanations for already deployed machine learning models, aiming to identify optimal conditions for producing meaningful explanations considering both data and user requirements. The primary goal is to develop methods for generating explanations for any model while ensuring that these explanations remain faithful to the underlying model and ...
PDF DOCTORAL THESIS
The aim of this thesis is to describe the resampling algorithms designed to reduce the negative impact of data imbalance on the performance of traditional classi˙cation algorithms, as well as their applications to the histopathological image recognition task. 1.1 Contentofthethesis The thesis is composed of the following twelve research papers:
PDF Missing Data Problems in Machine Learning
Missing Data Problems in Machine Learning Benjamin M. Marlin Doctor of Philosophy Graduate Department of Computer Science University of Toronto 2008 Learning, inference, and prediction in the presence of missing data are pervasive problems in machine learning and statistical data analysis. This thesis focuses on the problems of collab-
Open Theses
Open Topics We offer multiple Bachelor/Master theses, Guided Research projects and IDPs in the area of data mining/machine learning. A non-exhaustive list of open topics is listed below.. If you are interested in a thesis or a guided research project, please send your CV and transcript of records to Prof. Stephan Günnemann via email and we will arrange a meeting to talk about the potential ...
Machine Learning and Data Assimilation for Blending Incomplete Models
Data assimilation (DA) offers a well-established and successful paradigm for blending such models with noisy observational data. However, traditional DA-based inference often fails when available data are insufficiently informative. Chapter 2 copes with this challenge by introducing constraints into Ensemble Kalman Filtering, which is shown to ...
A machine learning approach to modeling and predicting training
However, traditional analysis techniques and human intuition are of limited use on so-called "big-data" environments, and one of the most promising areas to prepare for this influx of complex training data is the field of machine learning. Thus, the objective of this thesis was to lay the foundations for the use of machine learning algorithms ...
Quantum Machine Learning For Classical Data
thesis, and acknowledge the kind support of align Royal Society Research grant and the Google PhD Fellowship, which gave me the freedom to work on these ... operates on quantum or classical data. •Machine learning algorithms that are executed on classical computers (CPUs, GPUs, TPUs) and applied to classical data, the sector CC in Fig. 1.1, ...
PDF Visual Analytics and Interactive Machine Learning for Human Brain Data
an automatic method, machine learning algorithms act mostly as a black box, i.e. the users have very little information about how and why the algorithm work or fail. The underlying machine learning models are also designed primarily for the convenience of learning from data, but they are not easy for the users to understand or interact with.
PDF Master Thesis Using Machine Learning Methods for Evaluating the ...
Based on this background, the aim of this thesis is to select and implement a machine learning process that produces an algorithm, which is able to detect whether documents have been translated by humans or computerized systems. This algorithm builds the basic structure for an approach to evaluate these documents. 1.2 Related Work
UofT Machine Learning
2010. Andriy Mnih Learning Distributed Representations for Statistical Language Modelling and Collaborative Filtering (Ph. D. Thesis) Renqiang Min Machine Learning Approaches to Biological Sequence and Phenotype Data Analysis (Ph. D. Thesis) Vinod Nair Visual Object Recognition Using Generative Models of Images (Ph. D. Thesis)
Dissertations / Theses: 'Machine Learning (ML)'
This thesis proposes a mode of inquiry that considers the inter- active qualities of what machine learning does, as opposed the tech- nical specifications of what machine learning is. A shift in focus from the technicality of ML to the artifacts it creates allows the interaction designer to situate its existing skill set, affording it to engage ...
Data Mining
A Data Cleaning Assistant for Machine Learning Author: Maas, Y., 2021. Supervisor: Vanschoren, J. (Supervisor 1) Student thesis: Bachelor. ... Automatic Synthesis of Machine Learning Pipelines consisting of Pre-Trained Models for Multimodal Data Author ... Student thesis: Master. File. Boolean Matrix Factorization and Completion Author ...
Computational and Data Sciences (PhD) Dissertations
Novel Applications of Statistical and Machine Learning Methods to Analyze Trial-Level Data from Cognitive Measures, Chelsea Parlett. PDF. Optimal Analytical Methods for High Accuracy Cardiac Disease Classification and Treatment Based on ECG Data, Jianwei Zheng. Dissertations from 2020 PDF
CMU's Online Graduate Certificate in Machine Learning and Data Science
Program Name Change. To better reflect the emphasis on machine learning in the curriculum, the name of this certificate has been updated from Computational Data Science Foundations to Machine Learning & Data Science Foundations.. Although the name has changed, the course content, faculty, online experience, admissions requirements, and everything else has remained the same.
Data Analytics and Machine Learning Methods, Techniques and Tool for
This doctoral dissertation proposes a novel approach to enhance the development of smart services for the Internet of Things (IoT) and smart Cyber-Physical Systems (CPS). The proposed approach offers abstraction and automation to the software engineering processes, as well as the Data Analytics (DA) and Machine Learning (ML) practices. This is realized in an integrated and seamless manner. We ...
Applied Machine Learning certificate prepares professionals for data
Brian D'Alessandro, head of data science for Instagram's Well-Being and Integrity teams and author of Cornell's Applied Machine Learning and AI certificate program, has 20 years of experience in ML model construction. D'Alessandro spoke to the eCornell team about the certificate, ML instructional approaches and career paths.
A Beginner's Guide to the Top 10 Machine Learning Algorithms
Data science's essence lies in machine learning algorithms. Here are ten algorithms that are a great introduction to machine learning for any beginner! By Nate Rosidi, KDnuggets Market Trends & SQL Content Specialist on April 2, 2024 in Machine Learning. Image by Author. One of the fields that underpins data science is machine learning.
Mingying Zheng
Using Data Preprocessing Techniques and Machine Learning Algorithms to Explore Predictors of Word Difficulty in English Language Assessment. Defense date. Monday, May 13, 2024 - 1 PM. Location. S104 LC. Degree major. Psychological and Quantitative Foundations - Educational Measurement and Statistics. Chair (s) Jonathan Templin.
Full Steam Ahead: The 2024 MAD (Machine Learning, AI & Data) Landscape
This is our tenth annual landscape and "state of the union" of the data, analytics, machine learning and AI ecosystem.. In 10+ years covering the space, things have never been as exciting and promising as they are today. All trends and subtrends we described over the years are coalescing: data has been digitized, in massive amounts; it can be stored, processed and analyzed fast and cheaply ...
Video Highlights: Deep Reinforcement Learning for Maximizing Profits
In this video presentation, our good friend Jon Krohn, Co-Founder and Chief Data Scientist at the machine learning company Nebula, is joined by Dr. Barrett Thomas, an esteemed Research Professor in at the University of Iowa's College of Business, to delve deep into Markov decision processes and how they relate to Deep Reinforcement Learning.Dr.
Big Earth Data and Machine Learning for Sustainable and Resilient
This thesis recognizes the unprecedented opportunities offered by the high quality and open access Earth observation data of our times and introduces novel machine learning and big data methods to properly exploit them towards developing applications for sustainable and resilient agriculture. The thesis addresses three distinct thematic areas ...
The Résumé That Landed a Data Specialist a $300,000 Job at Google
Six months after deciding to switch careers, he landed a role as a data and machine learning specialist at Google's Seattle office. Here's the résumé he used to apply for his job at Google ...