Analytics Insight
Top 10 Research and Thesis Topics for ML Projects in 2022
This article features the top 10 research and thesis topics for ML projects for students to try in 2022
Text mining and text classification, image-based applications, machine vision, optimization, voice classification, sentiment analysis, recommendation framework project, mall customers’ project, object detection with deep learning.
Disclaimer: Any financial and crypto market information given on Analytics Insight are sponsored articles, written for informational purpose only and is not an investment advice. The readers are further advised that Crypto products and NFTs are unregulated and can be highly risky. There may be no regulatory recourse for any loss from such transactions. Conduct your own research by contacting financial experts before making any investment decisions. The decision to read hereinafter is purely a matter of choice and shall be construed as an express undertaking/guarantee in favour of Analytics Insight of being absolved from any/ all potential legal action, or enforceable claims. We do not represent nor own any cryptocurrency, any complaints, abuse or concerns with regards to the information provided shall be immediately informed here .
You May Also Like
AIOps: Enhancing Business through Automation
$SHIB Active Addresses Surge to 1.26 Million – Which Memecoins Will Be Next?
3 Cryptocurrencies That Will Make Millionaires in 2024, Interestingly Shiba Inu (SHIB) Couldn’t Make the List
Crypto Guru Reveals Top Token Competing Head-To-Head With Shiba Inu For Incredible 10x Profits
Analytics Insight® is an influential platform dedicated to insights, trends, and opinion from the world of data-driven technologies. It monitors developments, recognition, and achievements made by Artificial Intelligence, Big Data and Analytics companies across the globe.
- Select Language:
- Privacy Policy
- Content Licensing
- Terms & Conditions
- Submit an Interview
Special Editions
- Dec – Crypto Weekly Vol-1
- 40 Under 40 Innovators
- Women In Technology
- Market Reports
- AI Glossary
- Infographics
Latest Issue
Disclaimer: Any financial and crypto market information given on Analytics Insight is written for informational purpose only and is not an investment advice. Conduct your own research by contacting financial experts before making any investment decisions, more information here .
- Machine Learning
- Deep Learning
- Computer Vision
- Natural Language Processing
- Cognitive Computing
- Conversational AI
- Emotional AI
- Face/Image Recognition
- Self-Driving Cars
- Data Science
- Business Analytics
- Business Intelligence
- Augmented Analytics
- Data Management
- People Analytics
- Text Analytics
- Speech Analytics
- Edge Computing
- Quantum Computing
- Data Centers
- Best Crypto in 2023
- Best Crypto Presale
- Best Crypto Memecoin
- Next Cryptocurrencies to Explode
- Cybersecurity
- Digital Transformation
- Intelligent Automation
- Hyperautomation
The Future of AI Research: 20 Thesis Ideas for Undergraduate Students in Machine Learning and Deep Learning for 2023!
A comprehensive guide for crafting an original and innovative thesis in the field of ai..
By Aarafat Islam
“The beauty of machine learning is that it can be applied to any problem you want to solve, as long as you can provide the computer with enough examples.” — Andrew Ng
This article provides a list of 20 potential thesis ideas for an undergraduate program in machine learning and deep learning in 2023. Each thesis idea includes an introduction , which presents a brief overview of the topic and the research objectives . The ideas provided are related to different areas of machine learning and deep learning, such as computer vision, natural language processing, robotics, finance, drug discovery, and more. The article also includes explanations, examples, and conclusions for each thesis idea, which can help guide the research and provide a clear understanding of the potential contributions and outcomes of the proposed research. The article also emphasized the importance of originality and the need for proper citation in order to avoid plagiarism.
1. Investigating the use of Generative Adversarial Networks (GANs) in medical imaging: A deep learning approach to improve the accuracy of medical diagnoses.
Introduction: Medical imaging is an important tool in the diagnosis and treatment of various medical conditions. However, accurately interpreting medical images can be challenging, especially for less experienced doctors. This thesis aims to explore the use of GANs in medical imaging, in order to improve the accuracy of medical diagnoses.
2. Exploring the use of deep learning in natural language generation (NLG): An analysis of the current state-of-the-art and future potential.
Introduction: Natural language generation is an important field in natural language processing (NLP) that deals with creating human-like text automatically. Deep learning has shown promising results in NLP tasks such as machine translation, sentiment analysis, and question-answering. This thesis aims to explore the use of deep learning in NLG and analyze the current state-of-the-art models, as well as potential future developments.
3. Development and evaluation of deep reinforcement learning (RL) for robotic navigation and control.
Introduction: Robotic navigation and control are challenging tasks, which require a high degree of intelligence and adaptability. Deep RL has shown promising results in various robotics tasks, such as robotic arm control, autonomous navigation, and manipulation. This thesis aims to develop and evaluate a deep RL-based approach for robotic navigation and control and evaluate its performance in various environments and tasks.
4. Investigating the use of deep learning for drug discovery and development.
Introduction: Drug discovery and development is a time-consuming and expensive process, which often involves high failure rates. Deep learning has been used to improve various tasks in bioinformatics and biotechnology, such as protein structure prediction and gene expression analysis. This thesis aims to investigate the use of deep learning for drug discovery and development and examine its potential to improve the efficiency and accuracy of the drug development process.
5. Comparison of deep learning and traditional machine learning methods for anomaly detection in time series data.
Introduction: Anomaly detection in time series data is a challenging task, which is important in various fields such as finance, healthcare, and manufacturing. Deep learning methods have been used to improve anomaly detection in time series data, while traditional machine learning methods have been widely used as well. This thesis aims to compare deep learning and traditional machine learning methods for anomaly detection in time series data and examine their respective strengths and weaknesses.
Photo by Joanna Kosinska on Unsplash
6. Use of deep transfer learning in speech recognition and synthesis.
Introduction: Speech recognition and synthesis are areas of natural language processing that focus on converting spoken language to text and vice versa. Transfer learning has been widely used in deep learning-based speech recognition and synthesis systems to improve their performance by reusing the features learned from other tasks. This thesis aims to investigate the use of transfer learning in speech recognition and synthesis and how it improves the performance of the system in comparison to traditional methods.
7. The use of deep learning for financial prediction.
Introduction: Financial prediction is a challenging task that requires a high degree of intelligence and adaptability, especially in the field of stock market prediction. Deep learning has shown promising results in various financial prediction tasks, such as stock price prediction and credit risk analysis. This thesis aims to investigate the use of deep learning for financial prediction and examine its potential to improve the accuracy of financial forecasting.
8. Investigating the use of deep learning for computer vision in agriculture.
Introduction: Computer vision has the potential to revolutionize the field of agriculture by improving crop monitoring, precision farming, and yield prediction. Deep learning has been used to improve various computer vision tasks, such as object detection, semantic segmentation, and image classification. This thesis aims to investigate the use of deep learning for computer vision in agriculture and examine its potential to improve the efficiency and accuracy of crop monitoring and precision farming.
9. Development and evaluation of deep learning models for generative design in engineering and architecture.
Introduction: Generative design is a powerful tool in engineering and architecture that can help optimize designs and reduce human error. Deep learning has been used to improve various generative design tasks, such as design optimization and form generation. This thesis aims to develop and evaluate deep learning models for generative design in engineering and architecture and examine their potential to improve the efficiency and accuracy of the design process.
10. Investigating the use of deep learning for natural language understanding.
Introduction: Natural language understanding is a complex task of natural language processing that involves extracting meaning from text. Deep learning has been used to improve various NLP tasks, such as machine translation, sentiment analysis, and question-answering. This thesis aims to investigate the use of deep learning for natural language understanding and examine its potential to improve the efficiency and accuracy of natural language understanding systems.
Photo by UX Indonesia on Unsplash
11. Comparing deep learning and traditional machine learning methods for image compression.
Introduction: Image compression is an important task in image processing and computer vision. It enables faster data transmission and storage of image files. Deep learning methods have been used to improve image compression, while traditional machine learning methods have been widely used as well. This thesis aims to compare deep learning and traditional machine learning methods for image compression and examine their respective strengths and weaknesses.
12. Using deep learning for sentiment analysis in social media.
Introduction: Sentiment analysis in social media is an important task that can help businesses and organizations understand their customers’ opinions and feedback. Deep learning has been used to improve sentiment analysis in social media, by training models on large datasets of social media text. This thesis aims to use deep learning for sentiment analysis in social media, and evaluate its performance against traditional machine learning methods.
13. Investigating the use of deep learning for image generation.
Introduction: Image generation is a task in computer vision that involves creating new images from scratch or modifying existing images. Deep learning has been used to improve various image generation tasks, such as super-resolution, style transfer, and face generation. This thesis aims to investigate the use of deep learning for image generation and examine its potential to improve the quality and diversity of generated images.
14. Development and evaluation of deep learning models for anomaly detection in cybersecurity.
Introduction: Anomaly detection in cybersecurity is an important task that can help detect and prevent cyber-attacks. Deep learning has been used to improve various anomaly detection tasks, such as intrusion detection and malware detection. This thesis aims to develop and evaluate deep learning models for anomaly detection in cybersecurity and examine their potential to improve the efficiency and accuracy of cybersecurity systems.
15. Investigating the use of deep learning for natural language summarization.
Introduction: Natural language summarization is an important task in natural language processing that involves creating a condensed version of a text that preserves its main meaning. Deep learning has been used to improve various natural language summarization tasks, such as document summarization and headline generation. This thesis aims to investigate the use of deep learning for natural language summarization and examine its potential to improve the efficiency and accuracy of natural language summarization systems.
Photo by Windows on Unsplash
16. Development and evaluation of deep learning models for facial expression recognition.
Introduction: Facial expression recognition is an important task in computer vision and has many practical applications, such as human-computer interaction, emotion recognition, and psychological studies. Deep learning has been used to improve facial expression recognition, by training models on large datasets of images. This thesis aims to develop and evaluate deep learning models for facial expression recognition and examine their performance against traditional machine learning methods.
17. Investigating the use of deep learning for generative models in music and audio.
Introduction: Music and audio synthesis is an important task in audio processing, which has many practical applications, such as music generation and speech synthesis. Deep learning has been used to improve generative models for music and audio, by training models on large datasets of audio data. This thesis aims to investigate the use of deep learning for generative models in music and audio and examine its potential to improve the quality and diversity of generated audio.
18. Study the comparison of deep learning models with traditional algorithms for anomaly detection in network traffic.
Introduction: Anomaly detection in network traffic is an important task that can help detect and prevent cyber-attacks. Deep learning models have been used for this task, and traditional methods such as clustering and rule-based systems are widely used as well. This thesis aims to compare deep learning models with traditional algorithms for anomaly detection in network traffic and analyze the trade-offs between the models in terms of accuracy and scalability.
19. Investigating the use of deep learning for improving recommender systems.
Introduction: Recommender systems are widely used in many applications such as online shopping, music streaming, and movie streaming. Deep learning has been used to improve the performance of recommender systems, by training models on large datasets of user-item interactions. This thesis aims to investigate the use of deep learning for improving recommender systems and compare its performance with traditional content-based and collaborative filtering approaches.
20. Development and evaluation of deep learning models for multi-modal data analysis.
Introduction: Multi-modal data analysis is the task of analyzing and understanding data from multiple sources such as text, images, and audio. Deep learning has been used to improve multi-modal data analysis, by training models on large datasets of multi-modal data. This thesis aims to develop and evaluate deep learning models for multi-modal data analysis and analyze their potential to improve performance in comparison to single-modal models.
I hope that this article has provided you with a useful guide for your thesis research in machine learning and deep learning. Remember to conduct a thorough literature review and to include proper citations in your work, as well as to be original in your research to avoid plagiarism. I wish you all the best of luck with your thesis and your research endeavors!
Continue Learning
Chatpdf — chat with any pdf using the new chatgpt api.
Discover a New Way to Engage with PDFs: Enhance Productivity and Unveil Hidden Knowledge through Human-like Conversations with ChatPDF.com
How to Run the LLAMA Web UI on Collab or Locally?
Building a rag-based conversational chatbot with langflow and streamlit.
Learn how to build a chatbot that leverages Retrieval Augmented Generation (RAG) in 20 minutes
Beginner’s Guide to OpenAI’s GPT-3.5-Turbo Model
From GPT-3 to GPT-3.5-Turbo: Understanding the Latest Upgrades in OpenAI’s Language Model API.
The Role of AI in Enhancing Website Content Security and Clarity
Using chains and agents for llm application development.
Step-by-step guide to using chains and agents in LangChain
Machine Learning - CMU
PhD Dissertations
[all are .pdf files].
Learning Models that Match Jacob Tyo, 2024
Improving Human Integration across the Machine Learning Pipeline Charvi Rastogi, 2024
Reliable and Practical Machine Learning for Dynamic Healthcare Settings Helen Zhou, 2023
Automatic customization of large-scale spiking network models to neuronal population activity (unavailable) Shenghao Wu, 2023
Estimation of BVk functions from scattered data (unavailable) Addison J. Hu, 2023
Rethinking object categorization in computer vision (unavailable) Jayanth Koushik, 2023
Advances in Statistical Gene Networks Jinjin Tian, 2023 Post-hoc calibration without distributional assumptions Chirag Gupta, 2023
The Role of Noise, Proxies, and Dynamics in Algorithmic Fairness Nil-Jana Akpinar, 2023
Collaborative learning by leveraging siloed data Sebastian Caldas, 2023
Modeling Epidemiological Time Series Aaron Rumack, 2023
Human-Centered Machine Learning: A Statistical and Algorithmic Perspective Leqi Liu, 2023
Uncertainty Quantification under Distribution Shifts Aleksandr Podkopaev, 2023
Probabilistic Reinforcement Learning: Using Data to Define Desired Outcomes, and Inferring How to Get There Benjamin Eysenbach, 2023
Comparing Forecasters and Abstaining Classifiers Yo Joong Choe, 2023
Using Task Driven Methods to Uncover Representations of Human Vision and Semantics Aria Yuan Wang, 2023
Data-driven Decisions - An Anomaly Detection Perspective Shubhranshu Shekhar, 2023
Applied Mathematics of the Future Kin G. Olivares, 2023
METHODS AND APPLICATIONS OF EXPLAINABLE MACHINE LEARNING Joon Sik Kim, 2023
NEURAL REASONING FOR QUESTION ANSWERING Haitian Sun, 2023
Principled Machine Learning for Societally Consequential Decision Making Amanda Coston, 2023
Long term brain dynamics extend cognitive neuroscience to timescales relevant for health and physiology Maxwell B. Wang, 2023
Long term brain dynamics extend cognitive neuroscience to timescales relevant for health and physiology Darby M. Losey, 2023
Calibrated Conditional Density Models and Predictive Inference via Local Diagnostics David Zhao, 2023
Towards an Application-based Pipeline for Explainability Gregory Plumb, 2022
Objective Criteria for Explainable Machine Learning Chih-Kuan Yeh, 2022
Making Scientific Peer Review Scientific Ivan Stelmakh, 2022
Facets of regularization in high-dimensional learning: Cross-validation, risk monotonization, and model complexity Pratik Patil, 2022
Active Robot Perception using Programmable Light Curtains Siddharth Ancha, 2022
Strategies for Black-Box and Multi-Objective Optimization Biswajit Paria, 2022
Unifying State and Policy-Level Explanations for Reinforcement Learning Nicholay Topin, 2022
Sensor Fusion Frameworks for Nowcasting Maria Jahja, 2022
Equilibrium Approaches to Modern Deep Learning Shaojie Bai, 2022
Towards General Natural Language Understanding with Probabilistic Worldbuilding Abulhair Saparov, 2022
Applications of Point Process Modeling to Spiking Neurons (Unavailable) Yu Chen, 2021
Neural variability: structure, sources, control, and data augmentation Akash Umakantha, 2021
Structure and time course of neural population activity during learning Jay Hennig, 2021
Cross-view Learning with Limited Supervision Yao-Hung Hubert Tsai, 2021
Meta Reinforcement Learning through Memory Emilio Parisotto, 2021
Learning Embodied Agents with Scalably-Supervised Reinforcement Learning Lisa Lee, 2021
Learning to Predict and Make Decisions under Distribution Shift Yifan Wu, 2021
Statistical Game Theory Arun Sai Suggala, 2021
Towards Knowledge-capable AI: Agents that See, Speak, Act and Know Kenneth Marino, 2021
Learning and Reasoning with Fast Semidefinite Programming and Mixing Methods Po-Wei Wang, 2021
Bridging Language in Machines with Language in the Brain Mariya Toneva, 2021
Curriculum Learning Otilia Stretcu, 2021
Principles of Learning in Multitask Settings: A Probabilistic Perspective Maruan Al-Shedivat, 2021
Towards Robust and Resilient Machine Learning Adarsh Prasad, 2021
Towards Training AI Agents with All Types of Experiences: A Unified ML Formalism Zhiting Hu, 2021
Building Intelligent Autonomous Navigation Agents Devendra Chaplot, 2021
Learning to See by Moving: Self-supervising 3D Scene Representations for Perception, Control, and Visual Reasoning Hsiao-Yu Fish Tung, 2021
Statistical Astrophysics: From Extrasolar Planets to the Large-scale Structure of the Universe Collin Politsch, 2020
Causal Inference with Complex Data Structures and Non-Standard Effects Kwhangho Kim, 2020
Networks, Point Processes, and Networks of Point Processes Neil Spencer, 2020
Dissecting neural variability using population recordings, network models, and neurofeedback (Unavailable) Ryan Williamson, 2020
Predicting Health and Safety: Essays in Machine Learning for Decision Support in the Public Sector Dylan Fitzpatrick, 2020
Towards a Unified Framework for Learning and Reasoning Han Zhao, 2020
Learning DAGs with Continuous Optimization Xun Zheng, 2020
Machine Learning and Multiagent Preferences Ritesh Noothigattu, 2020
Learning and Decision Making from Diverse Forms of Information Yichong Xu, 2020
Towards Data-Efficient Machine Learning Qizhe Xie, 2020
Change modeling for understanding our world and the counterfactual one(s) William Herlands, 2020
Machine Learning in High-Stakes Settings: Risks and Opportunities Maria De-Arteaga, 2020
Data Decomposition for Constrained Visual Learning Calvin Murdock, 2020
Structured Sparse Regression Methods for Learning from High-Dimensional Genomic Data Micol Marchetti-Bowick, 2020
Towards Efficient Automated Machine Learning Liam Li, 2020
LEARNING COLLECTIONS OF FUNCTIONS Emmanouil Antonios Platanios, 2020
Provable, structured, and efficient methods for robustness of deep networks to adversarial examples Eric Wong , 2020
Reconstructing and Mining Signals: Algorithms and Applications Hyun Ah Song, 2020
Probabilistic Single Cell Lineage Tracing Chieh Lin, 2020
Graphical network modeling of phase coupling in brain activity (unavailable) Josue Orellana, 2019
Strategic Exploration in Reinforcement Learning - New Algorithms and Learning Guarantees Christoph Dann, 2019 Learning Generative Models using Transformations Chun-Liang Li, 2019
Estimating Probability Distributions and their Properties Shashank Singh, 2019
Post-Inference Methods for Scalable Probabilistic Modeling and Sequential Decision Making Willie Neiswanger, 2019
Accelerating Text-as-Data Research in Computational Social Science Dallas Card, 2019
Multi-view Relationships for Analytics and Inference Eric Lei, 2019
Information flow in networks based on nonstationary multivariate neural recordings Natalie Klein, 2019
Competitive Analysis for Machine Learning & Data Science Michael Spece, 2019
The When, Where and Why of Human Memory Retrieval Qiong Zhang, 2019
Towards Effective and Efficient Learning at Scale Adams Wei Yu, 2019
Towards Literate Artificial Intelligence Mrinmaya Sachan, 2019
Learning Gene Networks Underlying Clinical Phenotypes Under SNP Perturbations From Genome-Wide Data Calvin McCarter, 2019
Unified Models for Dynamical Systems Carlton Downey, 2019
Anytime Prediction and Learning for the Balance between Computation and Accuracy Hanzhang Hu, 2019
Statistical and Computational Properties of Some "User-Friendly" Methods for High-Dimensional Estimation Alnur Ali, 2019
Nonparametric Methods with Total Variation Type Regularization Veeranjaneyulu Sadhanala, 2019
New Advances in Sparse Learning, Deep Networks, and Adversarial Learning: Theory and Applications Hongyang Zhang, 2019
Gradient Descent for Non-convex Problems in Modern Machine Learning Simon Shaolei Du, 2019
Selective Data Acquisition in Learning and Decision Making Problems Yining Wang, 2019
Anomaly Detection in Graphs and Time Series: Algorithms and Applications Bryan Hooi, 2019
Neural dynamics and interactions in the human ventral visual pathway Yuanning Li, 2018
Tuning Hyperparameters without Grad Students: Scaling up Bandit Optimisation Kirthevasan Kandasamy, 2018
Teaching Machines to Classify from Natural Language Interactions Shashank Srivastava, 2018
Statistical Inference for Geometric Data Jisu Kim, 2018
Representation Learning @ Scale Manzil Zaheer, 2018
Diversity-promoting and Large-scale Machine Learning for Healthcare Pengtao Xie, 2018
Distribution and Histogram (DIsH) Learning Junier Oliva, 2018
Stress Detection for Keystroke Dynamics Shing-Hon Lau, 2018
Sublinear-Time Learning and Inference for High-Dimensional Models Enxu Yan, 2018
Neural population activity in the visual cortex: Statistical methods and application Benjamin Cowley, 2018
Efficient Methods for Prediction and Control in Partially Observable Environments Ahmed Hefny, 2018
Learning with Staleness Wei Dai, 2018
Statistical Approach for Functionally Validating Transcription Factor Bindings Using Population SNP and Gene Expression Data Jing Xiang, 2017
New Paradigms and Optimality Guarantees in Statistical Learning and Estimation Yu-Xiang Wang, 2017
Dynamic Question Ordering: Obtaining Useful Information While Reducing User Burden Kirstin Early, 2017
New Optimization Methods for Modern Machine Learning Sashank J. Reddi, 2017
Active Search with Complex Actions and Rewards Yifei Ma, 2017
Why Machine Learning Works George D. Montañez , 2017
Source-Space Analyses in MEG/EEG and Applications to Explore Spatio-temporal Neural Dynamics in Human Vision Ying Yang , 2017
Computational Tools for Identification and Analysis of Neuronal Population Activity Pengcheng Zhou, 2016
Expressive Collaborative Music Performance via Machine Learning Gus (Guangyu) Xia, 2016
Supervision Beyond Manual Annotations for Learning Visual Representations Carl Doersch, 2016
Exploring Weakly Labeled Data Across the Noise-Bias Spectrum Robert W. H. Fisher, 2016
Optimizing Optimization: Scalable Convex Programming with Proximal Operators Matt Wytock, 2016
Combining Neural Population Recordings: Theory and Application William Bishop, 2015
Discovering Compact and Informative Structures through Data Partitioning Madalina Fiterau-Brostean, 2015
Machine Learning in Space and Time Seth R. Flaxman, 2015
The Time and Location of Natural Reading Processes in the Brain Leila Wehbe, 2015
Shape-Constrained Estimation in High Dimensions Min Xu, 2015
Spectral Probabilistic Modeling and Applications to Natural Language Processing Ankur Parikh, 2015 Computational and Statistical Advances in Testing and Learning Aaditya Kumar Ramdas, 2015
Corpora and Cognition: The Semantic Composition of Adjectives and Nouns in the Human Brain Alona Fyshe, 2015
Learning Statistical Features of Scene Images Wooyoung Lee, 2014
Towards Scalable Analysis of Images and Videos Bin Zhao, 2014
Statistical Text Analysis for Social Science Brendan T. O'Connor, 2014
Modeling Large Social Networks in Context Qirong Ho, 2014
Semi-Cooperative Learning in Smart Grid Agents Prashant P. Reddy, 2013
On Learning from Collective Data Liang Xiong, 2013
Exploiting Non-sequence Data in Dynamic Model Learning Tzu-Kuo Huang, 2013
Mathematical Theories of Interaction with Oracles Liu Yang, 2013
Short-Sighted Probabilistic Planning Felipe W. Trevizan, 2013
Statistical Models and Algorithms for Studying Hand and Finger Kinematics and their Neural Mechanisms Lucia Castellanos, 2013
Approximation Algorithms and New Models for Clustering and Learning Pranjal Awasthi, 2013
Uncovering Structure in High-Dimensions: Networks and Multi-task Learning Problems Mladen Kolar, 2013
Learning with Sparsity: Structures, Optimization and Applications Xi Chen, 2013
GraphLab: A Distributed Abstraction for Large Scale Machine Learning Yucheng Low, 2013
Graph Structured Normal Means Inference James Sharpnack, 2013 (Joint Statistics & ML PhD)
Probabilistic Models for Collecting, Analyzing, and Modeling Expression Data Hai-Son Phuoc Le, 2013
Learning Large-Scale Conditional Random Fields Joseph K. Bradley, 2013
New Statistical Applications for Differential Privacy Rob Hall, 2013 (Joint Statistics & ML PhD)
Parallel and Distributed Systems for Probabilistic Reasoning Joseph Gonzalez, 2012
Spectral Approaches to Learning Predictive Representations Byron Boots, 2012
Attribute Learning using Joint Human and Machine Computation Edith L. M. Law, 2012
Statistical Methods for Studying Genetic Variation in Populations Suyash Shringarpure, 2012
Data Mining Meets HCI: Making Sense of Large Graphs Duen Horng (Polo) Chau, 2012
Learning with Limited Supervision by Input and Output Coding Yi Zhang, 2012
Target Sequence Clustering Benjamin Shih, 2011
Nonparametric Learning in High Dimensions Han Liu, 2010 (Joint Statistics & ML PhD)
Structural Analysis of Large Networks: Observations and Applications Mary McGlohon, 2010
Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy Brian D. Ziebart, 2010
Tractable Algorithms for Proximity Search on Large Graphs Purnamrita Sarkar, 2010
Rare Category Analysis Jingrui He, 2010
Coupled Semi-Supervised Learning Andrew Carlson, 2010
Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong, 2009
Efficient Matrix Models for Relational Learning Ajit Paul Singh, 2009
Exploiting Domain and Task Regularities for Robust Named Entity Recognition Andrew O. Arnold, 2009
Theoretical Foundations of Active Learning Steve Hanneke, 2009
Generalized Learning Factors Analysis: Improving Cognitive Models with Machine Learning Hao Cen, 2009
Detecting Patterns of Anomalies Kaustav Das, 2009
Dynamics of Large Networks Jurij Leskovec, 2008
Computational Methods for Analyzing and Modeling Gene Regulation Dynamics Jason Ernst, 2008
Stacked Graphical Learning Zhenzhen Kou, 2007
Actively Learning Specific Function Properties with Applications to Statistical Inference Brent Bryan, 2007
Approximate Inference, Structure Learning and Feature Estimation in Markov Random Fields Pradeep Ravikumar, 2007
Scalable Graphical Models for Social Networks Anna Goldenberg, 2007
Measure Concentration of Strongly Mixing Processes with Applications Leonid Kontorovich, 2007
Tools for Graph Mining Deepayan Chakrabarti, 2005
Automatic Discovery of Latent Variable Models Ricardo Silva, 2005
- Warning : Invalid argument supplied for foreach() in /home/customer/www/opendatascience.com/public_html/wp-includes/nav-menu.php on line 95 Warning : array_merge(): Expected parameter 2 to be an array, null given in /home/customer/www/opendatascience.com/public_html/wp-includes/nav-menu.php on line 102
- ODSC EUROPE
- AI+ Training
- Speak at ODSC
- Data Analytics
- Data Engineering
- Data Visualization
- Deep Learning
- Generative AI
- Machine Learning
- NLP and LLMs
- Business & Use Cases
- Career Advice
- Write for us
- ODSC Community Slack Channel
- Upcoming Webinars
10 Compelling Machine Learning Ph.D. Dissertations for 2020
Machine Learning Modeling Research posted by Daniel Gutierrez, ODSC August 19, 2020 Daniel Gutierrez, ODSC
As a data scientist, an integral part of my work in the field revolves around keeping current with research coming out of academia. I frequently scour arXiv.org for late-breaking papers that show trends and reveal fertile areas of research. Other sources of valuable research developments are in the form of Ph.D. dissertations, the culmination of a doctoral candidate’s work to confer his/her degree. Ph.D. candidates are highly motivated to choose research topics that establish new and creative paths toward discovery in their field of study. Their dissertations are highly focused on a specific problem. If you can find a dissertation that aligns with your areas of interest, consuming the research is an excellent way to do a deep dive into the technology. After reviewing hundreds of recent theses from universities all over the country, I present 10 machine learning dissertations that I found compelling in terms of my own areas of interest.
[Related article: Introduction to Bayesian Deep Learning ]
I hope you’ll find several that match your own fields of inquiry. Each thesis may take a while to consume but will result in hours of satisfying summer reading. Enjoy!
1. Bayesian Modeling and Variable Selection for Complex Data
As we routinely encounter high-throughput data sets in complex biological and environmental research, developing novel models and methods for variable selection has received widespread attention. This dissertation addresses a few key challenges in Bayesian modeling and variable selection for high-dimensional data with complex spatial structures.
2. Topics in Statistical Learning with a Focus on Large Scale Data
Big data vary in shape and call for different approaches. One type of big data is the tall data, i.e., a very large number of samples but not too many features. This dissertation describes a general communication-efficient algorithm for distributed statistical learning on this type of big data. The algorithm distributes the samples uniformly to multiple machines, and uses a common reference data to improve the performance of local estimates. The algorithm enables potentially much faster analysis, at a small cost to statistical performance.
Another type of big data is the wide data, i.e., too many features but a limited number of samples. It is also called high-dimensional data, to which many classical statistical methods are not applicable.
This dissertation discusses a method of dimensionality reduction for high-dimensional classification. The method partitions features into independent communities and splits the original classification problem into separate smaller ones. It enables parallel computing and produces more interpretable results.
3. Sets as Measures: Optimization and Machine Learning
The purpose of this machine learning dissertation is to address the following simple question:
How do we design efficient algorithms to solve optimization or machine learning problems where the decision variable (or target label) is a set of unknown cardinality?
Optimization and machine learning have proved remarkably successful in applications requiring the choice of single vectors. Some tasks, in particular many inverse problems, call for the design, or estimation, of sets of objects. When the size of these sets is a priori unknown, directly applying optimization or machine learning techniques designed for single vectors appears difficult. The work in this dissertation shows that a very old idea for transforming sets into elements of a vector space (namely, a space of measures), a common trick in theoretical analysis, generates effective practical algorithms.
4. A Geometric Perspective on Some Topics in Statistical Learning
Modern science and engineering often generate data sets with a large sample size and a comparably large dimension which puts classic asymptotic theory into question in many ways. Therefore, the main focus of this dissertation is to develop a fundamental understanding of statistical procedures for estimation and hypothesis testing from a non-asymptotic point of view, where both the sample size and problem dimension grow hand in hand. A range of different problems are explored in this thesis, including work on the geometry of hypothesis testing, adaptivity to local structure in estimation, effective methods for shape-constrained problems, and early stopping with boosting algorithms. The treatment of these different problems shares the common theme of emphasizing the underlying geometric structure.
5. Essays on Random Forest Ensembles
A random forest is a popular machine learning ensemble method that has proven successful in solving a wide range of classification problems. While other successful classifiers, such as boosting algorithms or neural networks, admit natural interpretations as maximum likelihood, a suitable statistical interpretation is much more elusive for a random forest. The first part of this dissertation demonstrates that a random forest is a fruitful framework in which to study AdaBoost and deep neural networks. The work explores the concept and utility of interpolation, the ability of a classifier to perfectly fit its training data. The second part of this dissertation places a random forest on more sound statistical footing by framing it as kernel regression with the proximity kernel. The work then analyzes the parameters that control the bandwidth of this kernel and discuss useful generalizations.
6. Marginally Interpretable Generalized Linear Mixed Models
A popular approach for relating correlated measurements of a non-Gaussian response variable to a set of predictors is to introduce latent random variables and fit a generalized linear mixed model. The conventional strategy for specifying such a model leads to parameter estimates that must be interpreted conditional on the latent variables. In many cases, interest lies not in these conditional parameters, but rather in marginal parameters that summarize the average effect of the predictors across the entire population. Due to the structure of the generalized linear mixed model, the average effect across all individuals in a population is generally not the same as the effect for an average individual. Further complicating matters, obtaining marginal summaries from a generalized linear mixed model often requires evaluation of an analytically intractable integral or use of an approximation. Another popular approach in this setting is to fit a marginal model using generalized estimating equations. This strategy is effective for estimating marginal parameters, but leaves one without a formal model for the data with which to assess quality of fit or make predictions for future observations. Thus, there exists a need for a better approach.
This dissertation defines a class of marginally interpretable generalized linear mixed models that leads to parameter estimates with a marginal interpretation while maintaining the desirable statistical properties of a conditionally specified model. The distinguishing feature of these models is an additive adjustment that accounts for the curvature of the link function and thereby preserves a specific form for the marginal mean after integrating out the latent random variables.
7. On the Detection of Hate Speech, Hate Speakers and Polarized Groups in Online Social Media
The objective of this dissertation is to explore the use of machine learning algorithms in understanding and detecting hate speech, hate speakers and polarized groups in online social media. Beginning with a unique typology for detecting abusive language, the work outlines the distinctions and similarities of different abusive language subtasks (offensive language, hate speech, cyberbullying and trolling) and how we might benefit from the progress made in each area. Specifically, the work suggests that each subtask can be categorized based on whether or not the abusive language being studied 1) is directed at a specific individual, or targets a generalized “Other” and 2) the extent to which the language is explicit versus implicit. The work then uses knowledge gained from this typology to tackle the “problem of offensive language” in hate speech detection.
8. Lasso Guarantees for Dependent Data
Serially correlated high dimensional data are prevalent in the big data era. In order to predict and learn the complex relationship among the multiple time series, high dimensional modeling has gained importance in various fields such as control theory, statistics, economics, finance, genetics and neuroscience. This dissertation studies a number of high dimensional statistical problems involving different classes of mixing processes.
9. Random forest robustness, variable importance, and tree aggregation
Random forest methodology is a nonparametric, machine learning approach capable of strong performance in regression and classification problems involving complex data sets. In addition to making predictions, random forests can be used to assess the relative importance of feature variables. This dissertation explores three topics related to random forests: tree aggregation, variable importance, and robustness.
10. Climate Data Computing: Optimal Interpolation, Averaging, Visualization and Delivery
This dissertation solves two important problems in the modern analysis of big climate data. The first is the efficient visualization and fast delivery of big climate data, and the second is a computationally extensive principal component analysis (PCA) using spherical harmonics on the Earth’s surface. The second problem creates a way to supply the data for the technology developed in the first. These two problems are computationally difficult, such as the representation of higher order spherical harmonics Y400, which is critical for upscaling weather data to almost infinitely fine spatial resolution.
I hope you enjoyed learning about these compelling machine learning dissertations.
Editor’s note: Interested in more data science research? Check out the Research Frontiers track at ODSC Europe this September 17-19 or the ODSC West Research Frontiers track this October 27-30.
Daniel Gutierrez, ODSC
Daniel D. Gutierrez is a practicing data scientist who’s been working with data long before the field came in vogue. As a technology journalist, he enjoys keeping a pulse on this fast-paced industry. Daniel is also an educator having taught data science, machine learning and R classes at the university level. He has authored four computer industry books on database and data science technology, including his most recent title, “Machine Learning and Data Science: An Introduction to Statistical Learning Methods with R.” Daniel holds a BS in Mathematics and Computer Science from UCLA.
New Survey Shows AI May Be Reducing Worker Numbers
AI and Data Science News posted by ODSC Team Apr 6, 2024 In a new survey released on Friday, Adecco Group, a leading staffing provider, has unveiled insights...
Tesla CEO Elon Musk Boosting Salaries in Bid to Fight Off OpenAI Poachers
AI and Data Science News posted by ODSC Team Apr 5, 2024 In a bid to fight off OpenAI poachers, Tesla CEO Elon Musk announced significant pay raises...
ODSC’s AI Weekly Recap: Week of April 5th
AI and Data Science News posted by Jorge Arenas Apr 5, 2024 Open Data Science Blog Recap The White House Office of Management and Budget, or OMB has...
Search code, repositories, users, issues, pull requests...
Provide feedback.
We read every piece of feedback, and take your input very seriously.
Saved searches
Use saved searches to filter your results more quickly.
To see all available qualifiers, see our documentation .
thesis-project
Here are 172 public repositories matching this topic..., dimvasdim / thesis.
Dynamic pricing of e-shop products through machine learning algorithms
- Updated Dec 27, 2020
hector6298 / EVOCamCal-vehicleSpeedEstimation
Graduation Project. Vehicle speed estimation using computer vision and evolutionary-based camera calibration
- Updated Jun 30, 2022
- Jupyter Notebook
rickirby / SCANDO_iOS
Apps for translating Braille document captured by iPhone camera, then send translation result to ITS's Braille printer for duplicating purpose (re-printing, copying braille document with no original text)
- Updated Oct 6, 2023
amirashoori7 / sdn_qos
This is my master's thesis project: "QoS implementation in Software Defined Network using Ryu Controller"
- Updated Mar 19, 2023
martysai / artificial-text-detection
Python framework for artificial text detection: NLP approaches to compare natural text against generated by neural networks.
- Updated Sep 5, 2023
georgetz15 / mss-thesis
Pytorch implementation of MDensenet and sparse NMF. Made for my undergraduate thesis "Music Source Separation with Supervised Learning Methods".
- Updated Jan 31, 2021
Miksus / thesis-computational-artificial-market
Artificial stock market (ASM) with Julia language.
- Updated Aug 10, 2021
MarcoParola / DL-quickstart
Deep Learning quick guide to setting up master thesis projects
- Updated Mar 4, 2024
aaaastark / Intrusion-Detection-System
Attack Detection, Parameter Optimization and Performance Analysis in Enterprise Networks (ML Networks) for Intrusion Detection System IDS.
- Updated Oct 31, 2023
thehale / DIY-Smartcube
A proof-of-concept proposal for turning standard Rubik's Cubes into smartcubes by embedding speakers into the cube's centercaps.
- Updated Feb 21, 2024
dofranko / castle-defender-ar
Augmented Reality game for Android. Your Castle needs to be protected from enemies. Powered by ARCore, Unity
- Updated Feb 8, 2022
Ashishkumar-hub / Detection-of-Retinal-Blood-Vessels-with-their-Geometrical-Physical-Characteristics-for-ROP
Master's Thesis Project on Detection of Retinal Bood Vessels and their Geometrical Physical Characteristics
- Updated Jan 8, 2022
raulmabe / tfg_app
Production-ready application developed with Flutter, using BLoC as the state management pattern. Moreover, this application connects with an independent back-end that uses GraphQL and MongoDB.
- Updated Jan 25, 2021
yigitozgumus / Polimi_Thesis
This is the code repository of my master's thesis titled "Adversarially Learned Anomaly Detection using Generative Adversarial Networks"
- Updated Sep 22, 2022
zpi-2023 / SensoBackend
Backend of the application made for the Team Project course taken in the last semester of BSc studies. Built in ASP.NET. 👴📱
- Updated Dec 14, 2023
zpi-2023 / senso-frontend
Frontend of the application made for the Team Project course taken in the last semester of BSc studies. Built in React Native. 👴📱
- Updated Dec 15, 2023
MasterCruelty / Thesis-IAC
Una soluzione IAC per il rilascio di un'infrastruttura in ambito DevOps
- Updated Apr 3, 2024
ManhTin / ba-webcam-scraper
Scraper to obtain training / test images from public parking lot webcams for my bachelor thesis project
- Updated Nov 22, 2022
jennynguyenoberg / about-consulting
This is my Master Thesis where I explore Typescript, Tailwind CSS and a variety of animation techniques, such as Framer Motion, GSAP and Locomotive Scroll.
- Updated Feb 5, 2024
hopeliz / ridges
Working files for my thesis project
- Updated Jan 9, 2022
Improve this page
Add a description, image, and links to the thesis-project topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the thesis-project topic, visit your repo's landing page and select "manage topics."
Bias and Variance in Machine Learning
The bias-variance tradeoff is the classical explanation for the generalisation behaviour of machine learning models across supervised learning tasks. This project explores how well the bias-variance framework explains the observed behaviour of machine learning algorithms. We present the bias-variance decomposition for two commonly used regression and classification losses: squared and cross-entropy. These losses are examples of Bregman divergences. We then derive the generalised bias-variance decomposition for Bregman divergences. In addition, we demonstrate that bias-variance theory correctly explains the model behaviour across vari- ous supervised learning regimes and, when considered with recent literature, the behaviour of ensemble methods. The report concludes by examining modern deep learning architectures and other large models, demonstrating how classical bias-variance theory must be adapted to explain their behaviour. We explore the double-decent risk curves produced by deep learn- ing architectures. These can be explained by a monotonically decreasing bias and unimodal variance terms. Bias-variance theory remains a valuable framework for reasoning about the behaviour of machine learning models, but there are still open questions. Further research on the dynamics of bias-variance theory and deep learning could help improve practitioners’ understanding of deep learning models and lay the groundwork for a unified theory of deep learning.
Luke Braithwaite
Mphil advanced computer science student.
I am an MPhil ACS student at Peterhouse, University of Cambridge and my research interests are graph representation learning and geometric deep learning. My current research explores adapting sheaf-based methods for heterogeneous graph data.
- Faculty of Arts and Sciences
- FAS Theses and Dissertations
- Communities & Collections
- By Issue Date
- FAS Department
- Quick submit
- Waiver Generator
- DASH Stories
- Accessibility
- COVID-related Research
Terms of Use
- Privacy Policy
- By Collections
- By Departments
Undergraduate Fundamentals of Machine Learning
Citable link to this page
Collections.
- FAS Theses and Dissertations [6136]
Contact administrator regarding this item (to report mistakes or request changes)
Google Custom Search
Wir verwenden Google für unsere Suche. Mit Klick auf „Suche aktivieren“ aktivieren Sie das Suchfeld und akzeptieren die Nutzungsbedingungen.
Hinweise zum Einsatz der Google Suche
- Data Analytics and Machine Learning Group
- TUM School of Computation, Information and Technology
- Technical University of Munich
Open Topics
We offer multiple Bachelor/Master theses, Guided Research projects and IDPs in the area of data mining/machine learning. A non-exhaustive list of open topics is listed below.
If you are interested in a thesis or a guided research project, please send your CV and transcript of records to Prof. Stephan Günnemann via email and we will arrange a meeting to talk about the potential topics.
Robustness of Large Language Models
Type: Master's Thesis
Prerequisites:
- Strong knowledge in machine learning
- Very good coding skills
- Proficiency with Python and deep learning frameworks (TensorFlow or PyTorch)
- Knowledge about NLP and LLMs
Description:
The success of Large Language Models (LLMs) has precipitated their deployment across a diverse range of applications. With the integration of plugins enhancing their capabilities, it becomes imperative to ensure that the governing rules of these LLMs are foolproof and immune to circumvention. Recent studies have exposed significant vulnerabilities inherent to these models, underlining an urgent need for more rigorous research to fortify their resilience and reliability. A focus in this work will be the understanding of the working mechanisms of these attacks.
We are currently seeking students for the upcoming Summer Semester of 2024, so we welcome prompt applications.
Contact: Tom Wollschläger
References:
- Universal and Transferable Adversarial Attacks on Aligned Language Models
- Attacking Large Language Models with Projected Gradient Descent
- Representation Engineering: A Top-Down Approach to AI Transparency
- Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks
Generative Models for Drug Discovery
Type: Mater Thesis / Guided Research
- Strong machine learning knowledge
- Proficiency with Python and deep learning frameworks (PyTorch or TensorFlow)
- Knowledge of graph neural networks (e.g. GCN, MPNN)
- No formal education in chemistry, physics or biology needed!
Effectively designing molecular geometries is essential to advancing pharmaceutical innovations, a domain which has experienced great attention through the success of generative models. These models promise a more efficient exploration of the vast chemical space and generation of novel compounds with specific properties by leveraging their learned representations, potentially leading to the discovery of molecules with unique properties that would otherwise go undiscovered. Our topics lie at the intersection of generative models like diffusion/flow matching models and graph representation learning, e.g., graph neural networks. The focus of our projects can be model development with an emphasis on downstream tasks ( e.g., diffusion guidance at inference time ) and a better understanding of the limitations of existing models.
Contact : Johanna Sommer , Leon Hetzel
Equivariant Diffusion for Molecule Generation in 3D
Equivariant Flow Matching with Hybrid Probability Transport for 3D Molecule Generation
Structure-based Drug Design with Equivariant Diffusion Models
Data Pruning and Active Learning
Type: Interdisciplinary Project (IDP) / Hiwi / Guided Research / Master's Thesis
Data pruning and active learning are vital techniques in scaling machine learning applications efficiently. Data pruning involves the removal of redundant or irrelevant data, which enables training models with considerably less data but the same performance. Similarly, active learning describes the process of selecting the most informative data points for labeling, thus reducing annotation costs and accelerating model training. However, current methods are often computationally expensive, which makes them difficult to apply in practice. Our objective is to scale active learning and data pruning methods to large datasets using an extrapolation-based approach.
Contact: Sebastian Schmidt , Tom Wollschläger , Leo Schwinn
- Large-scale Dataset Pruning with Dynamic Uncertainty
Efficient Machine Learning: Pruning, Quantization, Distillation, and More - DAML x Pruna AI
Type: Master's Thesis / Guided Research / Hiwi
The efficiency of machine learning algorithms is commonly evaluated by looking at target performance, speed and memory footprint metrics. Reduce the costs associated to these metrics is of primary importance for real-world applications with limited ressources (e.g. embedded systems, real-time predictions). In this project, you will work in collaboration with the DAML research group and the Pruna AI startup on investigating solutions to improve the efficiency of machine leanring models by looking at multiple techniques like pruning, quantization, distillation, and more.
Contact: Bertrand Charpentier
- The Efficiency Misnomer
- A Gradient Flow Framework for Analyzing Network Pruning
- Distilling the Knowledge in a Neural Network
- A Survey of Quantization Methods for Efficient Neural Network Inference
Deep Generative Models
Type: Master Thesis / Guided Research
- Strong machine learning and probability theory knowledge
- Knowledge of generative models and their basics (e.g., Normalizing Flows, Diffusion Models, VAE)
- Optional: Neural ODEs/SDEs, Optimal Transport, Measure Theory
With recent advances, such as Diffusion Models, Transformers, Normalizing Flows, Flow Matching, etc., the field of generative models has gained significant attention in the machine learning and artificial intelligence research community. However, many problems and questions remain open, and the application to complex data domains such as graphs, time series, point processes, and sets is often non-trivial. We are interested in supervising motivated students to explore and extend the capabilities of state-of-the-art generative models for various data domains.
Contact : Marcel Kollovieh , David Lüdke
- Flow Matching for Generative Modeling
- Auto-Encoding Variational Bayes
- Denoising Diffusion Probabilistic Models
- Structured Denoising Diffusion Models in Discrete State-Spaces
Graph Structure Learning
Type: Guided Research / Hiwi
- Optional: Knowledge of graph theory and mathematical optimization
Graph deep learning is a powerful ML concept that enables the generalisation of successful deep neural architectures to non-Euclidean structured data. Such methods have shown promising results in a vast range of applications spanning the social sciences, biomedicine, particle physics, computer vision, graphics and chemistry. One of the major limitations of most current graph neural network architectures is that they often rely on the assumption that the underlying graph is known and fixed. However, this assumption is not always true, as the graph may be noisy or partially and even completely unknown. In the case of noisy or partially available graphs, it would be useful to jointly learn an optimised graph structure and the corresponding graph representations for the downstream task. On the other hand, when the graph is completely absent, it would be useful to infer it directly from the data. This is particularly interesting in inductive settings where some of the nodes were not present at training time. Furthermore, learning a graph can become an end in itself, as the inferred structure can provide complementary insights with respect to the downstream task. In this project, we aim to investigate solutions and devise new methods to construct an optimal graph structure based on the available (unstructured) data.
Contact : Filippo Guerranti
- A Survey on Graph Structure Learning: Progress and Opportunities
- Differentiable Graph Module (DGM) for Graph Convolutional Networks
- Learning Discrete Structures for Graph Neural Networks
NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification
A Machine Learning Perspective on Corner Cases in Autonomous Driving Perception
Type: Master's Thesis
Industrial partner: BMW
Prerequisites:
- Strong knowledge in machine learning
- Knowledge of Semantic Segmentation
- Good programming skills
- Proficiency with Python and deep learning frameworks (TensorFlow or PyTorch)
Description:
In autonomous driving, state-of-the-art deep neural networks are used for perception tasks like for example semantic segmentation. While the environment in datasets is controlled in real world application novel class or unknown disturbances can occur. To provide safe autonomous driving these cased must be identified.
The objective is to explore novel class segmentation and out of distribution approaches for semantic segmentation in the context of corner cases for autonomous driving.
Contact: Sebastian Schmidt
References:
- Segmenting Known Objects and Unseen Unknowns without Prior Knowledge
- Efficient Uncertainty Estimation for Semantic Segmentation in Videos
- Natural Posterior Network: Deep Bayesian Uncertainty for Exponential Family
- Description of Corner Cases in Automated Driving: Goals and Challenges
Active Learning for Multi Agent 3D Object Detection
Type: Master's Thesis Industrial partner: BMW
- Knowledge in Object Detection
- Excellent programming skills
In autonomous driving, state-of-the-art deep neural networks are used for perception tasks like for example 3D object detection. To provide promising results, these networks often require a lot of complex annotation data for training. These annotations are often costly and redundant. Active learning is used to select the most informative samples for annotation and cover a dataset with as less annotated data as possible.
The objective is to explore active learning approaches for 3D object detection using combined uncertainty and diversity based methods.
- Exploring Diversity-based Active Learning for 3D Object Detection in Autonomous Driving
- Efficient Uncertainty Estimation for Semantic Segmentation in Videos
- KECOR: Kernel Coding Rate Maximization for Active 3D Object Detection
- Towards Open World Active Learning for 3D Object Detection
Graph Neural Networks
Type: Master's thesis / Bachelor's thesis / guided research
- Knowledge of graph/network theory
Graph neural networks (GNNs) have recently achieved great successes in a wide variety of applications, such as chemistry, reinforcement learning, knowledge graphs, traffic networks, or computer vision. These models leverage graph data by updating node representations based on messages passed between nodes connected by edges, or by transforming node representation using spectral graph properties. These approaches are very effective, but many theoretical aspects of these models remain unclear and there are many possible extensions to improve GNNs and go beyond the nodes' direct neighbors and simple message aggregation.
Contact: Simon Geisler
- Semi-supervised classification with graph convolutional networks
- Relational inductive biases, deep learning, and graph networks
- Diffusion Improves Graph Learning
- Weisfeiler and leman go neural: Higher-order graph neural networks
- Reliable Graph Neural Networks via Robust Aggregation
Physics-aware Graph Neural Networks
Type: Master's thesis / guided research
- Proficiency with Python and deep learning frameworks (JAX or PyTorch)
- Knowledge of graph neural networks (e.g. GCN, MPNN, SchNet)
- Optional: Knowledge of machine learning on molecules and quantum chemistry
Deep learning models, especially graph neural networks (GNNs), have recently achieved great successes in predicting quantum mechanical properties of molecules. There is a vast amount of applications for these models, such as finding the best method of chemical synthesis or selecting candidates for drugs, construction materials, batteries, or solar cells. However, GNNs have only been proposed in recent years and there remain many open questions about how to best represent and leverage quantum mechanical properties and methods.
Contact: Nicholas Gao
- Directional Message Passing for Molecular Graphs
- Neural message passing for quantum chemistry
- Learning to Simulate Complex Physics with Graph Network
- Ab initio solution of the many-electron Schrödinger equation with deep neural networks
- Ab-Initio Potential Energy Surfaces by Pairing GNNs with Neural Wave Functions
- Tensor field networks: Rotation- and translation-equivariant neural networks for 3D point clouds
Robustness Verification for Deep Classifiers
Type: Master's thesis / Guided research
- Strong machine learning knowledge (at least equivalent to IN2064 plus an advanced course on deep learning)
- Strong background in mathematical optimization (preferably combined with Machine Learning setting)
- Proficiency with python and deep learning frameworks (Pytorch or Tensorflow)
- (Preferred) Knowledge of training techniques to obtain classifiers that are robust against small perturbations in data
Description : Recent work shows that deep classifiers suffer under presence of adversarial examples: misclassified points that are very close to the training samples or even visually indistinguishable from them. This undesired behaviour constraints possibilities of deployment in safety critical scenarios for promising classification methods based on neural nets. Therefore, new training methods should be proposed that promote (or preferably ensure) robust behaviour of the classifier around training samples.
Contact: Aleksei Kuvshinov
References (Background):
- Intriguing properties of neural networks
- Explaining and harnessing adversarial examples
- SoK: Certified Robustness for Deep Neural Networks
- Certified Adversarial Robustness via Randomized Smoothing
- Formal guarantees on the robustness of a classifier against adversarial manipulation
- Towards deep learning models resistant to adversarial attacks
- Provable defenses against adversarial examples via the convex outer adversarial polytope
- Certified defenses against adversarial examples
- Lipschitz-margin training: Scalable certification of perturbation invariance for deep neural networks
Uncertainty Estimation in Deep Learning
Type: Master's Thesis / Guided Research
- Strong knowledge in probability theory
Safe prediction is a key feature in many intelligent systems. Classically, Machine Learning models compute output predictions regardless of the underlying uncertainty of the encountered situations. In contrast, aleatoric and epistemic uncertainty bring knowledge about undecidable and uncommon situations. The uncertainty view can be a substantial help to detect and explain unsafe predictions, and therefore make ML systems more robust. The goal of this project is to improve the uncertainty estimation in ML models in various types of task.
Contact: Tom Wollschläger , Dominik Fuchsgruber , Bertrand Charpentier
- Can You Trust Your Model’s Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift
- Predictive Uncertainty Estimation via Prior Networks
- Posterior Network: Uncertainty Estimation without OOD samples via Density-based Pseudo-Counts
- Evidential Deep Learning to Quantify Classification Uncertainty
- Weight Uncertainty in Neural Networks
Hierarchies in Deep Learning
Type: Master's Thesis / Guided Research
Multi-scale structures are ubiquitous in real life datasets. As an example, phylogenetic nomenclature naturally reveals a hierarchical classification of species based on their historical evolutions. Learning multi-scale structures can help to exhibit natural and meaningful organizations in the data and also to obtain compact data representation. The goal of this project is to leverage multi-scale structures to improve speed, performances and understanding of Deep Learning models.
Contact: Marcel Kollovieh , Bertrand Charpentier
- Tree Sampling Divergence: An Information-Theoretic Metricfor Hierarchical Graph Clustering
- Hierarchical Graph Representation Learning with Differentiable Pooling
- Gradient-based Hierarchical Clustering
- Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space
machine learning Recently Published Documents
Total documents.
- Latest Documents
- Most Cited Documents
- Contributed Authors
- Related Sources
- Related Keywords
An explainable machine learning model for identifying geographical origins of sea cucumber Apostichopus japonicus based on multi-element profile
A comparison of machine learning- and regression-based models for predicting ductility ratio of rc beam-column joints, alexa, is this a historical record.
Digital transformation in government has brought an increase in the scale, variety, and complexity of records and greater levels of disorganised data. Current practices for selecting records for transfer to The National Archives (TNA) were developed to deal with paper records and are struggling to deal with this shift. This article examines the background to the problem and outlines a project that TNA undertook to research the feasibility of using commercially available artificial intelligence tools to aid selection. The project AI for Selection evaluated a range of commercial solutions varying from off-the-shelf products to cloud-hosted machine learning platforms, as well as a benchmarking tool developed in-house. Suitability of tools depended on several factors, including requirements and skills of transferring bodies as well as the tools’ usability and configurability. This article also explores questions around trust and explainability of decisions made when using AI for sensitive tasks such as selection.
Automated Text Classification of Maintenance Data of Higher Education Buildings Using Text Mining and Machine Learning Techniques
Data-driven analysis and machine learning for energy prediction in distributed photovoltaic generation plants: a case study in queensland, australia, modeling nutrient removal by membrane bioreactor at a sewage treatment plant using machine learning models, big five personality prediction based in indonesian tweets using machine learning methods.
<span lang="EN-US">The popularity of social media has drawn the attention of researchers who have conducted cross-disciplinary studies examining the relationship between personality traits and behavior on social media. Most current work focuses on personality prediction analysis of English texts, but Indonesian has received scant attention. Therefore, this research aims to predict user’s personalities based on Indonesian text from social media using machine learning techniques. This paper evaluates several machine learning techniques, including <a name="_Hlk87278444"></a>naive Bayes (NB), K-nearest neighbors (KNN), and support vector machine (SVM), based on semantic features including emotion, sentiment, and publicly available Twitter profile. We predict the personality based on the big five personality model, the most appropriate model for predicting user personality in social media. We examine the relationships between the semantic features and the Big Five personality dimensions. The experimental results indicate that the Big Five personality exhibit distinct emotional, sentimental, and social characteristics and that SVM outperformed NB and KNN for Indonesian. In addition, we observe several terms in Indonesian that specifically refer to each personality type, each of which has distinct emotional, sentimental, and social features.</span>
Compressive strength of concrete with recycled aggregate; a machine learning-based evaluation
Temperature prediction of flat steel box girders of long-span bridges utilizing in situ environmental parameters and machine learning, computer-assisted cohort identification in practice.
The standard approach to expert-in-the-loop machine learning is active learning, where, repeatedly, an expert is asked to annotate one or more records and the machine finds a classifier that respects all annotations made until that point. We propose an alternative approach, IQRef , in which the expert iteratively designs a classifier and the machine helps him or her to determine how well it is performing and, importantly, when to stop, by reporting statistics on a fixed, hold-out sample of annotated records. We justify our approach based on prior work giving a theoretical model of how to re-use hold-out data. We compare the two approaches in the context of identifying a cohort of EHRs and examine their strengths and weaknesses through a case study arising from an optometric research problem. We conclude that both approaches are complementary, and we recommend that they both be employed in conjunction to address the problem of cohort identification in health research.
Export Citation Format
Share document.
Thesis and Labrotation Projects
Please check this page regularly, new topics will be added on a rolling basis. As long as there are no topics on this page yet, feel free to contact our group directly.
Available Topics
Assigned topics, development of novel techniques to "explain" nonlinear prediction models (msc).
Target group: MSc Students in Computer Science or related fields
Short description : This project will develop and implement novel techniques to "explain" specific classes of non-linear prediction models.
Background : As machine learning and artificial intelligence methods are increasingly used in sensitive applications, a need for such methods to be interpretable to humans has arisen, leading to the formation of the field of "explainable AI" (XAI). However, most XAI methods do not address a well-defined problem and are hence difficult to benchmark. The UNIML group has started to provide problem definitions, benchmarks and performance metrics for assessing "explanation performance". This project will propose novel techniques to derived explanations and interpretations from nonlinear models. In particular, we will be concerned with kernel methods and/or deep neural networks. Existing non-linear benchmark problems from the group will be used to benchmark the proposed approach and guide their further refinement.
Required skills: Python, machine learning, statistics
Optional skills: experience with deep learning frameworks
Anticipated duration : 6 months
Contact: Stefan Haufe
Investigating the relationship between power and functional connectivity of brain rhythms (MSc)
Target group: MSc Students in Computational Neuroscience or related fields
Short description : This project will study the relationsship between the power of rhythmic brain signals and the functional connectivity (coherence, Granger causality) between such signals through theoretical analyses and simulations.
Background : While it is often observed that the synchronization of brain rhythms correlates with their strengths, the relationship between the two can be much more complex. Similarly, directed and undirected functional connectivity metrics can lead to seemingly inconsistent results. The purpose of this project is to derive simple examples that illustrative the complex ways in which power and connectivity can interact.
Required skills: Programming, signal processing
Optional skills: MATLAB, Python
Enumeration of modes in generated data
Title: Enumeration of modes in generated data
Target group: BSc/MSc Students in Computer Science or related fields
Short description : Find data patterns inside high dimensional time series data to detect bad generative model
Background :
The rise of generative models, such as MidJourney, has brought about significant advancements in the field of machine learning. These models have shown impressive capabilities in creating synthetic data with a wide range of possible applications, including data augmentation, data privacy, and data sharing. However, as generative models become more prevalent, they also raise important ethical and social issues that need to be carefully considered.
In this bachelor's thesis, we aim to investigate one important aspect of generative modeling: the enumeration of possible synthetic data modes or prototypes. This process involves identifying and describing the different variations of synthetic data that can be generated by a given generative model. By enumerating these modes, we can gain a better understanding of the types of synthetic data that can be produced, which can be useful as a quality criteria for synthetic data.
The thesis will consist of a few components: writing an expose, which explores the existing literature for possible solutions, either applying or adapting it to the time series data, and developing own method
Required skills : strong background in statistics/data analysis/machine learning
Anticipated duration : 3 months (or more, depending on deliverables)
Development and validation of an individual head modeling pipeline for MEG source localization (BSc)
Target group: BSc Students in Computer Science or related fields
Short description : This project will develop an individual head modeling pipeline for magnetoencephalography (MEG), and will apply it for the purpose of localizing the sources of real MEG data.
Background : Electrical volume conductor modeling of the head is an important step when it comes to localizing brain sources magnetoencephalographic (MEG) measurements. Here it is important to take the individual anatomy of the subject's head and its relative position in the MEG scanner into account. This project will develop an individual head modeling pipeline for the Yokogawa MEG system at the PTB, and will test it using real MEG data.
Required skills: Programming experience
Optional skills: MATLAB, Python, basic linear algebra
Anticipated duration : 3 months
Gesture recognition and classification using wearable sensors (BSc/MSc)
Short description : Using sensors on your wearable device (e.g. Android phone, Arduino with gyro- and accelerometer), fuse the data from gyroscope and accelerometer and create a speller.
Background : Currently there is an increasing interest in healthy lifestyle. An accessible way to improve the lifestyle is to use mobile apps and sophisticated sensors of the smartphones to gather structured information about wellbeing. In this project students will develop an application that uses wearable sensors for recognition and tracking of human activities. The scope of the human activities will depend on the project duration and preparedness of students. The simplest example would be a gesture speller, holding smartphone. A more complicated instance would be tracking of behavioural patterns and activities (eating, sleeping, working, etc). The project consists of 3 main parts: recording of the dataset, data processing and analysis, and application delivery.
Required skills : basic programmingin OOP language and basic knowledge of operating systems
Optional skills: Android programming, signal analysis
Anticipated duration : 3 months (or more, depending on deliverables)
Contact: Rustam Zhumagambetov
Towards robust metrics of amplitude-amplitude coupling between brain areas (MSc)
Short description : This project will conduct simulations to study the influence of source mixing on estimates of amplitude-amplitude coupling (AAC) between neural time series.
Background : The analysis of electrophysiological recordings of brain activity using electroencephalography (EEG) or similar techniques promises to shed light on the working principles of the brain. In particular, measures of interaction between neural time series may provide insight on how communication between different regions is implemented in the brain. One mechanism that has been proposed is through correlation of the envelopes of distinct brain rhythms (AAC). However, ubiquitous source mixing can induce spurious AAC. While remedies have been proposed, these can be demostrated to fail in counterexamples. This project aims to characterize the ability of different metrics of AAC to distinguish true from spurious across-site interaction. It will also aim to develop novel metrics based on antisymmetrizes higher order spectra.
Characterizing dementia types using normative models of functional brain connectivity (MSc)
Target group: MSc Students in Neuroscience or related fields
Short description: This project will analyze several large magnetoencephalography datasets comprising data of patients diagnosed with different stages and types of dementia. Robust functional connectivity estimation pipelines will be used to compare patients to previously established normative data from healthy subjects in order to identify clinically relevant clusters of patients.
Background: Several devastating aging related neurological disorders such as Alzheimer's disease and other dementias are currently incurable and their pathophysiology is not well understood. Brain communication patterns in these disorders are likely disturbed making functional brain connectivity (FC) analysis a promising tool to derive disease and disease stage specific biomarkers. Ideally, such direct markers of brain functioning could even be of prognostic value and inspire novel interventions. In this project, we will apply validated robust pipelines for directed and undirected FC estimation to large patient MEG datasets. Comparisons to previously established normative data will be used to identify spatially and spectrally resolved FC markers that are specific to diseases and disease stages.
Required skills: Matlab, signal processing, basic statistics, interest in the pathophysiology of neurological disorders Optional skills: Experience with M/EEG data analysis including source reconstruction and functional connectivity estimation
Anticipated duration : 6+ months
Contact: Stefan Haufe
Design of benchmark data to validate explainable artificial intelligence (MSc)
Target group: MSc Students in Computer Science or related fields
Short description: This project will develop synthetic ground-truth data to benchmark and validate explainable artificial intelligence methods using generative deep learning models.
Background: As machine learning and artificial intelligence methods are increasingly used in sensitive applications, a need for such methods to be interpretable to humans has arisen, leading to the formation of the field of "explainable AI" (XAI). However, most XAI methods do not address a well-defined problem and are hence difficult to benchmark. The UNIML group has started to provide problem definitions and performance metrics for assessing "explanation performance". This project will design and validate realistic yet well-defined ground-truth data to benchmark XAI approaches according to the developed definitions and criteria. To this end, we will use state-of-the-art generative models such as generative adversarial and diffusion models. The focus will be on natural and medical images.
Required skills: Python, machine learning, statistics Optional skills: experience with deep learning frameworks
Investigating the effect of whitening on "AI explanation performance" (MSc)
Short description : This project will study the effects of various whitening and orthogonalization transforms of the input data on the "explanation performance" of so called "explainable AI" methods.
Background : As machine learning and artificial intelligence methods are increasingly used in sensitive applications, a need for such methods to be interpretable to humans has arisen, leading to the formation of the field of "explainable AI" (XAI). However, most XAI methods do not address a well-defined problem and are hence difficult to benchmark. The UNIML group has started to provide problem definitions, benchmarks and performance metrics for assessing "explanation performance". This project will explore the ability of whitening transforms to improve the performance of popular XAI methods.
Required skills: Python, machine learning
Optional skills: experience with deep learning and XAI frameworks
Comparison of FEM and BEM models for EEG forward and inverse modeling (BSc)
Short description : This project will integrate an existing finite element (FEM) modeling pipeline (ROAST) into the open source package Brainstorm for electroencephalographic (EEG) data analysis. This will make it possilbe to create accurate volume conductor models for brain source localization. The project will also quantitatively compare the obtained accuracy with that of standard boundary element method (BEM) modeling implemented in Brainstorm.
Background : Electrical volume conductor modeling of the head is an important step when it comes to modeling the effect of transcranial electric brain stimulation (TES) as well as localizing brain sources electroencephalographic (EEG) measurements. While TES modeling typically relies on detailed finite element (FEM) solvers, software packages for EEG inverse modeling typically offer only less accurate boundary-element (BEM) solvers. This project will make an existing FEM code (ROAST) accessible for EEG inverse modeling by integrating it into the open source package Brainstorm. This will allow for a direct quantitative comparison of FEM and BEM models in terms of EEG source localization accuracy.
Required skills : Programming experience
Optional skills: MATLAB, basic linear algebra
- Bibliography
- More Referencing guides Blog Automated transliteration Relevant bibliographies by topics
- Automated transliteration
- Relevant bibliographies by topics
- Referencing guides
Dissertations / Theses on the topic 'Machine Learning (ML)'
Create a spot-on reference in apa, mla, chicago, harvard, and other styles.
Consult the top 50 dissertations / theses for your research on the topic 'Machine Learning (ML).'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Holmberg, Lars. "Human In Command Machine Learning." Licentiate thesis, Malmö universitet, Malmö högskola, Institutionen för datavetenskap och medieteknik (DVMT), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-42576.
Nangalia, V. "ML-EWS - Machine Learning Early Warning System : the application of machine learning to predict in-hospital patient deterioration." Thesis, University College London (University of London), 2017. http://discovery.ucl.ac.uk/1565193/.
John, Meenu Mary. "Design Methods and Processes for ML/DL models." Licentiate thesis, Malmö universitet, Institutionen för datavetenskap och medieteknik (DVMT), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-45026.
Tabell, Johnsson Marco, and Ala Jafar. "Efficiency Comparison Between Curriculum Reinforcement Learning & Reinforcement Learning Using ML-Agents." Thesis, Blekinge Tekniska Högskola, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-20218.
Mattsson, Fredrik, and Anton Gustafsson. "Optimize Ranking System With Machine Learning." Thesis, Högskolan i Halmstad, Akademin för informationsteknologi, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-37431.
Kakadost, Naser, and Charif Ramadan. "Empirisk undersökning av ML strategier vid prediktion av cykelflöden baserad på cykeldata och veckodagar." Thesis, Malmö universitet, Fakulteten för teknik och samhälle (TS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-20168.
Sammaritani, Gloria. "Google BigQuery ML. Analisi comparativa di un nuovo framework per il Machine Learning." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2020.
Gustafsson, Sebastian. "Interpretable serious event forecasting using machine learning and SHAP." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-444363.
Schoenfeld, Brandon J. "Metalearning by Exploiting Granular Machine Learning Pipeline Metadata." BYU ScholarsArchive, 2020. https://scholarsarchive.byu.edu/etd/8730.
Hellborg, Per. "Optimering av datamängder med Machine learning : En studie om Machine learning och Internet of Things." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-13747.
Nämerforslund, Tim. "Machine Learning Adversaries in Video Games : Using reinforcement learning in the Unity Engine to create compelling enemy characters." Thesis, Mittuniversitetet, Institutionen för informationssystem och –teknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-42746.
Mahfouz, Tarek Said. "Construction legal support for differing site conditions (DSC) through statistical modeling and machine learning (ML)." [Ames, Iowa : Iowa State University], 2009.
REPETTO, MARCO. "Black-box supervised learning and empirical assessment: new perspectives in credit risk modeling." Doctoral thesis, Università degli Studi di Milano-Bicocca, 2023. https://hdl.handle.net/10281/402366.
Björkberg, David. "Comparison of cumulative reward withone, two and three layered artificialneural network in a simple environmentwhen using ml-agents." Thesis, Blekinge Tekniska Högskola, Institutionen för datavetenskap, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-21188.
Ucci, Graziano. "The Interstellar Medium of Galaxies: a Machine Learning Approach." Doctoral thesis, Scuola Normale Superiore, 2019. http://hdl.handle.net/11384/85928.
Gilmore, Eugene M. "Learning Interpretable Decision Tree Classifiers with Human in the Loop Learning and Parallel Coordinates." Thesis, Griffith University, 2022. http://hdl.handle.net/10072/418633.
Sridhar, Sabarish. "SELECTION OF FEATURES FOR ML BASED COMMANDING OF AUTONOMOUS VEHICLES." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-287450.
Lagerkvist, Love. "Neural Novelty — How Machine Learning Does Interactive Generative Literature." Thesis, Malmö universitet, Fakulteten för kultur och samhälle (KS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-21222.
Bhogi, Keerthana. "Two New Applications of Tensors to Machine Learning for Wireless Communications." Thesis, Virginia Tech, 2021. http://hdl.handle.net/10919/104970.
Garg, Anushka. "Comparing Machine Learning Algorithms and Feature Selection Techniques to Predict Undesired Behavior in Business Processesand Study of Auto ML Frameworks." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-285559.
Hanski, Jari, and Kaan Baris Biçak. "An Evaluation of the Unity Machine Learning Agents Toolkit in Dense and Sparse Reward Video Game Environments." Thesis, Uppsala universitet, Institutionen för speldesign, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-444982.
Stellmar, Justin. "Predicting the Deformation of 3D Printed ABS Plastic Using Machine Learning Regressions." Youngstown State University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ysu1587462911261523.
Krüger, Franz David, and Mohamad Nabeel. "Hyperparameter Tuning Using Genetic Algorithms : A study of genetic algorithms impact and performance for optimization of ML algorithms." Thesis, Mittuniversitetet, Institutionen för informationssystem och –teknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-42404.
Björkman, Desireé. "Machine Learning Evaluation of Natural Language to Computational Thinking : On the possibilities of coding without syntax." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-424269.
Lundin, Lowe. "Artificial Intelligence for Data Center Power Consumption Optimisation." Thesis, Uppsala universitet, Avdelningen för systemteknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-447627.
Wessman, Filip. "Advanced Algorithms for Classification and Anomaly Detection on Log File Data : Comparative study of different Machine Learning Approaches." Thesis, Mittuniversitetet, Institutionen för informationssystem och –teknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-43175.
Narmack, Kirilll. "Dynamic Speed Adaptation for Curves using Machine Learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-233545.
Mathias, Berggren, and Sonesson Daniel. "Design Optimization in Gas Turbines using Machine Learning : A study performed for Siemens Energy AB." Thesis, Linköpings universitet, Programvara och system, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-173920.
Giuliani, Luca. "Extending the Moving Targets Method for Injecting Constraints in Machine Learning." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/23885/.
Tarullo, Viviana. "Artificial Neural Networks for classification of EMG data in hand myoelectric control." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019. http://amslaurea.unibo.it/19195/.
Wallner, Vanja. "Mapping medical expressions to MedDRA using Natural Language Processing." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-426916.
Hellberg, Johan, and Kasper Johansson. "Building Models for Prediction and Forecasting of Service Quality." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-295617.
Nardello, Matteo. "Low-Power Smart Devices for the IoT Revolution." Doctoral thesis, Università degli studi di Trento, 2020. http://hdl.handle.net/11572/274371.
Nardello, Matteo. "Low-Power Smart Devices for the IoT Revolution." Doctoral thesis, Università degli studi di Trento, 2020. http://hdl.handle.net/11572/274371.
Michelini, Mattia. "Barcode detection by neural networks on Android mobile platforms." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2020. http://amslaurea.unibo.it/21080/.
Lundström, Robin. "Machine Learning for Air Flow Characterization : An application of Theory-Guided Data Science for Air Fow characterization in an Industrial Foundry." Thesis, Karlstads universitet, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kau:diva-72782.
Daneshvar, Saman. "User Modeling in Social Media: Gender and Age Detection." Thesis, Université d'Ottawa / University of Ottawa, 2019. http://hdl.handle.net/10393/39535.
Rosell, Felicia. "Tracking a ball during bounce and roll using recurrent neural networks." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-239733.
Hallberg, Jesper. "Searching for the charged Higgs boson in the tau nu analysis using Boosted Decision Trees." Thesis, Uppsala universitet, Högenergifysik, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-301351.
Klingvall, Emelie. "Artificiell intelligens som ett beslutsstöd inom mammografi : En kvalitativ studie om radiologers perspektiv på icke-tekniska utmaningar." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-18768.
Alatram, Ala'a A. M. "A forensic framework for detecting denial-of-service attacks in IoT networks using the MQTT protocol." Thesis, Edith Cowan University, Research Online, Perth, Western Australia, 2022. https://ro.ecu.edu.au/theses/2561.
Forssell, Melker, and Gustav Janér. "Product Matching Using Image Similarity." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-413481.
Bengtsson, Theodor, and Jonas Hägerlöf. "Stora mängder användardata för produktutveckling : Möjligheter och utmaningar vid integrering av stora mängder användardata i produktutvecklingsprocesser." Thesis, KTH, Integrerad produktutveckling, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-297966.
Nordqvist, My. "Classify part of day and snow on the load of timber stacks : A comparative study between partitional clustering and competitive learning." Thesis, Mittuniversitetet, Institutionen för informationssystem och –teknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-42238.
Mele, Matteo. "Convolutional Neural Networks for the Classification of Olive Oil Geographical Origin." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2020.
Hjerpe, Adam. "Computing Random Forests Variable Importance Measures (VIM) on Mixed Numerical and Categorical Data." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-185496.
Ahlm, Kristoffer. "IDENTIFIKATION AV RISKINDIKATORER I FINANSIELL INFORMATION MED HJÄLP AV AI/ML : Ökade möjligheter för myndigheter att förebygga ekonomisk brottslighet." Thesis, Umeå universitet, Institutionen för matematik och matematisk statistik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-184818.
Zanghieri, Marcello. "sEMG-based hand gesture recognition with deep learning." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019. http://amslaurea.unibo.it/18112/.
Benatti, Mattia. "Progettazione e Sviluppo di una Piattaforma Multi-Sorgente per l’Ottimizzazione dei Servizi di Emergenza." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021.
Sibelius, Parmbäck Sebastian. "HMMs and LSTMs for On-line Gesture Recognition on the Stylaero Board : Evaluating and Comparing Two Methods." Thesis, Linköpings universitet, Artificiell intelligens och integrerade datorsystem, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-162237.
Marjolein Bolten, succesfully defended her Master thesis. Monitoring training load and identifying fatigue in young elite speed skaters using machine learning methods Monitoring training load and identifying fatigue in young elite speed skaters using machine learning methods
Bolten, Marjolein (2024) Monitoring training load and identifying fatigue in young elite speed skaters using machine learning methods. University of Twente, Enschede.
More information: Thesis
More recent news
Help | Advanced Search
Computer Science > Cryptography and Security
Title: learn what you want to unlearn: unlearning inversion attacks against machine unlearning.
Abstract: Machine unlearning has become a promising solution for fulfilling the "right to be forgotten", under which individuals can request the deletion of their data from machine learning models. However, existing studies of machine unlearning mainly focus on the efficacy and efficiency of unlearning methods, while neglecting the investigation of the privacy vulnerability during the unlearning process. With two versions of a model available to an adversary, that is, the original model and the unlearned model, machine unlearning opens up a new attack surface. In this paper, we conduct the first investigation to understand the extent to which machine unlearning can leak the confidential content of the unlearned data. Specifically, under the Machine Learning as a Service setting, we propose unlearning inversion attacks that can reveal the feature and label information of an unlearned sample by only accessing the original and unlearned model. The effectiveness of the proposed unlearning inversion attacks is evaluated through extensive experiments on benchmark datasets across various model architectures and on both exact and approximate representative unlearning approaches. The experimental results indicate that the proposed attack can reveal the sensitive information of the unlearned data. As such, we identify three possible defenses that help to mitigate the proposed attacks, while at the cost of reducing the utility of the unlearned model. The study in this paper uncovers an underexplored gap between machine unlearning and the privacy of unlearned data, highlighting the need for the careful design of mechanisms for implementing unlearning without leaking the information of the unlearned data.
Submission history
Access paper:.
- HTML (experimental)
- Other Formats
References & Citations
- Google Scholar
- Semantic Scholar
BibTeX formatted citation
Bibliographic and Citation Tools
Code, data and media associated with this article, recommenders and search tools.
- Institution
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .
IMAGES
VIDEO
COMMENTS
In this tech-driven world, selecting research and thesis topics in machine learning projects is the first choice of masters and Doctorate scholars. Selecting and working on a thesis topic in machine learning is not an easy task as machine learning uses statistical algorithms to make computers work in a certain way without being explicitly ...
This article provides a list of 20 potential thesis ideas for an undergraduate program in machine learning and deep learning in 2023. Each thesis idea includes an introduction, which presents a brief overview of the topic and the research objectives. The ideas provided are related to different areas of machine learning and deep learning, such ...
Based on this background, the aim of this thesis is to select and implement a machine learning process that produces an algorithm, which is able to detect whether documents have been translated by humans or computerized systems. This algorithm builds the basic structure for an approach to evaluate these documents. 1.2 Related Work
Data-driven Decisions - An Anomaly Detection Perspective Shubhranshu Shekhar, 2023. METHODS AND APPLICATIONS OF EXPLAINABLE MACHINE LEARNING Joon Sik Kim, 2023. Applied Mathematics of the Future Kin G. Olivares, 2023. NEURAL REASONING FOR QUESTION ANSWERING Haitian Sun, 2023.
This dissertation revisits and makes progress on some old but challenging problems concerning least squares estimation, the work-horse of supervised machine learning. Two major problems are addressed: (i) least squares estimation with heavy-tailed errors, and (ii) least squares estimation in non-Donsker classes.
of the basics of machine learning, it might be better understood as a collection of tools that can be applied to a speci c subset of problems. 1.2 What Will This Book Teach Me? The purpose of this book is to provide you the reader with the following: a framework with which to approach problems that machine learning learning might help solve ...
This dissertation explores three topics related to random forests: tree aggregation, variable importance, and robustness. 10. Climate Data Computing: Optimal Interpolation, Averaging, Visualization and Delivery. This dissertation solves two important problems in the modern analysis of big climate data.
Machine Learning, a natural outgrowth at the intersection of Computer Science and Statistics, has evolved into a broad, highly successful, and extremely dynamic discipline. ... In this thesis, we develop theoretical foundations and new algorithms for several important emerging learning paradigms of significant practical importance, including ...
They will stress the importance of structure, substance and style. They will urge you to write down your methodology and results first, then progress to the literature review, introduction and conclusions and to write the summary or abstract last. To write clearly and directly with the reader's expectations always in mind.
Dynamic pricing of e-shop products through machine learning algorithms. machine-learning neural-network particle-swarm-optimization dynamic-pricing thesis-project Updated Dec 27, 2020; Python ... This is my master's thesis project: "QoS implementation in Software Defined Network using Ryu Controller"
The bachelor thesis is commonly a necessary last step towards the first graduation in higher education and constitutes a central key to both further studies in higher education and employment that ...
3) Machine Learning algorithms allowed us to analyze clinical data, draw. relationships between diagnostic variables, design the predictive model, and. tests it against the new case. The predictive model achieved an accuracy of 89.4. percent using RandomForest Classifier's default setting to predict heart diseases.
The bias-variance tradeoff is the classical explanation for the generalisation behaviour of machine learning models across supervised learning tasks. This project explores how well the bias-variance framework explains the observed behaviour of machine learning algorithms. We present the bias-variance decomposition for two commonly used ...
This thesis studies advanced probabilistic models, including both their theoretical foundations and practical applications, for different semi-supervised learning (SSL) tasks. The proposed probabilistic methods are able to improve the safety of AI systems in real applications by providing reliable uncertainty estimates quickly, and at the same time, achieve competitive performance compared to ...
The concept of machine learning is something born out of this environment. Computers can analyze digital data to find patterns and laws in ways that is too complex for a human to do. The basic idea of machine learning is that a computer can automatically learn from experience (Mitchell, 1997). Although machine learning applications vary, its
sensitive datasets to develop models that are only locally optimal. Federated learning (FL) facilitates robust machine learning by enabling the development of global models without sharing sensitive data. However, there are two broad challenges associated with deploying FL systems: privacy challenges and training/performance-related challenges.
Abstract. Drawing on lectures, course materials, existing textbooks, and other resources, we synthesize and consolidate the content necessary to offer a successful first exposure to machine learning for students with an undergraduate-level background in linear algebra and statistics. The final product is a textbook for Harvard's introductory ...
Open Topics We offer multiple Bachelor/Master theses, Guided Research projects and IDPs in the area of data mining/machine learning. A non-exhaustive list of open topics is listed below.. If you are interested in a thesis or a guided research project, please send your CV and transcript of records to Prof. Stephan Günnemann via email and we will arrange a meeting to talk about the potential ...
The sample for this study consisted of bachelor students' thesis projects (n=2436) that have been started between 2010 and 2017. Data were extracted from two different data systems used to record data about thesis projects. From these systems, thesis project data were collected including variables related to both students and supervisors.
This thesis uses machine learning techniques and statistical analysis in two separate educational experiments. In the first experiment we attempt to find relationships between students' written essay responses to physics questions and their learning of the physics data.
Find the latest published documents for machine learning, Related hot topics, top authors, the most cited documents, and related journals ... The project AI for Selection evaluated a range of commercial solutions varying from off-the-shelf products to cloud-hosted machine learning platforms, as well as a benchmarking tool developed in-house. ...
Target group: MSc Students in Computer Science or related fields Short description: This project will develop and implement novel techniques to "explain" specific classes of non-linear prediction models.. Background: As machine learning and artificial intelligence methods are increasingly used in sensitive applications, a need for such methods to be interpretable to humans has arisen, leading ...
This thesis proposes a mode of inquiry that considers the inter- active qualities of what machine learning does, as opposed the tech- nical specifications of what machine learning is. A shift in focus from the technicality of ML to the artifacts it creates allows the interaction designer to situate its existing skill set, affording it to engage ...
Molecular Property Prediction (MPP) is vital for drug discovery, crop protection, and environmental science. Over the last decades, diverse computational techniques have been developed, from using simple physical and chemical properties and molecular fingerprints in statistical models and classical machine learning to advanced deep learning approaches. In this review, we aim to distill ...
Python offers an unmatched ecosystem and flexibility, perfect for those who live and breathe data science. Meanwhile, ML.NET provides a robust, seamless path for .NET developers to integrate ML into their applications, unlocking new potential. The decision between Python and ML.NET refers to specific project requirements, team expertise, and ...
With extensive pre-trained knowledge and high-level general capabilities, large language models (LLMs) emerge as a promising avenue to augment reinforcement learning (RL) in aspects such as multi-task learning, sample efficiency, and task planning. In this survey, we provide a comprehensive review of the existing literature in $\\textit{LLM-enhanced RL}$ and summarize its characteristics ...
Marjolein Bolten, succesfully defended her Master thesis. Monitoring training load and identifying fatigue in young elite speed skaters using machine learning methods Monitoring training load and identifying fatigue in young elite speed skaters using machine learning methods
This emerging field requires project professionals to master the art of crafting effective prompts to communicate with AI tools, enhancing their ability to manage projects successfully. This PMI report delves into the essentials of prompt engineering for project managers, offering insights into how to leverage GenAI for automating, assisting ...
Machine unlearning has become a promising solution for fulfilling the "right to be forgotten", under which individuals can request the deletion of their data from machine learning models. However, existing studies of machine unlearning mainly focus on the efficacy and efficiency of unlearning methods, while neglecting the investigation of the privacy vulnerability during the unlearning process ...