recent research papers on data structures

Algorithms and Data Structures

18th International Symposium, WADS 2023, Montreal, QC, Canada, July 31 – August 2, 2023, Proceedings

Conference proceedings
© 2023
Pat Morin ORCID: https://orcid.org/0000-0003-0471-4118 0 ,
Subhash Suri ORCID: https://orcid.org/0000-0002-5668-7521 1

Carleton University, Ottawa, Canada

You can also search for this editor in PubMed Google Scholar

University of California, Santa Barbara, USA

Part of the book series: Lecture Notes in Computer Science (LNCS, volume 14079)

Included in the following conference series:

WADS: Algorithms and Data Structures Symposium

Conference proceedings info: WADS 2023.

29k Accesses

6 Citations

3 Altmetric

This is a preview of subscription content, log in via an institution to check access.

Access this book

Available as EPUB and PDF
Read on any device
Instant download
Own it forever
Compact, lightweight edition
Dispatched in 3 to 5 business days
Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Institutional subscriptions

Table of contents (47 papers)

Front matter, geometric spanning trees minimizing the wiener index.

A. Karim Abu-Affash, Paz Carmi, Ori Luwisch, Joseph S. B. Mitchell

The Mutual Visibility Problem for Fat Robots

Rusul J. Alsaedi, Joachim Gudmundsson, André van Renssen

Faster Algorithms for Cycle Hitting Problems on Disk Graphs

Shinwoo An, Kyungjin Cho, Eunjin Oh

Tight Analysis of the Lazy Algorithm for Open Online Dial-a-Ride

Júlia Baligács, Yann Disser, Farehe Soheil, David Weckbecker

Online TSP with Known Locations

Evripidis Bampis, Bruno Escoffier, Niklas Hahn, Michalis Xefteris

Socially Fair Matching: Exact and Approximation Algorithms

Sayan Bandyapadhyay, Fedor Fomin, Tanmay Inamdar, Fahad Panolan, Kirill Simonov

A Parameterized Approximation Scheme for Generalized Partial Vertex Cover

Sayan Bandyapadhyay, Zachary Friggstad, Ramin Mousavi

Dominator Coloring and CD Coloring in Almost Cluster Graphs

Aritra Banik, Prahlad Narasimhan Kasthurirangan, Venkatesh Raman

Tight Approximation Algorithms for Ordered Covering

Jatin Batra, Syamantak Das, Agastya Vibhuti Jha

Online Minimum Spanning Trees with Weight Predictions

Magnus Berg, Joan Boyar, Lene M. Favrholdt, Kim S. Larsen

Compact Distance Oracles with Large Sensitivity and Low Stretch

Davide Bilò, Keerti Choudhary, Sarel Cohen, Tobias Friedrich, Simon Krogmann, Martin Schirneck

Finding Diameter-Reducing Shortcuts in Trees

Davide Bilò, Luciano Gualà, Stefano Leucci, Luca Pepè Sciarria

Approximating the Smallest k -Enclosing Geodesic Disc in a Simple Polygon

Prosenjit Bose, Anthony D’Angelo, Stephane Durocher

Online Interval Scheduling with Predictions

Joan Boyar, Lene M. Favrholdt, Shahin Kamali, Kim S. Larsen

On Length-Sensitive Fréchet Similarity

Kevin Buchin, Brittany Terese Fasy, Erfan Hosseini Sereshgi, Carola Wenk

Hardness of Graph-Structured Algebraic and Symbolic Problems

Jingbang Chen, Yu Gao, Yufan Huang, Richard Peng, Runze Wang

Sublinear-Space Streaming Algorithms for Estimating Graph Parameters on Sparse Graphs

Xiuge Chen, Rajesh Chitnis, Patrick Eades, Anthony Wirth

Efficient k -Center Algorithms for Planar Points in Convex Position

Jongmin Choi, Jaegun Lee, Hee-Kap Ahn

Classification via Two-Way Comparisons (Extended Abstract)

Marek Chrobak, Neal E. Young

Other volumes

data structures
adaptive algorithms
artificial intelligence
computer networks
directed graphs
graph theory
graphic methods

About this book

This book constitutes the refereed proceedings of the 18th International Symposium on Algorithms and Data Structures, WADS 2023, held during July 31-August 2, 2023. The 47 regular papers, presented in this book, were carefully reviewed and selected from a total of 92 submissions. They present original research on the theory, design and application of algorithms and data structures.

Editors and Affiliations

Subhash Suri

Bibliographic Information

Book Title : Algorithms and Data Structures

Book Subtitle : 18th International Symposium, WADS 2023, Montreal, QC, Canada, July 31 – August 2, 2023, Proceedings

Editors : Pat Morin, Subhash Suri

Series Title : Lecture Notes in Computer Science

DOI : https://doi.org/10.1007/978-3-031-38906-1

Publisher : Springer Cham

eBook Packages : Computer Science , Computer Science (R0)

Copyright Information : The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023

Softcover ISBN : 978-3-031-38905-4 Published: 28 July 2023

eBook ISBN : 978-3-031-38906-1 Published: 27 July 2023

Series ISSN : 0302-9743

Series E-ISSN : 1611-3349

Edition Number : 1

Number of Pages : XII, 721

Number of Illustrations : 42 b/w illustrations, 121 illustrations in colour

Topics : Data Structures and Information Theory , Algorithm Analysis and Problem Complexity , Computer Systems Organization and Communication Networks , Symbolic and Algebraic Manipulation , Discrete Mathematics in Computer Science , Computer Graphics

Publish with us

Policies and ethics

Find a journal
Track your research

Search anything:

Must read research papers on Data Structures

Data structures.

Open-Source Internship opportunity by OpenGenus for programmers. Apply now.

Data Structures are not seen to be as important as Algorithms but in reality, it is equally important to solve computational problems efficiently. The must read research papers on Data Structures are:

Ordered Hash Table (1973)
Randomized Search Trees (1989)
EERTREE: An Efficient Data Structure for Processing Palindromes in Strings (2015)
Making data structures persistent (1986)
Design and implementation of an efficient priority queue (1976)
Fractional cascading: A data structuring technique (1986)

Note that the above list has been prepared by OpenGenus and is very accuracy. We will now go through each in detail:

Ordered Hash Table

Basic details on the paper:

Author: O. Amble and D. E. Knuth
Affiliation: University of Oslo and Stanford University
Date published: 1973
Journal/ Conference: The Computer Journal
Read this paper here: by Oxford Academia
Read about Hash Tables

Hash table is a fundamental progress as it shows how a simple data structure like array can be used to improve common operations like searching to constant time. Today, hash map has a central place in algorithms but a lot of ideas goes into it for efficiency like:

collision avoidance
hash generation

This is a must read for anyone interested in Computer Science and specially, Algorithms and Data Structures.

Randomized Search Trees

Author: Raimund Seidel and Cecilia Aragon
Affiliation: UC Berkeley
Date published: November 1989
Journal/ Conference: 30th Annual Symposium on Foundations of Computer Science
Read this paper here: Research Gate (PDF)
Read about Randomized Search Trees

Balanced Search trees are a solution to a number of problems but when it comes to reality, trees keep changing and keeping it balanced is a difficult process. This paper presents how balanced trees can be modified using randomness and opens up a whole new category of algorithms.

For will need to understand Binary Search Tree to get the complete idea of this.

EERTREE: An Efficient Data Structure for Processing Palindromes in Strings

Author: Mikhail Rubinchik and Arseny M. Shur
Affiliation: Ural Federal University, Ekaterinburg, Russia
Date published: June 2015
Journal/ Conference: European Journal of Combinatorics
Read this paper here: ArXiv
Read about EerTree

This is a must read because this data structure shows that some problems like string related problems may seem to be pruely algorithmic problems but a clever use of a data structure can improve the performance significantly.

This paper bridges the gap between algorithms and data structures with respect to strings.

Making data structures persistent

Author: Driscoll, J.R., Sarnak, N., Sleator, D.D., Tarjan, R.E
Affiliation: Carnegie-Mellon University, IBM, AT&T Bell Laboratories, Princeton University
Date published: 5th August 1986
Journal/ Conference: Journal of Computer and System Sciences
Read this paper here: ACM
Read about Persistent Segment Tree

Usually, data structures need not remember previous structures but as we went on to solve complex problems in time domain, need of persistent data structures that is data structures maintaining its previous structures rose. This paper is a must read as it shows what it means for a data structure to be persistent and how we can do so.

This opens up the path to a whole new domain of persistent data structures.

Design and implementation of an efficient priority queue

Author: P. van Emde Boas, R. KaasE. Zijlstra
Affiliation: Mathematical Centre, Amsterdam, Netherlands
Date published: December 1976
Journal/ Conference: Mathematical systems theory
Read this paper here: Springer
Read about Priority Queue

This is an old paper but is a good read as it shows how a data structure should be implemented depending on the system architecture and programming language used to get the most performance out of it.

This will open up your mind to look at algorithms and data structures differently.

Fractional cascading: A data structuring technique

Author: Bernard Chazelle, Leonidas J. Guibas
Affiliation: Brown University, Ecole Normale Supérieure, DEC/SRC and Stanford University, USA
Date published: November 1986
Journal/ Conference: Algorithmica

This is an important paper as it shows how particular data structure can perform less for the first few tries and then, show superior performance as it gets warmed up. This is a must read to understand the true potential of data structures.

Learn more about Data Structures

With this, you will have a good understanding of the wide variety of data structures and what can be done using them.

(Data) STRUCTURES

Ieee account.

Change Username/Password
Update Address

Purchase Details

Payment Options
Order History
View Purchased Documents

Profile Information

Communications Preferences
Profession and Education
Technical Interests
US & Canada: +1 800 678 4333
Worldwide: +1 732 981 0060
Contact & Support
About IEEE Xplore
Accessibility
Terms of Use
Nondiscrimination Policy
Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Data Science Journal

Download PDF (English) XML (English)
Alt. Display

Research Papers

Black hole clustering: gravity-based approach with no predetermined parameters.

Belal K. ELFarra
Mamoun A. A. Salaha
Wesam M. Ashour

Clustering is a fundamental technique in data mining and machine learning, aiming to group data elements into related clusters. However, traditional clustering algorithms, such as K-means, suffer from limitations such as the need for user-defined parameters and sensitivity to initial conditions.

This paper introduces a novel clustering algorithm called Black Hole Clustering (BHC), which leverages the concept of gravity to identify clusters. Inspired by the behavior of masses in the physical world, gravity-based clustering treats data points as mass points that attract each other based on distance. This approach enables the detection of high-density clusters of arbitrary shapes and sizes without the need for predefined parameters. We extensively evaluate BHC on synthetic and real-world datasets, demonstrating its effectiveness in handling complex data structures and varying point densities. Notably, BHC excels in accurate prediction of the number of clusters and achieves competitive clustering accuracy rates. Moreover, its parameter-free nature enhances clustering accuracy, robustness, and scalability. These findings represent a significant contribution to advanced clustering techniques and pave the way for further research and application of gravity-based clustering in diverse fields. BHC offers a promising approach to addressing clustering challenges in complex datasets, opening up new possibilities for improved data analysis and pattern discovery.

gravity-based clustering
density-based clustering
machine learning
data mining

I. Introduction

Clustering is a data analysis technique used to group similar data points together based on certain features or characteristics. It is used for pattern recognition, data compression, anomaly detection, recommendation systems, image segmentation, customer segmentation, and genomics analysis. However, it faces challenges, such as selecting the right number of clusters, handling irregular cluster shapes, scalability issues, sensitivity to outliers, and the need for appropriate evaluation methods. Researchers continually work on improving clustering algorithms to address these challenges and make clustering more effective for various applications.

However, commonly employed clustering algorithms like K-means and expectation maximization (EM) face challenges such as dependence on user-defined parameters and sensitivity to initial conditions. In high-dimensional data, determining the optimal number of clusters (k) in K-means can pose a particularly challenging task ( Cai et al. 2023 ; Ghazal et al. 2021 ).

To overcome these limitations, density-based clustering ( Ester, et al., 1996 ) techniques have emerged, defining clusters based on regions of high densities separated by regions of low densities. Among these techniques, gravity-based clustering stands out as a variant that exploits the concept of gravity to detect clusters. By treating data points as mass points that attract each other based on distance, gravity-based clustering forms high-density clusters where points are closely related ( Huang et al. 2019 ; Kuwil et al. 2020 ).

The inspiration for gravity-based clustering stems from the role of gravity in the physical world, where it governs the behavior of masses in the universe ( Cadiou et al. 2020 ). Researchers have leveraged this concept to develop innovative clustering algorithms capable of effectively identifying clusters in diverse datasets. Gravity-based clustering finds applications in various fields, including time series analysis, astrophysics, and data mining ( Jankowiak et al. 2017 ), enabling the identification of clusters with arbitrary shapes and sizes, making it particularly valuable for datasets characterized by varying point densities.

This paper introduces the Black Hole Clustering (BHC) algorithm, a novel approach that harnesses the concept of gravity to classify unsupervised datasets and autonomously predict cluster numbers, eliminating the need for predefined parameters. BHC demonstrates robust performance in predicting cluster numbers and consistently achieves competitive accuracy rates. Comparative evaluations against established clustering methods consistently highlight BHC’s superiority across various scenarios. When applied to real-world datasets from diverse domains, BHC consistently proves its effectiveness, emphasizing its reliability and potential for further research and practical applications. This contribution advances the field of clustering algorithms capable of handling complex data structures.

The remaining sections of this paper are organized as follows: The next section is dedicated to the literature review, where we begin with a discussion on the statement of the problem, followed by an exploration of related works. Subsequently, we delve into the methodology, starting with the proposed idea and then presenting the proposed algorithm. The experiments and results sections showcase the outcomes of our research. Finally, we conclude the paper by summarizing the key findings and contributions of our study, along with highlighting potential directions for future research.

II. Literature Review

A. statement of the problem.

The problem addressed in this research is the development of a clustering approach that does not rely on predetermined parameters and can identify clusters with non-linear boundaries. The proposed solution leverages the concept of black holes to model the clustering of data points. The goal is to determine the number of clusters and perform clustering for each data point without using any parameters. The identification of the cluster center is a significant challenge, and the datapoints with max dens are suggested as the efficient cluster centroids. The objective function evaluates the contribution of every variable to achieve optimized clustering, and the centroids get relocated to find the optimum grouping, such that the data points within a cluster are closest to their centroid.

B. Related Works

In the field of clustering, Xu and Wunsch ( Xu & Wunsch II 2005 ) provided a comprehensive review of various algorithms that aim to identify clusters in data. These algorithms can be classified into distribution-based, hierarchical-based, density-based, and grid-based approaches, with the choice depending on the characteristics and prior knowledge of the dataset ( Xu et al. 2016 ; Liu, et al., 2007 ; Liu & Hou 2016 ; Louhichi et al. 2017 ). However, the challenge arises when dealing with big data, which is often heterogeneous and challenging to exploit.

Clustering methods offer a promising solution to tackle the complexities of big data. Density-based clustering methods, in particular, are widely used due to their ability to handle large databases and effectively handle noisy data ( Ester et al. 1996 ; Hai-Feng et al. 2023 ). One such algorithm is DBSCAN, which has been extended to include variants such as OPTICS, ST-DBSCAN, and MR-DBSCAN ( Ankerst et al. 1999 ; Birant & Kut 2007 ; He et al. 2011 ). While these methods perform well on spatial data, they have limitations when applied to high-dimensional data. Subspace clustering algorithms, like DENCLUE and CLIQUE, address this issue by detecting clusters within low-dimensional subspaces of high-dimensional data ( Hinneburg & Keim 1998 ; Agrawal et al. 1998 ). However, DENCLUE suffers from slow execution time due to its hill-climbing method, which slows down convergence to local maxima.

Several new clustering algorithms have been proposed to address the limitations of existing methods. The Multi-Elitist PSO algorithm combines particle swarm optimization with clustering ( Das et al. 2008 ), while PSO-Km integrates PSO with the K-means method ( Dhawan & Dai 2018 ). An improved method ( Elfarra et al. 2013 ) uses the concept of gravity to discover clusters in data, where each data point is attracted to the closest point with higher gravity. However, this method requires the specification of two predetermined parameters.

Another notable clustering algorithm is Density Peak Clustering (DPC), which identifies cluster centers based on their density and assigns points to clusters accordingly ( Rodriguez & Laio 2014 ). Several improved versions of DPC, such as MDPC, PPC, FDP Cluster, and DPCG, have been proposed ( Cai et al. 2018 ; Ni et al. 2019 ; Yan et al. 2016 ; Xu et al. 2016 ). However, these algorithms tend to select high-density points as initial cluster centers, which may lead to incorrect assignments or treat low-density points as noise.

The Shared Nearest Neighbors (SNN) algorithm addresses the issue of multiple-density clusters by considering the number of shared neighbors between objects ( Jarvis & Patrick 1973 ). However, identifying clusters without significant separation zones may not be accurate, as the k-nearest neighbors based on distance may not be at the same level as the object ( Ertöz et al. 2003 ). Although the SNN and DPC algorithms have been integrated into SNN-DPC, accurately identifying clusters without evident separation zones remains a challenge, and user input regarding the number of clusters or center points is often required ( Liu et al. 2018 ).

In summary, the field of clustering algorithms offers various approaches to address the challenges posed by big data. From density-based methods like DBSCAN and OPTICS to subspace clustering algorithms like DENCLUE and CLIQUE, each approach has its strengths and limitations ( Chen et al. 2022 ). Newer algorithms, such as Multi-Elitist PSO, PSO-Km, and gravity-based methods, strive to improve clustering performance. However, accurately identified clusters without evident separation zones and the need for user-specified parameters remain ongoing challenges in the field.

III. Methodology

A. proposed idea.

Our clustering approach utilizes the concept of black holes to model data points, akin to the gravitational force exerted by black holes in space. In our approach, we designate prototypes as black holes that attract nearby data points. Each data point generates gravity for each link between itself and any data point that is identified as its nearest neighbor. By selecting prototypes with the highest gravity, we attract the nearest data points, which, in turn, pull their nearest data points towards the prototypes, resulting in the formation of clusters.

The gravity of a data point X is determined by the number of data points Y that consider X as their nearest neighbor. Our objective is to determine the optimal number of clusters and classify each data point without relying on predefined parameters.

B. Challenges and novel algorithms

Challenges may arise when two data points designate each other as their nearest neighbor, leading to the complete separation of these points from the cluster. This situation causes the cluster to split into two new clusters. Figure 1 provides a clear example of such a case, where datapoint 1 designates datapoint 2 as its nearest neighbor, and vice versa.

Reciprocal nearest neighbor relationship between datapoint 1 and datapoint 2.

Another challenge arises when a data point, X, has nearby neighbors that are closely grouped together, leading to the splitting of the cluster into two separate clusters. This situation is illustrated in Figure 2 , where data point 1 identifies data point 4 as its nearest neighbor, while data point 2 identifies data point 3 as its nearest neighbor. As a result, data point 3 forms new connections with other data points that are unrelated to the neighbors of data point 4, leading to the formation of two isolated subgroups. Ideally, these isolated subgroups should be classified as a single cluster rather than multiple new subclusters.

Formation of isolated subgroups within a cluster due to neighbor relations.

To effectively tackle these challenges without introducing additional parameters, we introduce two innovative algorithms, namely “Move_data_points” and “Shrink.”

The Move_data_points algorithm is designed to optimize the relationships between data points. It achieves this by relocating the second, third, fourth, and fifth nearest neighboring points (referred to as fifth-level neighbors) either to the given data point or one of its adjacent points. This adjustment serves to refine the connections among neighboring points and enhance the overall clustering structure. For example, in Figure 2 , datapoint 2 will have connections with datapoints 1, 3, and 4, resulting in a single cluster.

However, it’s worth noting that this approach can occasionally lead to the unintended merging of clusters, particularly when noise points act as connectors between distinct clusters. In response to this challenge, we present the “Shrink” algorithm as a complementary solution. The Shrink algorithm operates by transforming data points to positions closer to their nearest neighbors if the distance between them exceeds twice the mean distance between neighboring points. This strategic relocation effectively brings noisy points into closer proximity to their respective nearest clusters while simultaneously disrupting any spurious connections that may exist between clusters.

Together, these two algorithms, Move_data_points and Shrink, work in tandem to refine the clustering results and mitigate potential challenges arising from noisy or poorly connected data points.

As a final step, we evaluate the effectiveness of our black hole clustering approach by comparing it to other widely used clustering algorithms such as K-means, DBSCAN, and OPTICS. The proposed black hole clustering approach offers a promising alternative method for clustering without the need for predefined parameters. This method excels at identifying clusters with non-linear boundaries and can be applied to various data types, including high-dimensional data. Further research can explore the efficiency and effectiveness of this method and its potential for real-world applications.

C. Proposed Algorithm

The BHC-Clustering algorithm starts by loading the dataset and creating a matrix called Z. It then calculates the Euclidean distance between each pair of data points in Z, resulting in a distance matrix called d ecu . The distances are sorted in ascending order, and the indices of the sorted distances are stored in an S indices array. By using this matrix, we can determine the fifth-level neighbors datapoints for each row-datapoint. The next step is to apply the Shrink function to Z matrix using the distance matrix d ecu and the S indices array. As mentioned before, we use the Shrink algorithm to eliminate or reduce the impact of noise datapoints. This step modifies the positions of the data points in Z matrix and creates a modified matrix called X. The algorithm continues by repeatedly iterating through the points in X matrix that have not been moved. It identifies the parent data point (P dp ), which is a datapoint that is marked as the nearest datapoint and as large as possible, with the highest recurrence in the first column of the S indices array, and then we move the associated data points in X using the “Move_data_points” algorithm. This process continues until all data points have been moved. Finally, the algorithm returns the modified X matrix, which represents the clustered data points based on the BHC-Clustering approach.

The “Move_data_points” algorithm operates on a dataset and performs the following steps without explicitly referring to individual data points:

For each data point in the dataset, designate a specific data point, referred to as P dp , as its nearest neighbor.

Iterate through the dataset again, and for each data point encountered, set its nearest neighbor to P dp .

Consider P dp as the second, third, fourth, and fifth nearest neighbor for each data point in the dataset. Update the data points accordingly by assigning P dp as their nearest neighbor.

Finally, return the modified dataset after applying these updates.

In summary, the “Move_data_points” algorithm operates on a dataset, establishing P dp as the nearest neighbor for each data point, and extends this association to the second, third, fourth, and fifth nearest neighbors. The algorithm then updates the data points based on these assignments before returning the modified dataset.

The Shrink algorithm takes a dataset Z, a distance matrix d ecu , and the indices of the nearest neighbors for each data point as input. It performs the following steps:

First, it calculates the mean distance between each data point and its nearest neighbor and stores these mean distances in a variable called mean_dist. Then, for each data point, it checks if the distance between the current data point and its nearest neighbor is greater than three times the mean distance. If this condition is true, it moves the current data point to the position of its nearest neighbor.

In summary, the Shrink algorithm adjusts the positions of data points based on their distances to their nearest neighbors. It ensures that any data point with a distance significantly larger than the mean distance is moved closer to its nearest neighbor. This process enhances the clustering results by bringing scattered points, which may act as outliers, closer to their neighbors.

Source code available on Google Colap at https://colab.research.google.com/drive/1gVMiNf4KPyUCdqqFDQk-5xP_fogHkLaC .

D. Complexity Analysis

In Algorithm 1, BHC-Clustering, the primary factors contributing to time complexity are the nested loops used for distance calculations and data shrinking. Specifically, the time complexity is O(n 2 *d) due to these nested operations. While in the Move_data_points algorithm, it involves iterating over points and potentially reassigning them, resulting in a time complexity of O(n 2 ). In the Shrink algorithm, the primary time-consuming tasks are computing mean distances and shrinking points. The overall time complexity is O(n 2 *d).

It’s noteworthy that the dominant factor in the time complexity of these algorithms is the nested loops, which encompass iterating over the dataset and evaluating distances between data points. This leads to a combined time complexity of O(n 2 *d) for these algorithms.

In summary, the nested loops involved in dataset traversal and distance computations are the key contributors to the overall time complexity of these algorithms, resulting in a common time complexity of O(n 2 *d).

IV. Experiments and Results

We conducted various experiments using synthetic and real-world datasets to demonstrate the effectiveness of BHC-Clustering and compare it against K-means, DBSCAN, OPTICS, and BIRCH algorithms. Synthetic datasets included both Gaussian and non-Gaussian data, while real-world datasets were also utilized. The experiments were performed using Python version 3.8 on a Windows 10 Education system with an Intel Core i5-10500H CPU running at 2.50GHz and 16 gigabytes of memory. The synthetic datasets were two-dimensional and had varying numbers of true clusters.

In these experiments, we employed BHC-Clustering to determine the optimal number of clusters and then compared the results with the aforementioned clustering methods. The goal was to evaluate the performance of each method in clustering the data and assess the effectiveness of BHC-Clustering in particular.

A. Synthetic Data Sets

We conducted our experiments using two synthetic data sets. The first data set contained two clusters, while the second data set consisted of 15 clusters. Initially, we applied our proposed method to automatically determine the number of clusters, and our approach successfully predicted the correct number of clusters.

To compare the performance of our proposed algorithm, BHC-Clustering, we also utilized several other popular clustering algorithms, namely K-means, DBSCAN, OPTICS, and Birch. We applied these algorithms to the data sets and evaluated their results against our proposed method.

Figures 3 and 4 depict the clustering results obtained from applying the mentioned algorithms to the 2D-synthetic data sets. Specifically, when utilizing K-means clustering ( Figures 3(a) and 4(a) ), the algorithm exhibited unsatisfactory performance across the datasets. It incorrectly merged half of each cluster with half of the others, resulting in inaccurate classifications.

Comparison of clustering algorithms on 2D-synthetic data sets with two clusters (a) K-means clustering results, (b) DBSCAN clustering results, (c) OPTICS clustering results, and (d) Birch clustering results.

Comparison of clustering algorithms on 2D- synthetic data sets with 15 clusters (a) K-means clustering results, (b) DBSCAN clustering results, (c) OPTICS clustering results, and (d) Birch clustering results.

Similarly, the application of DBSCAN ( Figures 3(b) and 4(b) ) showed unsatisfactory performance across the datasets, leading to an incorrect number of clusters. This resulted in the misclassification of data points and the formation of spurious clusters. Evidence of this can be observed from the presence of misclassified data points and the existence of scattered points that should have been grouped together in coherent clusters.

Likewise, OPTICS ( Figures 3(c) and 4(c) ) also demonstrated poor performance. It frequently led to an excessive increase in the number of clusters and caused the fragmentation of clusters into multiple smaller clusters in certain cases. As a result, OPTICS consistently produced inaccurate classifications across the datasets. On the other hand, the Birch algorithm yielded significantly better results in clustering, as evident in Figures 3(d) and 4(d) . However, it classified the data set into 14 clusters instead of the expected 15 clusters.

To provide a comprehensive comparison, Figure 5 presents the performance of our proposed BHC-Clustering approach against the aforementioned algorithms. The results clearly demonstrate that our proposed method outperformed the other algorithms in terms of clustering accuracy and overall performance.

Performance comparison of BHC-Clustering against other algorithms.

Our contribution lies in the accurate prediction of the number of clusters in the data. Determining the optimal number of clusters is a crucial step in clustering analysis, as it directly affects the quality of the results. Traditional clustering algorithms, such as K-means, often require the number of clusters to be specified in advance, which can be challenging, especially when working with unfamiliar or complex datasets. This advancement in cluster prediction enhances the accuracy and reliability of clustering results. It enables us to uncover the underlying structure of the data more effectively. Moreover, our approach reduces the burden on users by automating the process of selecting the number of clusters, making it more accessible and efficient for various applications in data analysis and machine learning.

Overall, the experiments conducted on synthetic data sets provide valuable insights into the performance and suitability of BHC-Clustering for different clustering tasks.

B. Real world Data Sets

In addition to the synthetic data sets, we also tested BHC-Clustering on real-world datasets to assess its performance. Table 1 summarizes the characteristics of these real datasets. They serve as practical benchmarks for evaluating BHC-Clustering in complex real-world scenarios.

Characteristics of real-world datasets.

The first data set, Iris Plants, consists of 150 instances and belongs to three different classes. It has four dimensions, capturing various features of iris plants. This dataset is one of the earliest datasets used in the literature on classification methods and is widely used in statistics and machine learning. The data set contains three classes of 50 instances each, where each class refers to a type of iris plant. One class is linearly separable from the other two; the latter are not linearly separable from each other ( Fisher 1988 ).

The second dataset, Wine, contains 178 instances and is divided into three classes. It is a high-dimensional dataset with 13 dimensions, representing different chemical properties of wines. These data are the results of a chemical analysis of wines grown in the same region in Italy but derived from three different cultivars. The analysis determined the quantities of 13 constituents found in each of the three types of wines. This dataset is selected for its complexity, allowing us to assess the proposed BHC algorithm’s performance in handling high-dimensional data with distinct attributes ( Aeberhard et al. 1991 ).

The third data set, Breast Cancer (BC), is composed of 569 instances and has two classes. It is a particularly challenging data set due to its high dimensionality, with 30 different attributes related to breast cancer diagnosis. These attributes are derived from digitized images of fine needle aspirates (FNA) of breast masses, describing characteristics of cell nuclei within the images ( Wolberg et al. 1995 ). The selection of this dataset was motivated by its high dimensionality and real-world relevance, rendering it a valuable testbed for our clustering algorithm in a healthcare context.

The fourth data set, Seeds-Dataset (SD), contains measurements of the geometrical properties of kernels belonging to three different varieties of wheat. A soft X-ray technique and GRAINS package were used to construct all seven real-valued attributes. The examined group comprised kernels belonging to three different varieties of wheat: Kama, Rosa, and Canadian, 70 elements each, randomly selected for the experiment. High quality visualization of the internal kernel structure was detected using a soft X-ray technique. It is non-destructive and considerably cheaper than other more sophisticated imaging techniques like scanning microscopy or laser technology. The images were recorded on 13x18 cm X-ray KODAK plates. Studies were conducted using combine harvested wheat grain originating from experimental fields explored at the Institute of Agrophysics of the Polish Academy of Sciences in Lublin ( Charytanowicz et al. 2012 ).

The fifth dataset is the Glass dataset, providing a more realistic scenario with 214 instances and 10 attributes. Each instance in this dataset represents a unique piece of glass, and the class attribute indicates the type of glass based on the manufacturing process. There are six distinct types of glass, representing different manufacturing techniques ( German 1987 ). The study of the classification of types of glass was motivated by a criminological investigation. At the scene of the crime, the glass left can be used as evidence, making this dataset particularly relevant for forensic and investigative applications.

These real-world data sets serve as practical benchmarks to assess the effectiveness and applicability of BHC-Clustering in diverse and complex real-world scenarios. The subsequent sections will present the experimental results and comparisons for each of these data sets.

The results presented in Table 2 demonstrate the accuracy rates of various clustering algorithms, namely BHC-Clustering, DBSCAN, OPTICS, and K-means, applied to five real-world data sets: Iris, Wine, Breast Cancer, Seeds-Dataset, and Glass.

Predicted and actual number of classes and accuracy rates of clustering algorithms on real-world datasets.

C. Evaluation metrics

To assess the BHC algorithm’s effectiveness, we utilized confusion matrices as our primary evaluation tool. A confusion matrix is a valuable resource in clustering and unsupervised learning. It aids in gauging how effectively data points are grouped into clusters by comparing assigned cluster labels to actual cluster memberships. This matrix tallies the instances that were correctly and incorrectly assigned to clusters, offering insights into the algorithm’s performance.

The diagonal values within the matrix represent correctly clustered instances. To compute the overall accuracy, we divided these diagonal values by the total number of instances. For visual clarity, Figure 6 illustrates the BHC algorithm’s classification of the Iris dataset. The accompanying confusion matrix reveals that out of a total of 150 instances, 136 were correctly classified, resulting in an accuracy rate of 90.7%.

Confusion matrix for Iris dataset clustering using BHC algorithm.

D. Experimental Results and Evaluation

Notably, the proposed method achieved remarkable success in accurately predicting the number of clusters, as indicated by the column “Pred.” It consistently achieved a perfect prediction rate, highlighting its significance in effectively determining the true number of clusters.

For the Iris data set, BHC-Clustering achieved an accuracy rate of 90.7%, outperforming the other algorithms. DBSCAN and OPTICS had relatively lower accuracy rates of 66% and 67%, respectively, while K-means performed poorly with an accuracy rate of only 24%.

In the case of the Wine data set, BHC-Clustering achieved a moderate accuracy rate of 62%, surpassing DBSCAN (33%) and K-means (16%) but falling behind OPTICS (67%). It is worth noting that none of the algorithms achieved high accuracy on this particular data set.

For the Breast Cancer (BC) data set, BHC-Clustering achieved a decent accuracy rate of 70.3%. DBSCAN (63%) and OPTICS (72%) also performed reasonably well, but K-means excelled with an accuracy rate of 85%.

For the Seed-dataset (SD), BHC-Clustering achieved a moderate accuracy rate of 63%. However, it outperformed the other algorithms in this case as well. DBSCAN and OPTICS had significantly lower accuracy rates of 28% and 18%, respectively, while K-means performed slightly better with an accuracy rate of 26%.

In the Glass Identification (GI) data set, BHC-Clustering achieved an accuracy rate of 76.2%, demonstrating its effectiveness in clustering this particular data set. DBSCAN and OPTICS had lower accuracy rates of 23.8% and 16%, respectively, while K-means performed relatively better with an accuracy rate of 45%.

Overall, the results suggest that BHC-Clustering exhibits competitive performance compared to the other algorithms in terms of clustering accuracy. However, the performance varies depending on the data set, indicating the importance of considering the characteristics and complexity of the data when selecting a suitable clustering algorithm. The proposed method’s success in accurately predicting the number of clusters, demonstrates its potential for enhancing the clustering process.

V. Conclusion

The proposed BHC-Clustering method has been extensively investigated and applied to synthetic and real-world datasets. This approach utilizes the concept of black holes to attract nearby data points and form clusters. The method exhibits robust performance in accurately predicting the number of clusters and achieving competitive clustering accuracy rates.

Comparative evaluations against popular clustering algorithms, such as K-means, DBSCAN, OPTICS, and BIRCH, demonstrate that BHC-Clustering outperforms K-means and achieves comparable or superior results compared to DBSCAN and OPTICS. Although BIRCH shows promise, it has lower accuracy on one of the datasets.

Furthermore, the application of BHC-Clustering on real-world datasets, including Iris, Wine, Breast Cancer, Seeds-Dataset, and Glass, showcases its effectiveness across different domains. It demonstrates varying levels of performance, depending on the characteristics of the dataset. The findings emphasize the reliability and effectiveness of BHC-Clustering as a clustering approach and encourage further research to refine the method, assess its efficiency, and explore its applicability in diverse applications. Overall, BHC-Clustering offers a promising alternative for clustering tasks, providing accurate cluster prediction and competitive clustering accuracy on a variety of datasets, including real-world scenarios.

However, it is important to acknowledge the challenge posed by the algorithm’s complexity, which scales as O(n 2 *d), particularly when confronted with multiple noise points in real-world scenarios. The need for further research in this direction is evident. Future work in this field should focus on:

Complexity enhancement: Addressing the complexity of the BHC-Clustering method O(n 2 *d) to improve its efficiency and scalability, especially when dealing with large datasets and intricate cluster structures.

Noise handling: Developing advanced mechanisms to enhance the algorithm’s ability to identify and manage multiple noise points effectively. This will bolster its applicability in noisy, real-world environments and ensure more efficient clustering outcomes.

Competing Interests

The authors have no competing interests to declare.

Aeberhard, M, Stefan, M and Forina, M 1991. Wine. UCI Machine Learning Repository . DOI: https://doi.org/10.24432/C5PC7J

Agrawal, R, Gehrke, J, Gunopulos, D and Raghavan, P 1998. Automatic subspace clustering of high dimensional data for data mining applications. ACM SIGMOD Record , 27(2): 94–105. DOI: https://doi.org/10.1145/276305.276314

Ankerst, M, Breunig, MM, Kriegel, HP and Sande, J 1999. Optics: ordering points to identify the clustering structure. ACM SIGMOD Record , 28(2): 49–60. DOI: https://doi.org/10.1145/304181.304187

Birant, D and Kut, A 2007. St-dbscan: An algorithm for clustering spatial–temporal data. Data & Knowledge Engineering , 60(1): 208–221. DOI: https://doi.org/10.1016/j.datak.2006.01.013

Cadiou, E, Sarzi, M and Dubois, Y 2020. Gravitational clustering of stars and gas in galaxy simulations. Monthly Notices of the Royal Astronomical Society , 496(4): 4986–5001.

Cai, B, Huang, G, Yong, X, Jing, H, Huang, GL, Ke, D, et al. 2018. Clustering of multiple density peaks. In: 22nd Pacific–Asia Conference, PAKDD 2018, Melbourne, Australia on 3–6 June 2018, 413–425. DOI: https://doi.org/10.1007/978-3-319-93040-4_33

Cai, J, Hao, J, Yang, H, Zhao, X, Yang, Y, et al. 2023. A review on semi-supervised clustering. Information Sciences , 632: 164–200. DOI: https://doi.org/10.1016/j.ins.2023.02.088

Charytanowicz, M, Jerzy, N, Piotr, K, Piotr, K, Szymon, L, et al. 2012. Seeds. UCI Machine Learning Repository . DOI: https://doi.org/10.24432/C5H30K

Chen, X, Wu, H, Lichti, D, Han, X, Ban, Y, Li, P, Deng, H, et al. 2022. Extraction of indoor objects based on the exponential function density clustering model. Information Sciences , 607: 1111–1135. DOI: https://doi.org/10.1016/j.ins.2022.06.032

Das, S, Abraham, A and Konar, A 2008. Automatic kernel clustering with a multi-elitist particle swarm optimization algorithm. Pattern Recognition Letters , 29(5): 688–699. DOI: https://doi.org/10.1016/j.patrec.2007.12.002

Dhawan, AP and Dai, S 2018. Clustering and pattern classification. In: Dhawan, AP, Huang, HK, and Kim, DS (eds.), Principles and Advanced Methods in Medical Imaging and Image Analysis . Singapore: World Scientific. pp. 229–265. DOI: https://doi.org/10.1142/9789812814807_0010

Elfarra, BK, El Khateeb, TJ and Ashour, WM 2013. BH-centroids: A new efficient clustering algorithm. Work , 1(1): 15–24. DOI: https://doi.org/10.14257/ijaiasd.2013.1.1.02

Ertöz, L, Steinbach, M and Kumar, V 2003. Finding clusters of different sizes shapes and densities in noisy high dimensional data. In: The 2003 SIAM International Conference on Data Mining, San Francisco, CA on 1–3 May 2003, pp. 47–58. DOI: https://doi.org/10.1137/1.9781611972733.5

Ester, M, Kriegel, HP, Sander, J and Xu, X 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In: The 2nd International Conference on Knowledge Discovery and Data Mining, Portland, Oregon on 2–4 August 1996, pp. 226–231.

Fisher, RA 1988. Iris. UCI Machine Learning Repository . DOI: https://doi.org/10.24432/C56C76

German, B 1987. Glass Identification. UCI Machine Learning Repository . DOI: https://doi.org/10.24432/C5WW2P

Ghazal, T, Hussain MZ, Said, RA and Nadeem, A 2021. Performances of K-means clustering algorithm with Different Distance Metrics. Intelligent Automation and Soft Computing , 30(2): 735–742. DOI: https://doi.org/10.32604/iasc.2021.019067

Hai-Feng, Y, Xiao-Na, Y, Jiang-Hui, C, Yu-Qing, Y, et al. 2023. An in-depth exploration of LAMOST unknown spectra based on density clustering. Research in Astronomy and Astrophysics , 23(5). DOI: https://doi.org/10.1088/1674-4527/acc507

He, Y, Tan, H, Luo, W, Mao, H, Ma, D, Feng, S, Fan, J et al. 2011. Mr-dbscan: An efficient parallel density-based clustering algorithm using mapreduce. In: 2011 IEEE 17th International Conference on Parallel and Distrubuted Systems, Tianan, Taiwan on 7–9 December 2011, pp. 473–480. DOI: https://doi.org/10.1109/ICPADS.2011.83

Hinneburg, A and Keim, DA 1998. An efficient approach to clustering in large multimedia databases with noise. In: KDD ’98: Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, New York, NY on 27–31 August 1998, pp. 58–65.

Huang, Y, Yang, H and Zhang, L 2019. A novel clustering algorithm based on gravity. Journal of Ambient Intelligence and Humanized Computing , 10(6):2461–2470.

Jankowiak, M, Kaczmarek, M, Wozniak, M and Wojciechowski, K 2017. Gravity-based clustering of time series data. Information Sciences , 385-386: 52–64.

Jarvis, RA and Patrick, EA 1973. Clustering using a similarity measure based on shared near neighbors. IEEE Transactions on Computers , C–22(11): 1025–1034. DOI: https://doi.org/10.1109/T-C.1973.223640

Kuwil, FH, Atila, Ü, Abu-Issa, R and Murtagh, F 2020. A novel data clustering algorithm based on gravity center methodology. Expert Systems with Applications , 156: 113435. DOI: https://doi.org/10.1016/j.eswa.2020.113435

Liu, P, Zhou, D and Wu, N 2007. VDBSCAN: varied density based spatial clustering of applications with noise. In: International Conference on Service Systems and Service Management, Chengdu, China on 9–11 June 2007, pp. 1–4. DOI: https://doi.org/10.1109/ICSSSM.2007.4280175

Liu, R, Wang, H and Yu, X 2018. Shared-nearest-neighbor-based clustering by fast search and find of density peaks. Information Sciiences , 450: 200–226. DOI: https://doi.org/10.1016/j.ins.2018.03.031

Liu, W and Hou, J 2016. Study on a density peak based clustering algorithm. In: 7th International Conference on Intelligent Control and Information Processing (ICICIP), Siem Reap, Cambodia on 1–4 December 2016, pp. 60–67. DOI: https://doi.org/10.1109/ICICIP.2016.7885877

Louhichi, S, Gzara, M and Ben-Abdallah, H 2017. Unsupervised varied density based clustering algorithm using spline. Pattern Recognition Letters , 93: 48–57. DOI: https://doi.org/10.1016/j.patrec.2016.10.014

Ni, L, Luo, W, Zhu, W and Liu, W 2019. Clustering by finding prominent peaks in density space. Engineering Applications Of Artifical Intelligence , 85: 727–739. DOI: https://doi.org/10.1016/j.engappai.2019.07.015

Rodriguez, A and Laio, A 2014. Clustering by fast search and find of density peaks. Science , 344(6191): 1492–1496. DOI: https://doi.org/10.1126/science.1242072

Wolberg, W, Mangasarian, O, Street, N and Street, W 1995. Breast Cancer Wisconsin (Diagnostic). UCI Machine Learning Repository . DOI: https://doi.org/10.24432/C5DW2B

Xu, R and Wunsch, D, II 2005. Survey of clustering algorithms. IEEE Transactions on Neural Networks , 16(3): 645–678. DOI: https://doi.org/10.1109/TNN.2005.845141

Xu, X, Ding, S, Du, M and Xue, Y 2016. DPCG: An efficient density peaks clustering algorithm based on grid. International Journal of Machine Learnning and Cybernetics , 9: 743–754. DOI: https://doi.org/10.1007/s13042-016-0603-2

Yan, Z, Luo, W, Bu, C and Ni, L 2016. Clustering spatial data by the neighbors intersection and the density difference. In: UCC’16: 9th International Conference on Utility and Cloud Computing, Shanghai, China on 6–9 December 2016, pp. 217–226.

AlphaFold 3 predicts the structure and interactions of all of life’s molecules

May 08, 2024

[[read-time]] min read

Introducing AlphaFold 3, a new AI model developed by Google DeepMind and Isomorphic Labs. By accurately predicting the structure of proteins, DNA, RNA, ligands and more, and how they interact, we hope it will transform our understanding of the biological world and drug discovery.

Colorful protein structure against an abstract gradient background.

Inside every plant, animal and human cell are billions of molecular machines. They’re made up of proteins, DNA and other molecules, but no single piece works on its own. Only by seeing how they interact together, across millions of types of combinations, can we start to truly understand life’s processes.

In a paper published in Nature , we introduce AlphaFold 3, a revolutionary model that can predict the structure and interactions of all life’s molecules with unprecedented accuracy. For the interactions of proteins with other molecule types we see at least a 50% improvement compared with existing prediction methods, and for some important categories of interaction we have doubled prediction accuracy.

We hope AlphaFold 3 will help transform our understanding of the biological world and drug discovery. Scientists can access the majority of its capabilities, for free, through our newly launched AlphaFold Server , an easy-to-use research tool. To build on AlphaFold 3’s potential for drug design, Isomorphic Labs is already collaborating with pharmaceutical companies to apply it to real-world drug design challenges and, ultimately, develop new life-changing treatments for patients.

Our new model builds on the foundations of AlphaFold 2, which in 2020 made a fundamental breakthrough in protein structure prediction . So far, millions of researchers globally have used AlphaFold 2 to make discoveries in areas including malaria vaccines, cancer treatments and enzyme design. AlphaFold has been cited more than 20,000 times and its scientific impact recognized through many prizes, most recently the Breakthrough Prize in Life Sciences . AlphaFold 3 takes us beyond proteins to a broad spectrum of biomolecules. This leap could unlock more transformative science, from developing biorenewable materials and more resilient crops, to accelerating drug design and genomics research.

7PNM - Spike protein of a common cold virus (Coronavirus OC43): AlphaFold 3’s structural prediction for a spike protein (blue) of a cold virus as it interacts with antibodies (turquoise) and simple sugars (yellow), accurately matches the true structure (gray). The animation shows the protein interacting with an antibody, then a sugar. Advancing our knowledge of such immune-system processes helps better understand coronaviruses, including COVID-19, raising possibilities for improved treatments.

How AlphaFold 3 reveals life’s molecules

Given an input list of molecules, AlphaFold 3 generates their joint 3D structure, revealing how they all fit together. It models large biomolecules such as proteins, DNA and RNA, as well as small molecules, also known as ligands — a category encompassing many drugs. Furthermore, AlphaFold 3 can model chemical modifications to these molecules which control the healthy functioning of cells, that when disrupted can lead to disease.

AlphaFold 3’s capabilities come from its next-generation architecture and training that now covers all of life’s molecules. At the core of the model is an improved version of our Evoformer module — a deep learning architecture that underpinned AlphaFold 2’s incredible performance. After processing the inputs, AlphaFold 3 assembles its predictions using a diffusion network, akin to those found in AI image generators. The diffusion process starts with a cloud of atoms, and over many steps converges on its final, most accurate molecular structure.

AlphaFold 3’s predictions of molecular interactions surpass the accuracy of all existing systems. As a single model that computes entire molecular complexes in a holistic way, it’s uniquely able to unify scientific insights.

7R6R - DNA binding protein: AlphaFold 3’s prediction for a molecular complex featuring a protein (blue) bound to a double helix of DNA (pink) is a near-perfect match to the true molecular structure discovered through painstaking experiments (gray).

Leading drug discovery at Isomorphic Labs

AlphaFold 3 creates capabilities for drug design with predictions for molecules commonly used in drugs, such as ligands and antibodies, that bind to proteins to change how they interact in human health and disease.

AlphaFold 3 achieves unprecedented accuracy in predicting drug-like interactions, including the binding of proteins with ligands and antibodies with their target proteins. AlphaFold 3 is 50% more accurate than the best traditional methods on the PoseBusters benchmark without needing the input of any structural information, making AlphaFold 3 the first AI system to surpass physics-based tools for biomolecular structure prediction. The ability to predict antibody-protein binding is critical to understanding aspects of the human immune response and the design of new antibodies — a growing class of therapeutics.

Using AlphaFold 3 in combination with a complementary suite of in-house AI models, Isomorphic Labs is working on drug design for internal projects as well as with pharmaceutical partners. Isomorphic Labs is using AlphaFold 3 to accelerate and improve the success of drug design — by helping understand how to approach new disease targets, and developing novel ways to pursue existing ones that were previously out of reach.

AlphaFold Server: A free and easy-to-use research tool

8AW3 - RNA modifying protein: AlphaFold 3’s prediction for a molecular complex featuring a protein (blue), a strand of RNA (purple), and two ions (yellow) closely matches the true structure (gray). This complex is involved with the creation of other proteins — a cellular process fundamental to life and health.

Google DeepMind’s newly launched AlphaFold Server is the most accurate tool in the world for predicting how proteins interact with other molecules throughout the cell. It is a free platform that scientists around the world can use for non-commercial research. With just a few clicks, biologists can harness the power of AlphaFold 3 to model structures composed of proteins, DNA, RNA and a selection of ligands, ions and chemical modifications.

AlphaFold Server helps scientists make novel hypotheses to test in the lab, speeding up workflows and enabling further innovation. Our platform gives researchers an accessible way to generate predictions, regardless of their access to computational resources or their expertise in machine learning.

Experimental protein-structure prediction can take about the length of a PhD and cost hundreds of thousands of dollars. Our previous model, AlphaFold 2, has been used to predict hundreds of millions of structures, which would have taken hundreds of millions of researcher-years at the current rate of experimental structural biology.

Demo video showing the capabilities of the server.

Sharing the power of AlphaFold 3 responsibly

With each AlphaFold release, we’ve sought to understand the broad impact of the technology , working together with the research and safety community. We take a science-led approach and have conducted extensive assessments to mitigate potential risks and share the widespread benefits to biology and humanity.

Building on the external consultations we carried out for AlphaFold 2, we’ve now engaged with more than 50 domain experts, in addition to specialist third parties, across biosecurity, research and industry, to understand the capabilities of successive AlphaFold models and any potential risks. We also participated in community-wide forums and discussions ahead of AlphaFold 3’s launch.

AlphaFold Server reflects our ongoing commitment to share the benefits of AlphaFold, including our free database of 200 million protein structures. We’ll also be expanding our free AlphaFold education online course with EMBL-EBI and partnerships with organizations in the Global South to equip scientists with the tools they need to accelerate adoption and research, including on underfunded areas such as neglected diseases and food security. We’ll continue to work with the scientific community and policy makers to develop and deploy AI technologies responsibly.

Opening up the future of AI-powered cell biology

7BBV - Enzyme: AlphaFold 3’s prediction for a molecular complex featuring an enzyme protein (blue), an ion (yellow sphere) and simple sugars (yellow), along with the true structure (gray). This enzyme is found in a soil-borne fungus (Verticillium dahliae) that damages a wide range of plants. Insights into how this enzyme interacts with plant cells could help researchers develop healthier, more resilient crops.

AlphaFold 3 brings the biological world into high definition. It allows scientists to see cellular systems in all their complexity, across structures, interactions and modifications. This new window on the molecules of life reveals how they’re all connected and helps understand how those connections affect biological functions — such as the actions of drugs, the production of hormones and the health-preserving process of DNA repair.

The impacts of AlphaFold 3 and our free AlphaFold Server will be realized through how they empower scientists to accelerate discovery across open questions in biology and new lines of research. We’re just beginning to tap into AlphaFold 3’s potential and can’t wait to see what the future holds.

6 incredible images of the human brain built with the help of Google's AI

How Prados Beauty is using Gemini to grow their business

creativity-ai-article-keyword-illustration

3 things we learned from professional creatives about their hopes for AI

A new report explores the economic impact of generative AI

Enhance visual storytelling in Demand Gen with generative AI

101 real-world gen AI use cases featured at Google Cloud Next ’24

Let’s stay in touch. Get the latest news from Google in your inbox.

linked list Recently Published Documents

Total documents.

Latest Documents
Most Cited Documents
Contributed Authors
Related Sources
Related Keywords

A New Top-Down Context-Free Parsing for Syntactic Pattern Recognition

The numerous different mathematical methods used to solve pattern recognition snags may be assembled into two universal approaches: the decision-theoretic approach and the syntactic(structural) approach. In this paper, at first syntactic pattern recognition method and formal grammars are described and then has been investigated one of the techniques in syntactic pattern recognition called top – down tabular parser known as Earley’s algorithm Earley's tabular parser is one of the methods of context -free grammar parsing for syntactic pattern recognition. Earley's algorithm uses array data structure for implementing, which is the main problem and for this reason takes a lots of time, searching in array and grammar parsing, and wasting lots of memory. In order to solve these problems and most important, the cubic time complexity, in this article, a new algorithm has been introduced, which reduces wasting the memory to zero, with using linked list data structure. Also, with the changes in the implementation and performance of the algorithm, cubic time complexity has transformed into O (n*R) order. Key words: syntactic pattern recognition, tabular parser, context –free grammar, time complexity, linked list data structure.

On the Use of Model Checking for the Bounded and Unbounded Verification of Nonblocking Concurrent Data Structures

<p>Concurrent data structure algorithms have traditionally been designed using locks to regulate the behaviour of interacting threads, thus restricting access to parts of the shared memory to only one thread at a time. Since locks can lead to issues of performance and scalability, there has been interest in designing so-called nonblocking algorithms that do not use locks. However, designing and reasoning about concurrent systems is difficult, and is even more so for nonblocking systems, as evidenced by the number of incorrect algorithms in the literature. This thesis explores how the technique of model checking can aid the testing and verification of nonblocking data structure algorithms. Model checking is an automated verification method for finite state systems, and is able to produce counterexamples when verification fails. For verification, concurrent data structures are considered to be infinite state systems, as there is no bound on the number of interacting threads, the number of elements in the data structure, nor the number of possible distinct data values. Thus, in order to analyse concurrent data structures with model checking, we must either place finite bounds upon them, or employ an abstraction technique that will construct a finite system with the same properties. First, we discuss how nonblocking data structures can be best represented for model checking, and how to specify the properties we are interested in verifying. These properties are the safety property linearisability, and the progress properties wait-freedom, lock-freedom and obstructionfreedom. Second, we investigate using model checking for exhaustive testing, by verifying bounded (and hence finite state) instances of nonblocking data structures, parameterised by the number of threads, the number of distinct data values, and the size of storage memory (e.g. array length, or maximum number of linked list nodes). It is widely held, based on anecdotal evidence, that most bugs occur in small instances. We investigate the smallest bounds needed to falsify a number of incorrect algorithms, which supports this hypothesis. We also investigate verifying a number of correct algorithms for a range of bounds. If an algorithm can be verified for bounds significantly higher than the minimum bounds needed for falsification, then we argue it provides a high degree of confidence in the general correctness of the algorithm. However, with the available hardware we were not able to verify any of the algorithms to high enough bounds to claim such confidence. Third, we investigate using model checking to verify nonblocking data structures by employing the technique of canonical abstraction to construct finite state representations of the unbounded algorithms. Canonical abstraction represents abstract states as 3-valued logical structures, and allows the initial coarse abstraction to be refined as necessary by adding derived predicates. We introduce several novel derived predicates and show how these allow linearisability to be verified for linked list based nonblocking stack and queue algorithms. This is achieved within the standard canonical abstraction framework, in contrast to recent approaches that have added extra abstraction techniques on top to achieve the same goal. The finite state systems we construct using canonical abstraction are still relatively large, being exponential in the number of distinct abstract thread objects. We present an alternative application of canonical abstraction, which more coarsely collapses all threads in a state to be represented by a single abstract thread object. In addition, we define further novel derived predicates, and show that these allow linearisability to be verified for the same stack and queue algorithms far more efficiently.</p>

Exploratory Study on Accuracy of Students' Mental Models of a Singly Linked List

Automated detection on the security of the linked-list operations, sequential linked data: the state of affairs.

Sequences are among the most important data structures in computer science. In the Semantic Web, however, little attention has been given to Sequential Linked Data. In previous work, we have discussed the data models that Knowledge Graphs commonly use for representing sequences and showed how these models have an impact on query performance and that this impact is invariant to triplestore implementations. However, the specific list operations that the management of Sequential Linked Data requires beyond the simple retrieval of an entire list or a range of its elements – e.g. to add or remove elements from a list –, and their impact in the various list data models, remain unclear. Covering this knowledge gap would be a significant step towards the realization of a Semantic Web list Application Programming Interface (API) that standardizes list manipulation and generalizes beyond specific data models. In order to address these challenges towards the realization of such an API, we build on our previous work in understanding the effects of various sequential data models for Knowledge Graphs, extending our benchmark and proposing a set of read-write Semantic Web list operations in SPARQL, with insert, update and delete support. To do so, we identify five classic list-based computer science sequential data structures (linked list, double linked list, stack, queue, and array), from which we derive nine atomic read-write operations for Semantic Web lists. We propose a SPARQL implementation of these operations with five typical RDF data models and compare their performance by executing them against six increasing dataset sizes and four different triplestores. In light of our results, we discuss the feasibility of our devised API and reflect on the state of affairs of Sequential Linked Data.

Using Prefix Reversals To Looplessly Generate O(1) Time Multiset Permutation By Boustrophedon Linked List

A high-efficiency smoothed particle hydrodynamics model with multi-cell linked list and adaptive particle refinement for two-phase flows, parallel merging and sorting on linked list.

We study linked list sorting and merging on the PRAM model. In this paper we show that n real numbers can be sorted into a linked list in constant time with n2+e processors or in ) time with n2 processors. We also show that two sorted linked lists of n integers in {0, 1, …, m} can be merged into one sorted linked list in O(log(c)n(loglogm)1/2) time using n/(log(c)n(loglogm)1/2) processors, where c is an arbitrarily large constant.

A more pragmatic implementation of the lock-free, ordered, linked list

Export citation format, share document.

Help | Advanced Search

Computer Science > Data Structures and Algorithms

Title: recent advances in fully dynamic graph algorithms.

Abstract: In recent years, significant advances have been made in the design and analysis of fully dynamic algorithms. However, these theoretical results have received very little attention from the practical perspective. Few of the algorithms are implemented and tested on real datasets, and their practical potential is far from understood. Here, we present a quick reference guide to recent engineering and theory results in the area of fully dynamic graph algorithms.

Submission history

Access paper:.

Other Formats

References & Citations

Google Scholar
Semantic Scholar

DBLP - CS Bibliography

Bibtex formatted citation.

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

Purdue Online Writing Lab Purdue OWL® College of Liberal Arts

Welcome to the Purdue Online Writing Lab

Welcome to the Purdue OWL

This page is brought to you by the OWL at Purdue University. When printing this page, you must include the entire legal notice.

Copyright ©1995-2018 by The Writing Lab & The OWL at Purdue and Purdue University. All rights reserved. This material may not be published, reproduced, broadcast, rewritten, or redistributed without permission. Use of this site constitutes acceptance of our terms and conditions of fair use.

The Online Writing Lab at Purdue University houses writing resources and instructional material, and we provide these as a free service of the Writing Lab at Purdue. Students, members of the community, and users worldwide will find information to assist with many writing projects. Teachers and trainers may use this material for in-class and out-of-class instruction.

The Purdue On-Campus Writing Lab and Purdue Online Writing Lab assist clients in their development as writers—no matter what their skill level—with on-campus consultations, online participation, and community engagement. The Purdue Writing Lab serves the Purdue, West Lafayette, campus and coordinates with local literacy initiatives. The Purdue OWL offers global support through online reference materials and services.

A Message From the Assistant Director of Content Development

The Purdue OWL® is committed to supporting students, instructors, and writers by offering a wide range of resources that are developed and revised with them in mind. To do this, the OWL team is always exploring possibilties for a better design, allowing accessibility and user experience to guide our process. As the OWL undergoes some changes, we welcome your feedback and suggestions by email at any time.

Please don't hesitate to contact us via our contact page if you have any questions or comments.

All the best,

Social Media

Facebook twitter.

Reference Manager
Simple TEXT file

People also looked at

Brief research report article, recent records of thermohaline profiles and water depth in the taam ja’ blue hole (chetumal bay, mexico).

1 Department of Observation and Study of the Land, the Atmosphere and the Ocean, Consejo Nacional de Humanidades, Ciencias y Tecnologías-El Colegio de la Frontera Sur (CONAHCYT-ECOSUR), Chetumal, Mexico
2 Department of Sustainability Sciences, El Colegio de la Frontera Sur, Chetumal, Mexico
3 Department of Observation and Study of the Land, the Atmosphere and the Ocean, El Colegio de la Frontera Sur, Chetumal, Mexico

Coastal karst structures have been recently explored and documented in Chetumal Bay, Mexico, at the southeast of the Yucatan Peninsula. These structures, recognized as blue holes, stand out for their remarkable dimensions within a shallow estuarine environment. Particularly the Taam Ja’ Blue Hole (TJBH), revealed a depth of ~274 mbsl based on echo sounder mapping, momentarily positioning it as the world's second-deepest blue hole. However, echo sounding methods face challenges in complex environments like blue holes or inland sinkholes arising from frequency-dependent detection and range limitations due to water density vertical gradients, cross-sectional depth variations, or morphometric deviations in non-strictly vertical caves. Initial exploration could not reach the bottom and confirm its position, prompting ongoing investigation into the geomorphological features of TJBH. Recent CTD profiler records in TJBH surpassed 420 mbsl with no bottom yet reached, establishing the TJBH as the deepest-known blue hole globally. Hydrographic data delineated multiple water layers within TJBH. Comparison with Caribbean water conditions at the Mesoamerican Barrier Reef System, reef lagoons, and estuaries suggests potential subterranean connections. Further research and implementation of underwater navigation technologies are essential to decipher its maximum depth and the possibilities of forming part of an interconnected system of caves and tunnels.

Introduction

Anchialine systems stand out as impressive and exciting environments to be explored across different disciplines. These systems provide a vast research field, from microbiology ( Benítez et al., 2019 ; Little et al., 2021 ; Sha et al., 2021 ), to sea-level dynamics or paleoclimate ( van Hengstum et al., 2011 ; Husson et al., 2018 ; van Hengstum et al., 2020 ; Wallace et al., 2021 ), stratigraphy ( Vimpere, 2017 ), physicochemical water properties ( Perry et al., 2002 , Perry et al., 2009 ), as well as groundwater hydrology ( Gondwe et al., 2010 ; Björnerås et al., 2020 ). However, a common basis across all disciplines is the need to understand the geomorphology and dimensions of the karst structures.

The Yucatan Peninsula, part of Central America's Maya block, lacks Paleozoic folds ( Weber et al., 2012 ). With dynamic diagenesis and gradual Pliocene emergence, it exhibits significant geological structures in vadose ( Perry et al., 2003 , Perry et al., 2009 ) and phreatic settings ( van Hengstum et al., 2010 , van Hengstum et al., 2011 ), as well as in coastal submarine environments ( Bauer-Gottwein et al., 2011a ). Moreover, the Yucatan Peninsula's northern side hosts the Ring of Cenotes Fault, a regional-scale structure formed by sinkholes, related to the Chicxulub meteorite impact 65 million years ago ( Bauer-Gottwein et al., 2011a ). Simultaneously, the world's most extensive subterranean cave system, shaped by glacio-eustatic sea-level changes, is found on the western side ( Supper et al., 2009 ; Kambesis and Coke, 2013 ). Across the eastern margin, parallel to the Caribbean coast, the Yucatan Peninsula features two regional fracture zones—the Holbox Fracture Zone to the north and the Rio Hondo Fault Zone to the south ( Bauer-Gottwein et al., 2011a ) with possible intersections and water exchange ( Gondwe et al., 2011 ). To the southeast, inland sinkholes and lagoons aligned with the Rio Hondo Fault Zone have been extensively studied (e.g. Gischler et al., 2011 ; Perry et al., 2021 ). Also, recent exploration in Chetumal Bay reported large coastal karstic formations recognized as blue holes ( Carrillo et al., 2009b ; Alcérreca-Huerta et al., 2023 ; Flórez-Franco et al., 2023 ). These blue holes represented an outstanding revelation, particularly that of the Taam-ja’ Blue Hole (TJBH), preliminarily recognized as the world's second-deepest, surpassing the depths of the Dean’s Blue Hole in the Bahamas (~202 mbsl) ( Vimpere, 2017 ), the Dahab Blue Hole in Egypt (~130 mbsl) ( Li et al., 2018 ), or the Great Blue Hole in Belize (~125 mbsl) ( Schmitt et al., 2021 ).

The TJBH, first documented by Alcérreca-Huerta et al. (2023) , stands as a noteworthy geological feature. Bathymetric mapping employing echo sounder technology indicated an impressive maximum depth of 274.4 meters below sea level (mbsl). Echo sounding, serving as an indirect method, allowed a comprehensive 3D spatial coverage of the TJBH morphology. However, this method could grapple with constraints arising from frequency-dependent detection and range limitations ( Colbo et al., 2014 ). These challenges are usually accentuated in blue holes and inland sinkholes due to fluctuations in water density ( Cejudo et al., 2022 ) and cross-sectional variations in depth ( Li et al., 2018 ), particularly in non-strictly vertical caves where the blue hole structure deviates from their entrance position. Direct methods for depth measurement employed in TJBH relied on CTD profiling but encountered limitations with measurements being restricted to a maximum depth of 200 mbsl to safeguard against potential instrument damage ( Alcérreca-Huerta et al., 2023 ; Flórez-Franco et al., 2023 ). Notably, the measurements could not reach the bottom and confirm its position, leaving the depths of TJBH and the vertical thermohaline structure partially unresolved.

Therefore, recent direct methods for water depth measurement gathered with a SWiFT CTD Profiler reveal water depths within the TJBH that surpassed the previous reported records, but also the maximum water depth record held by the Sansha Yongle Blue Hole (SYBH) at ~301 mbsl in the South China Sea ( Li et al., 2018 ). This groundbreaking finding establishes the TJBH as the recently confirmed deepest-known blue hole globally. Additionally, the hydrographic data collected is also described to delineate the water temperature and salinity variations along the recent depths reached, the formation of previously unknown pycnoclines, and comparison of the thermohaline conditions in TJBH with those found in the literature for waters in the Caribbean at the Mesoamerican Barrier Reef System and coastal reef lagoons, as a proxy of possible hydraulic connectivity between them and the blue hole.

Cenotes, underground springs, freshwater inlets, and a complex lagoon and anchialine system develop at the southeastern region of the Yucatán Peninsula ( Figure 1A ). The system connects with Chetumal Bay, a semi-closed mesohaline tropical estuary developed over carbonated sedimentary deposits of the Miocene, Mio-Pliocene and Holocene ( Gondwe et al., 2010 ; Domínguez-Herrera et al., 2023 ), which hydrographic conditions are described in Carrillo et al (2009a ), Carrillo et al (2009b) and Ruíz-Pineda et al. (2016) .

Figure 1 (A) Location of the Taam ja’ Blue Hole (TJBH) in Chetumal Bay, Mexico, is presented alongside the CC and CSW data regions for further comparison of water temperature and salinity conditions. Regional fracture zones and geological faults in the Yucatán Peninsula are indicated ( INEGI, 2002 ), along with the locations of documented blue holes within Chetumal Bay. CB data was measured at sampling stations positioned at cardinal positions ~500 m apart of the TJBH (TJBH N , TJBH S , TJBH E and TJBH W ). Images from scuba explorations of the TJBH at depths (B) 5.0 mbsl, (C) 20 mbsl, and (D) 30 mbsl are also presented.

The TJBH (378823 m E, 2059390 m N, UTM 16Q) is located in the central portion of Chetumal Bay, within the Mexican State Reserve “Chetumal Bay-Manatee Sanctuary” (RESMBCH). It is ~4.5 km from Tamalcab island, and ~19.2 km from Chetumal, the most urbanized area. TJBH, Lool ja’ Blue Hole (LJBH), and Ch’och-ja’ Blue Hole (CJBH) are among the blue holes recently documented in Chetumal Bay ( Carrillo et al., 2009b ; Alcérreca-Huerta et al., 2023 ; Flórez-Franco et al., 2023 ) ( Figure 1A ), for which preliminary insights into their geomorphological features, and temporal variability of physicochemical properties have been provided.

Field work and data analysis

On December 6 th , 2023, a scuba diving expedition was conducted to identify the environmental conditions prevailing at the TJBH and related to factors such as visibility, substrate characteristics, and wall coverage within a depth range extending from 0 to 30 mbsl. Additionally, on December 6 th and 13 th , 2023, measurement of new CTD profiles was conducted within the TJBH aiming to reach its bottom and confirm the echo-sounding results described in Alcérreca-Huerta et al. (2023) . Employing a SWiFT CTD Profiler (Valeport UK), single profiles at each campaign with simultaneous measurements of water pressure, temperature, and conductivity were acquired throughout the water column of TJBH. The coordinates for the CTD profiles were 378830.7 m E and 2059383.6 m N (UTM 16Q), selected based on preliminary echo sounding measurements that indicated water depths surpassing 250 mbsl. The vessel was anchored to prevent drifting caused by waves and currents. In this specific location, the CTD instrument was lowered, utilizing ~500 m of cable down to the bottom, adhering to the maximum depth supported by the instrument.

Salinity and density values from CTD casts are computed employing the Chen and Millero/UNESCO international algorithm ( Chen and Millero, 1977 ; Fofonoff and Millard, 1983 ), leading to an accuracy of ±0.01 PSU and ±0.01 kg/m³, respectively. Temperature data from SWiFT CTD Profiler measurements has an accuracy of ±0.01 °C. Data was resampled to achieve a fixed depth resolution of 0.5 m for the calculation of temperature (∂T/∂z), salinity (∂S/∂z), and density (∂ρ/∂z) vertical gradients, to delineate variations in these parameters with depth. The vertical gradient resulted from the absolute difference in a variable quantity over the vertical distance between their resampled measurement locations. Pycnoclines, indicative of density variations, were estimated by considering the maximum vertical density gradient surpassing a defined threshold of δ 1 = 0.5 kg·m 4 ( Read et al., 2011 ; Flórez-Franco et al., 2023 ). Building upon the findings by Flórez-Franco et al. (2023) , density transition zones are identified assuming a density gradient of δ 2 ≥ 0.05 kg·m 4 .

A temperature-salinity diagram was also devised to identify a potential relationship between the waters of the TJBH and those in coastal and open-sea waters in the Caribbean. For this purpose, existing hydrographic data from the Caribbean Surface Water (CSW data) at the Mesoamerican Barrier Reef (0-150 mbsl) delineated in Carrillo et al. (2016) was employed. Insights derived from data detailed in Tovar et al. (2009) , encompassing coastal reef lagoons within the Mexican Caribbean, were considered (CC data). Additionally, existing quarterly data measurements at stations ~500 m apart from the TJBH (i.e., TJBH N , TJBH S , TJBH E , TJBH W ) between March 2021 to December 2023, were used to describe the observed conditions within Chetumal Bay and in the vicinity of the TJBH (CB data). Location of the different comparative study areas (CB, CSW and CC) is depicted in Figure 1A .

The boundary of TJBH, clearly defined around 5.0 mbsl, features a soft substrate covered by biofilms, which extends across the upper walls of the blue hole ( Figure 1B ). The turbidity of Chetumal Bay's waters conceals this border from being visible at the surface. However, the border becomes clearly seen after a depth >4.0 mbsl The TJBH wall exhibits speleothem-like formations covered by biofilms, yet they are soft, fragile, and prone to collapse ( Figure 1C ). Beyond 25-30 mbsl, the wall steepens and develops a firm substrate. This substrate occasionally forms a tilted roof largely free of biofilms (i.e. 0-20% coverage), possibly due to limited natural light penetration ( Figure 1D ).

Profiles and vertical gradients of water temperature, salinity, and density are depicted in Figure 2 . The depths attained from CTD casts on both December 6 th and 13 th , 2023, recorded 416.0 and 423.6 mbsl, respectively. Consequently, these new findings unequivocally establish the Taam Ja’ Blue Hole (TJBH) as the world's deepest known blue hole, with its bottom still not reached.

Figure 2 Vertical profiles and gradients of (A) water temperature, (B) salinity, (C) density, and (D) sound speed measured on 06.12.2023 and 13.12.2023 in TJBH with a CTD profiler. Pycnoclines are given by the maximum density gradient above a threshold δ 1 =0.5 kg/m 4 . Regions next to the pycnoclines location with a density gradient δ 2 >0.05 kg/m 4 (TZ) are also shown.

The CTD measurements revealed a depth shorter than the cable length (~500 m) employed to lower the CTD profiler, indicating an oblique descent of the instrument at an angle of approximately 32.1-33.7° from the vertical. This deviation in orientation could be ascribed to either the specific geomorphology of the Taam Ja’ Blue Hole (TJBH) or the influence of prevailing underwater currents. Moreover, echo sounding data from prior investigations ( Alcérreca-Huerta et al., 2023 ) had reported a maximum depth of 274.4 mbsl, with the deeper regions of the TJBH concentrated predominantly on the northern side, where depths were in average 250 mbsl. This depth coincides with the location of a pycnocline, positioned at a depth of 246.1 mbsl. Consequently, it can be inferred that the echo sounding results reported by Alcérreca-Huerta et al. (2023) might have been affected by a possibly non-strictly vertical morphology of the TJBH or acoustic scattering given by fluctuations in water density ( Figure 2C, D ).

The development of four primary clines with density gradients exceeding 0.5 kg/m 4 is also shown in Figure 2A–C . Pycnoclines were delineated on average at 4.6-5.3 mbsl for the 1 st pycnocline, 246.1 mbsl for the 2 nd pycnocline, 323.3 mbsl for the 3 rd pycnocline, and 414.5 mbsl for the 4 th pycnocline. Transition zones (TZ) between layers above and below the pycnoclines are defined by gradients ∂ρ/∂z > 0.05 kg/m 4 .

The surface water layer (~0-4 mbsl) above the 1st pycnocline exhibits substantial variability in temperature (ranging from 24.9 to 27.9°C) and salinity (13.5-15.0 PSU) across measurements. Temperature and salinity variabilities decrease within the layers below the 1st pycnocline within the TJBH. On average, the layer between pycnoclines 1-2 describes an average temperature of 24.9±0.30 °C and salinity of 22.2±1.02 PSU within a depth range of 8 to 236 mbsl. In the layer encompassing depths of 249-313 mbsl (between pycnoclines 2-3), the average temperature decreases, while salinity increases, with values of 22.3±0.18 °C and 29.5±0.53 PSU, respectively. The layer below, spanning depths of 332-399 mbsl, registers an average salinity of 35.1±0.01 PSU and the lowest average temperature (19.8±0.01 °C). Beyond 400 mbsl, there is a significant increase in temperature within the transition zone, rising from 19.8 to 23.9 °C, accompanied by a salinity increase of up to 37.5 PSU and an average water density of 1027 kg/m 3 .

Possible hydrographic relationships across the TJBH, Chetumal Bay (CB), the Caribbean Surface Water (CSW) and Mexican Caribbean reef lagoons (CC) are explored in the temperature-salinity diagram in Figure 3 . The CB data presents a wide variability of temperature (>25°C) and salinity (<17 PSU) with water densities below 1010 kg/m 3 , similar to those observed in the surface layer above the entrance of TJBH. This reflects the influence of the estuarine Chetumal Bay water atop the TJBH entrance.

Figure 3 Temperature-salinity diagram for the water features corresponding to the TJBH. Water temperature and salinity from measured data in Chetumal Bay (CB) between 2021-2023 is also depicted together with data corresponding to the Caribbean Surface Water (CSW) for water depths 2-150 m ( Carrillo et al., 2016 ) and to reef lagoons in the Caribbean Coast (CC) ( Tovar et al., 2009 ). Curves show density in kilograms per cubic meter. Color bar refers to water depth in meters.

Beyond the depth of 400 mbsl within the TJBH, the water conditions gradually converge with those of in the Caribbean Sea (CSW and CC, Figure 3 ). Salinity levels in the Caribbean Surface Water reach up to 36.9 PSU, particularly at depths ranging from 115 to 150 mbsl, where the water densities are in average 1023±0.1 kg/m³ and reach up to 1026 kg/m³. These marine hydrographic values resemble the results obtained from CTD casts within TJBH at depths exceeding 400 mbsl with average salinity of 36.0±0.74 PSU and density of 1027±0.3 kg/m³. Similarly, data from the coastal reef lagoons of the Mexican Caribbean describe an average salinity value of 36.0±0.53 PSU, accompanied by water temperatures surpassing 18.3 °C and averaging approximately 27.9±2.48 °C. Coastal reef hydrographic data represents shallow areas (less than 9.5 mbsl) showing a wider range of density values between 1020 and 1026 kg/m³ with a mean value of 1023±0.8 kg/m³. This data alignment suggests a potential subterranean connection between these water bodies and the TJBH.

Discussion and concluding remarks

Hydrogeology and geomorphology of karst systems such as blue holes are highly valuable with implications for water resources, biodiversity, or physicochemical and geological processes. The initial results in Alcérreca-Huerta et al. (2023) yielded preliminary insights into the geomorphology, depths, and water properties of TJBH. Confirmation of the maximum depth was not possible due to instrumental limitations during the scientific expeditions in 2021, prompting the need for further exploration and analysis.

The recent records from CTD profiling in 2023 conclusively verifies that the TJBH is now the deepest blue hole discovered to date, exhibiting water depths surpassing 420 mbsl, with its bottom yet to be reached. In line with the approach undertaken by Li et al. (2018) , further investigations should incorporate advanced underwater navigation technologies in conjunction with CTD profilers. This integrated methodology would allow an accurate three-dimensional spatial representation of the TJBH leading to a detailed analysis on its geomorphological features and water depths.

CTD measurements provided valuable results into the temperature–salinity stratification of the TJBH, contributing to a more comprehensive understanding of its hydrographical characteristics. Variations in temperature and salinity within the water layers of the TJBH and the pycnoclines development offered insights of TJBH in relation to surrounding marine environments. In this regard, the CTD measurements hint potential yet undiscovered connections with the seawater of either the coastal reef lagoons or deeper coastal zones of the Mesoamerican Barrier Reef System. The notable increase of temperature (~ΔT>4.0 °C) and salinity (up to 37.5 PSU) at depths beyond 400 mbsl could probably be related to these connections. The increase in salinity may stem from various mechanisms, as delineated by Fleury et al. (2007) . These mechanisms could include salinization processes triggered by the inflow of marine water through a Venturi effect, water density differences ( Mijatovic, 1962 ; Fleury et al., 2007 ), or the difference in hydraulic head as long as that of the seawater is higher than that of the freshwater ( Whitaker and Smart, 1997 ). Thermal specific features could also be related to geological, volcanic or tectonic processes in relation to water circulation ( Šušmelj et al., 2024 ). The increase in water temperature at depths >400 mbsl in TJBH could be hypothesized to resemble that observed in the Floridian aquifer ( Meyer, 1989 ; Fleury et al., 2007 ), where geothermal activity warms cold seawater at deep layers, prompting its upward movement through existing sinkholes or factures at confining units. Subsequent interaction with the aquifer and the presence of further hydraulic connections with seawater could occur at upper layers, resulting in a reduction of the water temperature. This geothermal activity and the recharging areas from seawater have been related with fracture and fault zones in Florida ( Whitaker and Smart, 1997 ) and the Northern Adriatic Sea ( Šušmelj et al., 2024 ).

Research on blue holes encompasses a series of ambitious and exploration projects, often spanning several years or even decades, as occurred for the SYBH (e.g. Li et al., 2018 ; He et al., 2019 ; Xie et al., 2019 ; He et al., 2020 ; Jinwei et al., 2022 ; Chen et al., 2023 ) or the Bahamian blue holes (e.g. Bottrell et al., 1991 ; Mylroie, 2008 ; Gonzalez et al., 2011 ; Vimpere, 2017 ; van Hengstum et al., 2020 ; Sha et al., 2021 ). Moreover, the exploration and research of inland vertical caves, such as the Krubera–Voronya, the world's deepest known cave with a depth of 2191 meters, has continually set successive new depth records since 1960s ( Klimchouk et al., 2009 ; Klimchouk, 2019 ). This evinces the needs of continuous exploration of these karst geological structures, their intricate geomorphology, and the development of cave branches. Delving into the underwater spatial geomorphology of TJBH, the focus is on deciphering its maximum depth and the possibilities of forming part of an underwater intricate and potentially interconnected system of caves and tunnels.

Therefore, the new findings and the discovered challenging depths of TJBH entails a multifaceted inquiry encompassing various scientific dimensions. Efforts should extend to unravel the hydrogeology, stratification, and mixing processes within TJBH, delineating their relationship with regional water bodies, hydraulic connections, water quality dynamics, and water residence times. Within the depths of TJBH could also lie a biodiversity to be explored and linked to physicochemical and geomorphological processes, forming a unique biotope. Geological studies should extend to understanding TJBH's relationship with the fault and fracture system of the region (i.e. the Rio Hondo Fault Zone), with implications for its origin. Analyses are needed to describe the stratigraphic sequence within TJBH and potential connections between TJBH, other blue holes and cenotes in or nearby Chetumal Bay. Thus, uncovering the challenges and mysteries concealed in TJBH urges further exploration, monitoring, and scientific inquiry.

Data availability statement

The datasets presented in this article are not readily available because the data belong to a project funded by the authors. Once published, data eventually will be shared on the institutional data reservoir. Requests to access the datasets should be directed to Laura Carrillo, [email protected].

Author contributions

JA: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Supervision, Validation, Software, Visualization, Writing – original draft. OR: Investigation, Methodology, Writing – review & editing, Writing – original draft. JS: Investigation, Methodology, Writing – review & editing. TÁ: Investigation, Methodology, Writing – review & editing, Conceptualization, Funding acquisition, Project administration, Resources, Supervision, Validation. LC: Conceptualization, Funding acquisition, Investigation, Methodology, Project administration, Resources, Supervision, Validation, Writing – review & editing, Data curation, Formal analysis.

The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. The first author personally funded fieldwork expenses during the survey. The fifth author funded the acquisition of the SWiFT CTD Profiler for hydrographic measurements. APCs funded by El Colegio de la Frontera Sur.

Acknowledgments

The support of Mr. Jesús Artemio Poot Villa (COBIA Team) for their navigation services and support during field surveys is gratefully acknowledged. Technical support of Johnny Omar Valdez from UNAM-UMDI during the fieldwork and scuba-explorations in TJBH is highly appreciated and recognized. Permissions and collaboration with IBANQROO (Institute of Biodiversity and Protected Areas of the State of Quintana Roo) are accredited. Recognition is given to the CONAHCYT (Mexican National Council of Humanities, Sciences and Technologies) program ‘Investigadoras e Investigadores por México’ (Project 761).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Alcérreca-Huerta J. C., Álvarez-Legorreta T., Carrillo L., Flórez-Franco L. M., Reyes-Mendoza O. F., Sánchez-Sánchez J. A. (2023). First insights into an exceptionally deep blue hole in the Western Caribbean: The Taam ja’ Blue Hole. Front. Mar. Sci. 10. doi: 10.3389/fmars.2023.1141160

CrossRef Full Text | Google Scholar

Bauer-Gottwein P., Gondwe B. R. N., Charvet G., Marín L. E., Rebolledo-Vieyra M., Merediz-Alonzo G. (2011a). Review : The Yucatán Peninsula karst aquifer , Mexico. Hydrogeol J. 19, 507–524. doi: 10.1007/s10040-010-0699-5

Benítez S., Iliffe T. M., Quiroz-Martínez B., Alvarez F. (2019). How is the anchialine fauna distributed within a cave? A study of the Ox Bel Ha System, Yucatan Peninsula, Mexico. Subterr Biol. 31, 15–28. doi: 10.3897/subtbiol.31.34347

Björnerås C., Škerlep M., Gollnisch R., Herzog S. D., Ekelund Ugge G., Hegg A., et al. (2020). Inland blue holes of The Bahamas – chemistry and biology in a unique aquatic environment. Fundam. Appl. Limnology 194, 95–106. doi: 10.1127/fal/2020/1330

Bottrell S. H., Smart P. L., Whitaker F., Raiswell R. (1991). Geochemistry and isotope systematics of sulphur in the mixing zone of Bahamian blue holes. Appl. Geochemistry 6, 97–103. doi: 10.1016/0883-2927(91)90066-X

Carrillo L., Johns E. M., Smith R. H., Lamkin J. T., Largier J. L. (2016). Pathways and hydrography in the Mesoamerican Barrier Reef System Part 2: Water masses and thermohaline structure. Cont Shelf Res. 120, 41–58. doi: 10.1016/j.csr.2016.03.014

Carrillo L., Palacios-Hernández E., Ramírez A. M., Morales-Vela B. (2009a). “Características hidrometeorológicas y batimétricas,” in El sistema ecológico de la bahía de Chetumal / Corozal: costa occidental del Mar Caribe . Eds. Espinoza-Avalos J., Islebe G., Hernández-Arana H. A. (ECOSUR, Chetumal, Mexico), 12–20.

Google Scholar

Carrillo L., Palacios-Hernández E., Yescas M., Ramírez-Manguilar A. M. (2009b). Spatial and seasonal patterns of salinity in a large and shallow tropical estuary of the western caribbean. Estuaries Coasts 32, 906–916. doi: 10.1007/s12237-009-9196-2

Cejudo E., Ortega-Almazán P. J., Ortega-Camacho D., Acosta-González G. (2022). Hydrochemistry and water isotopes of a deep sinkhole in north Quintana Roo, Mexico. J. South Am. Earth Sci. 116, 103846. doi: 10.1016/j.jsames.2022.103846

Chen C.-T., Millero F. J. (1977). Speed of sound in seawater at high pressures. J. Acoust Soc. Am. 62, 1129–1135. doi: 10.1121/1.381646

Chen L., Yao P., Yang Z., Fu L. (2023). Seasonal and vertical variations of nutrient cycling in the world’s deepest blue hole. Front. Mar. Sci. 10, 1172475. doi: 10.3389/fmars.2023.1172475

Colbo K., Ross T., Brown C., Weber T. (2014). A review of oceanographic applications of water column data from multibeam echosounders. Estuarine, Coastal Shelf Sci. 145, 41–56. doi: 10.1016/j.ecss.2014.04.002

Domínguez-Herrera E., Luna-gonzález L., Velázquez-Torres D. (2023). Mapa de distribución de geodiversidad de Quintana Roo, México, escala 1:800,000. Terra Digitalis 7 (1), 1–17. doi: 10.22201/igg.25940694e.2023.1.99

Fleury P., Bakalowicz M., de Marsily G. (2007). Submarine springs and coastal karst aquifers: a review. J. Hydrol (Amst) 339, 79–92. doi: 10.1016/j.jhydrol.2007.03.009

Flórez-Franco L. M., Alcérreca-Huerta J. C., Reyes-Mendoza O. F., Sánchez-Sánchez J. A., Álvarez-Legorreta T., Carrillo L. (2023). Coastal blue holes in a large and shallow tropical estuary: geomorphometry and temporal variability of the physicochemical properties. Estuaries Coasts . 47, 686–700. doi: 10.1007/s12237-023-01304-9

Fofonoff N. P., Millard J. R.C. (1983). Algorithms for computation of fundamental properties of seawate). UNESCO Tech. papers Mar. science. 44, 53. doi: 10.25607/OBP-1450

Gischler E., Golubic S., Gibson M. A., Oschmann W., Hudson J. H. (2011). “Microbial mats and microbialites in the freshwater laguna bacalar, yucatan peninsula, Mexico,” in Advances in stromatolite geobiology. Lecture notes in earth sciences , vol. 131. (Springer, Berlin, Heidelberg), 187–205. doi: 10.1007/978-3-642-10415-2_13

Gondwe B. R. N., Lerer S., Stisen S., Marín L., Rebolledo-Vieyra M., Merediz-Alonso G., et al. (2010). Hydrogeology of the south-eastern Yucatan Peninsula: New insights from water level measurements, geochemistry, geophysics and remote sensing. J. Hydrol (Amst) 389, 1–17. doi: 10.1016/j.jhydrol.2010.04.044

Gondwe B. R. N., Merediz-Alonso G., Bauer-Gottwein P. (2011). The influence of conceptual model uncertainty on management decisions for a groundwater-dependent ecosystem in karst. J. Hydrol (Amst) 400, 24–40. doi: 10.1016/j.jhydrol.2011.01.023

Gonzalez B. C., Iliffe T. M., Macalady J. L., Schaperdoth I., Kakuk B. (2011). Microbial hotspots in anchialine blue holes: initial discoveries from the Bahamas. Hydrobiologia 677, 149–156. doi: 10.1007/s10750-011-0932-9

He H., Fu L., Liu Q., Fu L., Bi N., Yang Z., et al. (2019). Community structure, abundance and potential functions of bacteria and archaea in the sansha yongle blue hole, xisha, south China sea. Front. Microbiol. 10. doi: 10.3389/fmicb.2019.02404

He P., Xie L., Zhang X., Li J., Lin X., Pu X., et al. (2020). Microbial diversity and metabolic potential in the stratified sansha yongle blue hole in the south China sea. Sci. Rep. 10, 5949. doi: 10.1038/s41598-020-62411-2

PubMed Abstract | CrossRef Full Text | Google Scholar

Husson L., Pastier A., Pedoja K., Elliot M., Paillard D., Authemayou C., et al. (2018). Reef carbonate productivity during quaternary sea level oscillations. Geochemistry, geophysics. Geosystems 19, 1148–1164. doi: 10.1002/2017GC007335

INEGI (2002). Conjunto de datos vectoriales geológicos, continuo nacional. Fallas-fracturas Vol. 1 (Mexico: Instituto Nacional de Estadística y Geografía), 1000000.

Jinwei G., Tengfei F., Minghui Z., Hanyu Z., Liyan T. (2022). Preliminary study on formation process of Sansha Yongle Blue Hole. J. Trop. Oceanography 41, 171–183. doi: 10.11978/2021077

Kambesis P. N., Coke VI, J. G. (2013). “Overview of the controls on eogenetic cave and karst development in quintana roo, Mexico,” in Coastal karst landforms . Eds. Lace M. J., Mylroie J. E. (Springer, New York, London), 347–373. doi: 10.1007/978-94-007-5016-6_16

Klimchouk A. (2019). “Krubera (Voronja) cave,” in Encyclopedia of caves (Elsevier), 627–634. doi: 10.1016/B978-0-12-814124-3.00074-1

Klimchouk A., Samokhin G. V., Kasian Y. M. (2009). “The deepest cave in the world in the Arabika massif (Western Caucasus) and its hydrogeological and paleogeographic significance,” in ICS Proceedings, 15th International Congress of Speleology. 898–905 (USA: Kerrville).

Li T., Feng A., Liu Y., Li Z., Guo K., Jiang W., et al. (2018). Three-dimensional (3D) morphology of Sansha Yongle Blue Hole in the South China Sea revealed by underwater remotely operated vehicle. Sci. Rep. 8, 17122. doi: 10.1038/s41598-018-35220-x

Little S. N., van Hengstum P. J., Beddows P. A., Donnelly J. P., Winkler T. S., Albury N. A. (2021). Unique habitat for benthic foraminifera in subtidal blue holes on carbonate platforms. Front. Ecol. Evol. 9. doi: 10.3389/fevo.2021.794728

Meyer F. W. (1989). Hydrogeology, ground-water movement, and subsurface storage in the Floridian aquifer system in Southern Florida. US Geological Survey Prof. Paper 1403-G, 64.

Mijatovic B. F. (1962). “Contribution a la solution qualitative du problème de l’èquilibre hydraulique de l’eau douce et salèe dans les collecteurs du karst littoral,” in Association Internationnale des Hydrogèologues Publ., Rèunion d’Athènes (Greek Institute for Geology and Subsurface Research, Athènes), 184–193.

Mylroie J. E. (2008). Late Quaternary sea-level position : Evidence from Bahamian carbonate deposition and dissolution cycles. Quaternary Int. 183, 61–75. doi: 10.1016/j.quaint.2007.06.030

Perry E., Leal-Bautista R. M., Velázquez-Olimán G., Sánchez-Sánchez J. A., Wagner N. (2021). Aspects of the hydrogeology of southern campeche and quintana roo, Mexico. Boletín la Sociedad Geológica Mexicana 73, A011020. doi: 10.18268/BSGM2021v73n1a011020

Perry E., Paytan A., Pedersen B., Velazquez-Oliman G. (2009). Groundwater geochemistry of the Yucatan Peninsula, Mexico: Constraints on stratigraphy and hydrogeology. J. Hydrol (Amst) 367, 27–40. doi: 10.1016/j.jhydrol.2008.12.026

Perry E., Velazquez-Oliman G., Marin L. (2002). ). The hydrogeochemistry of the karst aquifer system of the Northern Yucatan Peninsula, Mexico. Int. Geol Rev. 44, 191–221. doi: 10.2747/0020-6814.44.3.191

Perry E., Velazquez-Oliman G., Socki R. (2003). “Hydrogeology of the yucatán peninsula,” in The lowland Maya: three millennia at the human–wildland interface . Eds. Gomez-Pompa A. ,. M., Allen S., Fedick, Jimenez-Osornio J. (Food Products Press, London), 115–138.

Read J. S., Hamilton D. P., Jones I. D., Muraoka K., Winslow L. A., Kroiss R., et al. (2011). Derivation of lake mixing and stratification indices from high-resolution lake buoy data. Environ. Model. Software 26, 1325–1336. doi: 10.1016/j.envsoft.2011.05.006

Ruíz-Pineda C., Suárez-Morales E., Gasca R. (2016). Copépodos planctónicos de la Bahía de Chetumal, Caribe Mexicano: variaciones estacionales durante un ciclo anual. Rev. Biol. Mar. Oceanogr 51, 301–316. doi: 10.4067/S0718-19572016000200008

Schmitt D., Gischler E., Walkenfort D. (2021). Holocene sediments of an inundated sinkhole: facies analysis of the “Great Blue Hole”, Lighthouse Reef, Belize. Facies 67, 10. doi: 10.1007/s10347-020-00615-8

Sha Y., Zhang H., Lee M., Björnerås C., Škerlep M., Gollnisch R., et al. (2021). Diel vertical migration of copepods and its environmental drivers in subtropical Bahamian blue holes. Aquat Ecol. 55, 1157–1169. doi: 10.1007/s10452-020-09807-4

Supper R., Motschka K., Ahl A., Bauer-Gottwein P., Gondwe B., Alonso G. M., et al. (2009). Spatial mapping of submerged cave systems by means of airborne electromagnetics: an emerging technology to support protection of endangered karst aquifers. Near Surface Geophysics 7, 613–627. doi: 10.3997/1873-0604.2009008

Šušmelj K., Čenčur Curk B., Kanduč T., Rožič B., Verbovšek T., Vreča P., et al. (2024). Hydrogeochemical conditions of submarine and terrestrial karst sulfur springs in the Northern Adriatic. Environ. Earth Sci. 83, 214. doi: 10.1007/s12665-024-11476-7

Tovar E., Suárez-Morales E., Carrillo L. (2009). Multiscale variability of the Chaetognatha along a Caribbean reef lagoon system. Mar. Ecol. Prog. Ser. 375, 151–160. doi: 10.3354/meps07770

van Hengstum P. J., Reinhardt E. G., Beddows P. A., Gabriel J. J. (2010). Linkages between Holocene paleoclimate and paleohydrogeology preserved in a Yucatan underwater cave. Quaternary Sci. Reviews2 29, 2788–2798. doi: 10.1016/j.quascirev.2010.06.034

van Hengstum P. J., Scott D. B., Gröcke D. R., Charette M. A. (2011). Sea level controls sedimentation and environments in coastal caves and sinkholes. Mar. Geol 286, 35–50. doi: 10.1016/j.margeo.2011.05.004

van Hengstum P. J., Winkler T. S., Tamalavage A. E., Sullivan R. M., Little S. N., MacDonald D., et al. (2020). Holocene sedimentation in a blue hole surrounded by carbonate tidal flats in The Bahamas: Autogenic versus allogenic processes. Mar. Geol 419. doi: 10.1016/j.margeo.2019.106051

Vimpere L. (2017). Stratigraphy and sedimentology of Quaternary carbonate units around and whitin Deans’s Blue Hole, Long Island, Bahamas (Switzerland: University of Geneva, Faculty of Sciences).

Wallace E., Donnelly J., van Hengstum P., Winkler T., Dizon C., LaBella A., et al. (2021). Regional shifts in paleohurricane activity over the last 1500 years derived from blue hole sediments offshore of Middle Caicos Island. Quat Sci. Rev. 268, 107126. doi: 10.1016/j.quascirev.2021.107126

Weber B., Scherer E. E., Martens U. K., Mezger K. (2012). Where did the lower Paleozoic rocks of Yucatan come from? A U-Pb, Lu-Hf, and Sm-Nd isotope study. Chem. Geol 312–313, 1–17. doi: 10.1016/j.chemgeo.2012.04.010

Whitaker F. F., Smart P. L. (1997). Groundwater circulation and geochemistry of a karstified bank–marginal fracture system, South Andros Island, Bahamas. J. Hydrol (Amst) 197, 293–315. doi: 10.1016/S0022-1694(96)03274-X

Xie L., Wang B., Pu X., Xin M., He P., Li C., et al. (2019). Hydrochemical properties and chemocline of the Sansha Yongle Blue Hole in the South China Sea. Sci. Total Environ. 649, 1281–1292. doi: 10.1016/j.scitotenv.2018.08.333

Keywords: coastal karst structures, underwater geomorphology, blue holes, Yucatán Peninsula, Mexican Caribbean, cave system, anchialine system

Citation: Alcérreca-Huerta JC, Reyes-Mendoza OF, Sánchez-Sánchez JA, Álvarez-Legorreta T and Carrillo L (2024) Recent records of thermohaline profiles and water depth in the Taam ja’ Blue Hole (Chetumal Bay, Mexico). Front. Mar. Sci. 11:1387235. doi: 10.3389/fmars.2024.1387235

Received: 17 February 2024; Accepted: 15 April 2024; Published: 29 April 2024.

Reviewed by:

Copyright © 2024 Alcérreca-Huerta, Reyes-Mendoza, Sánchez-Sánchez, Álvarez-Legorreta and Carrillo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Laura Carrillo, [email protected]

IMAGES

(PDF) Advanced Data Structures: An Introduction to Data Structures and
System-level data format exploration for dynamically allocated data
Data Structures Using C 2012-2013 B.Sc Computer Science Semester 3
Research Paper: Definition, Structure, Characteristics, and Types
(PDF) Data structures and algorithms in pen-based computing environments
Data Structure and Algorithm: RTU DAA Question Paper

VIDEO

Trees: A Data Structure Deep Dive! 🌳
Zenodo Datasets Repository
Python for Data Analysis: Built-in Data Structures, Functions, and Files: Part 2 (py4da02 3)
Python for Data Analysis: Built-in Data Structures, Functions, and Files: Part 1 (py4da02 3)
Data Structures and Algorithms in Rust: Consider Options for Creating & Storing Graphs| packtpub.com
CS50 2017

COMMENTS

Data Structures and Algorithms authors/titles recent submissions
The PRODSAT phase of random quantum satisfiability. Joon Lee, Nicolas Macris, Jean Bernoulli Ravelomanana, Perrine Vantalon. Subjects: Information Theory (cs.IT); Data Structures and Algorithms (cs.DS); Quantum Physics (quant-ph) Wed, 1 May 2024. Tue, 30 Apr 2024.
data structure Latest Research Papers
Our compressed structure allows for directed and undirected graphs, faster arc and neighborhood queries, as well as the ability for arcs and frames to be added and removed directly from the compressed structure (streaming operations). We use publicly available network data sets such as Flickr, Yahoo!, and Wikipedia in our experiments and show ...
Data Structures and Algorithms authors/titles Sep 2022
Authors: Kristóf Bérczi, Alexander Göke, Lydia Mirabel Mendoza-Cadena, Matthias Mnich. Subjects:Data Structures and Algorithms (cs.DS) [18] arXiv:2209.02990 [ pdf, other] Title: Õptimal Vertex Fault-Tolerant Spanners in Õptimal Time: Sequential, Distributed and Parallel. Authors: Merav Parter.
On the performance of learned data structures
1. Introduction. Very recently, the unexpected combination of data structures and Machine Learning (ML) has led to the development of a new area of algorithmic research, called learned data structures.The key design idea consists of augmenting — and sometimes even replacing — classic building blocks of data structures, such as arrays, trees or hash tables, with ML models, which are better ...
Algorithms and Data Structures for New Models of Computation
In the early days of computer science, the community settled on a simple standard model of computing and a basic canon of general purpose algorithms and data structures suited to that model. With isochronous computing, heterogeneous multiprocessors, flash memory, energy-aware computing, cache and other anisotropic memory, distributed computing, streaming environments, functional languages ...
125417 PDFs
Data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently. | Explore the latest full-text research PDFs, articles, conference papers ...
Algorithms
Special Issue Information. Dear Colleagues, Machine learning is the study of computer algorithms that allow computer programs to improve automatically through experience. Machine learning algorithms build a model based on training data to make predictions or decisions without being explicitly programmed to do so.
tree data structure Latest Research Papers
Time Algorithm . Tree Data . Tree Data Structure. This paper describes a polynomial time algorithm for solving graph isomorphism and automorphism. We introduce a new tree data structure called Walk Length Tree. We show that such tree can be both constructed and compared with another in polynomial time.
efficient data structures Latest Research Papers
The task is to construct a data structure over T answering the following type of online queries efficiently. Given a range [α,β], return a shortest substring T [i,j] of T with exactly one occurrence in [α,β]. We present an O (nlogn)-word data structure with O (logwn) query time, where w=Ω (logn) is the word size.
Algorithms
Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications. ... Data structures provide ways of ...
Algorithms and Data Structures
The papers in these RC 2023 proceedings present original research on the theory, design and application of algorithms and data structures. ... This book constitutes the refereed proceedings of the 18th International Symposium on Algorithms and Data Structures, WADS 2023, held during July 31-August 2, 2023.
(PDF) Data Structure: Theoretical Approach
Data structures are generally based on the abilit y of a. computer to fetch and store data at any pla ce in its. memory, specified by a pointer — a bit string, representing a memory address ...
(PDF) DATA STRUCTURES FOR MODERN APPLICATIONS
PDF | This book contains the following chapters: Chapter 1: Introduction Chapter 2: Data Structures And Algorithms Chapter 3: Data Structures And Its... | Find, read and cite all the research you ...
Data Structures and Algorithms authors/titles Jan 2019
Danny Hucke, Markus Lohrey, Louisa Seelbach Benkner. Comments: A short version of this paper appeared in the IEEE Proceedings of ISIT 2019. Subjects: Data Structures and Algorithms (cs.DS); Information Theory (cs.IT) [20] arXiv:1901.03254 [ pdf, other]
Recent Studies About Teaching Algorithms (CS1) and Data Structures (CS2
This Research Full Paper presents a review of recent studies on SIGCSE about teaching programming (CS1) and data structures (CS2) for university students in computer science courses. Our main contribution is the identification of three categories and their respective subcategories for teaching programming: (i) characterization of contents, (ii) identification of pedagogical strategies and (iii ...
Must read research papers on Data Structures
The must read research papers on Data Structures are: Ordered Hash Table (1973) Randomized Search Trees (1989) EERTREE: An Efficient Data Structure for Processing Palindromes in Strings (2015) Making data structures persistent (1986) Design and implementation of an efficient priority queue (1976)
PDF A Survey Paper on Data Structure and Algorithm Visualization
and teachers visualize data structures and algorithms with their real-life implementation. Keywords: Data Structures, Algorithms, Visualization, Real-Life Implementation. I. INTRODUCTION Data structures and algorithms play a major role in Computer Science and also help people to get hired. Data structure and Algorithms are the foundation.
(Data) STRUCTURES
Abstract: We show that a large fraction of the data-structure lower bounds known today in fact follow by reduction from the communication complexity of lopsided (asymmetric) set disjointness! This includes lower bounds for: (a) high-dimensional problems, where the goal is to show large space lower bounds; (b) constant-dimensional geometric problems, where the goal is to bound the query time ...
[0801.2378] String algorithms and data structures
The string-matching field has grown at a such complicated stage that various issues come into play when studying it: data structure and algorithmic design, database principles, compression techniques, architectural features, cache and prefetching policies. The expertise nowadays required to design good string data structures and algorithms is therefore transversal to many computer science ...
Data Science Journal
The CODATA Data Science Journal is a peer-reviewed, open access, electronic journal, publishing papers on the management, dissemination, use and reuse of research data and databases across all research domains, including science, technology, the humanities and the arts. The scope of the journal includes descriptions of data systems, their implementations and their publication, applications ...
AlphaFold 3 predicts the structure and interactions of all of life's
Google DeepMind's newly launched AlphaFold Server is the most accurate tool in the world for predicting how proteins interact with other molecules throughout the cell. It is a free platform that scientists around the world can use for non-commercial research. With just a few clicks, biologists can harness the power of AlphaFold 3 to model structures composed of proteins, DNA, RNA and a ...
linked list Latest Research Papers
Linked Data . Data Models . The State . Sequential Data . Linked List . Knowledge Graphs . State Of Affairs. Sequences are among the most important data structures in computer science. In the Semantic Web, however, little attention has been given to Sequential Linked Data.
[2102.11169] Recent Advances in Fully Dynamic Graph Algorithms
In recent years, significant advances have been made in the design and analysis of fully dynamic algorithms. However, these theoretical results have received very little attention from the practical perspective. Few of the algorithms are implemented and tested on real datasets, and their practical potential is far from understood. Here, we present a quick reference guide to recent engineering ...
Welcome to the Purdue Online Writing Lab
The Online Writing Lab at Purdue University houses writing resources and instructional material, and we provide these as a free service of the Writing Lab at Purdue.
Frontiers
Coastal karst structures have been recently explored and documented in Chetumal Bay, Mexico, at the southeast of the Yucatan Peninsula. These structures, recognized as blue holes, stand out for their remarkable dimensions within a shallow estuarine environment. Particularly the Taam Ja' Blue Hole (TJBH), revealed a depth of ~274 mbsl based on echo sounder mapping, momentarily positioning it ...