Computer Science Interdisciplinary Applications
Engineering and Technology
File | Description | Size | Format | |
---|---|---|---|---|
Attached File | 2.77 MB | Adobe PDF | ||
2.74 MB | Adobe PDF | |||
175.54 kB | Adobe PDF | |||
683.73 kB | Adobe PDF | |||
372.03 kB | Adobe PDF | |||
232.1 kB | Adobe PDF | |||
655.91 kB | Adobe PDF | |||
580.54 kB | Adobe PDF | |||
2.77 MB | Adobe PDF | |||
2.74 MB | Adobe PDF |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Skin cancer classification with deep learning: a systematic review.
Skin cancer is one of the most dangerous diseases in the world. Correctly classifying skin lesions at an early stage could aid clinical decision-making by providing an accurate disease diagnosis, potentially increasing the chances of cure before cancer spreads. However, achieving automatic skin cancer classification is difficult because the majority of skin disease images used for training are imbalanced and in short supply; meanwhile, the model’s cross-domain adaptability and robustness are also critical challenges. Recently, many deep learning-based methods have been widely used in skin cancer classification to solve the above issues and achieve satisfactory results. Nonetheless, reviews that include the abovementioned frontier problems in skin cancer classification are still scarce. Therefore, in this article, we provide a comprehensive overview of the latest deep learning-based algorithms for skin cancer classification. We begin with an overview of three types of dermatological images, followed by a list of publicly available datasets relating to skin cancers. After that, we review the successful applications of typical convolutional neural networks for skin cancer classification. As a highlight of this paper, we next summarize several frontier problems, including data imbalance, data limitation, domain adaptation, model robustness, and model efficiency, followed by corresponding solutions in the skin cancer classification task. Finally, by summarizing different deep learning-based methods to solve the frontier challenges in skin cancer classification, we can conclude that the general development direction of these approaches is structured, lightweight, and multimodal. Besides, for readers’ convenience, we have summarized our findings in figures and tables. Considering the growing popularity of deep learning, there are still many issues to overcome as well as chances to pursue in the future.
Given the rising prevalence of skin cancer and the significance for early detection, it is crucial to develop an effective method to automatically classify skin cancer. As the largest organ of the human body ( 1 ), the skin shoulders the responsibility of protecting other human systems, which increases its vulnerability to disease ( 2 ). Melanoma was the most common cancer in both men and women with approximately 300,000 new cases ( 3 ) diagnosed globally in 2018. In addition to melanoma, two other major skin cancer diseases, basal cell carcinoma (BCC) and squamous cell carcinoma (SCC), also had a relatively high incidence, with over 1 million cases in 2018 ( 4 ). As ( 5 ) reported, more skin cancers are diagnosed each year than all other cancers combined in the United States. Fortunately, if detected early, the chances of cure will be greatly improved. According to ( 4 ), melanoma has a 5-year survival rate of 99% when it does not metastasize. If it metastasizes to other organs in the body, its survival rate reduces to 20%. However, because early indications of skin cancer are not always visible, diagnostic results are often dependent on the dermatologist’s expertise ( 6 ). For inexperienced practitioners, an automatic diagnosis system is an essential tool for more accurate diagnoses. Beyond that, diagnosing skin cancer with naked eyes is highly subjective and rarely generalizable ( 7 ). Therefore, it is necessary to develop an automatic classification method for skin cancer that is more accurate, less expensive, and quicker to diagnose ( 8 ). Besides, implementing such automated diagnostic systems can effectively minimize mortality from skin cancers, benefiting both patients and the healthcare systems ( 9 ).
However, owing to the complexity and diversity of skin disease images, achieving automatic classification of skin cancer is challenging. First of all, different skin lesions have lots of interclass similarities, which could result in misdiagnosis ( 10 ). For example, there exist various mimics of BCC in histopathological images, such as SCC and other skin diseases ( 11 ). As a result, it is difficult for the diagnosis systems to effectively discriminate skin malignancies from their known imitators. Secondly, several skin lesions differ within their same class in terms of color, feature, structure, size, and location ( 12 ). For example, the appearance of BCC and its subcategories is almost different. This makes it difficult to classify different subcategories of the same category. Furthermore, the classification algorithms are highly sensitive to the types of camera devices used to capture images. When the test images come from a different domain, their performance suffers ( 13 ).
Although traditional machine learning approaches are capable of performing well in particular skin cancer classification tasks, these algorithms are ineffective for complicated diagnostic demands in clinical practice. Traditional machine learning methods for skin cancer diagnosis typically involve extracting features from skin-disease images and then classifying the extracted features ( 14 ). For example, ABCD Rule ( 15 ), Menzies Method ( 16 ), and 7-Point Checklist ( 17 ) are effective methods for extracting various features from skin disease images. The handcrafted features are then classified using several classification methods such as SVM ( 18 ), XGBoost ( 19 ), and decision tree ( 20 ). Due to the restricted number of selected features, machine learning algorithms can often only classify a subset of skin cancer diseases and cannot generalize to a broader range of disease types ( 21 ). Besides, given the wide variety of skin cancers, it is not effective to identify each form of cancer solely based on handcrafted features ( 22 ).
Without the need for domain expertise and feature extraction, deep learning algorithms have been widely used for skin cancer classification in recent years; however, there are still several difficulties and challenges ahead. Compared with traditional machine learning methods, deep learning algorithms can analyze data from a large-scale dataset faster and more accurately, which allows them to effectively extract relevant characteristics ( 23 ). At the same time, deep learning algorithms can also aid clinicians in more thorough data analysis and examination of test results ( 24 ). A number of studies, such as ( 25 – 27 ) demonstrated that deep learning algorithms can diagnose at a level comparable to that of a dermatologist. However, these algorithms still have many obstacles to becoming a complete diagnostic system. Firstly, data imbalance and the lack of a large volume of labeled images have hindered the widespread use of deep learning methods in skin cancer classification ( 12 ). When these algorithms are used to classify skin cancers that are rare in the training dataset, they frequently result in a misdiagnosis ( 28 ). Furthermore, when working with high-resolution images (such as pathological images) with millions of pixels, the deep learning models often result in significant computing costs and additional training time ( 29 ). Besides, different noises will be generated as a result of the various conditions (such as different imaging devices, backgrounds). Therefore, the robustness and generalization ability of these algorithms should also be taken into account ( 30 ).
These years, a number of reviews that detail the diagnostic breakthroughs in skin cancer classification have been published; however, no review has provided a specific analysis of frontier challenges in skin cancer classification tasks, such as data imbalance and limitation, domain adaptability, model robustness, and model efficiency ( 31 ). reviewed the recent developments in skin lesion classification using dermoscopic images ( 32 ). presented a detailed overview of studies on using CNNs to classify skin lesions ( 33 ). showed how the use of CNNs in correctly identifying skin cancer has developed ( 34 ). presented a review of different machine learning algorithms in dermatology diagnosis, as well as some of the obstacles and limitations ( 12 ). and ( 28 ) summarized a number of deep learning-based approaches for skin cancer classification, as well as various challenges and difficulties ( 35 ). provided an in-depth review of the current articles about melanoma classification and compared their results with human experts ( 36 ). summarized the latest CNN-based methods in skin lesion classification by utilizing image data and patient data ( 37 ). provided a review of deep learning-based methods for early diagnosis of skin cancer. We present these relevant surveys with details and highlights in Table 1 . By summarizing the previous reviews, we find that all of the preceding publications methodically studied a specific topic in skin cancer classification. However, most of them treated skin cancer classification as a classical classification problem, without addressing the model’s significant practical constraints in clinical work, such as data imbalance and limitation, cross-domain adaptability, model robustness, and model efficiency. Although several earlier reviews summarized some of the methods to solve the abovementioned frontier problems, their summaries were incomplete. Some novel techniques were not covered, such as pruning, knowledge distillation, and transformer. Therefore, in this review, we comprehensively summarize the frontier challenges in skin cancer classification and provide corresponding solutions by analyzing articles published until the year 2022. It gives readers in-depth information on the advances and limitations of deep learning in skin cancer classification and also provides different ideas for researchers to improve these algorithms.
Table 1 A summary of the current review related to skin cancer classification.
The rest of this paper is organized as follows: first of all, Section 2 introduces three types of dermatological images and several popular public datasets. In Section 3, we review several typical CNN frameworks and frontier problems with their corresponding solutions in skin cancer classification tasks. A brief conclusion is given in Section 4.
High-quality images of skin diseases are important for both dermatologists and automated diagnostic systems. On the one hand, dermatologists rely on high-resolution (HR) images to make diagnoses when direct observation is impossible ( 38 ). This is especially common in telemedicine, medical consultations, and regular clinics ( 39 ). On the other hand, training reliable algorithms has always necessitated the use of high-quality data. In particular, deep learning algorithms always need a vast volume of labeled data for a better accuracy ( 28 ). As a result, high-quality dermatological images are critical for both clinical diagnosis and the design of new algorithms. In this section, we go over three different types of images commonly used in skin cancer diagnosis, as well as some public datasets.
The three main types of image modalities used to diagnose skin diseases are clinical images, dermoscopy images, and histopathological images (see Figure 1 ). Clinical images are frequently captured by mobile devices for remote diagnosis or as medical records. Dermoscopy images and histopathological images are commonly utilized in clinical diagnosis to assess the severity of the illness. In the next part, we introduce them separately.
Figure 1 Examples of three types of dermatological images of BCC to show their differences and relationships: (A) Clinical image. (B) Dermoscopy image. (C) Histopathological image. 1
Clinical images are obtained by photographing the skin disease site directly with a camera. They can be used as a medical record for patients and provide different insights for dermoscopy images ( 12 ). The biggest issue of utilizing clinical images for skin cancer classification is that they include limited morphological information while also introducing considerable inaccuracies into the diagnostic results, owing to the effect of diverse imaging settings (such as lighting, angle, and so on) ( 40 ).
Dermoscopy images are captured with dermoscopy, a type of optical observation tool used to assess the fine details of skin diseases ( 41 ). Clinicians frequently utilize dermoscopy to diagnose benign nevi and malignant melanoma ( 42 ). It serves as a bridge between clinical and pathological aspects, and thus dermoscopy is often referred to as a dermatologist’s stethoscope. Dermoscopy images provide a clear visualization of the skin’s surface and are used to analyze the color and microstructure of the epidermis ( 43 ). For some skin diseases, there are already numerous diagnostic guidelines based on dermoscopy images ( 44 ), for example, the ABCD Rule Law ( 15 ), the CASH Rule Law ( 45 ), and the Menzies Method ( 16 ). When using dermoscopy images for skin cancer diagnosis, the range of structures that can be observed is limited, and its diagnostic accuracy is occasionally affected by the experience of dermatologists ( 46 ).
Histopathological images were obtained using microscopes to scan tissue slides and then digitalize as images ( 28 ). They are utilized to show the vertical structure and complete internal characteristics of the diseased tissue. In the clinic, pathological examinations serve as the “gold standard” for diagnosing almost all types of cancers, as they are often used to distinguish between types of cancers and guide appropriate treatment plans based on pathological changes. However, different histopathological images of skin cancer exhibit different morphologies, scales, textures, and color distributions, which makes it difficult to find a common pattern for diagnosis ( 12 ).
To create a trustworthy and robust skin cancer classification system, a variety of datasets with all kinds of dermatological images are required. As the need for medical imaging resources in academia grows, more and more datasets are becoming publicly available. To provide readers with a reference, we introduce several commonly used skin-disease datasets in the next part, along with the works based on these datasets.
The PH 2 dataset is constructed by ( 47 ) to support the research of classification and segmentation methods. It contains 200 color dermoscopy images (768 × 560) of three types of skin diseases, including common nevi, atypical nevi, and melanomas. Besides, it contains complete medical annotations, such as lesion segmentation results and pathological diagnosis.
PH 2 is frequently used as a dataset for testing the diagnostic algorithms of skin disease. For example ( 48 ), used the SegNet framework to automatically diagnose and segment the dermoscopic images in PH 2 and finally obtained the classification accuracy of 94% ( 49 ). proposed a novel deep convolutional network for feature extraction and classification of skin lesions. The model was mainly divided into three stages. The first stage was for data augmentation and image contrast enhancement. The second stage used CNN to extract information from the boundary of the lesion area. The third stage used Hamming distance to fuse and select features obtained with pretrained Inception v3. Finally, the model obtained a classification accuracy of 98.4%, 95.1%, and 94.8% on the PH 2 , ISIC-2016, and ISIC-2017 datasets, respectively ( 50 ). proposed a Multi-Focus Segmentation Network for skin cancer disease segmentation tasks based on the PH 2 dataset by utilizing feature maps of different scales. Two boundary attention modules and two reverse attention modules were utilized to generate skin lesion masks. Finally, the experimental results revealed that the proposed method achieved a dice similarity coefficient of 0.954 and an IoU index of 0.914 on the PH 2 dataset. In addition to the above works, the PH 2 dataset is being utilized by an increasing number of algorithms to validate their effectiveness and accuracy.
The MED-NODE Dataset 3 is collected by the Department of Dermatology of the University Medical Center Groningen (UMCG), which contains 170 digital images of melanoma ( 51 ) and nevi case ( 52 ). It is used to build and evaluate the MED-NODE system for detecting skin cancer with macroscopic images ( 53 ).
On the MED-NODE dataset, a variety of approaches provided significant classification results. For example, in order to improve the generalization ability of the model and alleviate the problem of data imbalance ( 54 ), proposed a model for melanoma classification based on transfer learning and ensemble learning. Finally, the model achieved 93% classification accuracy on the MED-NODE dataset, surpassing other state-of-the-art methods ( 55 ). applied AlexNet for the skin cancer classification task by using three different transfer learning methods, including fine-tuning the weight parameters of the model, replacing the classification layer function, and performing data augmentation on the original dataset. In the end, they achieved an accuracy of 96.86% on the MED-NODE dataset. Then ( 56 ), used two additional networks for the skin cancer classification task, including ResNet-101 and GoogleNet. Finally, experiment results revealed that GoogleNet achieved the best classification accuracy of 99.29% on the MED-NODE dataset. It can be seen that various convolutional neural networks have obtained decent classification results on this dataset; however, the number of skin disease images included is relatively restricted.
The HAM10000 4 dataset was collected by the International Skin Imaging Collaboration (ISIC) to solve the problem of data imbalance and data limitation in skin-disease datasets. It contains 10,015 dermoscopic images with seven representative diseases in pigmented skin lesions: nematode disease and intraepithelial carcinoma, basal cell carcinoma, benign keratoid lesions, cutaneous fibroma, melanoma, melanocyte nevi, and vascular lesions (including hemangiomas, purulent granulomas, and subcutaneous hemorrhage) ( 57 , 58 ).
The HAM10000 dataset is widely used by many scholars due to its diversity of skin lesions. For example ( 25 ), used four novel deep CNN models, DenseNet-201, ResNet-152, Inception-v3, and InceptionResNet-v2 to classify eight different types of skin cancers on the HAM10000 and PH 2 datasets. Finally, experimental results indicated that the diagnostic level of these CNN models exceeds the dermatologists in terms of ROC AUC score ( 59 ). trained 30 different models on the HAM10000 dataset to explore the classification performance of different models. At the same time, they also used two locally interpretable methods GradCAM and Kernel SHAP techniques to observe the mechanism of the classification model. Finally, the model achieved an average AUC of 0.85 ( 60 ). designed a method for classifying seven skin diseases that used ensemble learning and the one-versus-all (OVA) strategy. Finally, they achieved a classification accuracy of 0.9209 on the HAM10000 dataset ( 61 ). obtained the best classification result by combining Inception ResNet-v2 with Soft-Attention mechanism on the HAM10000 dataset, with an accuracy of 0.934, an AUC of 0.984, and an average precision of 0.937. With the in-depth study of skin cancer classification tasks by scholars, more and more novel classification methods are being tested on the HAM10000 dataset for a better comparison, where the adoption of the Soft-Attention module yields the best classification results.
The Derm7pt dataset contains approximately 2,000 clinical and dermoscopy color images of skin disease, as well as structured information for training and assessing CAD systems. It serves as a database for analyzing the prediction results of the seven-point malignancy checklist of skin lesion ( 62 ).
Due to the multimodal information contained in the Derm7pt dataset, it has gradually been widely used to test various multitask networks. When releasing the dataset ( 62 ), also proposed a multitask network for predicting melanoma with seven-point checklist criteria and diagnostic results. The model used different loss functions to handle different input modalities, while being able to make predictions on missing data at the output. Finally, the model achieved a classification accuracy of 73.7% on the Derm7pt dataset, also benchmarking the approach. To increase its interpretability ( 63 ), created a multitask model based on the Derm7pt dataset to illustrate the mechanism between different tasks. Learnable gates were used in the model to show how the method used or combined features from various tasks. This strategy may be used to investigate how CNN models behave, potentially enhancing their clinical utility ( 64 ). proposed a deep convolutional network for skin lesion classification on the Derm7pt dataset. Meanwhile, they implemented regularized DropOut and DropBlock to increase the model’s generalization capabilities and reduce overfitting. In addition, to address the dataset’s imbalance and limitation, they devised a novel loss function that assigns different weights to various samples, as well as an end-to-end cumulative learning technique. Finally, the method achieved excellent classification performance on the Derm7pt dataset and ISIC dataset while with low computational resources. The release of the Derm7pt dataset has a great boost in promoting the use of multimodal data in skin cancer classification tasks, as well as new ideas and solutions.
The BCN20000 5 dataset comprises 5,583 skin lesions and 19,424 dermoscopic images taken using high-resolution dermoscopy. They were all gathered between 2010 and 2016. At the same time, the collector employed a variety of computer vision techniques to remove noise, background, and other interference from the images. Finally, they were carefully reviewed by numerous experts to ensure the diagnosis’ validity ( 65 ).
BCN20000 is commonly utilized in skin cancer classification and segmentation tasks as part of the dataset for the ISIC-2019 competition. For example, in order to protect the data privacy and avoid data abuse ( 66 ), proposed a Distributed Analytics method for distributed training of skin disease images, which ensures that the training data remains in the original institution. Finally, after training on the BCN20000 dataset, the model achieves classification accuracy comparable to the centralized distribution. To evaluate the robustness of different CNN models ( 67 ), generated a series of out-of-distribution (OOD) images by using different data augmentation methods based on BCN20000, HAM10000, and other skin-disease datasets. This method establishes a benchmark for OOD testing and considerably facilitates the clinical use of skin cancer classification methods. Specially, by using different data augmentation methods with an ensemble learning strategy (including EfficientNets, SENet, and ResNeXt101_wsl) ( 68 ), achieved the first-place classification result with a balanced accuracy of 74.2% on the BCN20000 dataset.
To reduce skin cancer mortality while promoting the development and use of digital skin imaging ( 69 ), the International Skin Imaging Collaboration (ISIC) has established a publicly available skin disease dataset 6 for the computer science community around the world. Currently, ISIC Archive comprises over 13,000 representative dermoscopic images from clinical facilities throughout the world, all of which have been inspected and annotated by experts to ensure image quality ( 70 ).
The majority of studies that utilized the ISIC dataset focused on skin cancer classification and segmentation tasks, with the binary classification task being the most popular. For example ( 71 ), designed different modules based on VGGNet for skin disease classification (melanoma or benign) and benchmarked for the ISIC-2016 dataset. In the end, results showed that this method obtained excellent performance with an accuracy of 0.8133 and a sensitivity of 0.7866 ( 51 ). achieved the best classification results with an AUC of 0.911 and balanced multiclass accuracy of 0.831 on three skin cancer classification tasks of ISIC-2017 by using an ensemble of ResNet-50 networks on normalized images ( 72 ). used ensemble learning with a stacking scheme and obtained the classification results with an accuracy of 0.885 and an AUC of 0.983 in the ISIC-2018 competition ( 73 ). employed two bias removal techniques, “Learning Not to Learn” (LNTL) and “Turning a Blind Eye” (TABE), to alleviate irregularities in model predictions and spurious changes in melanoma images. Among them, the LNTL method combined a new regularization loss with a gradient inversion layer to enable the model to debias the CNN’s features in backpropagation. The TABE method reduced biases by using different auxiliary classifiers to identify biases in features. Finally, the experimental results revealed that TABE had a more effective denoising effect, with an improvement of 11.6% in the AUC score benchmark on the ISIC dataset. Since the ISIC dataset is widely used in competitions and research, readers can find more methods for comparison on the competition leaderboard or on the Internet.
Table 2 summarizes the above datasets to show the different characteristics between them. What we summarized are the most common datasets in the skin cancer classification task and may not be the most exhaustive summary. Readers can find more datasets from various sources online. At the same time, it can be seen from the above summary that most of the images in the skin-disease dataset are dermoscopic images, while clinical images and histopathological images are still relatively rare. Furthermore, most skin-disease datasets have a relatively small number of images compared with datasets of natural images, which poses certain challenges for skin cancer classification tasks.
Table 2 Characteristics of different skin-disease datasets.
In the past few years, many scholars have been working on developing computer-aided diagnostic (CAD) systems for skin cancer classification. Before the emergence of deep learning, the CAD systems were primarily designed by machine learning (ML) algorithms ( 74 ). However, due to the complexity of feature engineering and limitations of handcrafted features, these ML-based methods can only diagnose a subset of skin diseases. Deep learning algorithms, on the other hand, can automatically learn semantic features from large-scale datasets with higher accuracy and efficiency. As a result, deep learning-based methods such as Convolutional Neural Network (CNN) have been used to solve the great majority of skin cancer classification problems in recent years and obtained satisfactory results.
However, as we dig deeper into the challenges of skin cancer classification, it appears that they are not as straightforward as the challenges in the non-medical domain (e.g., ImageNet, PASCAL-VOC, MS-COCO) ( 75 ) ( 12 ). Firstly, many datasets of skin images are imbalanced due to the disproportions among different skin cancer classes, which increases the risk of misdiagnosis by the diagnostic system. Also, since correct annotation needs a great amount of expertise knowledge and is time-consuming and labor-intensive, many datasets only provide a limited number of images (e.g., the ISIC dataset is the largest publicly available skin disease dataset until now, which contains about 13,000 skin images). As a result, more labeled data is required to design a more accurate system. Besides, when the amount of training data is insufficient, the model’s generalization performance degrades. In addition, different noises generated by different devices or different shooting conditions also bring biases to the model, resulting in a reduction in diagnosis. Furthermore, the operational efficiency and resource consumption of the model also limit its clinical implementation on various medical devices.
As a result, in the following part, we present a complete overview of the use of deep learning methods in skin cancer classification. We begin by introducing the use of typical CNN frameworks in skin cancer classification, then review the frontier challenges in skin cancer classification and provide related solutions. We summarize these methods in Tables 3 – 6 . Among them, Table 3 summarizes the use of typical frameworks in skin cancer classification, as well as their highlights and limitations. Tables 4 – 6 summarize the approaches to address the frontier issues of data imbalance and limitation, model generalization ability and robustness, and model computational efficiency in skin cancer classification. At the same time, we list publications based on the same or similar dataset together to make it easier for readers to compare different approaches.
Table 3 References of skin cancer classification with typical CNN frameworks.
Table 4 Different methods for solving data imbalance and data limitation.
Table 5 Different methods for improving model generalization ability and robustness.
Table 6 Different methods for improving model efficiency.
During the early stages of the development of CNN, people usually used self-building networks for a specific task. For example ( 76 ), presented a self-supervised model for melanoma detection. Firstly, a deep belief network and self-advised SVM were used to train the labeled and unlabeled images. After that, a bootstrap approach was used to randomly choose the training images for the network to improve the generalization ability and decrease the redundancy of the model. Experiments showed that the proposed method outperformed other methods like KNN and SVM. Then ( 79 ), designed a simple CNN network work for detecting melanoma. Firstly, all input images were preprocessed to eliminate the effects of noise and artifacts. The processed images were then fed into a pretrained CNN to detect if they were melanoma or benign. Finally, experiment results showed that CNN outperformed other classification methods.
With the development of deep learning, various well-known networks, such as VGGNet ( 117 ), GoogleNet ( 118 ), and ResNet ( 119 ), have been applied to skin cancer classification with favorable results. The most landmark work was ( 78 ). It was the first time that a CNN has been utilized to train large-scale clinical images for skin cancer classification. They designed an end-to-end network for automated skin cancer classification using Inception v3. A total of 129,450 clinical images with 2,032 distinct skin diseases were utilized for training the model. Meanwhile, to make use of the fine-grained information in taxonomy structure, they proposed a disease partitioning algorithm to divide skin cancers into fine-grained classes (e.g., melanoma was subdivided into amelanotic melanoma and acrolentiginous melanoma). In the end, the results of the experiments indicated that the skin cancer classification system could attain diagnostic levels equivalent to dermatologists. In the same year ( 71 ), successfully implemented VGGNet for skin lesion classification (melanoma or benign) and benchmarked for ISIC datasets by using dermoscopic images. In this study, they designed three different modules based on VGG-16 as comparison. The first module trained the network from initial weights. The second module used pretrained VGG-16 for training and then used the current dataset to train the fully connected classifier. The third module also used transfer learning to train the network, but weights in the high-level part of the convolutional layers were initialized from the first module. In the end, results showed that the third module obtained excellent performance in skin cancer classification. Different from previous classification tasks ( 81 ), utilized a CNN framework (VGG-19) for the first time to evaluate the thickness of melanoma. They began by locating the lesion and cropping the region of interest (ROI). To solve the problem of data limitation and data imbalance; they then employed the Synthetic Minority Over-sampling technique to generate synthetic samples. After that, the pretrained VGG-19 was used for the thickness prediction. Finally, the results demonstrated that the algorithm can estimate the thickness of melanoma with an accuracy of 87.5%. For the first time, a multitask network was proposed by ( 82 ) based on Inception v3 by utilizing three different modalities of data to predict seven-point criteria. In addition, they designed a multimodal–multitask loss function to tackle the combinations of input modalities, which was also able to make predictions with incomplete information. Finally, results showed the superior performance in classifying skin lesions and the seven-point criteria. Besides, the proposed method had the ability to identify discriminating information and generate feature vectors for image retrieval ( 80 ). built two systems for skin disease classification based on the novel deep learning algorithms. Additionally, they added a sonification-derived layer to increase the sensitivity of the model. In the first system, a CNN architecture was proposed based on Inception v2 to identify skin diseases (benign or malignant) with dermoscopic images. The second system transformed the feature representation generated in the preceding system into sound data. Then, this sound information was then put into a machine learning classifier or translated to spectrograms for further analysis. In the end, both systems performed exceptionally well in terms of classification and sonification. After the deep learning methods achieved excellent results in the skin cancer classification task ( 77 ), proposed how to improve deep learning-based dermoscopy classification and dataset creation. They analyzed four ResNet architectures in dermoscopic classification, namely, ResNet-34, ResNet-50, ResNet-101, and ResNet-152, to apprehend the mechanisms and certain error causes. First, four ResNet networks were trained at their best fits to see if the structural differences between the models would result in different classification results. After testing with several epochs, they found that the accuracy of different models tended to be consistent and varied with different hyperparameter settings. Meanwhile, they had a high level of stability during training. Therefore, the training errors of the classification models were attributed to incorrect annotations and the complexity of medical images.
Gradually, people discovered that applying a single CNN to a CAD system typically did not produce the desired results due to the large variances in deep neural networks. After that, ensemble learning was proposed as a way to limit the error generated by a single model by training multiple models and then combining their results to get the final classification results ( 27 ). compared the performance between ensemble models and a single model by utilizing nine different CNN architectures in skin cancer classification. After different comparative experiments, they found the significance of ensemble learning for obtaining optimal classification models. In addition, they investigated the effectiveness between two different selection strategies in ensemble learning: random selection and utilizing a validation set. For the smaller ensemble models, they found that the second method had more advantages, but the first was also effective. For the larger ensemble models, it was possible to get away with merely picking models arbitrarily. Based on the same method ( 60 ), proposed two different methods for skin cancer classification while reducing the complexity of the model by using an OVA strategy: i) alone CNN model and ii) the incorporation of seven CNN models. In the first method, images from the dataset were directly put into the single CNN model for the final prediction. In the second method, a one-versus-all (OVA) strategy was used to combine seven separate models with two classes to obtain the final prediction. Each class in this method was classified according to true and false labels, thus increasing the efficiency of the model. The results revealed that the second method outperformed the first in terms of classification accuracy ( 83 ). adopted a grid search strategy to find the best ensemble learning methods for the classification of seven skin lesions. During the training, five CNN networks, ResNeXt, SeResNeXt, ResNet, Xception, and DenseNet, were used as baseline. After that, two ensemble learning strategies, namely, average ensemble and weighted ensemble, were conducted to find the optimal model. In the end, results showed that the weighted ensemble model had more advantages than the average ensemble model.
Data imbalance and data limitation in skin disease datasets are common problems in the skin cancer classification tasks. In fact, benign lesions account for the majority of data in many skin disease datasets. Meanwhile, many skin disease datasets have large inequities in the number of samples among different skin disease classes. Only the common skin diseases, such as BCC, SCC, and melanoma, are included in the majority of skin disease datasets. Other skin cancer diseases (such as appendiceal carcinomas and cutaneous lymphoma) are relatively rare in these datasets, making it difficult for algorithms to classify them correctly ( 28 ). Besides, the skin lesions in most of the current datasets are from fair-skinned people, with only a few from dark-skinned people ( 12 ). It has been demonstrated that deep learning frameworks that have been validated for skin cancer diagnosis in fair-skinned people are more likely to misdiagnose those with different races or ethnicity ( 120 ). At the same time, the quantity of skin disease images is also relatively restricted. For example, ISIC-2020 ( 121 ) is the dataset with the largest number images so far, with about 30,000 skin disease images. Although large amounts of skin disease images can be obtained from websites or medical institutions without any diagnosis information, labeling them takes professional knowledge and can be extremely challenging and time-consuming. What is more, sufficient labeled data are a requirement for training a reliable model. When only a limited number of images are provided, overfitting is more likely to occur. As a result, for the skin cancer classification task, a considerable amount of labeled data is required.
Generative adversarial networks (GAN) are widely thought to be a preferable alternative, as they can generate artificial data to compensate for data imbalance in terms of positive and negative proportions, rare cases, and different people ( 84 ). designed a data augmentation method based on generative adversarial networks to address the shortcomings of skin lesion images in melanoma detection. Firstly, they utilized several data processing methods to locate and eliminate hairs and other artifacts of the input images. Then they used two convolutional GANs, namely, DCGANs, to generate 350 images of melanoma and 750 images of seborrheic keratosis. Finally, the results demonstrated that combining the processing module and generative adversarial networks resulted in superior performance when compared with other baselines. Although GAN is extensively employed for data augmentation, the images it generated are typically low-resolution. To overcome this issue ( 85 ), proposed a style-based GAN to generate more high-quality images in skin lesion classification. Then these synthetic images were added to the training set to the pretrained ResNet-50 model. The experiment showed that the proposed style-based GAN method outperformed other GAN-based methods in terms of Inception Score (IS), Fréchet Inception Distance (FID), precision, and recall. What is more, the accuracy, sensitivity, specificity, and other indicators of the classification model also improved. In ( 88 ), the author proposed a GAN-based framework “TED-GAN” to generate skin lesion images artificially in skin cancer classification. Instead of using random Gaussian distribution to sample the noise vector in GAN, they used informative noise that was obtained from a separate network for the first time to generate the medical images. TED-GAN had four parts: one variational auto-encoder, two GANs, and one auxiliary classifier. Firstly, an auto-encoder network was trained to get the vector containing the image manifold’s information. Then one of the GANs sampled output of the auto-encoder to ensure the stability of training and make it more convenient to use the domain information. After that, the other GAN obtained more training data from the prior GAN. In addition, an auxiliary classifier was added to this GAN network, then the two were trained together to generate images of various skin diseases. In the end, experiment results showed that TED-GAN had a positive effect on skin cancer classification as it provided more images for training. Although data augmentation methods such as GAN may successfully increase the number of skin cancer images and alleviate the problem of data imbalance, the generated data usually have identical distributions, limiting the improvement in model performance. To solve this issue ( 89 ), proposed a data augmentation method based on PGGAN, namely, SPGGAN, to generate skin lesion images with different types and data distributions. Firstly, an attention module was added into SPGGANs to obtain the global and local information from skin lesion images, also enabling PGGAN to generate more diverse high-quality samples. Then, the Two-Timescale Update Rule (TTUR) was added to SPGGANs to reduce the signal magnitude increase and hence enhance the stability of the model. Finally, experiments showed that the GAN-based data augmentation approach can lead to an improvement in the classification in terms of accuracy, sensitivity, F1 score, and other metrics. Since skin lesions often contain irregular boundaries, varied textures, and shapes, it makes the training of the GAN framework sometimes unstable. To address this issue ( 86 ), utilized conditional generative adversarial networks (CGANs) to extract key information from all layers and generate skin lesion images. The proposed CGAN has two modules: a generator module and a discriminator module. The generator module was to extract useful features from high-level and low-level layers and generate synthetic images. The discriminator module was to accurately map latent feature components by combining auxiliary information with training images. After that, augmented images with original datasets were put into the pretrained ResNet-18 network for the classification task. Experiments showed that this model achieved superior results compared with other datasets.
Another popular method for resolving data imbalance is to apply weights to various samples in the loss function. The goal is to calculate the losses differently depending on whether the samples are in the majority or minority. For example ( 90 ), proposed an end-to-end framework for classifying seven skin lesions in the HAM10000 dataset. Especially, a class-weighted learning strategy was utilized to overcome the problem of data imbalance in the dataset by assigning different weights to different lesion classes in computing the loss function. Meanwhile, focus loss was used to further increase the model’s classification performance. It concentrated training on tough examples, preventing the classifier from being overwhelmed by easy samples. Experiment results revealed that the model obtained an average accuracy of 93%, outperforming dermatologists’ 84% accuracy. Although the problem of data imbalance can be alleviated through the design of the loss function, there exists a problem of slow learning of the minority classes. To solve the issue ( 91 ), proposed a hybrid strategy for skin cancer classification. It combined a loss function method at the algorithm level with a balanced mini-batch logic method for real-time image augmentation at the data level. By applying the balanced mini-batch and real-time image augmentation method, the new loss function can improve its learning ability in minority samples, thereby improving training efficiency. When compared with the previous strategy, this method improved the learning effectiveness of minority classes on an imbalanced dataset by increasing m-Recall by 4.65% and decreasing the standard deviation of recalls by 4.24%. In addition to designing a new loss function ( 93 ), also designed two new algorithms based on evolutionary algorithms, the Mixup Extrapolation Balancing (MUPEB) and the Differential Evolution (DE), to solve the problem of data imbalance in melanoma classification. The MUPEB method included a set of operations to mix and interpolate the dataset until it was balanced. The DE method mixed and combined three random images with varied clinical information to achieve data balance. Apart from that, weighted loss function and oversampling were also used to alleviate data imbalance. In the end, this algorithm increased the model’s classification precision and recall by 1% and 11%, respectively.
Data augmentation is an ideal solution to artificially increase the amount of data by generating new data points from existing data. It scales the number of images by random rotating, padding, rescaling, flipping, translation, etc. At the same, with the development of technology, various novel approaches for data augmentation have been presented in skin cancer classification ( 58 , 122 ). released the HAM10000 dataset by natural data augmentation; the images of skin lesions were captured at various magnifications or angles, or with multiple cameras. To evaluate the effectiveness of data augmentation methods while determining the most effective method ( 87 ), explored four types of data augmentation methods (geometric transformation, adding noise, color transformation, and image mix) and a multiple-layer augmentation method (augmented images by more than one operation) in melanoma classification. The first step was to preprocess the images to remove artifacts such as body hair on the images. Then each augmentation method was assessed to decide the optimal augmentation method. In the end, they found that single-layer augmentation outperformed multiple-layer augmentation methods. Besides, the region of interest (ROI)-mix method achieved the best performance compared with other approaches ( 92 ). proposed a two-stage strategy data augmentation method on mobile devices successfully with limited computing resources. The first stage was to search the optimal augmentation method in the Low-Cost-Augment (LCA) space. The second stage was to fine-tune the deep CNNs with augmented images and choose the model with the highest accuracy. Finally, the augmented images were trained with EfficientNets, which resulted in better accuracy and computational efficiency. Different from previous data augmentation methods ( 94 ), proposed a novel Synthetic Minority Oversampling Technique (SMOTE) to solve the problem of image scarcity and imbalance in the skin lesion dataset. Firstly, all images in the PH 2 dataset were preprocessed for ensuring cleaning. Then in the data augmentation stage, the covariance matrix (CM) was exploited by SMOTE to find dependent connections between attributes. Then they built surrogate instances based on the estimated CM to balance the number of minority class and majority class. Finally, all augmented images were utilized to train the SqueezeNet and it resulted in a significant improvement in terms of accuracy, sensitivity, specificity, and F1 score.
In the skin cancer classification task, the generalization ability of the model is often inferior to that of an experienced dermatologist. Firstly, owing to the small scale of skin image datasets, even if a large amount of similar data is artificially generated, the overfitting problem still exists. Secondly, the majority of research exclusively focuses on dermatological images taken using standardized medical equipment, such as dermoscopic and histological images ( 78 ). Little research has been conducted on dermatological images captured by other devices. When a trained model is applied to a new dataset with a different domain, its performance suffers significantly.
Transfer learning (TL) is commonly utilized for improving the generalization ability of computer-aided diagnostic systems in test data. The fundamental idea of TL is to preserve information gained while addressing a problem and implement it to a new but relevant problem ( 52 ). It can not only drastically reduce the time overhead and labor cost associated with partial repetitive labor but also compensate for the flaw in the skin disease datasets ( 96 ). presented two methods to improve the generalization ability of models to new samples and reduce cross-domain shift. The first method used a transfer learning strategy with two steps to acquire new knowledge from diverse domains. It began with pretraining on ImageNet and fine-tuned the model with a single skin dataset. In the end, they used the target set to fine-tune the model to get the prior information. The second method used a pixel-wise image synthesizing adaptation method to transfer the features between the source domain and target domain. In comparison to the previous transfer learning approach, this method was semi-supervised and did not need any labels for domain adaptation. Finally, cross-domain experiments showed that in order to improve classification performance, the proposed methods had the ability to transform images between different modalities. In order to solve the problem of class imbalance in skin lesion datasets, To address the problem of poor generalization performance due to low interclass variance and class imbalance in skin disease images ( 98 ), proposed a two-stage framework with adversarial training and transfer learning in melanoma detection. The first stage was to solve the data scarcity and class imbalance problem by generating underrepresented class samples. The second stage was to train deep neural networks for melanoma classification, by using newly synthesized images and original datasets. A focal loss was proposed to assist the model in learning from hard examples. In the end, results showed the significant improvement of the classification performance and superiority of the proposed method. With the application of transfer learning in skin cancer diagnosis, it has been discovered that most existing transfer learning methods only extract knowledge from the source data to learn, but many inaccurate samples that are very different from the target data are incorporated into the process. Meanwhile, most skin cancer classification methods simply learn from raw skin disease images, which makes information from different aspects (such as texture, shape, etc.) interfered by noise during the learning process. Therefore ( 99 ), proposed a multi-view-filtered transfer learning (MFTL) method to solve the poor scalability problem of skin cancer classification models. MFTL consisted primarily of two modules: a multi-view-weighing representation module and a filtered domain adaption module. The first module put the view weights obtained from the feature representation procedure to the final prediction. The second module selected key source information to transfer the knowledge between the source domain and target domain. Finally, the result showed a significantly improved performance in classifying melanoma and seborrheic keratosis. In ( 103 ), the author proposed a transfer learning approach to address the issue of insufficient data in the medical image datasets, as well as to improve the performance of other related medical image classification tasks. The proposed approach first trained deep learning models on a great amount of unlabeled images for a specific task, as the volume of unlabeled medical images has increased significantly. Then the models were fine-tuned on a relatively small-labeled dataset to perform the same task. Besides, they utilized a hybrid deep CNN model to accurately extract features and ensure training stability while avoiding overfitting. Experiments showed the effectiveness in the skin cancer and breast cancer classification in terms of classification accuracy, recall, precision, and F1 score. With the growing use of transfer learning in the field of computer vision, an increasing number of studies have proved that large-scale pretraining on natural images can be beneficial in a variety of tasks. However, research on medical images is still limited. With this purpose ( 95 ), investigated the advantages of large-scale supervised pretraining with three medical images: chest radiography, mammography, and dermatological images. Five tasks including in-domain performance, generalization under distribution shift, data efficiency, subgroup fairness, and uncertainty estimation were conducted to test if large-scale pretraining aided in the modeling of medical images. Finally, experiment results indicated that, despite significant differences from the pretraining data, employing larger pretraining datasets can achieve significant improvements across a wide range of medical disciplines. Besides, they discovered that pretraining at scale may allow downstream tasks to more effectively reuse deeper features.
In addition to TL, many novel methods such as adding innovative regularization terms, estimating model uncertainty, and lifelong learning models are beginning to be introduced into the skin cancer classification task to improve the generalization ability of the model across different domains ( 104 ). proposed a method that can improve the generalization ability of a model under limited samples by combining data augmentation and domain alignment. They observed in medical images that domain changes were compact and related to a certain extent. To be able to model such dependencies, the author introduced a dependency regularization term to learn a representative feature space that captured sharable information across different medical image domains. At the same time, a variational encoder was used to ensure that the latent features followed a predetermined distribution. Finally, through theoretical derivation, the author obtained the upper bound of empirical risk for any relevant target domain under this method, which alleviated the problem of overfitting. Finally, the generalization ability of the model was well confirmed on seven skin-disease datasets. In order to obtain the uncertainty quantification (UQ) of the deep learning model to prevent overfitting ( 102 ), proposed three indicators Monte Carlo (MC) dropout, Ensemble MC (EMC) dropout, and Deep Ensemble (DE) to solve this problem. They next presented a novel hybrid Bayesian deep learning model based on the three-way decision (TWD) theory to obtain the residual uncertainty after using the three methods of MC, EMC, and DE. It also enabled different UQ methods to be used in different neural networks or different classification stages. Finally, the experimental findings demonstrated that the proposed model can be employed efficiently in analyzing different stages of medical images, and the model’s uncertainty was accurately quantified. Since the deep learning model might forget much of the previous information while learning new data, updating the system with more new data would reduce the performance of the previous learning, which poses a greater challenge to the medical autonomous diagnosis system. To this end ( 105 ), designed a Bayesian generative model for continual learning based on a fixed pretrained feature extractor. Different from the previous continual learning method, which stored a small number of images for each old class, the proposed method stored the statistical information of each class based on the previous feature extractor, which can make the model naturally keep the knowledge of each old class from being used. Therefore, there was no need to store or regenerate old images. Finally, the model performed well on both the Skin7 and Skin40 datasets, and it was able to retain some images from previous classes during continual learning. The model’s scalability and generalization have been greatly enhanced.
Various noises obtained from heterogeneous sources and skin disease images pose challenges to the robustness of models in the task of skin cancer classification. When trained on high-quality skin lesion datasets, the deep learning model can reach the same diagnostic level as dermatologists, even surpassing them. However, since the skin cancer classification model is sensitive to images captured with different devices, lighting settings, and backgrounds, it frequently fails to obtain satisfactory classification results when tested with different images. Furthermore, photographic images (such as smartphone images) vary greatly in terms of zoom, perspective, and lighting, making classification much more difficult.
Therefore, many scholars have worked to integrate adversarial training into the field of skin cancer classification to enhance the robustness of the classification models. In ( 100 ), the author introduced a novel Attention-based DenseUnet (Att-DenseUnet) network combined with adversarial training for skin lesion segmentation and classification. With the addition of the attention module, the model can pay more attention to discriminative features while also successfully suppressing irrelevant features in the DenseBlocks output. In this way, the interference of artifacts on skin disease images is reduced. Att-DenseUnet had two main modules: Segmentor module and Discriminator module. The segmentor module was a U-Net shape structure, which contained a down-sampling path, up-sampling path, and related attention module to ensure the information transfer between different layers. Additionally, it adopted an attention module to focus on the essential features and speed up the training process. The discriminator module employed the adversarial training to impose the segmentor module to obtain diverse features with different sizes and shapes and direct the attention module to concentrate on the multiscale lesions. Besides, they used the adversarial loss to prevent overfitting by providing the regularization term for the networks. Finally, the results showed that this network achieved excellent performance and was robust enough for different skin image datasets. In clinical applications, it has been discovered that noises that are difficult for humans to detect frequently cause significant interference to the diagnostic model, limiting the utility of deep learning in the actual world. To improve the model’s robustness ( 97 ), performed adversarial training on MobileNet and VGG-16 using the innovative attacking models FGSM and PGD for skin cancer classification. Firstly, two white-box attacks based on Projected Gradient Descent (PGD) and Fast Gradient Sign Method (FGSM) were used to test the robustness of these models. Then, to increase the robustness of these models, the author did the adversarial training based on PGD against white-box attacks. In the end, the results showed that the robustness of these models significantly improved. To further increase the difficulty of adversarial attacks instead of simple adversarial attacks ( 101 ), used the more realistic and riskier Universal Adversarial Perturbation (UAP) to adversarially train seven classification models (VGG-16, VGG-19, ResNet-50, Inception ResNet-V2, DenseNet-21, and DenseNet-169). During the adversarial attack phase, the author used an iterative algorithm to generate perturbations for non-targeted and targeted attacks and the Fast Gradient Sign Method (FGSM) was used to generate perturbations for input images. After that, they conducted adversarial retraining to improve the robustness of these seven models. The results showed that these models were easily deceived when applied to adversarial attacks. In addition, they found the limited effect of adversarial retraining on non-targeted perturbations. Although adversarial retraining considerably lowered the vulnerability to adversarial perturbations in targeted attacks, it did not totally avoid it.
Although an increasing number of deep learning algorithms have been successfully applied to skin cancer classification with excellent classification results, the computational complexity of the model still needs to be considered. Firstly, due to improvements in imaging technology, many skin disease images with high resolution have large pixels. For example, histological scans are made up of millions of pixels, and their resolution is often larger than 50,000 × 50,000 ( 123 ). As a result, training them takes longer time and additional computing resources. Secondly, the computational complexity in the deep learning model is increasing as its accuracy improves, which demands their implementation to various medical equipment or mobile devices at a higher cost. Here we introduce three latest methods when designing an effective network for skin cancer classification.
Over the past few years, many Lightweight Convolutional Neural Networks have been designed and successfully applied in skin cancer classification to meet the demands of practical applications. Subsequently, many scholars used lightweight CNN for the task of skin cancer classification and successfully employed it to various mobile devices. For example ( 107 ), proposed an automated classification method based on MobileNet and successfully deployed it on an Android application or a website for public use. With the vigorous development of mobile health (mHealth), more and more mobile applications are designed for cancer classification and prediction. However, the application of automatic classification of skin cancer is still limited. To solve this problem ( 116 ), proposed an innovative expert system based on SqueezeNet, namely, “i-Rash,” to classify four classes of skin diseases in real time. Due to the limited size of “i-Rash” (i.e., 3 MB), identifying an unknown image for the system only took 0.09 s. Inspired by predecessors ( 111 ), proposed a novel method that incorporated attention residual learning (ARL) mechanism to EfficientNet with fewer parameters. Besides, they also investigated how the mechanism related to the existing attention mechanisms: Squeeze and Excitation (SE). Through the comparison of experimental results between models with and without SE, they speculated that the attention module accounts for a large portion of EfficientNet’s outstanding performance. What is more, the addition of ARL increased the accuracy of the EfficientNet and its variance. In ( 112 ), three different lightweight models (including MobileNet, MobileNetV2, NASNetMobile) were adopted for skin cancer classification. To find the model with the best performance, they tested a total of nine models with three different batch sizes. In the end, they found that the NASNetMobile model showed the best performance with a batch size of 16. Meanwhile, they benchmarked the lightweight models with fewer parameters and less computational time.
Pruning is an effective way to remove parameters from an existing network to maintain the accuracy of the network while increasing its efficiency ( 124 ). To enable CNN to be used in medical devices with limited power and resources ( 114 ), built a pruning framework to simplify the complicated architectures by choosing the most informative color channels in skin lesion detection. The proposed method is to achieve two purposes: removing redundant color channels and simplifying the whole network. Firstly, all color channels were put into the network. Then the weights that associated with the non-essential color channels were deleted to select the most informative color channel. After that, to generate a simplified network, they utilized CNN models as the target network and trained them on the chosen color channels. Besides, the requirements of these models were calculated from hardware perspectives to analyze the complexity of various networks. Finally, results showed that this color channel pruning strategy improved segmentation accuracy while also simplifying the network. Designing an efficient and generalizable deployment strategy is an extremely challenging problem for lightweight networks. To this end ( 109 ), proposed a weight pruning strategy for lightweight neural networks to make up for the accuracy loss and improve model performance and reliability in the skin cancer classification. Five lightweight CNNs, namely, SqueezeNet, MnasNet, MobileNetV2, ShuffleNetV2, and Xception, were investigated in this task. Firstly, a dense–sparse–dense (DSD) training strategy was used to avoid the underfitting and high bias of the networks. Then, a detailed analysis was used for building a pruning method including not just pruning connections with various relations but also reviewing a novel pruning mechanism that can remove the weights according to the distribution in each layer adaptively. In the end, the pruning strategy achieved higher accuracy and less computation compared with unpruned networks ( 110 ). designed a new pruning method “MergePrune” to reduce the computational cost of retraining the network by combining pruning and training into a single stage. Firstly, different units were assigned to learn each domain independently as they contribute differently to the classification result. Then, for one domain, determined culprit network units with high “culpability” scores were pruned and then reset and assigned to learn new domains. At the same time, non-culprit units were preserved. MergePrune was implemented to reduce the amount of computation and improve the efficiency of the classification model. Finally, the results showed that the network can perform accurately and effectively on real-world clinical imaging data with various domains, even with high pruning ratios.
Knowledge distillation is the process of distilling information from a huge model or group of models to a smaller model that could be successfully implemented with real-world restrictions ( 106 , 125 ). proposed a knowledge distillation-based method that enabled to transfer knowledge between models simultaneously in skin cancer classification and brain tumor detection. Firstly, a pretrained ResNet-50 was chosen as a base model as its excellent performance out of the box. Then, with the significant degree of resemblance across the images in the medical image dataset, they let the knowledge transfer only between the two bottom-most layers. As a result, high-level visual comprehension was preserved, and information was added to the granular distinction in this way. The findings of the experiments were revealed in order to gather remote knowledge and enhance global accuracy; some local accuracy was lost. To improve the robustness and reduce the computational cost of the model ( 115 ), proposed a knowledge distillation method based on curriculum training in distinguishing herpes zoster from other skin diseases. Firstly, three kinds of model, namely, basic models, mobile models, and ensemble models, were chosen for benchmark. Then, to improve the performance of a single network, an ensemble knowledge distillation was utilized. This allowed the student network to learn more robust and representative features from the network while keeping a low computational cost. After that, they proposed curriculum training for ensemble knowledge distillation in order to distill ensemble teachers more efficiently with an adaptive learning technique. In the end, the results showed that the proposed method achieved improved performance while obtaining higher efficiency.
Transformer ( 126 ) is a deep learning model designed by the Google team in 2017 that was originally utilized in Natural Language Processing (NLP) and is now frequently employed in medical image processing, such as skin lesion images. It uses the self-attention mechanism to weigh the relevance of different parts of the input data series, resulting in shorter training periods and improved accuracy ( 126 , 127 ). The introduction of the attention mechanism has generated great interest in the research community, but there is still a lack of systematic ways to select hyperparameters that guarantee model improvement. To this end ( 108 ), presented an assessment of the effectiveness for the attention module and self-attention module in skin cancer classifications based on ResNet architecture. Among the two modules, the attention module was used to recompute the features of the input tensor in each layer. The self-attention module was used to connect multiple positions of input images to obtain different representations of the input. In the experiment stage, the author investigated and compared a variety of alternative attention mechanisms with images from the HAM10000 dataset. In the end, the results showed that many of the self-attention structures outperformed the ResNet-based architectures, while containing fewer parameters. At the same time, applying the attention mechanism reduced the image noise; however, it did not behave consistently across different structural parameters. In solving the skin cancer classification problem, people often treat it as a simple classification task, ignoring the potential benefits of lesion segmentation. To this end ( 113 ), proposed an approach that combined the attention module with the CNN module for skin cancer classification. The CNN module was in charge of getting lesion texture information, while the attention module was responsible for obtaining context information such as the shape and size of the lesion. In addition, dual-task and attended region consistency losses were adopted to mediate the classification and segmentation heads without pixel-level annotation, which increased the robustness of the model when it trained with various augmented images. Finally, MT-TransUNet achieved excellent performance in the skin lesion segmentation and classification. At the same time, it preserved compelling computational efficiency and speed.
With the development of science and technology, the diagnosis accuracy and efficiency for skin cancer classification are constantly improving. In the previous clinical diagnosis scenarios of skin cancer, the final diagnosis often depends on the imaging quality and the experience of dermatological experts, which is highly subjective and has a high rate of misdiagnosis. With the advent of machine learning, various CAD systems have been designed to aid the dermatologists to diagnose skin cancer diseases. In some skin cancer classification tasks, these CAD systems achieved excellent performance by utilizing handcrafted features. Recently, with the success of deep learning in medical image analysis, several researchers have applied deep learning methods for skin cancer classification in an end-to-end manner and achieved satisfactory results. It is expected that in the future, artificial intelligence and the diagnosis of skin cancer diseases would become closely associated.
In this study, we present a comprehensive overview of the most recent breakthroughs in deep learning algorithms for skin cancer classification. Firstly, we introduced three different types of dermatological images used in diagnosis and some commonly used datasets. Next, we present the applications of typical CNN-based methods in skin cancer classification. After that, we introduce several frontier problems in the skin cancer classification task, such as data imbalance and limitation, cross-domain adaptability, model robustness, and model efficiency, along with relevant deep learning-based approaches. Finally, we provide a summary of the entire review. We draw the key information as follows:
● Skin cancer develops as a result of uncontrolled cell proliferation in the skin. It frequently appears on sun-exposed skin. The three major types of skin cancers are basal cell carcinoma (BCC), squamous cell carcinoma (SCC), and melanoma. Early skin cancer classification increases the chances of a successful treatment (refer to Section 1 for more information).
● Clinical images, dermoscopic images, and histopathological images are three common types of images used for skin disease diagnosis. Among them, the most common forms of images are dermoscopy images. With the growing need for medical imaging resources in academia, more and more datasets are becoming publicly available. We list several popular datasets for skin-disease images along with works based on these datasets. However, compared with natural image datasets, the diversity and quantity of skin-disease datasets are still very limited, which also brings great challenges to the automatic diagnosis of skin cancer (refer to Section 2 for more information).
● When using CNN-based methods for skin cancer classification, VGGNet, GoogleNet, ResNet, and their variants are the most often used deep learning models. Also, ensemble learning was proposed to limit the error generated by only a single model and achieved satisfactory results. Although various deep learning models have performed admirably on skin cancer classification tasks, several challenges still exist and need to be resolved, such as imbalanced datasets, a lack of labeled data, cross-domain generalization ability, noisy data from heterogeneous devices and images, and how to design effective models for complicated classification tasks. To address the challenges, methods include generative adversarial networks, data augmentation, designing new loss functions, transfer learning, continual learning, adversarial training, lightweight CNN, pruning strategy, knowledge distillation, and transformer. It can be expected that AI has the potential to play an active role in a paradigm shift in skin cancer diagnosis in the near future (refer to Section 3 for more information).
In comparison to other comparable reviews, this paper presents a comprehensive review in the topic of skin cancer classification with a focus on contemporary deep learning applications. It can be seen that the general evolutionary trend of these frameworks is structured, lightweight, and multimodal. With the help of this essay, one can gain an intuitive understanding of the core principles and issues in this field. Furthermore, anyone eager to engage in this field in the future should explore a number of different approaches to dealing with these issues. It is believed that the problems described above will become the research hotspots of scholars for a long time to come.
YW was responsible for writing the paper and supplementing materials. BC and RW were responsible for proposing amendments. AZ was responsible for revising and reviewing the article. DP was responsible for making the article more complete. SZ was the overall guide and was responsible for the whole project. All authors contributed to the article and approved the submitted version.
This work was supported by the National Natural Science Foundation of China (Youth Project) under Grant 62101607 and National Natural Science Foundation of China under Grant 62071502.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
1. Swann G ed. Journal of Visual Communication in Medicine (2010), pp. 148–9. doi: 10.3109/17453054.2010.525439. New York, NY, United States:Foundation of Computer Science (FCS)
CrossRef Full Text | Google Scholar
2. Montagna W. The Structure and Function of Skin . Amsterdam, Netherlands:Elsevier (2012).
Google Scholar
3. Samuel E, Moore M, Voskoboynik M, Shackleton M, Haydon A. An Update on Adjuvant Systemic Therapies in Melanoma. Melanoma Manage (2019) 6:MMT28. doi: 10.2217/mmt-2019-0009
4. ACS. Cancer Facts & Figures 2018. In: Cancer Facts Fig. Atlanta, GA,U.S.:American Cancer Society (ACS) (2018). p. 1–71.
5. Rogers HW, Weinstock MA, Feldman SR, Coldiron BM. Incidence Estimate of Nonmelanoma Skin Cancer (Keratinocyte Carcinomas) in the Us Population, 2012. JAMA Dermatol (2015) 151: 1081–6. doi: 10.1001/jamadermatol.2015.1187
PubMed Abstract | CrossRef Full Text | Google Scholar
6. Sheha MA, Mabrouk MS, Sharawy A. Automatic Detection of Melanoma Skin Cancer Using Texture Analysis. Int J Comput Appl (2012) 42: 22–6. doi: 10.5120/5817-8129
7. Massone C, Di Stefani A, Soyer HP. Dermoscopy for Skin Cancer Detection. Curr Opin Oncol (2005) 17:147–53. doi: 10.1097/01.cco.0000152627.36243.26
8. Hoang L, Lee SH, Lee EJ, Kwon KR. Multiclass Skin Lesion Classification Using a Novel Lightweight Deep Learning Framework for Smart Healthcare. Appl Sci (2022) 12:2677. doi: 10.3390/app12052677
9. Anas M, Gupta K, Ahmad S. Skin Cancer Classification Using K-Means Clustering. Int J Tech Res Appl (2017) 5:62–5.
10. Ali AR, Li J, O’Shea SJ, Yang G, Trappenberg T, Ye X. A Deep Learning Based Approach to Skin Lesion Border Extraction With a Novel Edge Detector in Dermoscopy Images, in: International Joint Conference on Neural Networks (IJCNN) Manhattan, New York, U.S.:Institute of Electrical and Electronics Engineers (IEEE) (2019) 1–7. pp
11. Stanoszek LM, Wang GY, Harms PW. Histologic Mimics of Basal Cell Carcinoma. Arch Pathol Lab Med (2017) 141:1490–502. doi: 10.5858/arpa.2017-0222-RA
12. Goyal M, Knackstedt T, Yan S, Hassanpour S. Artificial Intelligence-Based Image Classification for Diagnosis of Skin Cancer: Challenges and Opportunities. Comput Biol Med (2020) 127:104065. doi: 10.1016/j.compbiomed.2020.104065
13. Weingast J, Scheibböck C, Wurm EM, Ranharter E, Porkert S, Dreiseitl S, et al. A Prospective Study of Mobile Phones for Dermatology in a Clinical Setting. J Telemed Telecare (2013) 19:213–8. doi: 10.1177/1357633x13490890
14. Arroyo JLG, Zapirain BG. Automated Detection of Melanoma in Dermoscopic Images. In: Computer Vision Techniques for the Diagnosis of Skin Cancer . New York, NY, United States:Springer (2014). p. 139–92.
15. Nachbar F, Stolz W, Merkle T, Cognetta AB, Vogt T, Landthaler M, et al. The Abcd Rule of Dermatoscopy: High Prospective Value in the Diagnosis of Doubtful Melanocytic Skin Lesions. J Am Acad Dermatol (1994) 30:551–9. doi: 10.1016/S0190-9622(94)70061-3
16. Menzies SW, Ingvar C, Crotty KA, McCarthy WH. Frequency and Morphologic Characteristics of Invasive Melanomas Lacking Specific Surface Microscopic Features. Arch Dermatol (1996) 132:1178–82. doi: 10.1001/archderm.1996.03890340038007
17. Argenziano G, Fabbrocini G, Carli P, De Giorgi V, Sammarco E, Delfino M. Epiluminescence Microscopy for the Diagnosis of Doubtful Melanocytic Skin Lesions: Comparison of the Abcd Rule of Dermatoscopy and a New 7-Point Checklist Based on Pattern Analysis. Arch Dermatol (1998) 134:1563–70. doi: 10.1001/archderm.134.12.1563
18. Noble WS. What Is a Support Vector Machine? Nat Biotechnol (2006) 24:1565–7. doi: 10.1038/nbt1206-1565
19. Chen T, He T, Benesty M, Khotilovich V, Tang Y, Cho H, et al. Xgboost: Extreme Gradient Boosting. In: R Package Version 0.4-2 . Brookline, Massachusetts, U.S: Microtome Publishing (2015). 24 p. 1–4.
20. Safavian SR, Landgrebe D. A Survey of Decision Tree Classifier Methodology. IEEE Trans Syst Man Cybernet (1991) 21:660–74. doi: 10.1109/21.97458
21. Pomponiu V, Nejati H, Cheung NM. Deepmole: Deep Neural Networks for Skin Mole Lesion Classification, in: IEEE International Conference on Image Processing (ICIP), Manhattan, New York, U.S:Institute of Electrical and Electronics Engineers (IEEE) (2016) 2623–7. pp
22. Habif TP, Chapman MS, Dinulos JG, Zug KA. Skin Disease E-Book: Diagnosis and Treatment . Amsterdam, Netherlands:Elsevier Health Sciences (2017).
23. Guo Y, Liu Y, Oerlemans A, Lao S, Wu S, Lew MS. Deep Learning for Visual Understanding: A Review. Neurocomputing (2016) 187:27–48. doi: 10.1016/j.neucom.2015.09.116
24. Castiglioni I, Rundo L, Codari M, Di Leo G, Salvatore C, Interlenghi M, et al. Ai Applications to Medical Images: From Machine Learning to Deep Learning. Physica Med (2021) 83:9–24. doi: 10.1016/j.ejmp.2021.02.006
25. Rezvantalab A, Safigholi H, Karimijeshni S. Dermatologist Level Dermoscopy Skin Cancer Classification Using Different Deep Learning Convolutional Neural Networks Algorithms. ArXiv (2018) abs/1810.10348:arXiv:1810.10348. doi: 10.48550/arXiv.1810.10348
26. Li KM, Li EC. Skin Lesion Analysis Towards Melanoma Detection via End-to-End Deep Learning of Convolutional Neural Networks. ArXiv (2018) abs/1807.08332:arXiv:1807.08332. doi: 10.48550/arXiv.1807.08332
27. Perez F, Avila S, Valle E. Solo or Ensemble? Choosing a Cnn Architecture for Melanoma Classification. Proc IEEE/CVF Conf Comput Vision Pattern Recognit Workshops (2019), 0–0. doi: 10.1109/CVPRW.2019.00336
28. Li H, Pan Y, Zhao J, Zhang L. Skin Disease Diagnosis With Deep Learning: A Review. Neurocomputing (2021) 464:364–93. doi: 10.1016/j.neucom.2021.08.096
29. Hameed N, Ruskin A, Hassan KA, Hossain MA. A Comprehensive Survey on Image-Based Computer Aided Diagnosis Systems for Skin Cancer. In: International Conference on Software, Knowledge, Information Management & Applications (SKIMA) (IEEE) , 10th. Manhattan, New York, U.S: Institute of Electrical and Electronics Engineers (IEEE) (2016). p. 205–14.
30. Kuntz S, Krieghoff-Henning E, Kather JN, Jutzi T, Höhn J, Kiehl L, et al. Gastrointestinal Cancer Classification and Prognostication From Histology Using Deep Learning: Systematic Review. Eur J Cancer (2021) 155: 200–15. doi: 10.1016/j.ejca.2021.07.012
31. Pathan S, Prabhu KG, Siddalingaswamy P. Techniques and Algorithms for Computer Aided Diagnosis of Pigmented Skin Lesions—A Review. Biomed Signal Process Control (2018) 39:237–62. doi: 10.1016/j.bspc.2017.07.010
32. Brinker TJ, Hekler A, Utikal JS, Grabe N, Schadendorf D, Klode J, et al. Skin Cancer Classification Using Convolutional Neural Networks: Systematic Review. J Med Internet Res (2018) 20:e11936. doi: 10.2196/11936
33. Manne R, Kantheti S, Kantheti S. Classification of Skin Cancer Using Deep Learning, Convolutionalneural Networks-Opportunities and Vulnerabilities-a Systematic Review. Int J Modern Trends Sci Technol (2020) 6:2455–3778. doi: 10.46501/ijmtst061118
34. Chan S, Reddy V, Myers B, Thibodeaux Q, Brownstone N, Liao W. Machine Learning in Dermatology: Current Applications, Opportunities, and Limitations. Dermatol Ther (2020) 10:365–86. doi: 10.1007/s13555-020-00372-0
35. Haggenmüller S, Maron RC, Hekler A, Utikal JS, Barata C, Barnhill RL, et al. Skin Cancer Classification via Convolutional Neural Networks: Systematic Review of Studies Involving Human Experts. Eur J Cancer (2021) 156:202–16. doi: 10.1016/j.ejca.2021.06.049
36. Höhn J, Hekler A, Krieghoff-Henning E, Kather JN, Utikal JS, Meier F, et al. Integrating Patient Data Into Skin Cancer Classification Using Convolutional Neural Networks: Systematic Review. J Med Internet Res (2021) 23:e20708. doi: 10.2196/20708
37. Dildar M, Akram S, Irfan M, Khan HU, Ramzan M, Mahmood AR, et al. Skin Cancer Detection: A Review Using Deep Learning Techniques. Int J Environ Res Public Health (2021) 18:5479. doi: 10.3390/ijerph18105479
38. Sheng M, Tang M, Lin W, Guo L, He W, Chen W, et al. The Value of Preoperative High-Resolution Mri With Microscopy Coil for Facial Nonmelanoma Skin Cancers. Skin Res Technol (2021) 27:62–9. doi: 10.1111/srt.12909
39. Moreno-Ramirez D, Ferrandiz L. Skin Cancer Telemedicine. In: Telemedicine in Dermatology . New York, NY, United States:Springer (2012). p. 113–21.
40. Fujisawa Y, Otomo Y, Ogata Y, Nakamura Y, Fujita R, Ishitsuka Y, et al. Deep-Learning-Based, Computer-Aided Classifier Developed With a Small Dataset of Clinical Images Surpasses Board-Certified Dermatologists in Skin Tumour Diagnosis. Br J Dermatol (2019) 180:373–81. doi: 10.1111/bjd.16924
41. Zhang L, Yang G, Ye X. Automatic Skin Lesion Segmentation by Coupling Deep Fully Convolutional Networks and Shallow Network With Textons. J Med Imaging (2019) 6:024001. doi: 10.1117/1.JMI.6.2.024001
42. Kasuya A, Aoshima M, Fukuchi K, Shimauchi T, Fujiyama T, Tokura Y. An Intuitive Explanation of Dermoscopic Structures by Digitally Reconstructed Pathological Horizontal Top-Down View Images. Sci Rep (2019) 9:1–7. doi: 10.1038/s41598-019-56522-8
43. Pellacani G, Seidenari S. Comparison Between Morphological Parameters in Pigmented Skin Lesion Images Acquired by Means of Epiluminescence Surface Microscopy and Polarized-Light Videomicroscopy. Clinics Dermatol (2002) 20:222–7. doi: 10.1016/S0738-081X(02)00231-6
44. Ali ARH, Li J, Yang G. Automating the Abcd Rule for Melanoma Detection: A Survey. IEEE (2020) 8:83333–46. doi: 10.1109/ACCESS.2020.2991034
45. Henning JS, Dusza SW, Wang SQ, Marghoob AA, Rabinovitz HS, Polsky D, et al. The Cash (Color, Architecture, Symmetry, and Homogeneity) Algorithm for Dermoscopy. J Am Acad Dermatol (2007) 56:45–52. doi: 10.1016/j.jaad.2006.09.003
46. Lorentzen H, Weismann K, Petersen CS, Grønhøj Larsen F, Secher L, Skødt V. Clinical and Dermatoscopic Diagnosis of Malignant Melanoma: Assessed by Expert and Non-Expert Groups. Acta Dermato-venereol (1999) 79:301–4. doi: 10.1080/000155599750010715
47. Mendonça T, Ferreira PM, Marques JS, Marcal AR, Rozeira J. Ph 2-a Dermoscopic Image Database for Research and Benchmarking, in: 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). Manhattan, New York, U.S: Institute of Electrical and Electronics Engineers (IEEE) (2013). pp. 5437–40.
48. Brahmbhatt P, Rajan SN. Skin Lesion Segmentation Using Segnet With Binary Cross-Entropy. In: International Conference on Artificial Intelligence and Speech Technology (Aist2019) , 15th, vol. 14. Delhi, India: Excel India Publishers (2019).
49. Saba T, Khan MA, Rehman A, Marie-Sainte SL. Region Extraction and Classification of Skin Cancer: A Heterogeneous Framework of Deep Cnn Features Fusion and Reduction. J Med Syst (2019) 43:1–19. doi: 10.1007/s10916-019-1413-3
50. Basak H, Kundu R, Sarkar R. Mfsnet: A Multi Focus Segmentation Network for Skin Lesion Segmentation. Pattern Recognit (2022) 128:108673. doi: 10.1016/j.patcog.2022.108673
51. Matsunaga K, Hamada A, Minagawa A, Koga H. Image Classification of Melanoma, Nevus and Seborrheic Keratosis by Deep Neural Network Ensemble. ArXiv (2017) abs/1703.03108:arXiv:1703.03108. doi: 10.48550/arXiv.1703.03108
52. West J, Ventura D, Warnick S. Spring Research Presentation: A Theoretical Foundation for Inductive Transfer . Brigham Young University: College of Physical and Mathematical Sciences (2007).
53. Giotis I, Molders N, Land S, Biehl M, Jonkman MF, Petkov N. Med-Node: A Computer-Assisted Melanoma Diagnosis System Using non-Dermoscopic Images. Expert Syst Appl (2015) 42:6578–85. doi: 10.1016/j.eswa.2015.04.034
54. Manzo M, Pellino S. Bucket of Deep Transfer Learning Features and Classification Models for Melanoma Detection. J Imaging (2020) 6:129. doi: 10.3390/jimaging6120129
55. Hosny KM, Kassem MA, Foaud MM. Classification of Skin Lesions Using Transfer Learning and Augmentation With Alex-Net. PloS One (2019) 14:e0217293. doi: 10.1371/journal.pone.0217293
56. Hosny KM, Kassem MA, Foaud MM. Skin Melanoma Classification Using Roi and Data Augmentation With Deep Convolutional Neural Networks. Multimedia Tools Appl (2020) 79:24029–55. doi: 10.1007/s11042-020-09067-2
57. Codella N, Rotemberg V, Tschandl P, Celebi ME, Dusza S, Gutman D, et al. Skin Lesion Analysis Toward Melanoma Detection 2018: A Challenge Hosted by the International Skin Imaging Collaboration (Isic). ArXiv (2019) abs/1902.03368:arXiv:1902.03368. doi: 10.48550/arXiv.1902.03368
58. Tschandl P, Rosendahl C, Kittler H. The Ham10000 Dataset, a Large Collection of Multi-Source Dermatoscopic Images of Common Pigmented Skin Lesions. Sci Data (2018) 5:1–9. doi: 10.1038/sdata.2018.161
59. Young K, Booth G, Simpson B, Dutton R, Shrapnel S. Deep Neural Network or Dermatologist? In: Interpretability of Machine Intelligence in Medical Image Computing and Multimodal Learning for Clinical Decision Support . New York, NY, United States:Springer (2019). p. 48–55.
60. Polat K, Koc KO. Detection of Skin Diseases From Dermoscopy Image Using the Combination of Convolutional Neural Network and One-Versus-All. J Artif Intell Syst (2020) 2:80–97. doi: 10.33969/AIS.2020.21006
61. Datta SK, Shaikh MA, Srihari SN, Gao M. Soft Attention Improves Skin Cancer Classification Performance. In: Interpretability of Machine Intelligence in Medical Image Computing, and Topological Data Analysis and Its Applications for Medical Data . New York, NY, United States:Springer (2021). p. 13–23.
62. Kawahara J, Daneshvar S, Argenziano G, Hamarneh G. Seven-Point Checklist and Skin Lesion Classification Using Multitask Multimodal Neural Nets. IEEE J Biomed Health Inf (2018) 23:538–46. doi: 10.1109/JBHI.2018.2824327
63. Coppola D, Lee HK, Guan C. Interpreting Mechanisms of Prediction for Skin Cancer Diagnosis Using Multi-Task Learning. Proc IEEE/CVF Conf Comput Vision Pattern Recognit Workshops (2020), 734–5. doi: 10.1109/CVPRW50498.2020.00375
64. Yao P, Shen S, Xu M, Liu P, Zhang F, Xing J, et al. Single Model Deep Learning on Imbalanced Small Datasets for Skin Lesion Classification. IEEE Trans Med Imaging (2021) 41:1242–54. doi: 10.1109/TMI.2021.3136682
65. Combalia M, Codella NC, Rotemberg V, Helba B, Vilaplana V, Reiter O, et al. Bcn20000: Dermoscopic Lesions in the Wild. ArXiv (2019) abs/1908.02288:arXiv:1908.02288. doi: 10.48550/arXiv.1908.02288
66. Mou Y, Welten S, Yediel YU, Kirsten T, Beyan OD. Distributed Learning for Melanoma Classification Using Personal Health Train. ArXiv (2021) abs/2103.13226:arXiv:2103.13226. doi: 10.48550/arXiv.2103.13226
67. Maron RC, Schlager JG, Haggenmüller S, von Kalle C, Utikal JS, Meier F, et al. A Benchmark for Neural Network Robustness in Skin Cancer Classification. Eur J Cancer (2021) 155:191–9. doi: 10.1016/j.ejca.2021.06.047
68. Gessert N, Nielsen M, Shaikh M, Werner R, Schlaefer A. Skin Lesion Classification Using Ensembles of Multi-Resolution Efficientnets With Meta Data. MethodsX (2020) 7:100864. doi: 10.1016/j.mex.2020.100864
69. Gutman D, Codella NC, Celebi E, Helba B, Marchetti M, Mishra N, et al. Skin Lesion Analysis Toward Melanoma Detection: A Challenge at the International Symposium on Biomedical Imaging (Isbi) 2016, Hosted by the International Skin Imaging Collaboration (Isic). IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018) (2016) abs/1605.01397:arXiv:1605.01397. doi: 10.1109/ISBI.2018.8363547
70. Cassidy B, Kendrick C, Brodzicki A, Jaworek-Korjakowska J, Yap MH. Analysis of the Isic Image Datasets: Usage, Benchmarks and Recommendations. Med Image Anal (2022) 75:102305. doi: 10.1016/j.media.2021.102305
71. Lopez AR, Giro-i Nieto X, Burdick J, Marques O. Skin Lesion Classification From Dermoscopic Images Using Deep Learning Techniques, in: 13th IASTED International Conference on Biomedical Engineering (BioMed), Manhattan, New York, U.S: Institute of Electrical and Electronics Engineers (IEEE) (2017). pp. 49–54.
72. Nozdryn-Plotnicki A, Yap J, Yolland W. (2018). Ensembling Convolutional Neural Networks for Skin Cancer Classification. International Skin Imaging Collaboration (ISIC) Challenge on Skin Image Analysis for Melanoma Detection. MICCAI.
73. Bevan PJ, Atapour-Abarghouei A. Skin Deep Unlearning: Artefact and Instrument Debiasing in the Context of Melanoma Classification. ArXiv (2021) abs/2109.09818:arXiv:2109.09818. doi: 10.48550/arXiv.2109.09818
74. Ali AR, Li J, Yang G, O’Shea SJ. A Machine Learning Approach to Automatic Detection of Irregularity in Skin Lesion Border Using Dermoscopic Images. PeerJ Comput Sci (2020) 6:e268. doi: 10.7717/peerj-cs.268
75. Krizhevsky A, Sutskever I, Hinton GE. Imagenet Classification With Deep Convolutional Neural Networks. Adv Neural Inf Process Syst (2012) 60:84–90. doi: 10.1145/3065386
76. Masood A, Al-Jumaily A, Anam K. Self-Supervised Learning Model for Skin Cancer Diagnosis, in: 7th International IEEE/EMBS Conference on Neural Engineering (NER), Manhattan, New York, U.S.:Institute of Electrical and Electronics Engineers (IEEE) (2015). 1012–5 pp. doi: 10.1109/NER.2015.7146798
77. Mishra S, Imaizumi H, Yamasaki T. Interpreting Fine-Grained Dermatological Classification by Deep Learning. Proc IEEE/CVF Conf Comput Vision Pattern Recognit Workshops (2019), 0–0. doi: 10.1109/CVPRW.2019.00331
78. Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, et al. Dermatologist-Level Classification of Skin Cancer With Deep Neural Networks. Nature (2017) 542:115–8. doi: 10.1038/nature21056
79. Nasr-Esfahani E, Samavi S, Karimi N, Soroushmehr S, Jafari M, Ward K, et al. Melanoma Detection by Analysis of Clinical Images Using Convolutional Neural Network, in: 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Manhattan, New York, U.S: Institute of Electrical and Electronics Engineers (IEEE) (2016). 1373–6 pp. doi: 10.1109/EMBC.2016.7590963
80. Walker B, Rehg J, Kalra A, Winters R, Drews P, Dascalu J, et al. Dermoscopy Diagnosis of Cancerous Lesions Utilizing Dual Deep Learning Algorithms via Visual and Audio (Sonification) Outputs: Laboratory and Prospective Observational Studies. EBioMedicine (2019) 40:176–83. doi: 10.1016/j.ebiom.2019.01.028
81. Jaworek-Korjakowska J, Kleczek P, Gorgon M. Melanoma Thickness Prediction Based on Convolutional Neural Network With Vgg-19 Model Transfer Learning. Proc IEEE/CVF Conf Comput Vision Pattern Recognit Workshops (2019), 0–0. doi: 10.1109/CVPRW.2019.00333
82. Kawahara J, Daneshvar S, Argenziano G, Hamarneh G. Seven-Point Checklist and Skin Lesion Classification Using Multitask Multimodal Neural Nets. IEEE J Biomed Health Inf (2019) 23:538–46. doi: 10.1109/JBHI.2018.2824327
83. Rahman Z, Hossain MS, Islam MR, Hasan MM, Hridhee RA. An Approach for Multiclass Skin Lesion Classification Based on Ensemble Learning. Inf Med Unlocked (2021) 25:100659. doi: 10.1016/j.imu.2021.100659
84. Bisla D, Choromanska A, Berman RS, Stein JA, Polsky D. Towards Automated Melanoma Detection With Deep Learning: Data Purification and Augmentation. Proc IEEE/CVF Conf Comput Vision Pattern Recognit Workshops (2019), 0–0. doi: 10.1109/CVPRW.2019.00330
85. Qin Z, Liu Z, Zhu P, Xue Y. A Gan-Based Image Synthesis Method for Skin Lesion Classification. Comput Methods Programs Biomed (2020) 95:105568. doi: 10.1016/j.cmpb.2020.105568
86. Kaur R, GholamHosseini H, Sinha R. Synthetic Images Generation Using Conditional Generative Adversarial Network for Skin Cancer Classification, in: TENCON 2021-2021 IEEE Region 10 Conference (TENCON), Manhattan, New York, U.S: Institute of Electrical and Electronics Engineers (IEEE) (2021). 381–6 pp.
87. Lee KW, Chin RKY. The Effectiveness of Data Augmentation for Melanoma Skin Cancer Prediction Using Convolutional Neural Networks, in: IEEE 2nd International Conference on Artificial Intelligence in Engineering and Technology (IICAIET), Manhattan, New York, U.S:Institute of Electrical and Electronics Engineers (IEEE) (2020). 1–6 pp.
88. Ahmad B, Jun S, Palade V, You Q, Mao L, Zhongjie M. Improving Skin Cancer Classification Using Heavy-Tailed Student T-Distribution in Generative Adversarial Networks (Ted-Gan). Diagnostics (2021) 11:2147. doi: 10.3390/diagnostics11112147
89. Abdelhalim ISA, Mohamed MF, Mahdy YB. Data Augmentation for Skin Lesion Using Self-Attention Based Progressive Generative Adversarial Network. Expert Syst With Appl (2021) 165:113922. doi: 10.1016/j.eswa.2020.113922
90. Le DN, Le HX, Ngo LT, Ngo HT. Transfer Learning With Class-Weighted and Focal Loss Function for Automatic Skin Cancer Classification. ArXiv (2020) abs/2009.05977:arXiv:2009.05977. doi: 10.48550/arXiv.2009.05977
91. Pham TC, Doucet A, Luong CM, Tran CT, Hoang VD. Improving Skin-Disease Classification Based on Customized Loss Function Combined With Balanced Mini-Batch Logic and Real-Time Image Augmentation. IEEE (2020) 8:150725–37. doi: 10.1109/ACCESS.2020.3016653
92. Shen S, Xu M, Zhang F, Shao P, Liu H, Xu L, et al. Low-Cost and High-Performance Data Augmentation for Deep-Learning-Based Skin Lesion Classification. ArXiv (2021) abs/2101.02353:arXiv:2101.02353. doi: 10.34133/2022/9765307
93. Castro PB, Krohling B, Pacheco AG, Krohling RA. An App to Detect Melanoma Using Deep Learning: An Approach to Handle Imbalanced Data Based on Evolutionary Algorithms, in: International Joint Conference on Neural Networks (IJCNN), Manhattan, New York, U.S.:Institute of Electrical and Electronics Engineers (IEEE) (2020). 1–6 pp.
94. Abayomi-Alli OO, Damasevicius R, Misra S, Maskeliunas R, Abayomi-Alli A. Malignant Skin Melanoma Detection Using Image Augmentation by Oversampling in Nonlinear Lower-Dimensional Embedding Manifold. Turkish J Electric Eng Comput Sci (2021) 29:2600–14. doi: 10.3906/elk-2101-133
95. Mustafa B, Loh A, Freyberg J, MacWilliams P, Wilson M, McKinney SM, et al. Supervised Transfer Learning at Scale for Medical Imaging. ArXiv (2021) abs/2101.05913:arXiv:2101.05913. doi: 10.48550/arXiv.2101.05913
96. Gu Y, Ge Z, Bonnington CP, Zhou J. Progressive Transfer Learning and Adversarial Domain Adaptation for Cross-Domain Skin Disease Classification. IEEE J Biomed Health Inf (2020) 24:1379–93. doi: 10.1109/JBHI.2019.2942429
97. Huq A, Pervin MT. Analysis of Adversarial Attacks on Skin Cancer Recognition, in: International Conference on Data Science and Its Applications (ICoDSA), Manhattan, New York, U.S.:Institute of Electrical and Electronics Engineers (IEEE) (2020). 1–4 pp.
98. Zunair H, Hamza AB. Melanoma Detection Using Adversarial Training and Deep Transfer Learning. Phys Med Biol (2020) 65:135005. doi: 10.1088/1361-6560/ab86d3
99. Bian J, Zhang S, Wang S, Zhang J, Guo J. Skin Lesion Classification by Multi-View Filtered Transfer Learning. IEEE (2021) 9:66052–61. doi: 10.1109/ACCESS.2021.3076533
100. Wei Z, Song H, Chen L, Li Q, Han G. Attention-Based Denseunet Network With Adversarial Training for Skin Lesion Segmentation. IEEE Access (2019) 7:136616–29. doi: 10.1109/ACCESS.2019.2940794
101. Hirano H, Minagi A, Takemoto K. Universal Adversarial Attacks on Deep Neural Networks for Medical Image Classification. BMC Med Imaging (2021) 21:1–13. doi: 10.1186/s12880-020-00530-y
102. Abdar M, Samami M, Mahmoodabad SD, Doan T, Mazoure B, Hashemifesharaki R, et al. Uncertainty Quantification in Skin Cancer Classification Using Three-Way Decision-Based Bayesian Deep Learning. Comput Biol Med (2021) 13:104418. doi: 10.1016/j.compbiomed.2021.104418
103. Alzubaidi L, Al-Amidie M, Al-Asadi A, Humaidi AJ, Al-Shamma O, Fadhel MA, et al. Novel Transfer Learning Approach for Medical Imaging With Limited Labeled Data. Cancers (2021) 13:1590. doi: 10.3390/cancers13071590
104. Li H, Wang Y, Wan R, Wang S, Li TQ, Kot A. Domain Generalization for Medical Imaging Classification With Linear-Dependency Regularization. Adv Neural Inf Process Syst (2020) 33: 3118–29.
105. Yang Y, Cui Z, Xu J, Zhong C, Wang R, Zheng WS. Continual Learning With Bayesian Model Based on a Fixed Pre-Trained Feature Extractor. In: International Conference on Medical Image Computing and Computer-Assisted Intervention (Springer) . New York, NY, United States: Springer (2021). p. 397–406.
106. Goldstein O, Kachuee M, Sarrafzadeh M. Decentralized Knowledge Transfer on Edge Networks for Detecting Cancer in Images, in: IEEE EMBS International Conference on Biomedical and Health Informatics (BHI), Manhattan, New York, U.S:Institute of Electrical and Electronics Engineers (IEEE) (2021). 1–5 pp.
107. Velasco J, Pascion C, Alberio JW, Apuang J, Cruz JS, Gomez MA, et al. A Smartphone-Based Skin Disease Classification Using Mobilenet Cnn. ArXiv (2019) abs/1911.07929:arXiv:1911.07929. doi: 10.30534/ijatcse/2019/116852019
108. Pedro R, Oliveira AL. Assessing the Impact of Attention and Self-Attention Mechanisms on the Classification of Skin Lesions. ArXiv (2021) abs/2112.12748:arXiv:2112.12748. doi: 10.48550/arXiv.2112.12748
109. Xiang K, Peng L, Yang H, Li M, Cao Z, Jiang S, et al. A Novel Weight Pruning Strategy for Light Weight Neural Networks With Application to the Diagnosis of Skin Disease. Appl Soft Comput (2021) 111:107707. doi: 10.1016/j.asoc.2021.107707
110. Bayasi N, Hamarneh G, Garbi R. Culprit-Prune-Net: Efficient Continual Sequential Multi-Domain Learning With Application to Skin Lesion Classification. In: International Conference on Medical Image Computing and Computer-Assisted Intervention . New York, NY, United States:Springer (2021). p. 165–75.
111. Alche MN, Acevedo D, Mejail M. Efficientarl: Improving Skin Cancer Diagnoses by Combining Lightweight Attention on Efficientnet. Proc IEEE/CVF Int Conf Comput Vision (2021), 3354–60. doi: 10.1109/ICCVW54120.2021.00374
112. Yilmaz A, Kalebasi M, Samoylenko Y, Guvenilir ME, Uvet H. Benchmarking of Lightweight Deep Learning Architectures for Skin Cancer Classification Using Isic 2017 Dataset. ArXiv} (2021) abs/2110.12270:arXiv:2110.12270. doi: 10.48550/arXiv.2110.12270
113. Chen J, Chen J, Zhou Z, Li B, Yuille A, Lu Y. Mt-Transunet: Mediating Multi-Task Tokens in Transformers for Skin Lesion Segmentation and Classification. ArXiv (2021) abs/2112.01767:arXiv:2112.01767. doi: 10.48550/arXiv.2112.01767
114. Hajabdollahi M, Esfandiarpoor R, Khadivi P, Soroushmehr SMR, Karimi N, Samavi S. Simplification of Neural Networks for Skin Lesion Image Segmentation Using Color Channel Pruning. Comput Med Imaging Graphics (2020) 82:101729. doi: 10.1016/j.compmedimag.2020.101729
115. Back S, Lee S, Shin S, Yu Y, Yuk T, Jong S, et al. Robust Skin Disease Classification by Distilling Deep Neural Network Ensemble for the Mobile Diagnosis of Herpes Zoster. IEEE (2021) 9:20156–69. doi: 10.1109/ACCESS.2021.3054403
116. Hameed N, Shabut A, Hameed F, Cirstea S, Harriet S, Hossain A. Mobile Based Skin Lesions Classification Using Convolution Neural Network. Ann Emerging Technol Comput (AETiC) (2020) 4:26–37. doi: 10.33166/AETiC.2020.02.003
117. Simonyan K, Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR (2014) abs/1409.1556:arXiv:1409.1556. doi: 10.48550/arXiv.1409.1556
118. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, et al. Going Deeper With Convolutions. Proc IEEE Conf Comput Vision Pattern Recognit (2015), 1–9. doi: 10.1109/CVPR.2015.7298594
119. He K, Zhang X, Ren S, Sun J. Deep Residual Learning for Image Recognition. Proc IEEE Conf Comput Vision Pattern Recognit (2016), 770–8. doi: 10.1109/CVPR.2016.90
120. Marcus G, Davis E. Rebooting Ai. In: Building Artificial Intelligence We can Trust (Vintage) . New York, NY, United States: Vintage (2019).
121. Rotemberg V, Kurtansky N, Betz-Stablein B, Caffery L, Chousakos E, Codella N, et al. A Patient-Centric Dataset of Images and Metadata for Identifying Melanomas Using Clinical Context. Sci Data (2021) 8:81–8. doi: 10.1038/s41597-021-00815-z
122. Mikołajczyk A, Grochowski M. Data Augmentation for Improving Deep Learning in Image Classification Problem, in: international interdisciplinary PhD workshop (IIPhDW), Manhattan, New York, U.S.:Institute of Electrical and Electronics Engineers (IEEE) (2018) 117–22 pp.
123. Tizhoosh HR, Pantanowitz L. Artificial Intelligence and Digital Pathology: Challenges and Opportunities. J Pathol Inf (2018) 9:1–6 doi: 10.4103/jpi.jpi_53_18
124. Blalock D, Ortiz JJG, Frankle J, Guttag J. What is the State of Neural Network Pruning? ArXiv (2020) abs/2003.03033:arXiv:2003.03033.
125. Hinton GE, Vinyals O, Dean J. Distilling the Knowledge in a Neural Network. ArXiv (2015) abs/1503.02531:ArXiv abs/1503.02531. doi: 10.48550/arXiv.1503.02531
126. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is All You Need. Adv Neural Inf Process Syst (2017) 30:5998–6008. doi: 10.48550/arXiv.1706.03762
127. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. ArXiv (2020) abs/2010.11929:arXiv:2010.11929. doi: 10.48550/arXiv.2010.11929
Keywords: generative adversarial networks, convolutional neural network, deep learning, skin cancer, image classification
Citation: Wu Y, Chen B, Zeng A, Pan D, Wang R and Zhao S (2022) Skin Cancer Classification With Deep Learning: A Systematic Review. Front. Oncol. 12:893972. doi: 10.3389/fonc.2022.893972
Received: 11 March 2022; Accepted: 16 May 2022; Published: 13 July 2022.
Reviewed by:
Copyright © 2022 Wu, Chen, Zeng, Pan, Wang and Zhao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Shen Zhao, [email protected]
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.
Journal of Electrical Systems and Information Technology volume 11 , Article number: 36 ( 2024 ) Cite this article
Metrics details
Skin conditions are becoming increasingly prevalent across the world in current times. With the rise in dermatological disorders, there is a need for computerized techniques that are completely noninvasive to patients’ skin. As a result, deep learning models have become standard for the computerized detection of skin diseases. The performance efficiency of these models improves with access to more data with their primary aim being image classification. In this paper, we present a skin disease detection methodology using image processing techniques, non-local means denoising and convolutional neural network (CNN) backed by sparse dictionary learning. Here, the major benefit of using NLM denoising followed by sparse dictionary learning with CNNs in image classification lies in leveraging a multi-stage approach that enhances the quality of input data, extracts meaningful and discriminative features, and improves the overall performance of the classification model. This combined approach addresses challenges such as noise robustness, feature extraction, and classification accuracy, making it particularly effective in complex image analysis tasks. For denoising, the average Peak Signal to Noise Ratio (PSNR) obtained for images from HAM-10000 dataset is 33.59 dB. For the ISIC-2019 dataset, the average PSNR for the train folder is 34.37 dB, and for the test folder it is 34.39 dB. The deep learning network is trained for the analysis of skin cancer images using a CNN model and is achieving acceptable results in classifying skin cancer types. The datasets used contain high-resolution images. After all the tests, the accuracy obtained is 85.61% for the HAM-10000 dataset and 81.23% for the ISIC-2019 dataset, which is on par with existing approaches validated by benchmarking results.
The skin plays a key role by providing a barrier against invading pathogens and protecting against bacterial, fungal, and parasitic infections, making it a very important organ. An aberrant state of the skin is termed a skin disease. Therefore, the proper diagnosis of skin diseases has a crucial role in determining the most effective treatment plan and achieving successful outcomes at the earliest.
Skin disease such as skin cancer, if diagnosed and treated early, have a substantial possibility of recovery. The World Health Organization (WHO) has stressed on the fact that the prevalence of skin diseases in India is estimated to account for 10–12% of the population [ 1 ]. There are very few skilled dermatologists available to attend to the vast population. Experienced dermatologists conventionally follow a procedure, initially conducting naked-eye detection of suspected lesions, followed by dermatoscopy, and then performing a biopsy to confirm the condition. This whole process is invasive to the patient’s skin, takes a significant amount of time, and the individual might advance to a subsequent stage [ 2 ]. Furthermore, reliable diagnosis is eccentric, depending on the practitioner’s expertise. These factors make the manual detection of skin diseases one of the challenging and onerous tasks for patients. The prognosis may also vary for a particular condition with the proficiency of clinicians and their familiarity with different dermatoscopy procedures being used in which they have received professional and formal training. Hence, for the purpose of early phase diagnosing of skin cancer and addressing the related issues, thorough exploration has been conducted to enhance algorithms for computer-based image analysis.
There are various traditional or handcrafted feature-based classifications where the feature selection process is time-consuming and the selection of pertinent features is critical for the correct detection. There are several machine learning models such as SVM and logistic regression that can be utilized for classification. The other efficient way of tackling disease classification is by employing Deep Learning models. Deep Learning is used widely and extensively as a classification method because it helps in calculating features with ease within the convolutional layers and has been proven highly efficient outperforming classical methods providing much better accuracy. For Convolutional Neural Networks (CNNs), various pre-trained models like U-Net, ResNet, AlexNet and MobileNet have been developed and their efficiency is constantly evolving. The key to CNN’s success is that the feature extraction part is not fixed. It provides a framework that is flexible enough to learn a suitable feature representation for the given task and yet constrained enough to be tractable and avoid the issue of overfitting. The advantage of the classifier component is its convenience, as it facilitates joint training through backpropagation, while the learned feature extraction component of the network remains intact.
In this paper, image processing and CNN along with sparse dictionary learning are employed for the detection of dermatological disorders in humans. The paper underlines the present techniques utilized for the detection of skin diseases, presents an approach for detection, and lists the numerous benefits. The article also includes an in-depth exploration of the various transformations employed for the implementation of the proposed method and the comparisons made are based on the accuracy parameter.
The main propositions for this paper are:
Proposal of a CNN Model: This paper aims to propose a CNN model approach to aid in the development of digital models for detecting skin diseases.
Image Processing and Noise Removal: To significantly improve accuracy, the use of Non-local Means for denoising is proposed.
Sparse Dictionary Learning: Implementation of sparse dictionary learning to further enhance the classification efficiency of the model. Non-Local Means (NLM) denoising excels in preserving structural details and improving image quality by averaging similar patches from across the image, rather than just nearby pixels. This method effectively reduces noise by leveraging the redundancy of natural image textures and structures. NLM’s adaptive filtering, which adjusts based on local content, allows it to handle varying noise levels and textures robustly. The result is enhanced perceptual quality, maintaining sharpness and clarity without excessive blurring. Its versatility makes it ideal for medical imaging applications, where high image quality is essential. Sparse-based dictionary learning provides a compact and efficient data representation using a sparse linear combination of atoms from a learned dictionary. This method requires only a few atoms to approximate each data point, resulting in a more compact representation than other techniques. It reduces storage needs and speeds up computational tasks in classification and image processing. The learned dictionary captures the underlying data structure, making it an effective feature extraction tool that uncovers important patterns. Additionally, sparse representations are robust to noise, as they focus on significant coefficients, minimizing the impact of noise on data representation. The dictionary adapts to the specific features of data, further enhancing its ability to capture domain-specific features, making sparse-based dictionary learning efficient and effective for data representation. The datasets utilized in this work include:
HAM-10000 Dataset: An open-source dataset available on the internet for public use, containing 10,015 images of seven types of pigmented skin lesion images.
ISIC-2019 Dataset: A dataset with a similar classification of pigmented skin lesion images, also used for this study.
Further, the framework for this paper is as follows: The previous work is mentioned in Section “ Literature survey ”. In Section “ Proposed method ”, proposed work is explained in detail. Experiments and results are discussed in Section “ Experiments and results ”. Finally, conclusion is given in Section “ Conclusion and future scope ” with future scope.
For the rapid detection and classification of skin diseases and to address the difficulties in diagnosing these conditions, many research solutions have been devised over the years by improving computer image analysis algorithms. Although, the type of data cannot be regulated, the algorithms have potential for improvement, and can more precisely diagnose diseases. Consequently, many researchers are continuously working on developing efficient systems, and some relevant work is discussed below.
In the work by [ 3 ] novel image processing techniques and geometry based features of the skin cancer involving ABCD and geometrical features of lesions are used for the detection of melanoma lesions. In [ 4 ], the authors aimed to classify the skin diseases by preprocessing, segmentation and feature extraction using the ABCD rule which asseses parameters such as asymmetry, edge irregularity, color, and lesion diameter. They achieved accuracy of up to 91.6% for detecting melanoma.
As discussed in [ 5 ], the authors focused on the detecting melanocytic lesions using a dataset collected from Dermweb, where segmentation is executed using clustering and SVM for classification. The accuracy obtained is 96.8%. However, this study emphasizes only on the detection of melanocytic skin lesions, i.e. melanoma. In similar work, presented in [ 6 ], a pre-trained neural network model is used for feature extraction and Error-Correcting Output Codes (ECOC) SVM classifier is employed for skin cancer classification [ 7 ] demonstrated that while CNN alone achieved an accuracy of about 91%, it was enhanced to 95.3% when combined with SVM. In [ 8 ], an SVM classifier is used with extracted features of shape, color and GLCM, to classify images into malignant or benign classes. Another study on the SVM classifier is discussed in [ 9 ] where color segmentation combined with an SVM classifier recognized eight classes of skin diseases with a performance of 94.79%.
The authors of [ 10 ] applied K-means clustering to gamma-corrected images for melanoma segmentation using GLCM to compute textural features from segmented images. Their model distinguishes melanoma into various types, such as nodular, acral, superficial and lentigo achieving 90% accuracy.
In [ 11 ] the authors used the Gaussian Radial Basis Kernel in SVM where, segmentation was performed by the GrabCut algorithm. Features like shape, color and geometry were computed using image processing techniques which were further classified as non-cancerous (mole benign) or cancerous (malignant). The best feature combination achieved an accuracy of 86.67%.
In [ 12 ], a method that operates on color image input precised and passed to a trained CNN achieved a 100% performance rate for detecting three different types of skin diseases. The features were classifed using SVM, with results presented to the user including details on disease type, severity and extent of spread.
In [ 13 ], a dual-stage approach combining Computer Vision and Machine Learning was proposed. It uses maximum entropy with ANN in the first stage, followed by KNN, decision trees, and ANN for prediction in the second stage, achieving an accuracy of up to 95% for six diseases. The authors in [ 14 ] also proposed a system for detecting nine types of dermatological skin conditions. They first pre-processed color skin images to extract significant features followed by disease identification using ANN, in a two phase technique, achieving an accuracy rate of 90%. According to [ 15 ], their study focuses on the developing a malignant melanoma detection system using dermatoscopic images and available patient records. The model, which works on resource-poor devices, provides a balanced CNN+ANN model with higher accuracy of (92.34%) as compared to the CNN model (73.69%), based on the ISIC Archive dataset.
As outlined by [ 16 ], their proposed model aggregates a robust CNN into a framework and performs final classification based on the weighted outputs of member CNNs.
In [ 2 ], the authors employed a deep learning model, to analyze cancer images from more than 24,000 of skin samples in the benchmarked ISIC dataset using three ConvNet architectures: VGG-19, ResNet and InceptionV3. Numerous parameters were utilized to determine the optimal architectures for classification. The best accuracy of 86.90% was achieved with InceptionV3, which also yielded a precision of 87.47%, sensitivity of 86.14%, and specificity of 87.66%. Similarly, the image classification task in [ 17 ] used Inception V3 and MobileNet V1. Here, Inception V3 model achieved 72% accuracy while MobileNet V1 showed 58% accuracy. Research in [ 18 ] attained an overall accuracy of 83.1% using a pre-trained MobileNet.
In recent work, [ 19 ], the authors developed a deep learning framework using CNNs to classify skin diseases from dermoscopy images, using the CNN model, specifically VGG-16 to generalize and accurately predict skin diseases. Finally, the predictions are analyzed and the results are displayed through a web application, providing a practical interface for users to view the outcomes. Another recent work on VGG, [ 20 ], investigates the application of machine learning (ML) algorithms, particularly CNNs, in diagnosing skin diseases using the ISIC and Dermofit datasets, which contain 26,150 images. The study found that the CNN model, especially with Transfer Learning (TL) using VGG19, achieved a high accuracy of 94.91% in detecting various skin conditions as measured by metrics like Receiver Operating Characteristics (ROC) and Mean Squared Error (MSE).
In another work [ 21 ], the authors focused on the classification of nine skin diseases that share similar physical characteristics using a diverse dataset, preprocessing techniques were employed to prepare the data for training and testing. The LeNet architecture was then applied to classify these diseases, achieving a 95% accuracy rate. The proposed deep learning model not only demonstrated state-of-the-art performance in skin disease detection but also excelled in segmentation tasks, accurately delineating diseased regions. The model achieved a sensitivity of 90%, specificity of 94%, precision of 92%, and recall of 94%, underscoring its effectiveness in skin disease diagnosis.
In the work proposed by [ 22 ], introduced a new deep learning model based on optimal probability, where features extracted from pre-processed images were applied to the model for training using MATLAB software. This approach achieved an accuracy of 95%, specificity of 0.97, and sensitivity of 0.91. According to [ 23 ], a region-based CNN combined with fuzzy based k-means clustering was employed. In this method, visual information underwent enhancement before applying the faster RCNN to obtain fixed-length feature vectors. FKM was then used for segmenting skin regions affected by melanoma. This method delivered an average accuracy of 95.40% on the ISIC-2016 dataset, 93.1% on ISIC-2017, and 95.6% on the PH2 dataset.
The work by [ 24 ] integrated imaging modalities and patient metadata to improve the precision of automated skin lesion diagnosis. They used a modified ResNet-50 architecture and a multimodal classifier, surpassing the capabilities of a baseline classifier for melanoma detection.
In [ 25 ], a DCNN with a cross-modality learning strategy was proposed, extracting comprehensive features from sub-networks using Class Activation Mapping-Bilinear Pooling (CAM-BP). This technique produced probability maps that improved overall performance and facilitated the decision-making.
The authors in [ 26 ], worked on a modified GoogLeNet algorithm, to reduce the time complexity and achieved the highest classification accuracy of 0.9309.
In [ 27 ], the authors proposed an automated system for computer-aided diagnosis that focused on the classification of multi-class skin (MCS) cancer. The fine-tuning was performed on the seven classes from the HAM-10000 dataset, comparing five trained CNNs and four ensemble models. They reported a maximum accuracy of 93.20% with the individual ResNeXt101 model and 92.83% with the ensemble model. ResNeXt101 was recommended for the MCS cancer classification due to its optimally designed architecture and superior accuracy.
In [ 28 ], an acne type classification system using a custom CNN sequential model was studied, examining accuracy variations based on model parameters. The level of accuracy ranged from 90%-95%, with skin sensitivity and acne density varing from 93 to 96%. [ 29 ], proposed a custom CNN model, with picture handling strategies on a dormant dataset of 500 images, successfully detecting various diseases with the precision of 73%. [ 30 ] employed a CNN based algorithm trained on clinical images encompassing forty skin diseases. This customized variant of Densenet-161 with a hybrid implementation combining focal loss and log loss, achieved 76.93 ± 0.88% accuracy and average 0.95 ± 0.02 AUC on a collection of affected skin images. The clinical diagnosis of a broad range of skin conditions highlights the potential of smartphone applications powered by artificial intelligence as a convenient point-of-care guidance tools.
In recent work by [ 31 ], a hybrid architecture combining deep CNN techniques inspired by pre-trained models was proposed for detection of skin cancer. It included three core techniques, uniform distribution of convolutional filters throughout the entire architecture, use of residual connections to address vanishing gradient and a cyclic learning rate for annealing. Authors of [ 32 ], described a model, that presented a cascading of ensemble networks that leverages an integrated ConvNet and multi-layer perceptron based on computed handcrafted features. This approach has shown significant improvement in the performance of ensemble model, achieving increase from 85.3 to 98.3%, compared to CNN.
In [ 33 ], comprehensive survey of various deep learning techniques for diagnosing skin diseases was presented, including the performance evaluation metrics for various deep learning architectures, popular frameworks, and algorithms.
Hence, the reviewed literature in this paper demonstrated that machine learning in skin disease detection is a rapidly growing field with tremendous potential to enhance diagnostic accuracy and efficiency, potentially revolutionizing the field of dermatology.
The proposed method involves a multi-step process, beginning with pre-processing, which includes denoising the image, followed by the implementation of sparse dictionary learning to achieve enhanced accuracy, and finally using a CNN model for classification. The block diagram for the same is demonstrated in Fig. 1 .
Block diagram of the proposed model
The initial stage of the process involves performing resizing, denoising, and data augmentation, which are critical for preparing the data for accurate and efficient analysis. The steps for the same are portrayed in Fig. 2 .
Preprocessing and data augmentation
The starting point is to undertake resizing, where the images are resized to 100 × 100. Further, the image noises are removed using Non-local means Denoising algorithm which is done using OpenCV library.
Non-Local Means (NLM) denoising is being used for denoising, due to its excellent capability to preserve image details and textures while effectively handling various types of noise. A comparative study for some denoising techniques is provided in Table 1 The Non-local Means Denoising works efficiently for both Gaussian as well as Salt and Pepper noise, and also preserves skin textures better that other techniques. This algorithm works on fundamental principle that the colors of multiple image sub-windows which are similar to the pixel neighborhood can be averaged to replace the color of a single pixel. From Fig. 3 , it is seen that in the denoising algorithm, pixel value is substituted by computing the mean from the set of neighboring pixels. This process involves the comparison of the patches which are centered on the neighboring pixels with the patch, centered on the pixel of interest. Here, only those pixels which have patches that are similar to the current patch are used for the averaging purpose. The computation of non-local means is given in Eq. 1 .
where N Lu(p) represents the result of applying the Non-Local Means filter at pixel p, d(B(p), B(q)) denotes Euclidean distance of the image patches centered respectively at p and q . u(q) is the pixel value at q, which is weighted by the function f based on the distance between patches. C(p) denotes normalization and f is a decreasing function.
For pixel-wise implementation,
where ui(p) represents the denoised value at pixel p, C(p) is the normalization factor for pixel p. ui(q) is the pixel value at q, which is weighted by w(p,q) . w(p,q) is the weight assigned to pixel q based on the similarity between the patch centered at p and the patch centered at q.
Further for a patch this could be represented as,
where Bi represents the denoised value of the patch centered at pixel p, C is the normalization factor for patch B. ui(Q) is the value of the patch Q. w(B,Q) is the weight assigned to patch Q based on its similarity to patch B. N is normalization constant, representing the size of the neighborhood.
The implementation has a patch size of 7 × 7 and a window size of 21 × 21. This algorithm is capable of effectively restoring textures that have been blurred by other denoising algorithms. Figure 4 illustrates the flowchart describing the main steps of the denoising algorithm. In the proposed work, the mean of Peak Signal to Noise Ratio i.e. PSNR obtained for 10,015 images in HAM-10000 is 33.59 dB. In the similar manner for ISIC daatset, the average values obtained are 34.37 dB and 34.39 dB for train and test folders respectively which are admissible values for effective denoising.
Flowchart for non-local means denoising
Filtering using non-local means
Figure 5 displays the result of applying NonLocal Means Denoising, when applied to the image, this process effectively mitigates the noise that is present in the original image.
Further, data augmentation on HAM-10000 is applied since the CNN learns well if abundant data is provided, images are rotated by 90 \(^{\circ }\) clockwise, counterclockwise, 180 \(^{\circ }\) and flipped along the x and y-axis. Since the CNN is not affected by the orientation of the image so the dataset is increased by changing the orientation of all lesion images except Melanocytic nevi as it is present in sufficient numbers. The bar graph depicting dataset’s dimension before and after augmentation is shown, respectively in Figs. 6 and 7 .
HAM 10000 before augmentation
HAM 10000 after augmentation
From the dataset, HAM-10000, seven diseases are taken for assessment, and the image count is more than 26,000, after augmentation, which is split into two segments. Here training to testing ratio is maintained 8:2. The ratio was chosen as it is a commonly adopted split in the works of literature, ensuring consistency with previous studies. Since the HAM-10000 dataset is unbalanced even after augmentation as could be seen in Fig. 7 , class weights are assigned within minority classes by allocating higher weights for specific classes and lower weights for other classes.
sci-kit is used to compute the class weights, where an inbuilt function “compute_class_weight ” is used. The class weights can be balanced by passing the standard parameter as balanced, which is readily made available in sci-kit-learn models. Following are the class weights provided to the seven classes of diseases in HAM-10000:[0: 0.565, 1: 0.568, 2: 0.575, 3: 1.230, 4: 1.934, 5: 4.454, 6: 5.5].
Further, a second dataset from ISIC-2019 is also included for validation of the model. In this case, the dataset has already been divided into training and testing directories.
In the proposed work, the images are subjected to sparse dictionary learning which is based on the fact that natural signals such has images are allowed for sparse presentation, i.e. expressed as a linear amalgamation of small number of fundamental vectors which are discriminative in nature, and so are used for classification purposes. It plays a crucial role in image denoising and restoration, enabling the recovery of clean images from degraded versions by leveraging sparse representations. Additionally, sparse dictionary learning excels in automatic feature extraction, dimensionality reduction, and robust classification and recognition tasks. It facilitates efficient representation and processing of signals and improves adaptive filtering and compression algorithms by learning tailored dictionaries. By using sparse-based dictionary learning before CNNs, a structured and principled approach to handling complex data representations is ensured, enhancing the performance and accuracy of CNN-based model. This is done using sklearn Mini-batch dictionary learning is a technique for identifying a set of atoms that efficiently sparsely encode the fitted data. The parameter Sparsity controlling parameter,(alpha) is set to one, and the number of dictionary elements to extract are set to 64 with the batch size as 100 for 100 iterations.
The images are then passed to the CNN model for learning and further are tested for the lesion type where its accuracy of prediction is given.
A convolutional layer contains a set of filters that convolve the image with the kernel dot product and compute the image at each spatial location, where the kernel slides by the length stride length. The activation function is the final component of a convolutional layer, which enhances the nonlinearity of the output, which is also referred to as a feature map. Subsequently, the pooling layer is linked to reduce the input size and make the detected features more robust. The batch normalization technique is used to expedite the training process, elevate learning rates, and facilitate learning, it is integrated between the layers of a Neural Network. As a result, rather than utilizing raw data, mini-batches are employed. Furthermore, the optimizers are utilized with the aim of minimizing the losses and to modify various attributes like learning rate, weights of neural networks. The complete flow is explained with visuals in Fig. 8 .
Overview of the model
The processing layers encompass Convolution Layer, Pooling Layer, Rectified Linear Units (ReLU), Normalization, Softmax layer, and Fully Connected dense layers.
For the CNN model, the parameters listed below have been utilized.:
Filters: Here, the suggested approach is to create a bottleneck structure which is to, keep increasing neurons in the first few layers and then reduce it, in our case the order being 128, 256, 512, 512, 256, for better learning.
Kernel Size: The size of the convolutional filter in pixels. For learning minute details, smaller Kernel Size (1 × 1), (3 × 3) are used, and initially, it has been kept bigger for learning bigger details as (11 × 11) and (5 × 5).
Activation:The activation functions being used are Relu and Softmax. ReLu is given as-
Here in equation ( 7 ), function f(x) is defined as x when x is greater than or equal to 0, and 0 when x is less than 0. In equation ( 8 ) z is a vector, zi is the i-th component of z, and k is the number of components in the vector z . ReLU activation function is employed in the hidden layers for avoiding the problem of vanishing gradient and computation of better performance. The Softmax function is employed in the final output layer to transform the network’s output into a probability distribution across the predicted output classes, ensuring normalization..
Optimizer: Adaptive Moment Estimation (Adam) combines the merits of both momentum-based optimization and root mean square propagation (RMSprop) to achieve fast convergence, low memory requirements, and robustness to noisy gradients, making it a popular choice for deep learning practitioners.
The determination of batch size is dependent on the number of training data instances that need to be distributed across the neural network during each iteration (train step). An epoch consists of a series of training steps. It takes several epochs to train the neural network so that the detected error approaches zero. So, choosing an appropriate batch size and epoch count is crucial for achieving optimal performance in neural network training, as smaller batch sizes tend to result in noisier gradients while larger batch sizes can lead to slower convergence and overfitting. The batch size is kept as 32 and epoch as 20 initially for testing denoising and is further epoch set to 100.
After successfully implementing the proposed methodology, the model has demonstrated promising results, confirming its potential to effectively detect skin diseases to a great extent.
HAM-10000, where HAM denotes for “Human Against Machine” [ 34 ]. It has total 10,015 images, which is a large consolidation of dermatoscopic pigmented lesion images obtained from different sources. It is an open-source dataset and covers all significant diagnostic categories related to pigmented lesions. The dermatoscopic images have been sourced from diverse populations and acquired and stored using various modalities. The dataset contains seven different classes of skin cancer lesions, i.e. Melanocytic nevi (nv), Melanoma (mel), Benign keratosis-like lesions (bkl), Basal cell carcinoma (bcc), Actinic keratoses (ak), Vascular lesions (vasc), Dermatofibroma (df).
Additionally, this model has also been trained using a dataset provided by The International Skin Imaging Collaboration (ISIC) [ 35 ]. ISIC-2019 is freely accessible online for the purpose of research activities, that also contains these seven types of skin lesion images. The dataset is organized into two directories, one for training and another for testing, each of which are further divided into folders containing different lesions as Melanocytic nevi (nv), Dermatofibroma (df), Basal cell carcinoma (bcc), Actinic keratoses (ak), Vascular lesions (vasc), Benign keratosis-like lesions (bkl), Melanoma (mel) containing specific lesion images.
The proposed methodology is initially implemented in Python, using Google Colab.
Further, on the GPU where Nvidia RTX-4000, with 8 GB RAM on Windows 11 is used for final results implementing sparse dictionary learning.
The frameworks used are described as given:
Tensorflow: It is a Python library, is an open-source framework created and upheld by Google, designed to efficiently handle matrix data operations and consequently ideal for creating neural networks. It has become the go-to framework for all the tasks involving machine and deep learning.
Keras: It is a powerful framework for deep learning that is designed to work with the Python programming language. This library sits“on top”of TensorFlow, offering concise syntax for constructing neural networks, which are then converted into TensorFlow models to facilitate all machine learning operations.
Open CV: It is a widely adopted library which is open-source and is utilized for machine learning, computer vision, and image processing. It provides critical support for real-time operations such as segmentation and detection that are essential in today’s systems. When it is integrated with additional libraries like NumPy, Python becomes capable of efficiently analyzing the OpenCV array structure.
scikit-learn (sklearn): Sklearn is the one of the best, extremely helpful and reliable Python library for machine learning, offering a myriad of efficient tools employed in machine learning and statistical modeling, involving classification, clustering, regression, and dimensionality reduction.
In an attempt to measure the performance of the proposed skin disease recognition model, we have analyzed some related research that resembles our research. This work is making a contribution to the machine learning strategy that provides the possibility of detecting a large variety of skin conditions in the human by enlarging the image dataset.
The first step in the technique we implemented is to obtain images of the impacted skin. Usually the images available differ in properties, so preprocessing techniques are performed. The datasets used are standard, HAM- 10000 and ISIC-2019 containing lesion images. Denoising greatly improves the image quality and performance of the model. Following this, sparse dictionary learning further enhances the accuracy of the model. For both datasets, denoising is performed, followed by sparse dictionary learning to improve the fidelity of the model before implementing the CNN for classification.
We can see from the resulting plots that without denoising the value accuracy achieved is 56.79% and that with denoising it is 71.30% for 20 epochs in HAM-10000 datasets. Further,the model loss curve has larger spikes in the case of non-denoised images as compared to those denoised using non-local means denoising, this could be seen for both the datasets, as it reduces bad data by fixing poor-quality data with noise, which makes harder for the system to detect the underlying patterns. The initial results without performing denoising are illustrated in Figs. 9 and 11 . Further, Figs. 10 and 12 show the plots obtained after denoising. These plots show that the results obtained with denoising are better than those performed without denoising (Figs. 11 , 12 ).
The curves depicting model accuracy and model loss for HAM-10000 are given in Figs. 13 and 14 respectively. The curves depicting model accuracy and model loss for ISIC-2019 datasets are given in Figs. 15 and 16 respectively. The accuracy obtained for HAM-10000 dataset is 85.61% and the accuracy obtained for the ISIC-2019 dataset is 81.23%. This shows that accuracy can be greatly improved using sparse dictionary learning. The relative percentage difference of accuracy for HAM-10000 is 17.27% greater than that obtained with Inception V3 as stated by Eddy et al., and the results for the ISIC dataset are also comparable as could be observed in Table 2 .
Other than accuracy other plots obtained are ROC, Confusion Matrix and HeatMap as given in Figs. 17 , 19 , 21 and Figs. 18 , 20 , 22 for HAM-10000 and ISIC dataset respectively. We have obtained ROC Curve and AUC metrics for multiclass classification in both the datasets, as this shows the separability of all the classes by all possible thresholds, i.e. how well the model is classifying each class. The ROC curves have been illustrated in Figs. 17 and 18 .
Similarly, the Area Under the Curve (AUC) serves as a metric that evaluates the capability of a binary classifier to differentiate between the classes and serves as an aggregate rendering of the given ROC curve. The model’s effectiveness in discerning between positive and negative classes improves as the AUC rises. The model obtained an AUC score of 0.977 for HAM-10000 and for ISIC-2019 AUC score is 0.963. These results demonstrate the significant impact of denoising and sparse dictionary learning on model performance, with substantial improvements in accuracy and reduced model loss. The high AUC scores and other evaluation metrics further validate the model’s robustness, making it a reliable tool for accurately detecting and classifying various skin conditions.
Without non-local means denoising on HAM-10000
With non-local means denoising on HAM-10000
Without non-local means denoising on ISIC-2019 dataset images
With non-local means denoising on ISIC-2019 dataset images
Model accuracy curve obtained for HAM 10000 with sparse dictionary learning
Model loss curve obtained for HAM 10000 with sparse dictionary learning
Model accuracy curve for ISIC-2019 with sparse dictionary learning
Model loss curve obtained for ISIC-2019 with sparse dictionary learning
ROC curve obtained for HAM-10000 where defined classes are 0-nv, 1-mel, 2-bkl, 3-bcc, 4-ak, 5-vasc, 6-df
ROC curve obtained for ISIC-2019 where defined classes are 0-nv, 1-df, 2-bcc, 3-ak, 4-vasc, 5-bkl, 6-mel
Confusion Matrix obtained for HAM-10000
Confusion Matrix obtained for ISIC-2019
Heat Map obtained for HAM-10000
Heat Map obtained for ISIC-2019
Another way to understand the effectiveness of categorical classifiers is by using the Confusion Matrix. The confusion matrix obtained for both datasets is given in Figs. 19 and 20 . The Heatmap gives a graphical representation of data using color coding to represent the relative values of different data points. The Heatmap obtained for both datasets is given in Figs. 21 and 22 respectively.
While the chosen NLM denoising parameters were effective in reducing noise, these parameters might not be universally optimal across all image variations within the dataset. A fixed patch size and window size may not account for all types of noise, potentially leaving some residual noise unaddressed. Implementing adaptive NLM denoising, where the patch size and window size are dynamically adjusted based on the image characteristics, could lead to more effective noise reduction, potentially improving the model’s accuracy. Further hyperparameter tuning within the sparse dictionary learning framework, such as adjusting the number of dictionary atoms or the sparsity level, could optimize the model’s ability to capture underlying patterns, thereby enhancing it’s accuracy. By considering these improvements, the model’s accuracy and robustness could be further enhanced, making it more effective in real-world applications.
The proposed technique in the manuscript includes the implementation of a computer-based system for the detection of skin condition. To begin with, the images from the database are gathered and subjected to pre-processing by performing denoising of the images, applying non-local means denoising algorithm. The diseases are classified using sparse dictionary learning and the CNN model. The accuracy achieved by the proposed model is 85.61% for HAM-10000 dataset and accuracy attained is 81.23% for the ISIC-2019 dataset.
The application of automated diagnostics for skin diseases has the potential to yield significant advantages. However, accurate diagnosis increases the need for a robust automated diagnostic process, that can be used by both the skilled experts and novice clinicians. The skin and skin diseases pattern differs from country to country. This work can be extended by using a diverse dataset with images of all differing skin hues and qualities to further enhance the learning of the model with including more categories of skin diseases to make the model more versatile. The efficiency of the model can be improved with varied hyperparameters of the CNN model for the best possible results. Further, work can be extended by taking real-time images of the lesions and affected skin, so the system could be used in mobile applications and made available to remote people as well.
Data will be made available on request.
Ajith A, Goel V, Vazirani P, Roja MM (2017) Digital dermatology skin disease detection model using image processing. In: International conference on intelligent computing and control systems (ICICCS)
Mijwil MM (2021) Skin cancer disease images classification using deep learning solutions. Multimed Tools Appl 80:26255–26271
Article Google Scholar
Jain S, Jagtap V, Pise N (2015) Computer aided melanoma skin cancer detection using image processing. Procedia Comput Sci 48:735–740
Garg N, Sharma V, Kaur P (2018) Melanoma skin cancer detection using image processing. Sensors and image processing, vol 651. Advances in Intelligent Systems and Computing. Springer, Singapore
Chapter Google Scholar
Suganya R (2016) An automated computer aided diagnosis of skin lesions detection and classification for dermoscopy images. In: International conference on recent trends in information technology (ICRTIT), pp 1–5
Dorj U-O, Lee K-K, Choi J-Y, Lee M (2018) The skin cancer classification using deep convolutional neural network. Multimed Tools Appl 77:9909–9924
Hasija Y, Garg N, Sourav S (2017) Automated detection of dermatological disorders through image-processing and machine learning. In: 2017 international conference on intelligent sustainable systems (ICISS), pp 1047–1051
Thaajwer MA, Ishanka UP (2020) Melanoma skin cancer detection using image processing and machine learning techniques. In: 2020 2nd international conference on advancements in computing (ICAC), pp 363–368
Nawar A, Sabuz NK, Siddiquee SMT, Rabbani M, Biswas AA, Majumder A (2021) Skin disease recognition: a machine vision based approach. In: 2021 7th international conference on advanced computing and communication systems (ICACCS), pp 1029–1034
Thiyaneswaran B, Anguraj K, Kumarganesha S, Thangaraj K (2020) Early detection of melanoma images using gray level co-occurrence matrix features and machine learning techniques for effective clinical diagnosis. Int J Imaging Syst Technol 31:682–694
Mustafa S, Kimura A (2018) A SVM-based diagnosis of melanoma using only useful image features. In: 2018 international workshop on advanced image technology (IWAIT), pp 1–4
Alenezi NSA (2019) A method of skin disease detection using image processing and machine learning. Procedia Comput Sci 163:85–92
Kumar VB, Kumar SS, Saboo V (2016) Dermatological disease detection using image processing and machine learning. In: 2016 third international conference on artificial intelligence and pattern recognition (AIPR), pp 1–6
Yasir MR, Rahman A, Ahmed N (2014) Dermatological disease detection using image processing and artificial neural network. In: 8th international conference on electrical and computer engineering, pp 687–690
Ningrum DNA, Yuan S-P, Kung W-M, Wu C-C, Tzeng I-S, Huang C-Y, Yu-Chuan J, Wang Y-C (2021) Deep learning classifier with patient’s metadata of dermoscopic images in malignant melanoma detection. J Multidiscip Healthc 14:877–885
Harangi B (2018) Skin lesion classification with ensembles of deep convolutional neural networks. J Biomed Inform 86:25–32
Eddy PIK, Kusuma HA, Ratna AAP, Nurtanio I, Hidayati AN, Purnomo MH, Nugroho SMS, Rachmadi RF (2019) Disease classification based on dermoscopic skin images using convolutional neural network in teledermatology system. In: 2019 international conference on computer engineering, network, and intelligent multimedia (CENIM), pp 1–5
Chaturvedi SS, Gupta K, Prasad PS (2020) Skin lesion analyser: an efficient seven-way multi-class skin cancer classification using MobileNet. In: Advanced machine learning technologies and applications: proceedings of AMLTA 2020
Saranya K, Vijayashaarathi S, Sasirekha N, Rishika M, Rajeswari PSR (2024) Skin disease detection using CNN (convolutional neural network). In: International conference on data engineering and communication systems (ICDECS)
Gupta M, Kumar R, Nandan Pradhan AK, Obaid AJ (2024) Skin disease detection using neural networks. In: International conference on advancements in smart, secure and intelligent computing (ASSIC)
Ahalya RK, Babu G, Sathish S, Shruthi K (2024) Automated skin disease detection using deep learning algorithms. In: International conference on communication and signal processing (ICCSP)
Jain A, Rao ACS, Jain PK, Abraham A (2022) Multi-type skin diseases classification using OP-DNN based feature extraction approach. Multimed Tools Appl 81:6451–6476
Nawaz M, Mehmood Z, Nazir T, Naqvi RA, Rehman A, Iqbal M, Saba T (2022) Skin cancer detection from dermoscopic images using deep learning and fuzzy k-means clustering. Microsc Res Tech 85(1):339–351
Yap J, Yolland W, Tschandl P (2018) Multimodal skin lesion classification using deep learning. Exp Dermatol 27(11):1261–1267
Ge Z, Demyanov S, Chakravorty R, Bowling A, Garnavi R (2017) Skin disease recognition using deep saliency features and multimodal learning of dermoscopy and clinical images. In: Medical image computing and computer assisted intervention MICCAI
Yilmaz E, Trocan M (2021) A modified version of GoogLeNet for melanoma diagnosis. J Inf Telecommun 5:395–405
Google Scholar
Chaturvedi SS, Tembhurne JV, Diwan T (2020) A multi-class skin cancer classification using deep convolutional neural networks. Multimed Tools Appl 79:28477–28498
Karunanayake RK, Dananjaya WM, Peiris MY, Gunatileka BR, Lokuliyana S, Kuruppu A (2020)CURETO: skin diseases detection using image processing and CNN. In: 2020 14th international conference on innovations in information technology (IIT), pp 1–6
Rimi TA, Sultana N, Foysal MFA (2020) Derm-NN: skin diseases detection using convolutional neural network. In: 2020 4th international conference on intelligent computing and control systems (ICICCS), pp 1205–1209
Pangti R, Mathur J, Chouhan V, Kumar S, Rajput L, Shah S, Gupta A, Dixit A, Dholakia D, Gupta S, Gupta S, George M, Sharma VK, Gupta S (2020) A machine learning-based, decision support, mobile phone application for diagnosis of common dermatological diseases. Acad Dermatol Venereol 35(2):536–545
Diwan T, Shukla R, Ghuse E, Tembhurne JV (2023) Model hybridization & learning rate annealing for skin cancer detection. Multimed Tools Appl 82:2369–2392
Sharma AK, Tiwari S, Aggarwal G, Goenka N, Chakrabarti AKP, Chakrabarti T, Gono R, Leonowicz Z, Jasinski M (2022) Dermatologist-level classification of skin cancer using cascaded ensembling of convolutional neural network and handcrafted features based deep neural network. IEEE Access 10:17920–17932
Li H, Pan Y, Zhao J, Zhang L (2021) Skin disease diagnosis with deep learning: a review. Neurocomputing 464:364–393
Tschandl P, Rosendahl C, Kittler H (2018) The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Sci Data 5:1–9
Combalia M, Codella NC, Rotemberg V, Helba B, Vilaplana V, Reiter O, Carrera C, Barreiro A, Halpern AC, Puig S, Malvehy J (2019) Isic-2019 challenge. Bcn20000: dermoscopic lesions in the wild
Download references
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
Authors and affiliations.
Department of Electronics and Communication Engineering, Indian Institute of Information Technology Nagpur, Nagpur, Maharsahtra, 441108, India
Apeksha Pandey, Manepalli Sai Teja, Parul Sahare & Mayur Parate
Department of Electronics and Communication Engineering, Visvesvaraya National Institute of Technology, Nagpur, Maharashtra, 440010, India
Vipin Kamble
Department of Electronics and Communication Engineering, National Institute of Technology Warangal, Hanamkonda, Telangana, 506004, India
Mohammad Farukh Hashmi
You can also search for this author in PubMed Google Scholar
Apeksha Pandey and Manepalli Sai Teja: Data curation, Software, Writing, Original draft preparation. Parul Sahare: Conceptualization, Methodology, Supervision. Vipin Kamble and Mayur Parate: Visualization, Investigation, Validation, Writing, Reviewing and Editing. Parul Sahare and Mohammad Farukh Hashmi: Writing, Reviewing and Editing. All authors have given approval to the final version of the manuscript.
Correspondence to Parul Sahare .
Ethics approval and consent to participate.
Not applicable.
Competing interests.
The authors of this publication declare there is no conflict of interest.
Publisher's note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .
Reprints and permissions
Cite this article.
Pandey, A., Teja, M.S., Sahare, P. et al. Skin cancer classification using non-local means denoising and sparse dictionary learning based CNN. Journal of Electrical Systems and Inf Technol 11 , 36 (2024). https://doi.org/10.1186/s43067-024-00162-0
Download citation
Received : 30 January 2024
Accepted : 27 August 2024
Published : 06 September 2024
DOI : https://doi.org/10.1186/s43067-024-00162-0
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
IMAGES
VIDEO
COMMENTS
This paper presents a detailed systematic review of deep learning techniques for the early detection of skin cancer. Research papers published in well-reputed journals, relevant to the topic of skin cancer diagnosis, were analyzed. Research findings are presented in tools, graphs, tables, techniques, and frameworks for better understanding.
Skin cancer is the second most common cancer (after breast cancer) in women between the ages of 30 and 35, and the most common cancer in women between the ages of 25 and 29, ... Magnetic resonance imaging T1-and T2-mapping to assess renal structure and function: A systematic review and statement paper. Nephrol. Dial. Transplant.
Skin cancer is one the most dangerous types of cancer and is one of the primary causes of death worldwide. The number of deaths can be reduced if skin cancer is diagnosed early. Skin cancer is mostly diagnosed using visual inspection, which is less accurate. Deep-learning-based methods have been proposed to assist dermatologists in the early and accurate diagnosis of skin cancers.
of Dermatology Association (AAD, 2020), skin cancer is the most common cancer in the United States and worldwide. Despite advances in modern medicine and disease prevention, the prevalence of skin cancer continues to grow. Skin cancer incidence rates are estimated at 9,500 new cases every day in the United States alone (AAD, 2020).
Skin cancer is one of the most dangerous diseases in the world. Correctly classifying skin lesions at an early stage could aid clinical decision-making by providing an accurate disease diagnosis, potentially increasing the chances of cure before cancer spreads. ... As a highlight of this paper, we next summarize several frontier problems ...
Skin cancer is the most common form of cancer generally associated with the over-exposure to sun or other ... Various methods were considered and studied before the selection of the actual functions for this thesis. The ... To keep the paper as relevant as possible a lot of other papers and journals were referred and consulted. The
Skin cancer is among the most common types of cancer, and quick identification considerably enhances the odds of survival. The purpose of this work is to develop cutting-edge deep learning models that can classify images of skin cells and accurately detect cases of skin cancer. The strength of deep learning algorithms is utilized in this research, which uses a cloud-based architecture. These ...
Identifying melanoma at the early stages of diagnosis is imperative as early detection can exponentially increase one's chances of cure. The paper first proposes a literature survey of multiple methods used for performing skin cancer classification. Our methodology consists of using Convolutional Neural Network (CNN) to identify and diagnose the skin cancer using the IS IC dataset containing ...
There are different types of skin. cancers, of which Melanoma, Basal cell carcinoma (BCC), Squamous. cell carcinoma (SCC), which are considered as dangerous types. And the other types include ...
The DNA inside the skin cells can be annihilated by the radiation of UV rays from the sun. In addition, unusual swellings of the human body are also a cause of skin cancer. There are four most frequent types of skin cancer like Actinic keratoses, Basal cell carcinoma, Squamous cell carcinoma, and Melanoma (Dorj, Lee, Choi, & Lee, 2018).
This paper focuses on the presentation of a comprehensive, systematic literature review of classical approaches of deep learning, such as artificial neural networks (ANN), convolutional neural networks (CNN), Kohonen self-organizing neural networks (KNN), and generative adversarial neural networks (GAN) for skin cancer detection.
This paper proposed an artificial skin cancer detection system using image processing and machine learning method. The features of the affected skin cells are extracted after the segmentation of ...
State-of-the-art classifiers based on convolutional neural networks (CNNs) were shown to classify images of skin cancer on par with dermatologists and could enable lifesaving and fast diagnoses, even outside the hospital via installation of apps on mobile devices. To our knowledge, at present there is no review of the current work in this ...
Skin cancer has become an important reason for human death in recent years. In the United States alone, the disease affects roughly three million people annually. As the disease progresses, the survival rate dramatically decreases. Because skin cancer extends slowly to various parts of the human body, it is easier to treat in its initial stages, so early detection is best. Due to its high ...
Microsoft Word - B01231589_RecepErol_Thesis.docx. SKIN CANCER MALIGNANCY CLASSIFICATION WITH TRANSFER LEARNING. by. Recep Erol. A thesis presented to the Department of Computer Science. and the Graduate School of University of Central Arkansas in partial. fulfillment of the requirements for the degree of.
Recent years have noticed an increase in the total number of skin cancer cases and it is projected to grow exponentially, however mortality rate of malignant melanoma can be decreased if it is diagnosed and treated in its early stage. Notwithstanding the fact that visual similarity between benign and
The strategy for detecting skin cancer using image processing technologies is presented in this paper. The system receives the image of the skin lesion as an input and analyses it using cutting-edge image processing methods to determine whether skin cancer is present. The Lesion Image Analysis Tools use texture, size, and shape assessment for ...
This paper presents. a detailed systematic review of deep learning techniques for the early detection of skin cancer. Research papers published in well-reputed journals, relevant to the topic of ...
A Study on Skin Cancer Detection: Researcher: Maiti, Ananjan: Guide(s): Chatterjee, Biswajoy: Keywords: Computer Science Computer Science Interdisciplinary Applications Engineering and Technology: ... title of the thesis - ananjan maiti.pdf: 2.74 MB: Adobe PDF: View/Open: Show full item record
2.1.1 Clinical Images. Clinical images are obtained by photographing the skin disease site directly with a camera. They can be used as a medical record for patients and provide different insights for dermoscopy images ().The biggest issue of utilizing clinical images for skin cancer classification is that they include limited morphological information while also introducing considerable ...
In this research paper, we proposed an extended model for the early detection of skin cancer... The purpose is reduce the waiting time to obtaining a diagnosis, in addition, the function of the dermatoscope has been digitized by using a Smartphone and magnifying lenses as an accessory the mobile device. The proposed model has five phases: 1. The patient is attended by a general practitioner or ...
the type of skin cancer automatically from the images can assist in the quick diagnosis and enhanced accuracy saving valuable time. This paper presents a review on automated diagnosis of skin cancer by analyzing images using Image Processing techniques with applying intelligence using Machine Learning.
Skin conditions are becoming increasingly prevalent across the world in current times. With the rise in dermatological disorders, there is a need for computerized techniques that are completely noninvasive to patients' skin. As a result, deep learning models have become standard for the computerized detection of skin diseases. The performance efficiency of these models improves with access ...
Remarkable gains have been made in global health in the past 25 years, but progress has not been uniform. Mortality and morbidity from common conditions needing surgery have grown in the world's poorest regions, both in real terms and relative to other health gains. At the same time, development of safe, essential, life-saving surgical and anaesthesia care in low-income and middle-income ...