Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

Hate speech detection: Challenges and solutions

Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Supervision, Validation, Writing – original draft, Writing – review & editing

* E-mail: [email protected]

Affiliation Information Retrieval Laboratory, Georgetown University, Washington, DC, United States of America

ORCID logo

Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Writing – original draft, Writing – review & editing

Roles Conceptualization, Writing – original draft, Writing – review & editing

Roles Conceptualization, Methodology, Supervision, Writing – review & editing

  • Sean MacAvaney, 
  • Hao-Ren Yao, 
  • Eugene Yang, 
  • Katina Russell, 
  • Nazli Goharian, 
  • Ophir Frieder

PLOS

  • Published: August 20, 2019
  • https://doi.org/10.1371/journal.pone.0221152
  • Reader Comments

Table 1

As online content continues to grow, so does the spread of hate speech. We identify and examine challenges faced by online automatic approaches for hate speech detection in text. Among these difficulties are subtleties in language, differing definitions on what constitutes hate speech, and limitations of data availability for training and testing of these systems. Furthermore, many recent approaches suffer from an interpretability problem—that is, it can be difficult to understand why the systems make the decisions that they do. We propose a multi-view SVM approach that achieves near state-of-the-art performance, while being simpler and producing more easily interpretable decisions than neural methods. We also discuss both technical and practical challenges that remain for this task.

Citation: MacAvaney S, Yao H-R, Yang E, Russell K, Goharian N, Frieder O (2019) Hate speech detection: Challenges and solutions. PLoS ONE 14(8): e0221152. https://doi.org/10.1371/journal.pone.0221152

Editor: Minlie Huang, Tsinghua University, CHINA

Received: April 3, 2019; Accepted: July 22, 2019; Published: August 20, 2019

Copyright: © 2019 MacAvaney et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the manuscript, its Supporting Information files, and the provided data links as follows. Forum dataset: https://github.com/aitor-garcia-p/hate-speech-dataset . Instructions to get TRAC dataset: https://sites.google.com/view/trac1/shared-task . HatebaseTwitter dataset: https://github.com/t-davidson/hate-speech-and-offensive-language . HatEval dataset: https://competitions.codalab.org/competitions/19935 .

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Hate crimes are unfortunately nothing new in society. However, social media and other means of online communication have begun playing a larger role in hate crimes. For instance, suspects in several recent hate-related terror attacks had an extensive social media history of hate-related posts, suggesting that social media contributes to their radicalization [ 1 , 2 ]. In some cases, social media can play an even more direct role; video footage from the suspect of the 2019 terror attack in Christchurch, New Zealand, was broadcast live on Facebook [ 2 ].

Vast online communication forums, including social media, enable users to express themselves freely, at times, anonymously. While the ability to freely express oneself is a human right that should be cherished, inducing and spreading hate towards another group is an abuse of this liberty. For instance, The American Bar Association asserts that in the United States, hate speech is legal and protected by the First Amendment, although not if it directly calls for violence [ 3 ]. As such, many online forums such as Facebook, YouTube, and Twitter consider hate speech harmful, and have policies to remove hate speech content [ 4 – 6 ]. Due to the societal concern and how widespread hate speech is becoming on the Internet [ 7 ], there is strong motivation to study automatic detection of hate speech. By automating its detection, the spread of hateful content can be reduced.

Detecting hate speech is a challenging task, however. First, there are disagreements in how hate speech should be defined. This means that some content can be considered hate speech to some and not to others, based on their respective definitions. We start by covering competing definitions, focusing on the different aspects that contribute to hate speech. We are by no means, nor can we be, comprehensive as new definitions appear regularly. Our aim is simply to illustrate variances highlighting difficulties that arise from such.

Competing definitions provide challenges for evaluation of hate speech detection systems; existing datasets differ in their definition of hate speech, leading to datastets that are not only from different sources, but also capture different information. This can make it difficult to directly access which aspects of hate speech to identify. We discuss the various datasets available to train and measure the performance of hate speech detection systems in the next section. Nuance and subtleties in language provide further challenges in automatic hate speech identification, again depending on the definition.

Despite differences, some recent approaches found promising results for detecting hate speech in textual content [ 8 – 10 ]. The proposed solutions employ machine learning techniques to classify text as hate speech. One limitation of these approaches is that the decisions they make can be opaque and difficult for humans to interpret why the decision was made. This is a practical concern because systems that automatically censor a person’s speech likely need a manual appeal process. To address this problem, we propose a new hate speech classification approach that allows for a better understanding of the decisions and show that it can even outperform existing approaches on some datasets. Some of the existing approaches use external sources, such as a hate speech lexicon, in their systems. This can be effective, but it requires maintaining these sources and keeping them up to date which is a problem in itself. Here, our approach does not rely on external resources and achieves reasonable accuracy. We cover these topics in the following section.

In general, however, there are practical challenges that remain among all systems. For instance, armed with the knowledge that the platforms they use are trying to silence them, those seeking to spread hateful content actively try to find ways to circumvent measures put in place. We cover this topic in more detail in the last section.

In summary, we discuss the challenges and approaches in automatic detection of hate speech, including competing definitions, dataset availability and construction, and existing approaches. We also propose a new approach that in some cases outperforms the state of the art and discuss remaining shortcomings. Ultimately, we conclude the following:

  • Automatic hate speech detection is technically difficult;
  • Some approaches achieve reasonable performance;
  • Specific challenges remain among all solutions;
  • Without societal context, systems cannot generalize sufficiently.

Defining hate speech

The definition of hate speech is neither universally accepted nor are individual facets of the definition fully agreed upon. Ross, et al. believe that a clear definition of hate speech can help the study of detecting hate speech by making annotating hate speech an easier task, and thus, making the annotations more reliable [ 11 ]. However, the line between hate speech and appropriate free expression is blurry, making some wary to give hate speech a precise definition. For instance, the American Bar Association does not give an official definition, but instead asserts that speech that contributes to a criminal act can be punished as part of a hate crime [ 12 ]. Similarly, we opt not to propose a specific definition, but instead examine existing definitions to gain insights into what typically constitutes hate speech and what technical challenges the definitions might bring. We summarize leading definitions of hate speech from varying sources, as well as some aspects of the definitions that make the detection of hate speech difficult.

  • Encyclopedia of the American Constitution: “Hate speech is speech that attacks a person or group on the basis of attributes such as race, religion, ethnic origin, national origin, sex, disability, sexual orientation, or gender identity.” [ 13 ]
  • Facebook: “We define hate speech as a direct attack on people based on what we call protected characteristics—race, ethnicity, national origin, religious affiliation, sexual orientation, caste, sex, gender, gender identity, and serious disease or disability. We also provide some protections for immigration status. We define attack as violent or dehumanizing speech, statements of inferiority, or calls for exclusion or segregation.” [ 4 ]
  • Twitter: “Hateful conduct: You may not promote violence against or directly attack or threaten other people on the basis of race, ethnicity, national origin, sexual orientation, gender, gender identity, religious affiliation, age, disability, or serious disease.” [ 6 ]
  • Davidson et al.: “Language that is used to expresses hatred towards a targeted group or is intended to be derogatory, to humiliate, or to insult the members of the group.” [ 9 ]
  • de Gilbert et al.: “Hate speech is a deliberate attack directed towards a specific group of people motivated by aspects of the group’s identity.” [ 14 ]
  • Fortuna et al. “Hate speech is language that attacks or diminishes, that incites violence or hate against groups, based on specific characteristics such as physical appearance, religion, descent, national or ethnic origin, sexual orientation, gender identity or other, and it can occur with different linguistic styles, even in subtle forms or when humour is used.” [ 8 ]. This definition is based on their analysis of various definitions.

It is notable that in some of the definitions above, a necessary condition is that it is directed to a group. This differs from the Encyclopedia of the American Constitution definition, where an attack on an individual can be considered hate speech. A common theme among the definitions is that the attack is based on some aspect of the group or peoples identity. While in de Gilbert’s definition the identity itself is left vague, some of the other definitions provide specific identity characteristics. In particular, protected characteristics are aspects of the Davidson et al. and Facebook definitions. Fortuna et al.’s definition specifically calls out variations in language style and subtleties. This can be challenging, and goes beyond what conventional text-based classification approaches are able to capture.

Fortuna et al.’s definition is based on an analysis of the following characteristics from other definitions [ 8 ]:

  • Hate speech is to incite violence or hate
  • Hate speech is to attack or diminish
  • Hate speech has specific targets
  • Whether humor can be considered hate speech

A particular problem not covered by many definitions relate to factual statements. For example, “Jews are swine” is clearly hate speech by most definitions (it is a statement of inferiority), but “Many Jews are lawyers” is not. In the latter case, to determine whether each statement is hate speech, we would need to check whether the statement is factual or not using external sources. This type of hate speech is difficult because it relates to real-world fact verification—another difficult task [ 15 ]. More so, to evaluate validity, we would initially need to define precise word interpretations, namely, is “many” an absolute number or by relative percentage of the population, further complicating the verification.

Another issue that arises in the definition of hate speech is the potential praising of a group that is hateful. For example, praising the KKK is hate speech, however praising another group can clearly be non-hate speech. In this case it is important to know what groups are hate groups and what exactly is being praised about the group as some praising is undoubtedly, and unfortunately, true. For example, the Nazis were very efficient in terms of their “Final Solution”. Thus, praise processing alone is, at times, difficult.

Collecting and annotating data for the training of automatic classifiers to detect hate speech is challenging. Specifically, identifying and agreeing whether specific text is hate speech is difficult, as per previously mentioned, there is no universal definition of hate speech. Ross, et al. studied the reliability of hate speech annotations and suggest that annotators are unreliable [ 11 ]. Agreement between annotators, measured using Krippendorff’s α , was very low (up to 0.29). However, they compared annotations based on the Twitter definition, versus annotations based on their own opinions and found a strong correlation.

Furthermore, social media platforms are a hotbed for hate speech, yet many have very strict data usage and distribution policies. This results in a relatively small number of datasets available to the public to study, with most coming from Twitter (which has a more lenient data usage policy). While the Twitter resources are valuable, their general applicability is limited due to the unique genre of Twitter posts; the character limitation results in terse, short-form text. In contrast, posts from other platforms are typically longer and can be part of a larger discussion on a specific topic. This provides additional context that can affect the meaning of the text.

Another challenge is that there simply are not many publicly-available, curated datasets that identify hateful, aggressive, and insulting text. A representative sampling of available training and evaluation public datasets is shown in Table 1 :

  • HatebaseTwitter [ 9 ]. One Twitter dataset is a set of 24,802 tweets provided by Davidson, et al [ 9 ]. Their procedure for creating the dataset was as follows. First they took a hate speech lexicon from Hatebase [ 16 ] and searched for tweets containing these terms, resulting in a set of tweets from about 33,000 users. Next they took a timeline from all these users resulting in a set of roughly 85 million Tweets. From the set of about 85 million tweets, they took a random sample, of 25k tweets, that contained terms from the lexicon. Via crowdsourcing, they annotated each tweet as hate speech, offensive (but not hate speech), or neither hate speech nor offensive. If the agreement between annotators was too low, the tweet was excluded from the set. A commonly-used subset of this dataset is also available, containing 14,510 tweets.
  • WaseemA [ 17 ]. Waseem and Hovy also provide a dataset from Twitter, consisting of 16,914 tweets labeled as racist, sexist, or neither [ 17 ]. They first created a corpus of about 136,000 tweets that contain slurs and terms related to religious, sexual, gender, and ethnic minorities. From this corpus, the authors themselves annotated (labeled) 16,914 tweets and had a gender studies major review the annotations.
  • WaseemB [ 18 ]. In a second paper, Waseem creates another dataset by sampling a new set of tweets from the 136,000 tweet corpus [ 18 ]. In this collection, Waseem recruited feminists and anti-racism activists along with crowdsourcing for the annotation of the tweets. The labels therein are racist, sexist, neither or both.
  • Stormfront [ 14 ]. de Gilbert, et al. provide a dataset from posts from a white supremacist forum, Stormfront [ 14 ]. They annotate the posts at sentence level resulting in 10,568 sentences labeled with Hate, NoHate, Relation, or Skip. Hate and NoHate labels indicate presence or lack thereof, respectively, of hate speech in each sentence. The label “Relation” indicates that the sentence is hate speech when it is combined with the sentences around it. Finally, the label “skip” is for sentences that are non-English or not containing information related to hate or non-hate speech. They also capture the amount of context (i.e., previous sentences) that an annotator used to classify the text.
  • TRAC [ 19 ]. The 2018 Workshop on Trolling, Aggression, and Cyberbullying (TRAC) hosted a shared task focused on detecting aggressive text in both English and Hindi [ 19 ]. Aggressive text is often a component of hate speech. The dataset from this task is available to the public and contains 15,869 Facebook comments labeled as overtly aggressive, covertly aggressive, or non-aggressive. There is also a small Twitter dataset, consisting of 1,253 tweets, which has the same labels.
  • HatEval [ 20 ]. This dataset is from SemEval 2019 (Task 5) for competition on multilingual detection of hate targeting to women and immigrants in tweets [ 20 ]. It consists of several sets of labels. The first indicates whether the tweet expresses hate towards women or immigrants, the second, whether the tweet is aggressive, and the third, whether the tweet is directed at an individual or an entire group. Note that targeting an individual is not necessarily considered hate speech by all definitions.
  • Kaggle [ 21 ] Kaggle.com hosted a shared task on detecting insulting comments [ 21 ]. The dataset consists of 8,832 social media comments labeled as insulting or not insulting. While not necessarily hate speech, insulting text may indicate hate speech.
  • GermanTwitter [ 11 ]. As part of their study of annotator reliability, Ross, et al. created a Twitter dataset in German for the European refugee crisis [ 11 ]. It consists of 541 tweets in German, labeled as expressing hate or not.

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.pone.0221152.t001

Note that these datasets vary considerably in their size, scope, characteristics of the data annotated, and characteristics of hate speech considered. The most common source of text is Twitter, which consists of short-form online posts. While the Twitter datasets do capture a wide variety of hate speech aspects in several different languages such as attacking different groups, the construction process including the filtering and sampling methods introduce uncontrolled factors for analyzing the corpora. Furthermore, corpora constructed from social media and websites other than Twitter are rare, making analysis of hate speech difficult to cover the entire landscape.

There is also the issue of imbalance in the number of hate and not hate texts within datasets. On a platform such as Twitter, hate speech occurs at a very low rate compared to non-hate speech. Although datasets reflect this imbalance to an extent, they do not map the actual percentage due to training needs. For example, in the WaseemA dataset [ 17 ], 20% of the tweets were labelled sexist, 11.7% racist, and 68.3% neither. In this case, there is still an imbalance in the number of sexist, racist, or neither tweets, but it may not be as imbalanced as expected on Twitter.

Automatic approaches for hate speech detection

Most social media platforms have established user rules that prohibit hate speech; enforcing these rules, however, requires copious manual labor to review every report. Some platforms, such as Facebook, recently increased the number of content moderators. Automatic tools and approaches could accelerate the reviewing process or allocate the human resource to the posts that require close human examination. In this section, we overview automatic approaches for hate speech detection from text.

Keyword-based approaches

A basic approach for identifying hate speech is using a keyword-based approach. By using an ontology or dictionary, text that contain potentially hateful keywords are identified. For instance, Hatebase [ 16 ] maintains a database of derogatory terms for many groups across 95 languages. Such well-maintained resources are valuable, as terminology changes over time. However, as we observed in our study of the definitions of hate speech, simply using a hateful slur is not necessarily enough to constitute hate speech.

Keyword-based approaches are fast and straightforward to understand. However, they have severe limitations. Detecting only racial slurs would result in a highly precise system but with low recall where precision is the percentage of relevant from the set detected and recall is the percent of relevant from within the global population. In other words, a system that relies chiefly on keywords would not identify hateful content that does not use these terms. In contrast, including terms that could but are not always hateful (e.g., “trash”, “swine”, etc.) would create too many false alarms, increasing recall at the expense of precision.

Furthermore, keyword-based approaches cannot identify hate speech that does not have any hateful keywords (e.g., figurative or nuanced language). Slang such as “build that wall” literally means constructing a physical barrier (wall). However, with the political context, some interpret this is a condemnation of some immigrates in the United States.

Source metadata

Additional information from social media can help further understand the characteristics of the posts and potentially lead to a better identification approach. Information such as demographics of the posting user, location, timestamp, or even social engagement on the platform can all give further understanding of the post in different granularity.

However, this information is not often readily available to external researchers as publishing data with sensitive user information raises privacy issues. External researchers might only have part or even none of the user information. Thus, they possibly solve the wrong puzzle or learn based on wrong knowledge from the data. For instance, a system trained on these data might naturally bias towards flagging content by certain users or groups as hate speech based on incidental dataset characteristics.

Using user information potentially raises some ethical issues. Models or systems might be biased against certain users and frequently flag their posts as hateful even if some of them are not. Similarly, relying too much on demographic information could miss posts from users who do not typically post hateful content. Flagging posts as hate based on user statistics could create a chilling effect on the platform and eventually limit freedom of speech.

Machine learning classifiers

Machine learning models take samples of labeled text to produce a classifier that is able to detect the hate speech based on labels annotated by content reviewers. Various models were proposed and proved successful in the past. We describe a selection of open-sourced systems presented in the recent research.

Content preprocessing and feature selection.

To identify or classify user-generated content, text features indicating hate must be extracted. Obvious features are individual words or phrases (n-grams, i.e., sequence of n consecutive words). To improve the matching of features, words can be stemmed to obtain only the root removing morphological differences. Metaphore processing, e.g., Neuman, et. al. [ 22 ], likewise can extract features.

The bag-of-words assumption is commonly used in text categorization. Under this assumption, a post is represented simply as a set of words or n-grams without any ordering. This assumption certainly omits an important aspect of languages but nevertheless proved powerful in numerous tasks. In this setting, there are various ways to assign weights to the terms that may be more important, such as TF-IDF [ 23 ]. For a general information retrieval review, see [ 24 ].

Besides distributional features, word embeddings, i.e., assigning a vector to a word, such as word2vec [ 25 ], are common when applying deep learning methods in natural language processing and text mining [ 26 , 27 ]. Some deep learning architectures, such as recurrent and transformer neural networks, challenge the bag-of-words assumption by modeling the ordering of the words by processing over a sequence of word embeddings [ 28 ].

Hate speech detection approaches and baselines.

Naïve Bayes, Support Vector Machine and Logistic Regression . These models are commonly used in text categorization. Naïve Bayes models label probabilities directly with the assumption that the features do not interact with one another. Support Vector Machines (SVM) and Logistic Regression are linear classifiers that predict classes based on a combination of scores for each feature. Open-source implementations of the these models exist, for instance in the well-known Python machine learning package sci-kit learn [ 29 ].

Davidson, et al . [ 9 ] Davidson, et al. proposed a state-of-the-art feature-based classification model that incorporates distributional TF-IDF features, part-of-speech tags, and other linguistic features using support vector machines. The incorporation of these linguistic features helps identify hate speech by distinguishing between different usages of the terms, but still suffers from some subtleties, such as when typically offensive terms are used in a positive sense (e.g., queer in “He’s a damn good actor. As a gay man, it’s awesome to see an openly queer actor given the lead role for a major film.”, from HatebaseTwitter dataset [ 9 ]).

Neural Ensemble [ 10 ]. Zimmerman, et al. propose an ensemble approach, which combines the decisions of ten convolutional neural networks with different weight initializations [ 10 ]. Their network structure is similar to the one proposed by [ 30 ], with convolutions of length 3 pooled over the entire document length. The results of each model are combined by averaging the scores, akin to [ 31 ].

FastText [ 32 ]. FastText is an efficient classification model proposed by researchers in Facebook. The model produces embeddings of character n-grams and provides predictions of the example based on the embeddings. Over time, this model has become a strong baseline for many text categorization tasks.

BERT [ 26 ]. BERT is a recent transformer-based pre-trained contextualized embedding model extendable to a classification model with an additional output layer. It achieves state-of-the-art performance in text classification, question answering, and language inference without substantial task-specific modifications. When we experiment with BERT, we add a linear layer atop the classification token (as suggested by [ 26 ]), and test all suggested tuning hyperparameters.

C-GRU [ 33 ]. C-GRU, a Convolution-GRU Based Deep Neural Network proposed by Zhang, et al., combines convolutional neural networks (CNN) and gated recurrent networks (GRU) to detect hate speech on Twitter. They conduct several evaluations on publicly available Twitter datasets demonstrating their ability to capture word sequence and order in short text. Note, in the HatebaseTwitter [ 9 ] dataset, they treat both Hate and Offensive as Hate resulting in binary label instead of its original multi-class label. In our evaluation, we use the original multi-class labels where different model evaluation results are expected.

Our proposed classifier: Multi-view SVM

We propose a multi-view SVM model for the classification of hate speech. It applies a multiple-view stacked Support Vector Machine (mSVM) [ 34 ]. Each type of feature (e.g., a word TF-IDF unigram) is fitted with an individual Linear SVM classifier (inverse regularization constant C = 0.1), creating a view-classifier for those features. We further combine the view classifiers with another Linear SVM ( C = 0.1) to produce a meta-classifier . The features used in the meta-classifier are the predicted probability of each label by each view-classifier. That is, if we have 5 types of features (e.g., character unigram to 5-gram) and 2 classes of labels, 10 features would serve as input into the meta-classifier.

Combining machine learning classifiers is not a new concept [ 35 ]. Previous efforts have shown that combining SVM with different classifiers provides improvements to various data mining tasks and text classification [ 36 , 37 ]. Combining multiple SVMs (mSVMs) has also been proven to be an effective approach in image processing tasks for reducing the large dimensionality problem [ 38 ].

However, applying multiple SVMs to identify hate speech expands the domain of use for such classification beyond that previously explored. Multi-view learning is known for capturing different views of the data [ 34 ]. In the context of hate speech detection, incorporating different views captures differing aspects of hate speech within the classification process. Instead of combining all features into a single feature vector, each view-classifier learns to classify the sentence based on only one type of feature. This allows the view-classifiers to pick up different aspects of the pattern individually.

Integrating all feature types in one model, by regularization, risks the masking of relatively weak but key signals. For example, “yellow” and “people” individually would appear more times than “yellow people” combined; posts having these terms individually are unlikely to be hate. However, “yellow people” is likely hate speech (especially when other hate speech aspects are present), but the signal might be rare in the collection, and therefore, is likely masked by the regularization if all features are combined together. In this case, mSVM is able to pick up this feature in one of the view-classifiers, where there are fewer parameters.

Furthermore, this model offers the opportunity to interpret the model so as to identify which view-classifier contributes most through the meta-classifier provides human intuition for the classification. The view-classifier contributing most to the final decision identifies key vocabulary (features) resulting in a hate speech label. This contrasts with well-performing neural models, which are often opaque and difficult to understand [ 10 , 39 , 40 ]. Even state-of-the-art methods that employ self-attention (e.g., BERT [ 26 ]) suffer from considerable noise that vastly reduces interpretability.

Experimental setup

Using multiple hate speech datasets, we evaluated the accuracy of existing as well as our hate speech detection approaches.

Data preprocessing and features.

For simplicity and generality, preprocessing and feature identification is intentionally minimal. For pre-processing, we apply case-folding, tokenization, and punctuation removal (while keeping emoji). For features, we simply extract word TF-IDF from unigram to 5-gram and character N-gram counts from unigram to 5-gram.

We evaluate the approach on the Stormfront [ 14 ], TRAC [ 19 ], HatEval, and HatebaseTwitter [ 9 ] datasets previously described. These datasets provide a variety of hate speech definitions and aspects (including multiple types of aggression), and multiple types of online content (including online forums, Facebook, and Twitter content). For Stormfront, we use the balanced train/test split proposed in [ 14 ], with a random selection of 10% of the training set held out as validation data. For the TRAC dataset, we use the English Facebook training, validation, and test splits provided by [ 19 ]. For HatEval, we use a split of the training set for validation and use the official validation dataset for testing because the official test set is not public. Finally, for the HatebaseTwitter dataset [ 9 ], we use the standard train-validation-test split provided by [ 9 ].

Evaluation.

We evaluate the performance of each approach using accuracy and macro-averaged F 1 score. There are not a consensus in literature about which evaluation metrics to use. However, we believe that focusing on both accuracy and macro- F 1 offers good insights into the relative strengths and weaknesses of each approach.

Experimental results

We report the highest score of the approaches described above on each dataset in Table 2 . Complete evaluation results are available in supporting document S1 Table (including accuracy breakdown by label).

thumbnail

The top two approaches on each dataset are reported.

https://doi.org/10.1371/journal.pone.0221152.t002

In the Stormfront and TRAC datasets, our proposed approach provides state-of-the-art or competitive results for hate speech detection. On Stormfront, the mSVM model achieves 80% accuracy in detecting hate speech, which is a 7% improvement from the best published prior work (which achieved 73% accuracy). BERT performs 2% better than our approach, but the interpretability of the decisions the BERT model made are difficult to explain.

On the TRAC dataset, our mSVM approach achieves 53.68% macro F 1 score. Note that through optimization on the validation set, we found that using TF-IDF weights for character N-grams works better on Facebook dataset, so we report results using those TF-IDF instead of raw counts. This outperforms all other approaches we experimented with, including the strong BERT system. We also compared our approach to the other systems that participated in the shared task [ 19 ], and observed that we outperform them as well in terms of the metric they reported (weighted F-score) by 1.34% or higher. This is particularly impressive because our approach outperformed systems which rely on external datasets and data augmentation strategies.

Our approach outperformed the top-ranked ensemble [ 41 ] method by 3.96% in terms of accuracy and 2.41% in terms of F 1 . This indicates that mSVM learns from different aspects and preserves more signals as compared to a simple ensemble method that uses all features for each first-level classifier. BERT achieved 3% lower in terms of accuracy and 1% lower in terms of F 1 than our proposed method and still provided minimal interpretability, demonstrating that forgoing interpretability does not necessarily provide higher accuracy. For HatEval and HatebaseTwitter, the neural ensemble approach outperforms our method suggesting that neural approaches are better suited for Twitter data than mSVM-based solution. Previous works reported various metrics, e.g. a support-weighted F 1 in Davidson, et. al. [ 9 ], making comparison between models difficult. We report macro F 1 to mitigate the effect of imbalance between classes, which is an effect that has been baked in during the construction of the datasets. For a fair and complete comparison between the systems, we execute the systems from the previous works and calculate macro F 1 on the datasets reported in this study. The previous best performance on the Stormfront dataset used a recurrent neural network to achieve an accuracy of 0.73 [ 14 ]; our approach easily outperforms this method. On the TRAC dataset, others reported a weighted F 1 performance of 0.6425 using a recurrent neural network, without reporting accuracy or macro-averaged F 1 [ 19 , 42 ]. On HatebaseTweitter, others reported a macro F 1 score of 0.94 [ 33 ], but this is achieved by combining the hate and offensive categories, greatly simplifying the task.

In S1 Table , we observe that for most datasets and approaches, the accuracy is biased towards the majority class in the training data. This suggests the need for datasets that are more representative of real data distributions for future evaluation.

Considering the above mixed in terms of dominance evaluation results, given potential ethical concerns related to hate speech detection, we err on the side of caution and opt for interpretability over uncertain improvements on the evaluation metrics.

Interpretation of mSVM.

We analyzed the top features of the mSVM classifier on the Stormfront dataset. The meta-classifier weights character 4-grams and word unigrams as the highest contributors to the overall score. 4-grams such as “jew”, “ape”, “mud”, “egro” are among the strongest signals of being hate. (Note that whitespace contributes to character 4-grams.) This category appears to capture the aspect of a group’s identity. Word unigrams such as “invasion” and “violence” contribute highly to hate classification, and appear to capture the attack aspect. The top word unigrams, 2-grams and 3-grams from the view-classifier results of each dataset are in S3 Table . We found that the accuracy of all view-classifiers is at least two percent lower than the meta-classifier. The full comparison between view-classifier and meta-classifier results are given in supplementary information S2 Table . We also found that, although three other view-classifiers outperform the word unigram model, the meta-classifier still weights its score higher than those models, further suggesting that it captures a different hate speech aspect.

Interpretation of BERT.

Because the BERT model employs a self-attention mechanism, one can visualize the terms that the model relies most upon for classification purposes. We present attention visualizations from BertViz [ 43 ] for the trained BERT model on the mis-classified forum post “I don’t think anyone is insinuating that we are equal to non whites, or that we would ignore white nations.” (this post does not satisfy the authors’ conditions for hate speech, but the BERT model classified it as hateful). We present an detailed attention weights for all 12 attention heads of the classification token on layer 11 in Fig 1 . Despite appearing to be the most informative layer, we observe that Layer 11 does not provide a clear answer to why the model labeled the post as hateful; the attention is distributed among most words in the sentence, and many of the weights with the most attention do not appear to be informative (e.g., we ). When investigating the other layers (overview given in S1 Fig in the supplementary information) and other posts, we similarly do not see strong trends that would enable interpretability. This demonstrates the limitation of using deep neural models—even those with claims of interpretability—when trying to interpret the decisions made. These observations are in line with prior work that has found attention signals to be noisy and not necessarily indicative of term importance [ 39 , 40 ]. While our approach can be combined with neural models, it would come at the expense of increased model complexity and reduced interpretability.

thumbnail

Each color represents a different attention head, and the lightness of the color represents the amount of attention. For instance, the figure indicates that nearly all attention heads focus heavily on the term ‘we’.

https://doi.org/10.1371/journal.pone.0221152.g001

Error analysis.

To gain a better understanding of our mSVM classifier’s mistakes, we qualitatively analyze its false positive (FP) and false negative (FN) samples on the Stormfront dataset. We categorized the misclassified posts based on their mutual linguistic features, semantic features, and length. 41% of the posts misclassified as not hate needed surrounding context to understand that the post is hate speech. 7% of the FN were implicit hate, making it difficult to classify, such as “ Indeed, I haven’t seen or heard machines raping or robbing people in the streets of Stockholm yet, non-european immigrants however… ”. Furthermore, given that the inter-annotator agreement is not perfect in the dataset [ 14 ] (prior work shows that high inter-annotator agreement for hate speech is difficult to achieve [ 11 , 44 ]), we analyzed some borderline cases with the definition of hate speech used for annotation. When manually re-assessing the misclassified posts, we found that the gold label of the 17% of the FN and 10% of the FP posts do not match our interpretation of the post content. Another major problem is with posts that are aggressive but do not meet the necessary conditions to be considered hate speech. These constitute 16% of the FP. Finally, short posts (6 or fewer terms, representing less than 3% of hate speech sentences found in the dataset) increased FP as well, occurring 7% of the time. The remaining misclassified posts were miscellaneous cases including posts that are sarcastic or metaphoric.

Shortcomings and future work

A challenge faced by automatic hate speech detection systems is the changing of attitudes towards topics over time and historical context. Consider the following excerpt of a Facebook post:

“…The merciless Indian Savages, whose known rule of warfare, is an undistinguished destruction of all ages, sexes and conditions…”

Intuition suggests that this is hate speech; it refers to Native Americans as “merciless Indian savages”, and dehumanizes them by suggesting that they are inferior. Indeed, the text satisfies conditions used in most definitions of hate speech. However, this text is actually a quote from the Declaration of Independence. Given the historical context of the text, the user who posted it may not have intended the hate speech result, but instead meant to quote the historical document for other purposes. This shows that user intent and context play an important role in hate speech identification.

As another example, consider the phrase “the Nazi organization was great.” This would be considered hate speech because it shows support for a hate group. However, “the Nazi ’s organization was great” isn’t supporting their ideals but instead commenting on how well the group was organized. In some contexts, this might not be considered hate speech, e.g., if the author was comparing organizational effectiveness over time. The difference in these two phrases is subtle, but could be enough to make the difference between hate speech or not.

Another remaining challenge is that automatic hate speech detection is a closed-loop system; individuals are aware that it is happening, and actively try to evade detection. For instance, online platforms removed hateful posts from the suspect in the recent New Zealand terrorist attack (albeit manually), and implemented rules to automatically remove the content when re-posted by others [ 2 ]. Users who desired to spread the hateful messages quickly found ways to circumvent these measures by, for instance, posting the content as images containing the text, rather than the text itself. Although optical character recognition can be employed to solve the particular problem, this further demonstrates the difficulty of hate speech detection going forward. It will be a constant battle between those trying to spread hateful content and those trying to block it.

As hate speech continues to be a societal problem, the need for automatic hate speech detection systems becomes more apparent. We presented the current approaches for this task as well as a new system that achieves reasonable accuracy. We also proposed a new approach that can outperform existing systems at this task, with the added benefit of improved interpretability. Given all the challenges that remain, there is a need for more research on this problem, including both technical and practical matters.

Supporting information

S1 table. full comparison of hate speech classifiers..

https://doi.org/10.1371/journal.pone.0221152.s001

S2 Table. Full comparison of view classifiers in mSVM.

https://doi.org/10.1371/journal.pone.0221152.s002

S3 Table. Top 10 weighted vocabularies learned by Word-level view classifier.

This list has been sanitized.

https://doi.org/10.1371/journal.pone.0221152.s003

S1 Fig. Visualization of self-attention weights for the forum BERT model.

All layers and attention heads for the sentence “ I don’t think anyone is insinuating that we are equal to non whites, or that we would ignore white nations. ” are included. Darker lines indicate stronger attention between terms. The first token is the special classification token.

https://doi.org/10.1371/journal.pone.0221152.s004

Acknowledgments

We thank Shabnam Behzad and Sajad Sotudeh Gharebagh for reviewing early versions of this paper and for helpful feedback on this work. We also thank the anonymous reviewers for their insightful comments.

  • 1. Robertson C, Mele C, Tavernise S. 11 Killed in Synagogue Massacre; Suspect Charged With 29 Counts. 2018;.
  • 2. The New York Times. New Zealand Shooting Live Updates: 49 Are Dead After 2 Mosques Are Hit. 2019;.
  • 3. Hate Speech—ABA Legal Fact Check—American Bar Association;. Available from: https://abalegalfactcheck.com/articles/hate-speech.html .
  • 4. Community Standards;. Available from: https://www.facebook.com/communitystandards/objectionable_content .
  • 5. Hate speech policy—YouTube Help;. Available from: https://support.google.com/youtube/answer/2801939 .
  • 6. Hateful conduct policy;. Available from: https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy .
  • 7. Mondal M, Silva LA, Benevenuto F. A Measurement Study of Hate Speech in Social Media. In: ACM HyperText; 2017.
  • View Article
  • Google Scholar
  • 9. Davidson T, Warmsley D, Macy MW, Weber I. Automated Hate Speech Detection and the Problem of Offensive Language. ICWSM. 2017;.
  • 10. Zimmerman S, Kruschwitz U, Fox C. Improving Hate Speech Detection with Deep Learning Ensembles. In: LREC; 2018.
  • 11. Ross B, Rist M, Carbonell G, Cabrera B, Kurowsky N, Wojatzki M. Measuring the Reliability of Hate Speech Annotations: The Case of the European Refugee Crisis. In: The 3rd Workshop on Natural Language Processing for Computer-Mediated Communication @ Conference on Natural Language Processing; 2016.
  • 14. de Gibert O, Perez N, Garc’ia-Pablos A, Cuadros M. Hate Speech Dataset from a White Supremacy Forum. In: 2nd Workshop on Abusive Language Online @ EMNLP; 2018.
  • 15. Popat K, Mukherjee S, Yates A, Weikum G. DeClarE: Debunking Fake News and False Claims using Evidence-Aware Deep Learning. In: EMNLP; 2018.
  • 16. Hatebase;. Available from: https://hatebase.org/ .
  • 17. Waseem Z, Hovy D. Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter. In: SRW@HLT-NAACL; 2016.
  • 18. Waseem Z. Are you a racist or am i seeing things? annotator influence on hate speech detection on twitter. In: Proceedings of the first workshop on NLP and computational social science; 2016. p. 138–142.
  • 19. Kumar R, Ojha AK, Malmasi S, Zampieri M. Benchmarking Aggression Identification in Social Media. In: Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018). ACL; 2018. p. 1–11.
  • 20. CodaLab—Competition;. Available from: https://competitions.codalab.org/competitions/19935 .
  • 21. Detecting Insults in Social Commentary;. Available from: https://kaggle.com/c/detecting-insults-in-social-commentary .
  • 24. Grossman DA, Frieder O. Information Retrieval: Algorithms and Heuristics. Berlin, Heidelberg: Springer-Verlag; 2004.
  • 25. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J. Distributed Representations of Words and Phrases and their Compositionality. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ, editors. Advances in Neural Information Processing Systems 26. Curran Associates, Inc.; 2013. p. 3111–3119.
  • 26. Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:181004805 [cs]. 2018;.
  • 27. Yang Z, Chen W, Wang F, Xu B. Unsupervised Neural Machine Translation with Weight Sharing. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics; 2018. p. 46–55. Available from: http://aclweb.org/anthology/P18-1005 .
  • 28. Kuncoro A, Dyer C, Hale J, Yogatama D, Clark S, Blunsom P. LSTMs Can Learn Syntax-Sensitive Dependencies Well, But Modeling Structure Makes Them Better. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Melbourne, Australia: Association for Computational Linguistics; 2018. p. 1426–1436. Available from: http://aclweb.org/anthology/P18-1132 .
  • 30. Kim Y. Convolutional Neural Networks for Sentence Classification. In: EMNLP; 2014.
  • 31. Hagen M, Potthast M, Büchner M, Stein B. Webis: An Ensemble for Twitter Sentiment Detection. In: SemEval@NAACL-HLT; 2015.
  • 32. Joulin A, Grave E, Bojanowski P, Mikolov T. Bag of Tricks for Efficient Text Classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers. ACL; 2017. p. 427–431.
  • 33. Zhang Z, Robinson D, Tepper J. Detecting hate speech on twitter using a convolution-gru based deep neural network. In: European Semantic Web Conference. Springer; 2018. p. 745–760.
  • 34. Zhao J, Xie X, Xu X, Sun S. Multi-view learning overview: Recent progress and new challenges. Information Fusion. 2017;.
  • 36. Chand N, Mishra P, Krishna CR, Pilli ES, Govil MC. A comparative analysis of SVM and its stacking with other classification algorithm for intrusion detection. In: 2016 International Conference on Advances in Computing, Communication, & Automation (ICACCA)(Spring). IEEE; 2016. p. 1–6.
  • 37. Dong YS, Han KS. Boosting SVM classifiers by ensemble. In: Special interest tracks and posters of the 14th international conference on World Wide Web. ACM; 2005. p. 1072–1073.
  • 38. Abdullah A, Veltkamp RC, Wiering MA. Spatial pyramids and two-layer stacking SVM classifiers for image categorization: A comparative study. In: 2009 International Joint Conference on Neural Networks. IEEE; 2009. p. 5–12.
  • 39. Jain S, Wallace BC. Attention is not Explanation. ArXiv. 2019;abs/1902.10186.
  • 40. Serrano S, Smith NA. Is Attention Interpretable? In: ACL; 2019.
  • 41. Arroyo-Fernández I, Forest D, Torres JM, Carrasco-Ruiz M, Legeleux T, Joannette K. Cyberbullying Detection Task: The EBSI-LIA-UNAM system (ELU) at COLING’18 TRAC-1. In: The First Workshop on Trolling, Aggression and Cyberbullying @ COLING; 2018.
  • 42. Aroyehun ST, Gelbukh A. Aggression Detection in Social Media: Using Deep Neural Networks, Data Augmentation, and Pseudo Labeling. In: Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018). Santa Fe, New Mexico, USA: Association for Computational Linguistics; 2018. p. 90–97. Available from: https://www.aclweb.org/anthology/W18-4411 .
  • 43. Vig J. Visualizing Attention in Transformer-Based Language Representation Models. arXiv preprint arXiv:190402679. 2019;.
  • 44. Waseem Z. Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter. In: NLP+CSS @ EMNLP; 2016.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Elsevier - PMC COVID-19 Collection

Logo of pheelsevier

Detecting twitter hate speech in COVID-19 era using machine learning and ensemble learning techniques

Akib mohi ud din khanday.

a Department of Computer Sciences Baba Ghulam Shah University, Rajouri, Jammu & Kashmir 185234, India

Syed Tanzeel Rabani

Qamar rayees khan, showkat hassan malik.

b Department of Computer Sciences University of Kashmir, Srinagar, Jammu & Kashmir 190006, India

The COVID-19 pandemic has impacted every nation, and social isolation is the major protective method for the coronavirus. People express themselves via Facebook and Twitter. People disseminate disinformation and hate speech on Twitter. This research seeks to detect hate speech using machine learning and ensemble learning techniques during COVID-19. Twitter data was extracted from using its API with the help of trending hashtags during the COVID-19 pandemic. Tweets were manually annotated into two categories based on different factors. Features are extracted using TF/IDF, Bag of Words and Tweet Length. The study found the Decision Tree classifier to be effective. Compared to other typical ML classifiers, it has 98% precision, 97% recall, 97% F1-Score, and 97% accuracy. The Stochastic Gradient Boosting classifier outperforms all others with 99 percent precision, 97 percent recall, 98 percent F1-Score, and 98.04 percent accuracy.

1. Introduction

Information and communication technology (ICT) advancements have altered the artistic method of conveying and receiving information. Regardless of their diversity in behaviour, everyone in our universe wants to be kept up to date. People share views on Online Social Networking sites,various clients use this platform for spreading dubious/false information ( Joseph, Kar, & Ilavarasan, 2021 ). To secure information, we have to secure all links in a chain comprising PPT(People, Process, Technology). In the chain of links, people are usually weakest in any communication. Nowadays, adversarial use of social media is ubiquitous, and it is frequently used to distribute fake or misleading statements, posing a social, economic, and political risk ( Spohr, 2017 ; World Economic Forum, 2017 ). As the COVID-19 pandemic expands, more and more people practice physically distancing themselves from one another. The coronavirus consists of Severe Acute Respiratory Syndrome (SARS), Middle East Respiratory Syndrome (MERS) and Acute Respiratory Distress Syndrome (ARDS) Viruses. According to “World Health Organisation”, symptoms of this virus are Mild Fever, Sore throat, Dry Cough and running nose ( Khanday, Rabani, Khan, Rouf, & Mohi ud Din, 2020 d). Until 6th July 2020, no vaccine/drug is approved to cure this deadly virus. The COVID 19 pandemic had an undecorated political, economic and social effect. Social media and communication systems are also affected in extraordinary ways. As Classical media has tried to adjust to the quickly evolving situation, Alternate news media on the internet gave coronavirus its ideology spin. These media have been criticized for promoting social confusion and spreading theoretically hazardous “Fake News” or Conspiracy philosophies via social media and other available platforms ( Bail et al., 2018 ; Kar & Aswani, 2021 ). Facebook a social networking place which also owns WhatsApp and Instagram, published a report in which it was revealed that the messaging has been doubled since the rise in a pandemic. In certain countries, hate speech comes under umbrella of free speech. There are prohibitions against encouraging violence or societal disruption in the United States, Canada, France, the United Kingdom, and Germany ( Hua et al., 2013 ; Gillani, Yuan, Saveski, Vosoughi, & Roy, 2018 ). Facebook and Twitter have been criticised for not doing enough to prevent their services from being used to assault persons of a certain race, ethnicity, or gender ( Opinion | Twitter Must Do More to Block ISIS - The New York Times 2021 ) . They've stated that they'll work to eliminate prejudice and intolerance ( Facebook's Mark Zuckerberg ‘Understands Need to Stamp out Hate Speech’, Germany says | Daily Mail Online 2021 ). Meanwhile, therapeutic approaches, such as those used by Facebook and Twitter, have relied on users to report improper remarks, which has been a manual effort ( Facebook, Google, and Twitter agree German hate speech deal - BBC News 2021 ; Grover, Kar, Dwivedi, and Janssen, 2019 ). This not only necessitates a lot of effort on the part of human experts, but this also increases the risk of bias in judgements. Furthermore, a computer-based solution can accomplish this activity significantly faster than humans, a non-automated process performed by human annotators would have a significant impact on system reaction time. The need to automate the process of detecting online hate speech is highlighted by the tremendous increase in user-generated content on prior social networking sites, as well as the inability to scale manual screening ( Kushwaha, Kar, & Vigneswara Ilavarasan, 2020 ; Grover, Kar, & Ilavarasan, 2019 ; Wu & Gerber, 2018 ).

COVID-19 has created a social crisis by increasing inequality, exclusion and discrimination. Various rumours, philosophies, and propaganda regarding coronavirus were shared massively on various social network platforms like (Facebook, Twitter, WhatsApp, etc.). Logically unfound theories on potential causes and medicines made the rounds, triggering misperception and unsafe behaviour among people who followed these distorted and false recommendations. Hoaxes and propaganda are also being shared enormously through online social networks ( Khanday, Khan, & Rabani, 2020 b; Aswani, Kar, & Ilavarasan, 2019 ). With the advent of this pandemic in India, propaganda has blown out many fabrications. Hatemongers spread hate by criticizing a specific community ( Khanday, Khan, & Rabani, 2020 a). Due to the Tableegi Jaamat event held at Nizamuddin Markaz Delhi, India, various hate speech is being used to target a particular community. According to the latest Twitter data, various trending hashtags are being used to criticize a specific community. Many hate words are being tweeted every day, which can lead to a hazardous situation ( Neubaum & Krämer, 2017 ). It is still a challenge in social networking to detect hate speeches in real-time and attracted many researchers to develop scalable, automated methods for detecting hate speech using semantic content analysis based on Machine Learning (ML) and Natural Language Processing (NLP) ( Burnap & Williams, 2015 ; Ji Ho Park & Pascale, 2017 ). We extracted data from twitter using trending hashtags, labelling them manually into two Normal and Hate. Using the Twitter API and keywords, about 11K tweets were retrieved. #CoronaJihad, #CoronaTerrorisma, #COVID-19 and #MuslimCorona. The main aim of this study is to create a classifier that can classify tweets into hate and non-hate categories using several Machine Learning techniques that have been fine-tuned.

The noteworthy contributions of this paper are:

  • • The hybrid features engineering is being performed by merging TF/IDF, Bag of Words and Tweet Length..
  • • Thirty thousand tweets are extracted from Twitter to form a dataset, out of which 11000 are related to hate speech and are accordingly labeled to a particular class.
  • • Traditional machine and ensemble learning algorithms are trained and tested based on the proposed hybrid feature selection to classify the hate content shared through Twitter in COVID-19 Era.

The paper consists of VI sections, a brief background of hate speech and machine learning is being given in Section II. Section III provides detail of the proposed methodology. Experimental results are discussed in section IV, section V discusses about implications and limitations of proposed work and section VI concludes our work.

2. Related work

As social media has grown in popularity, research on automated hate speech detection has become a subject of public interest. When used to prevent text posting or blocklisting people, simple word-based algorithms fail to uncover subtle offending content and jeopardize right to free speech and expression ( Khanday, 2022 ). The issue of word ambiguity originates from the fact that a single word can have multiple meanings in different situations, and it is the fundamental reason of these approaches' high false-positive rate. Some conventional NLP techniques are unsuccessful at detecting uncommon spelling in user-generated comment content ( Kar & Dwivedi, 2020 ). This is also known as the "spelling variety problem", and it occurs when single characters in a token are intentionally or unintentionally replaced in order to obfuscate the detection. By and large, the intricacy of natural language constructions makes the process reasonably difficult. But due to the less availability of datasets, the researchers are not able to find the solution using the latest technology. Hate speech detection using supervised learning classification algorithms is not a new concept. Del Vigna, Cimino, Dell'Orletta, Petrocchi, and Tesconi (2017) found that a simple LSTM classifier did not outperform a standard SVM. Another method used a supervised model to detect the objectionable language in tweets ( Davidson, Warmsley, Macy, & Weber, 2017 ).A binary classifier is used for classifying a tweet into abusive language and hate speech. Nobata, Tetreault, Thomas, Mehdad, & Chang (2016) made an attempt to a supervised algorithm using various linguistic features to detect abusive material and grammatical aspects in the text, evaluated at the character unigram and bigram levels, and validated using Amazon data In general, we can highlight the most critical elements. The non-language agnostic feature of NLP-based models is one of their major flaws—the low detection scores.

For classification, machine learning methods can be used, but these algorithms need a huge amount of data for training. Hate speech term was used by Burnap and Williams (2015) , Gitari, Zuping, Damien, and Long (2015) , Silva, Mondal, Correa, Benevenuto, and Weber (2012) . Hate speech detection is typically cast by state of the art as a supervised text classification task ( Schmidt & Wiegand, 2017 ; Dubois & Blank, 2018 ). Various classical machine learning algorithms that rely on manual feature selection can perform this binary classification ( Warner & Hirschberg, 2012 ; Kwok & Wang, 2013 ; Waseem, 2016 ) showed how annotation is important in the classification task. The comparison was performed between expert and amateur annotations. About 6909 tweets are annotated using Crowd Flower. The annotators are chosen based on their knowledge of hate speech. The results showed that expert annotation showed better accuracy in classifying hate speech. Davidson et al. (2017) Hate speech is defined as rhetoric that is used to show hatred toward a certain group or is projected to denigrate, embarrass, or abuse the followers of that group. Crowd-sourcing is used to collect tweets that comprises keywords regarding hate speech. The tweets were labelled into multi-class tweets regarding hate speech, tweets with offensive words, and those that don't contain hate or offensive words. For labelling, Crowd-sourcing was used. The overall Precision, Recall and F1-score of best model are 91%, 90% and 90%, respectively. About 40% of the tweets are misclassified. The Precision of 44% and Recall of 61% are the classification report of hate class. Around 5 per cent of offensive tweets and 2% of inoffensive tweets have been wrongly classified as hate speech.

Hate speech can be detected using NLP concepts to use sentences' lexical and syntactic features ( Waseem, 2016 ) and AI-solutions and bag-of-words-based text representations ( Dubois & Blank, 2018 ). Unsupervised learning approaches for detecting offensive remarks in text are extremely widespread. Hatred users employ numerous obfuscation strategies, such as swapping a single character in insulting remarks, making automatic identification more difficult. They were using a binary classifier, for example. It has already been attempted on a paragraph2vec representation of words. Amazon data in the past, however it only worked successfully on an issue of binary classification ( Djuric et al., 2015 ). Another solution based on unsupervised learning, the authors offered a set of criteria for judging whether or not a tweet is offensive ( Waseem &Hovy, 2016 ). They also discovered that changes in user distribution by geography have just a minor influence. The detecting performance is only marginally affected. Another researcher used crowd-sourced strategy for combating hate speech, including constructing a new collection of annotations which supplements the obtainable datset ( Waseem, 2016 ). The effect of annotators' experience on labelling performance was explored. The authors dealt with tweet classification, but their main focus was on sexism, which they classified as "hostile," "benevolent," or "other" ( Jha & Mamidi, 2017 ).The authors employed Waseem & Hovy (2016) dataset of tweets, They labelled existing 'Sexism' tweets as 'Hostile,' while gathering their own for the 'Benevolent' class, to which they then applied the FastText and SVM ( Joulin, Grave, Bojanowski, & Mikolov, 2017 ). To solve the challenge, a supervised learning model based on a neural network is deployed. The technique surpassed any previously known unsupervised learning solution on the same dataset of tweets ( Badjatiya, Gupta, Gupta, & Varma, 2017 ).

Character n-grams extract features, and Gradient Boosted Decision Trees help with the LSTM model. Using character n-grams and word2vec pre-trained vectors, Convolution Neural Networks (CNN) were studied as a potential solution to the hate speech problem in tweets. Ji Ho Park & Pascale, 2017 turned the categorization into a two-step problem, in which abusive language is first differentiated from non-abusive material. The sort of abuse is then determined (Sexism or Racism). Four classes were forecasted using pre-trained CNN vectors, according to the authors ( Gambäck & Sikdar, 2017 ). In terms of F-score, they were marginally better than character n-grams.Even though the success of NLP approaches in hate-speech classification ( Schmidt & Wiegand, 2017 ), we believe Machine learning models can still make a significant contribution to the problem. It's also worth highlighting the challenge's inherent difficulty at this time, as indicated by the fact that no solution has yet managed to attain an F-score higher than 0.93. Table 1 summarizes the related work done in the field of Hate speech on social networks.

Summary of related work.

It is necessary to do research in order to identify people who use hate speech on social media, focusing on both their features and motivations as well as the social structures in which they are embedded. From the literature review, the following findings can be drawn:

  • • The majority of the work has been done on the already existing dataset.
  • • There is more scope for feature engineering, if done properly, the accuracy of the machine learning algorithms may increase.
  • • The dataset used in the existing work suffers from data Imbalance.

3. Methodology

The proposed methodology which is being used for detecting hate speech using Machine Learning is shown in Fig. 1 depicts a series of steps: (i) Data collection (ii) Preprocessing (iii) Feature Engineering (iv) Machine Learning Classification (v) Ensemble Learning Classification.

Fig 1

Proposed methodology.

3.1. Data collection

Data is being extracted via Twitter, a social media platform mostly used by celebrities and politicians to express their opinions.We used its Application Program Interface (API) ( Verma, Khanday, Rabani, Mir, & Jamwal, 2019 ). Various steps are being followed to extract data using Twitter API. We used hashtag #CoronaJihad, #CoronaTerrorism and #MuslimCorona to extract data from 4th April to 8th April 2020. About 30K tweets were extracted, but only 11K were relevant. The data is saved in the form of a CSV file such that it can be used for future analysis. The extracted dataset consists of about 16 attributes like Created, Text, Id, Screen Name etc.

3.1.1. Tweet length distribution

By this, we get the Length of each tweet in characters such that we can see the size of hate speech and non-hate speech tweets. Fig. 2 gives the tweet length distribution of the whole data set.

Fig 2

Length of tweets extracted.

3.1.2. Human annotation

Human annotation is a significant step in our research. The labelled tweets are needed for training the supervised machine learning models. Various researchers working in this area were given the task of annotation. They were asked to classify text based on context and words used in the tweet. This task is a binary classification problem, having two classes Hate and Normal. The tweets which contain words like F**k, S**t, hate, worst etc. were put in the class Hate others were put in a Normal class. After annotation, we got a dataset of 11k records, but it was unbalanced so to remove unbalances, we consider 4,093 tweets, as shown in Fig. 3 .

Fig 3

Balanced data of each class and their corresponding length .

3.2. Preprocessing

The data collected from Twitter is in the unstructured form, which contains noise, null values etc. for refining the data, it needed to be preprocessed such that it can be used for classification purposes. Pre-processing is critical for deciphering the meaning of brief texts in classification applications and clustering and anomaly detection. Preprocessing has a large impact on overall system performance, but receives less attention than feature extraction and classification. The preprocessing process includes preparing tweets for tasks such as event recognition, fraudulent information detection, sentiment analysis, and so on. On social media, people frequently adhere to their own set of informal language rules. As a result, each Twitter user has their own writing style, complete with abbreviations, unusual punctuation, and misspelt words. Emoticons and emojis are used in tweets to convey complexity, sentiment, and ideas. Slang and acronyms are common in tweets, “URLs”, “hashtags”, and “user mentions”. Data noise is caused by unwanted strings and Unicode, which are a result of the crawling process. In addition, practically all user-posted tweets include URLs that link to extra information, user mentions (@username), and the hashtag symbol (#coronaterrorism) to connect their message to a specific subject, and these hashtags can also express mood. These indications provide vital supplementary information to people, but they supply no information to machines and can be considered noise that must be dealt with. Researchers have proposed a number of techniques for dealing with this additional data offered by users, including replacing URLs with tags in one study ( Agarwal et al., 2011 ), and removing user mentions (@username) in another study ( Khan, Bashir, & Qamar, 2014 ).

To communicate sentiment and opinion, Twitter users utilise emoticons and emojis such as:-),;-),:-(, and others. This vital information must be recorded to effectively classify tweets. Words were used to replace emojis and expressions ( Gimpel et al., 2011 ). Twitter's character constraints discourage natural language usage, prompting users to adopt acronyms, abbreviations, and slang. Abbreviations include MIA (missing in action), gr8 (great), and ofc (of course). Slang is a casual way of expressing thoughts or meaning that is sometimes limited to specific individuals or settings. twitter, and OMG usually alludes to a surprise or emphasis rather than the literal expansion of oh my God. As a consequence, replacing casual insertions in tweets with their genuine word meaning improves automatic classifier performance without information loss. Abbreviations and slang were translated into word meanings in a study, which were then easily understood using standard text analysis tools ( Scheuer et al., 2011 ). Humans understand punctuation well, but it is less useful for automatic text classification. As a result, eliminate punctuation while preparing text for tasks like sentiment analysis. Some punctuation characters, including! and?, can communicate emotion eliminated punctuation ( Lin & He, 2009 ). However, replacing a question mark or exclamation mark with appropriate tags, such as!, can often express astonishment ( 2013 ). Like stemming, lemmatization simplifies a word. In lemmatization, linguistic knowledge is used to turn a word into its base form. Only tweets written in English are considered and converted in lower. Stopwords like a, an, the, etc., are removed using the stopword lexicon. Punctuation is also being performed, and the text is being divided into tokens called tokenization. Stemming is also used to get the root word example understanding will be converted to understand. Links, URLs etc are removed, and lemmatization is done. Fig. 4 shows the visual representation of the preprocessed data set.

Fig 4

Preprocessed dataset.

3.3. Feature engineering

Feature engineering decides whether a machine learning classifier will perform well or not. In this step, various features are extracted using multiple techniques like TF-IDF, a bag of words, sentence length also emphatic features are taken into consideration. The following Eq. (1 ). calculates the TF/IDF in context to our corpus.

Where t stands for the word as a feature, w stands for each tweet in the corpus, and D stands for the total number of tweets in the corpus (Document space).

Bag of Words features: It is composed of words and lemma.We used bigrams and trigram terms to extract more information from the text. Some of the selected features are corona jihad, COVID, dangerous Muslim, India, India come, dangerous, Muslim Coronavirus, report, coronavirus,coronajihad,billyperrigo etc.

3.4. Classification using traditional machine learning

For performing binary classification of tweets, Machine Learning algorithms are used. The binary classes are Hate and Normal. In this paper, traditional supervised machine learning algorithms are used for performing binary classification tasks. Logistic Regression (LR), Multinomial Naïve Bayes (MNB), Support Vector Machine (SVM) (LR), and Decision Tree algorithms were applied.

3.4.1. Logistic regression

Logistic Regression (LR) forecasts the arithmetic variable of a class constructed on its correlation with labels ( Khanday, Khan, & Rabani, 2020 c). The input is in the form of a Table with various values. About 50 features are chosen using TF/IDF and Bag of Words during feature engineering. LR usually computes the class relationship possibility, and here are two classes. y ∈ { 0 , 1 } .The subsequent options can be computed using Eq. (2 ).

3.4.2. Multinomial naïve bayes

This Machine Learning algorithm uses the Bayes rule for calculating class probabilities of tweets ( Rabani, Khan, & Khanday, 2020 ). Assuming d as the set of classes(Hate, Normal) and N denote the total number of features. In our Problem d= 0,1 and N=50. Multinomial Naïve Bayes allocates test tweet t i to highest probability class P ( d | t i ) using Bayes rule shown in Eq. (3 ):

The division of the number of labelled hate speech tweets d to the total number of hate speech tweets gives the value of P(d) value. P ( t i | d ) Is the possibility of finding a hate speech tweet like t i in-class c and is computed by:

Where f n i  = count of word ‘n’ in tweet ‘ti’ & P ( w n | d ) the probability of word ‘n’ given class d. the latter probability is calculated from the training data as:

Where F x c  = count of word ‘x’ in training documents having class d.

3.4.3. Support vector machine

It is a type of supervised machine learning algorithm used to classify text into various classes is Support Vector Machine (SVM) ( Khanday, Khan, & Rabani, 2021 ). Assume that ‘n’ be the number of features of a specific tweet corresponding with a label. We have about 50 features which are unigram and bigrams. Training set data points are ( y k , x k ) 1 n Where n is the number of features chosen. It takes 50 features as input in the form of a table. The motive of the SVM is to build a classifier in the form of Eq. (4 ).

Where: α k = positive real constant.

b = real constant.

Where: k, σ are constants.

Assuming the following equations classifier can be constructed in which +1 shows the hated class, and -1 shows normal class:

Which is equivalent to Eq. (5 ):

Where φ ( . )  = Nonlinear Mapping function used to map input into higher dimensional space.

Classification is being performed using hyperplane, which distinguishes the two classes (Hate and Normal). For constructing a hyperplane new variable ξ k is introduced. The Eq. (6 ) is for the hyperplane and is shown below:

3.4.4. Decision trees

Decision trees are an alternative method for performing binary classification ( Khanday, Khan, & Rabani, 2020 a). Decision trees partition the input space into regions and classify each region autonomously. It takes 50 features as input in the form of a table. Space is recursively splitted according to the input. It classifies the Tweets at the bottom of the tree. Leaf nodes do binary classification. An important function needs to be considered while building a Decision tree known “splitting criterion”. Splitting Criterion describes how data must be splited to maximize the performance of a decision tree. Information gain ratio is being used in our work, information gain to the intrinsic information gives us the value of information gain ratio shown in Eq. (7 ).

Where IG= Information Gain.

IV= Intrinsic information.

IG can be computed by having the value of Entropy:

Where Ex = Training Set and x x ∈ E x which describes the value of a particular training instance ‘x’ having features ‘a’.

H= Entropy and a=features.

IV can be Computed by:

3.5. Ensemble learning techniques

Ensemble classifiers are also used for performing binary(Hate and Normal) classification. Ensemble machine learning classifiers are used to improve accuracy. In our work, we used Bagging, Adaboost, Random Forest and Gradient Stochastic Boosting Ensemble learning techniques for performing binary classification.

3.5.1. Bagging

To increase the efficiency of classification and regression tasks, ensemble learning techniques are applied. The Bagging Technique causes us in abstaining from overfitting. Given a preparation set X having ‘n’ size, by examining consistently, it produces ‘m’ preparing sets ‘Xi’ each of size ‘n' with substitutions. The information is presented in the form of a table with various values for the 50 attributes that were picked. Because of substitutions, a few perceptions could rehash in every Xi. If m'=n, at that point for enormous n Xi is relied upon a division (1 - 1/e) to one of a kind instances of X, the rest are copies. The example is recognized as the bootstrap sample. By utilizing ‘m’ bootstrap samples, ‘m’ models are fitted and are consolidated by voting.

3.5.2. Adaboost

Adaboost algorithm uses weighting occasions of the dataset ( Zimmerman, Fox, & Kruschwitz, 2019 ). The information is given as a Table by having various values for 50 features that have been selected. Adaboost starts with equivalent weight to every perception and trains a weak algorithm by utilizing the weight information. By playing out this, the weak algorithm is delivered. Contingent upon exhibition of the weak classifier, it picks a coefficient ‘α’, which is misclassified. It focuses on improving weights and lessening weights effectively. A weak learning algorithm is used to generate a weak classifier using newly weighted data. Reiterating the process results in the development of an AdaBoost Classifier..

3.5.3. Random forest classifier

Random Forest is used for classification tasks having similar functions as the decision tree. Bootstrap amassing strategy is utilized for training this ensemble classifier. Averaging forecasts make the expectation of all single regression trees. For classification trees, the more significant part vote is taken. Random Forest utilizes an altered tree knowledge which chooses and split every learning procedure by irregular features subset. The data is presented in the form of a table, with varying values for the 50 attributes that were chosen. This algorithm makes a forest by utilizing a lot of decision trees from a subset of information that is arbitrarily chosen and summarises the decisions in favour of the choice tree to choose the last class of the article.

3.5.4. Stochastic gradient boosting

The Stochastic Gradient Boosting permits trees, which are eagerly made from the training dataset. The data is presented in the form of a table, with varying values for the 50 attributes that were chosen. It is utilized for decreasing the connection among the trees in inclination boosting. Every cycle, a subsample of preparation information is drawn aimlessly lacking substitutions from the complete preparing dataset. The haphazardly chosen subsample is utilized rather than the full example to adequately the base learner.

The experimentation is performed on a workstation having 4 GB Ram and 6 parallel 2.3GHz processors. Machine Learning is being performed using SCIKIT Learn toolkit. Other Libraries like Natural Language tool kits (NLTK) are also used for performing tasks like Tokenisation, Lemmatization, StopWord removal etc. In the wake of performing arithmetical figuring, further knowledge about the information was accomplished. We have used a 70:30 ratio for performing this task, where 70% of tweets are taken for training the ML models, and 30% are used for testing. We extracted 30K tweets, out of which 11K were essential. After annotation, 4,093 tweets were considered to adjust the dataset. They were marked into two classes, Hate and Normal. Hybrid features are selected by merging standard feature engineering techniques (TF/IDF,Bag of Words and Length).The classification was performed with the help of various Machine and Ensemble Learning algorithms by providing them features. Fivefold cross-validation was done as we don't have some other information by which the model can be validated. Table 2 shows the classification report of all machine learning classifiers.

Classification report of proposed methodology with ML classifiers.

The results showed that the Decision Tree Classifier outperformed all other traditional machine learning algorithms. The Precision of 98%, Recall of 97% and Accuracy of 97.96% is achieved. Fig. 5 , Fig. 6 , Fig. 7 , Fig. 8 show traditional machine learning algorithms' actual and predicted tweets by visualizing them by Confusion Matrix. Table 3 : shows the classification report of all Machine and Ensemble learning classifiers. The results showed that the Stochastic Gradient Boosting classifier outperforms all other algorithms.

Fig 5

Logistic regression.

Fig 6

Multinomial naïve bayes.

Fig 7

Support vector machine.

Fig 8

Decision tree.

Classification report of machine and ensemble learning techniques using proposed methodology.

Fig. 9 – 12 shows the confusion matrices of the corresponding ensemble learning techniques. The results showed that the Decision Tree gives better results, 98% precision, 97% recall, 97% F1 score and 97.9% overall accuracy, indicating the algorithm outclasses all other traditional algorithms. Stochastic Gradient Boosting classifier shows the highest performance among all ensemble and machine learning classifiers. It gives 99% precision, 97% recall, 98% F1 Score and 98.04% of overall accuracy. Other ensemble learning classifiers like Random Forest, Boosting and Adaboost also show promising results. The accuracy of these models can be improved by supplying more data. Fig. 13 depicts a comparative study of all algorithms employed in our research.

Fig 10

Random forest.

Fig 9

Stochastic gradient.

Fig 13

Comparative study of ML and ensemble learning classifiers.

5. Discussion

Hate Speech detection on social media is a pressing issue, and in this paper, we used Machine Learning Algorithms to detect hate speech in COVID-19 era. As the pandemic rose, Online Social Networks saw a drastic change in the behaviour, as users shared information regarding COVID-19 at an enormous pace. Hatemongers find the Pandemic to share hate and panic, triggering mass hysteria. The Twitter API is used to extract data from Twitter using various hate-related terms in this project. For supervised machine learning dataset needs to be labelled, manual annotation is being performed to label the tweets into Hate and Non-Hate Class. Due to tweets' semantic and contextual nature, manual annotation is being preffered by many researchers. Various techniques like Tokenization, Stemming, Normalisation, etc., are used for performing data preprocessing.

Since hate is spread in the form of text, feature selection is one of the important step for detecting hate. Features are selected by techniques, TF/IDF and Bag of Words. After performing data exploration, it was found that the tweet's Length play a vital role in spreading the hate. Due to this critical role, Length was also considered as a feature. After Selecting features supervised Machine Learning classifiers are trained and tested in the ratio of 70:30. When used with our methodology, it was found that the Decision tree showed better accuracy among other algorithms. This work will somehow help government officials tackle hate speech by analyzing the tweets regarding hate speech.

When compared with existing work, dataset HatebaseTwitter ( Davidson, Warmsley, Macy, & Weber, 2017 ) was used with proposed methodology and the results showed that, Decision Tree and Stochastic Gradient Boosting showed better accuracy then all other algorithms. Table 4 shows the comparative analysis of our best performing algorithms with previous work.

Comparative analysis with existing work.

5.1. Contribution to literature

Following the completion of a series of experiments, it was determined that our method performed far better than other methods that had been utilised in earlier investigations concerning hate speech. When training the model, the authors Zimmerman et al. (2019) solely employed TF/IDF features, whereas other researchers selected TF/IDF features as well as PoS features. In this study, the hybrid features Bag of Words, TF/IDF, and Tweet Length have been chosen for consideration. During the course of this research, novel data was generated, and tweets were collected without regard to their spatial context.

5.2. Practical implications

There are many different practical implications of this work. One of these implications is that the ability to detect hate speech in real time will help us combat hate speech on social networks. Hateful people take advantage of the fact that social media platforms can be used as a medium for communication, and as a result, these platforms are used to spread hatred among users. Checking the credibility of hate speech by manually is a rigorous and time-consuming process. Machine learning can be used to identify those who engage in cybercrime. It was discovered that the length of the tweet containing hate speech about COVID-19, measured in characters, is significantly longer than a typical class tweet. This work has the potential to be expanded to be investigated on other social networking platforms such as Facebook, Linkedin, and Reddit, amongst others. If automatic annotation programmes existed, they would have made a significant contribution to this body of work. The magnitude of the dataset would have allowed for more effective training of machine learning classifiers if automatic annotation had been used. Additionally, the phrases that are used to describe hate speech fluctuate depending on the subject that is being discussed. In the near future, features based on emphasis and semantics may be employed to improve hate speech prediction.

6. Conclusion

The world is paralyzed due to COVID-19 as no Vaccine or medication is available until 26 th July and has affected social life. Online social networks are used enormously in this pandemic for communicating with each other, a vast amount of information is being shared through these platforms. Many misinformation and Hate speech is being shared on this deadly virus. Hatemongers use COVID-19 as a platform for spreading hatred. Tweets were extracted using various hashtags like #CoronaJihad, #CoronaTerrorism, etc. and were labelled into Hate class and Normal Class. Hybrid Feature selection is made using TF/IDF and Bag of Words after training and testing all Machine Learning Models. The Decision Tree classifier shows promising results, 98% precision, 97% recall, 97% f1 score and an accuracy of 97.9%. Ensemble Models are also trained and tested for performing binary classification. Among all Ensemble Learning classifiers, Stochastic Gradient Boosting shows the highest performance, 99% precision, 97% recall, 98% F1 Score and Accuracy of 98.04%. Random Forest, Adaboost and Boosting also showed promising results. The effectiveness of classifiers can be improved by expanding measures of information. In the future, hate speech may be categorized based on gender. Long Short Term Memory (LSTM) and Convolutional Neaural Network (CNN) may also be used soon for performing Multi-class Classification. Algorithms 1

Algorithm 1

Classification of Tweets into Hate and Non-Hate Class

  • Balahur Alexandra. Sentiment Analysis in Social Media Texts. Association for Computational Linguistics; Atlanta, Georgia: 2013. pp. 120–128. [ Google Scholar ]
  • A. Agarwal, B. Xie, I. Vovsha, O. Rambow, and R. Passonneau, “Sentiment analysis of twitter data,” pp. 30–38, 2011.
  • Aswani R., Kar A.K., Ilavarasan P.V. Experience: Managing misinformation in social media – Insights for policymakers from twitter analytics. Journal of Data and Information Quality. 2019; 12 (1) Nov. [ Google Scholar ]
  • Badjatiya P., Gupta S., Gupta M., Varma V. Proceedings of the 26th international conference on world wide web confrence 2017. WWW 2017 Companion; 2017. Deep learning for hate speech detection in tweets; pp. 759–760. [ Google Scholar ]
  • Bail C.A., et al. Exposure to opposing views on social media can increase political polarization. Proceedings of the National Academy of Sciences of the United States of America. 2018; 115 (37):9216–9221. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Burnap P., Williams M.L. Cyber hate speech on twitter: An application of machine classification and statistical modeling for policy and decision making. Policy and Internet. 2015; 7 (2):223–242. [ Google Scholar ]
  • CodaLab - Competition. [Online]. Available: https://competitions.codalab.org/competitions/19935 . [Accessed: 22-Dec2021].
  • Davidson T., Warmsley D., Macy M., Weber I. Proceedings of the eleventh international AAAI conference on web and social media, ICWSM. 2017. Automated hate speech detection and the problem of offensive language; pp. 512–515. Icwsm. [ Google Scholar ]
  • Davidson T., Warmsley D., Macy M., Weber I. Automated hate speech detection and the problem of offensive language. 2017 [ Google Scholar ]
  • de Gibert O., Perez N., García-Pablos A., Cuadros M. Hate speech dataset from a white supremacy forum. 2019:11–20. [ Google Scholar ]
  • Del Vigna F., Cimino A., Dell'Orletta F., Petrocchi M., Tesconi M. Hate me, hate me not: Hate speech detection on facebook. MATECWeb of Conferences. 2017; 125 :86–95. [ Google Scholar ]
  • Djuric N., Zhou J., Morris R., Grbovic M., Radosavljevic V., Bhamidipati N. Proceedings of the WWW Companion - 24th International Conference on World Wide Web. 2015. Hate speech detection with comment embeddings; pp. 29–30. May. [ Google Scholar ]
  • Dubois E., Blank G. The echo chamber is overstated: The moderating effect of political interest and diverse media. Information, Communication & Society. 2018; 21 (5):729–745. [ Google Scholar ]
  • Facebook, Google and Twitter agree German hate speech deal - BBC News. [Online]. Available: https://www.bbc.com/news/world-europe-35105003 . [Accessed: 22-Dec- 2021].
  • Facebook's Mark Zuckerberg ‘understands need to stamp out hate speech’, Germany says | Daily Mail Online. [Online]. Available: https://www.dailymail.co.uk/news/article-3464501/Mark-Zuckerburg-understands-needs-stamp-hate-speech-Facebook-says-German-minister-meeting-discuss-deleting-neo-Nazi-comments-faster.html . [Accessed: 22-Dec- 2021].
  • B. Gambäck and U.K. Sikdar, “Using convolutional neural networks to classify hate-speech,” no. 7491, pp. 85–90, 2017.
  • Gillani N., Yuan A., Saveski M., Vosoughi S., Roy D. Proceedings of the world wide web conference on world wide web. 2018. Me, my echo chamber, and I: Introspection on social media polarization; pp. 823–831. [ Google Scholar ]
  • Gimpel K., et al. Vol. 2. 2011. Part-of-speech tagging for twitter: Annotation, features, and experiments; pp. 42–47. (Proceedings of the ACL-HLT -49th annual meeting of the association for computational linguistics: Human language technologies). [ Google Scholar ]
  • Gitari N.D., Zuping Z., Damien H., Long J. A lexicon-based approach for hate speech detection. International Journal of Multimedia and Ubiquitous Engineering. 2015; 10 (4):215–230. [ Google Scholar ]
  • Grover P., Kar A.K., Dwivedi Y.K., Janssen M. Polarization and acculturation in US Election 2016 outcomes – Can twitter analytics predict changes in voting preferences. Technological Forecasting and Social Change. 2019; 145 :438–460. [ Google Scholar ]
  • Grover P., Kar A.K., Ilavarasan P.V. Impact of corporate social responsibility on reputation – Insights from tweets on sustainable development goals by CEOs. International Journal of Information Management. 2019; 48 :39–52. [ Google Scholar ]
  • Hua T., et al. Analyzing civil unrest through social media. Computer (Long. Beach. Calif). 2013; 46 (12):80–84. [ Google Scholar ]
  • Jha Akshita, Mamidi Radhika. Proceedings of the Second Workshop on NLP and Computational Social Science. Association for Computational Linguistics; Vancouver, Canada: 2017. When does a compliment become sexist? Analysis and classification of ambivalent sexism using twitter data; pp. 7–16. [ Google Scholar ]
  • Ji Ho Park, Pascale Fung. One-step and Two-step Classification for Abusive Language Detection on Twitter. Association for Computational Linguistics; Vancouver, BC, Canada: 2017. pp. 41–45. [ Google Scholar ]
  • Joseph N., Kar A.K., Ilavarasan P.V. How do network attributes impact information virality in social networks? Information Discovery and Delivery. 2021; 49 (2):162–173. [ Google Scholar ]
  • Joulin A., Grave E., Bojanowski P., Mikolov T. Proceedings of the 15th conference of the European chapter of the association for computational linguistics EACL. Vol. 2. 2017. Bag of tricks for efficient text classification; pp. 427–431. [ Google Scholar ]
  • Kar A.K., Aswani R. How to differentiate propagators of information and misinformation–Insights from social media analytics based on bio-inspired computing. Journal of Information and Optimization Sciences. 2021; 42 (6):1307–1335. Aug. [ Google Scholar ]
  • Kar A.K., Dwivedi Y.K. Theory building with big data-driven research – Moving away from the ‘What’ towards the ‘Why. International Journal of Information Management. 2020; 54 [ Google Scholar ]
  • Khan F.H., Bashir S., Qamar U. TOM: Twitter opinion mining framework using hybrid classification scheme. Decision Support Systems. 2014; 57 (1):245–257. [ Google Scholar ]
  • Khanday A.M.U.D., Khan Q.R., Rabani S.T. Identifying propaganda from online social networks during COVID-19 using machine learning techniques. International Journal of Information Technology. 2020 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Khanday A.M.U.D., Khan Q.R., Rabani S.T. Detecting textual propaganda using machine learning techniques. Baghdad Science Journal. 2020:199–209. December. [ Google Scholar ]
  • Khanday A.M.U.D., Khan Q.R, Rabani S.T. Proceedings of the 2nd international conference on advances in computing, communication control and networking (ICACCCN) 2020. Analysing and predicting propaganda on social media using machine learning techniques; pp. 122–127. [ Google Scholar ]
  • Khanday A.M.U.D., Khan Q.R., Rabani S.T. Cognitive Informatics and Soft Computing . Springer; Singapore: 2021. SVMBPI: support vector machine-based propaganda identification; pp. 445–455. [ Google Scholar ]
  • Khanday A.M.U.D., Rabani S.T., Khan Q.R., Rouf N., Mohi ud Din M. Machine learning based approaches for detecting COVID-19 using clinical text data. International Journal of Information Technology. 2020 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Khanday Akib Mohi ud Din, et al. NNPCov19: Artificial Neural Network-Based Propaganda Identification on Social Media in COVID-19 Era. Mobile Information Systems. 2022:1–10. doi: 10.1155/2022/3412992. https://www.hindawi.com/journals/misy/2022/3412992/ In this issue. [ CrossRef ] [ Google Scholar ]
  • Kumar R., Ojha A.K., Malmasi S., Zampieri M. Benchmarking aggression identification in social media. Trac. 2018;(1):1–11. [ Google Scholar ]
  • Kushwaha Amit Kumar, Kar Arpan Kumar, Ilavarasan P. Predicting Information Diffusion on Twitter a Deep Learning Neural Network Model Using Custom Weighted Word Features. 19th Conference on e-Business, e-Services and e-Society (I3E) Apr 2020:456–468. doi: 10.1007/978-3-030-44999-5_38. Skukuza, South Africa. hal-03222872. [ CrossRef ] [ Google Scholar ]
  • Kwok I., Wang Y. Locate the hate: Detecting tweets against blacks. Association for the Advancement of Artificial Intelligence. 2013:1621–1622. [ Google Scholar ]
  • Lin C., He Y. Proceedings of the ACM international conference on information & knowledge management. 2009. Joint sentiment/topic model for sentiment analysis; pp. 375–384. [ Google Scholar ]
  • MacAvaney S., Yao H.R., Yang E., Russell K., Goharian N., Frieder O. Hate speech detection: Challenges and solutions. Plos One. 2019; 14 (8):1–16. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Neubaum G., Krämer N.C. Opinion climates in social media: Blending mass and interpersonal communication. Human Communication Research. 2017; 43 (4):464–476. Oct. [ Google Scholar ]
  • Nobata C., Tetreault J., Thomas A., Mehdad Y., Chang Y. Proceedings of the 25th international world wide web conference WWW. 2016. Abusive language detection in online user content; pp. 145–153. [ Google Scholar ]
  • Opinion | Twitter Must Do More to Block ISIS - The New York Times. [Online]. Available: https://www.nytimes.com/2017/01/13/opinion/twitter-must-do-more-to-block-isis.html . [Accessed: 22-Dec- 2021].
  • Rabani S.T., Khan Q.R., Khanday A.M.U.D. Detection of suicidal ideation on Twitter using machine learning & ensemble approaches. Baghdad Science Journal. 2020; 17 (4):1328–1339. [ Google Scholar ]
  • Scheuer C., et al. Twitter sentiment analysis: The good the bad and the OMG! Physical Education and Sport for Children and Youth with Special. Needs: Researches – Best Practices – Situation. 2011:538–541. [ Google Scholar ]
  • Silva L., Mondal M., Correa D., Benevenuto F., Weber I. Proceedings of the 10th international conference on web and social media, ICWSM. 2016. Analyzing the targets of hate in online social media; pp. 687–690. [ Google Scholar ]
  • Spohr D. Fake news and ideological polarization: Filter bubbles and selective exposure on social media. Business Information Review. 2017; 34 (3):150–160. Aug. [ Google Scholar ]
  • Verma P., Khanday A.M.U.D., Rabani S.T., Mir M.H., Jamwal S. Twitter sentiment analysis on Indian government project using R. International Journal of Recent Technology and Engineering. 2019; 8 (3):8338–8341. [ Google Scholar ]
  • Warner W., Hirschberg J. Detecting hate speech on the world wide web. Association for Computational Linguistics. 2012:19–26. [ Google Scholar ]
  • Waseem Z. Proceedings of the first workshop on NLP and computational social science. 2016. Are you a racist or am I seeing things? Annotator influence on hate speech detection on twitter; pp. 138–142. [ Google Scholar ]
  • Waseem Z., Hovy D. Proceedings of the NAACL HLT 2016 the 2016 conference of the north american chapter of the association for computational linguistics: Human language technologies student research workshop. 2016. Hateful symbols or hateful people? Predictive features for hate speech detection on twitter; pp. 88–93. [ Google Scholar ]
  • World Economic Forum The global risks report 2017 12th edition. Global Competitiveness Risks Team. 2017:103. [ Google Scholar ]
  • Wu C., Gerber M.S. Forecasting civil unrest using social media and protest participation theory. IEEE Transactions on Computational Social Systems. 2018; 5 (1):82–94. [ Google Scholar ]
  • Zimmerman S., Fox C., Kruschwitz U. Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018) 2019. Improving hate speech detection with deep learning ensembles; pp. 2546–2553. [ Google Scholar ]
  • Schmidt, A., & Wiegand, M. (2017, April). A survey on hate speech detection using natural language processing. In Proceedings of the fifth international workshop on natural language processing for social media (pp. 1-10).

Help | Advanced Search

Computer Science > Machine Learning

Title: cross-platform hate speech detection with weakly supervised causal disentanglement.

Abstract: Content moderation faces a challenging task as social media's ability to spread hate speech contrasts with its role in promoting global connectivity. With rapidly evolving slang and hate speech, the adaptability of conventional deep learning to the fluid landscape of online dialogue remains limited. In response, causality inspired disentanglement has shown promise by segregating platform specific peculiarities from universal hate indicators. However, its dependency on available ground truth target labels for discerning these nuances faces practical hurdles with the incessant evolution of platforms and the mutable nature of hate speech. Using confidence based reweighting and contrastive regularization, this study presents HATE WATCH, a novel framework of weakly supervised causal disentanglement that circumvents the need for explicit target labeling and effectively disentangles input features into invariant representations of hate. Empirical validation across platforms two with target labels and two without positions HATE WATCH as a novel method in cross platform hate speech detection with superior performance. HATE WATCH advances scalable content moderation techniques towards developing safer online communities.

Submission history

Access paper:.

  • HTML (experimental)
  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

Detection of Hate Speech in Videos Using Machine Learning

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Book cover

Sustainable Advanced Computing pp 125–135 Cite as

Hate Speech Detection Using Machine Learning Techniques

  • Akileng Isaac 40 ,
  • Raju Kumar 40 &
  • Aruna Bhat 40  
  • Conference paper
  • First Online: 31 March 2022

557 Accesses

2 Citations

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 840))

Hate speech is an issue to most governments and the public’s concern due to the increased emergence of different social media applications and the increasing use of such media to disseminate hate speech to individuals or groups of persons, communities, or race. Therefore, without detection and hate speech analysis, it is impossible to believe that there is not any malicious information on social media. This paper strives to provide a survey of hate speech detection using different approaches and their comparisons while focusing on aspects like machine learning models, different features put to use, and datasets.

This is a preview of subscription content, log in via an institution .

Buying options

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Ruwandika NDT, Weerasinghe AR (2018) Identification of hate speech in social media. In: IEEE international conference on advances in ICT for emerging regions

Google Scholar  

Rohmawati UAN, Sihwi SW, Cahyani DE (2018) An interface for Indonesian hate speech detection using machine learning. In: IEEE international conference on research of information technology and intelligent systems

Rodríguez A, Argueta C, Chen Y-L (2019) Automatic detection of hate speech on Facebook using sentiment and emotion analysis. IEEE

Saini Y, Bachchas V, Kumar Y, Kumar S (2020) Abusive text examination using latent Dirichlet allocation, self organizing maps and K means clustering. In: IEEE international conference on intelligent computing and control systems

Saksesi AS, Nasrun M, Setianingsih C (2018) Analysis text of hate speech detection using recurrent neural network. In: International conference on control, electronics, renewable energy and communications (ICCEREC)

Blei DM, Jordan MI, Ng AY (2003) Latent Dirichlet allocation. J Mach Learn Res

Amrutha BR, Bindu KR (2019) Detecting hate speech in tweets using different deep neural network architectures. In: Proceedings of the international conference on intelligent computing and control systems (ICICCS)

Aliwy AH, Ameer EHA (2017) Comparative study of five text classification algorithms with their improvements. Int J Appl Eng Res 12(14):4309–4319

Tsangaratos P, Ilia I (2016) Comparison of a logistic regression and Naïve Bayes classifier in landslide susceptibility assessments: the influence of models complexity and training dataset size. CATENA 145:164–179

Article   Google Scholar  

Argerich L, Cano MJ, Zaffaroni JT (2016) Hash2Vec: feature hashing for word embeddings. arXiv

Yun-tao Z, Ling G, Yong-cheng W (2005) An improved TF-IDF approach for text classification. J Zhejiang Univ Sci 6A(1):49–55

Hutto C, Gilbert E (2014) Vader: a parsimonious rule-based model for sentiment analysis of social media text. In: Eighth international conference on weblogs and social media (ICWSM-14)

Davidson T (2019) Hate speech and offensive language, Github

Nugroho AS, Witarto AB, Handoko D (2003) Application of support vector machine in bioinformatics. In: Proceeding of Indonesian scientific meeting

Davidson T, Warmsley D, Macy M, Weber I (2017) Automated hate speech detection and the problem of offensive language. arXiv Prepr. arXiv1703.04009

Hutto CT, Gilbert E (2014) VADER: a parsimonious rule-based model for sentiment analysis of social media text. In: Eighth international AAAI conference on weblogs and social media, pp 216–225

Download references

Acknowledgements

This work was supported by Delhi Technological University, Department of Computer Science and engineering under the guidance of Dr. Aruna Bhat.

Author information

Authors and affiliations.

Delhi Technological University, Delhi, India

Akileng Isaac, Raju Kumar & Aruna Bhat

You can also search for this author in PubMed   Google Scholar

Editor information

Editors and affiliations.

Department of Computer Science, CHRIST (Deemed to be University), Bengaluru, Karnataka, India

Sagaya Aurelia

Department of Mechanical Engineering, Indian Institute of Technology Madras, Chennai, Tamil Nadu, India

Somashekhar S. Hiremath

Department of Information Technology, University of Technology and Applied Science, Sultanate of Oman, Oman

Karthikeyan Subramanian

Department of Computer Science Engineering, National Institute of Technology Silchar, Silchar, Assam, India

Saroj Kr. Biswas

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper.

Isaac, A., Kumar, R., Bhat, A. (2022). Hate Speech Detection Using Machine Learning Techniques. In: Aurelia, S., Hiremath, S.S., Subramanian, K., Biswas, S.K. (eds) Sustainable Advanced Computing. Lecture Notes in Electrical Engineering, vol 840. Springer, Singapore. https://doi.org/10.1007/978-981-16-9012-9_11

Download citation

DOI : https://doi.org/10.1007/978-981-16-9012-9_11

Published : 31 March 2022

Publisher Name : Springer, Singapore

Print ISBN : 978-981-16-9011-2

Online ISBN : 978-981-16-9012-9

eBook Packages : Intelligent Technologies and Robotics Intelligent Technologies and Robotics (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

IMAGES

  1. Hate Speech Detection Using Machine Learning

    detecting hate speech on social media using machine learning

  2. How to Detect Hate Speech with Machine Learning?

    detecting hate speech on social media using machine learning

  3. GitHub

    detecting hate speech on social media using machine learning

  4. GitHub

    detecting hate speech on social media using machine learning

  5. Algorithms

    detecting hate speech on social media using machine learning

  6. Hate Speech Detection with Machine Learning

    detecting hate speech on social media using machine learning

VIDEO

  1. Sentimental analysis on twitter data using Hybrid Machine Learning Models

  2. EP 35 Omprakash parihar advocate supreme court @courtse #hate speech #canada bill

  3. Fake News Prediction using Python

  4. The Clever Hans Effect in ML: a case example of voice spoofing detection, by Bhusan Chettri

  5. AI Model for Detecting Offensive Language and Promoting Online Safety| Tensorflow Keras

  6. stress detection through speech analysis using machine learning

COMMENTS

  1. A survey on hate speech detection and sentiment analysis using machine learning and deep learning models

    Machine learning models for hate speech detection and sentiment analysis. The anonymity of social networks attracts hate speech, which presents a problem for the entire world, to hide their unlawful online behaviour. Detecting hate speech is crucial given the growing volume of social media data since it can have negative impacts on society [44 ...

  2. Exploring Automatic Hate Speech Detection on Social Media: A Focus on

    This paper presents a survey on automatic hate speech detection on social media, providing a structured overview of theoretical aspects and practical resources. ... Nobata et al. (2016) developed a machine learning based method to detect hate speech from the "Yahoo! Finance and News" dataset that outperformed a deep learning approach.

  3. Social Media Hate Speech Detection Using Machine Learning ...

    Data from Facebook has been gathered by the author. Formatting the dataset is the first step. The author used machine learning to incorporate hate speech that was taken from Facebook. The SVM approach had the loftiest F1 score of 0.71, while the Naive Bayes approach had the loftiest F1 score of 0.73.

  4. Hate speech detection in social media: Techniques, recent trends, and

    This analysis aims to create a valuable resource by summarizing the methods and strategies used to combat hate speech in social media. We perform a detailed review to achieve a deep knowledge of the hate speech detection landscape from 2018 to 2023, revealing global incidents of hate speech in 2022-2023.

  5. Hate speech detection: A comprehensive review of recent works

    In literature, there have been efforts to recognize and categorize hate speech using varied Machine Learning (ML) and Deep Learning (DL) techniques. Hence, considering the need and provocations for hate speech detection we aim to present a comprehensive review that discusses fundamental taxonomy as well as recent advances in the field of online ...

  6. (PDF) HATE SPEECH DETECTION USING MACHINE LEARNING: A SURVEY

    Abstract and Figures. This survey paper aims to provide a comprehensive overview of the existing research on hate speech detection using machine learning. We review various methodologies and ...

  7. Automatic Hate Speech Detection in English-Odia Code Mixed Social Media

    Hate speech on social media may spread quickly through online users and subsequently, may even escalate into local vile violence and heinous crimes. This paper proposes a hate speech detection model by means of machine learning and text mining feature extraction techniques. In this study, the authors collected the hate speech of English-Odia code mixed data from a Facebook public page and ...

  8. Deep Learning for Hate Speech Detection: A Comparative Study

    Automated hate speech detection is an important tool in combating the spread of hate speech, particularly in social media. Numerous methods have been developed for the task, including a recent proliferation of deep-learning based approaches. A variety of datasets have also been developed, exemplifying various manifestations of the hate-speech detection problem. We present here a large-scale ...

  9. Hate speech detection: Challenges and solutions

    The proposed solutions employ machine learning techniques to classify text as hate speech. ... Automatic approaches for hate speech detection. Most social media platforms have established user rules that prohibit hate speech; enforcing these rules, however, requires copious manual labor to review every report. ...

  10. Hate Speech Prediction on Social Media

    In order to stop hate speech spreaders on social media, researchers started developing machine learning systems that automatically detect hate speech. In this paper we present our proposed approach to detect hate speech on twitter. We develop two different models, the first one with a traditional approach using random forest model and the second one with the autoencoder as a deep learning ...

  11. Advances in Machine Learning Algorithms for Hate Speech Detection in

    The aim of this paper is to review machine learning (ML) algorithms and techniques for hate speech detection in social media (SM). Hate speech problem is normally model as a text classification task. In this study, we examined the basic baseline components of hate speech classification using ML algorithms. There are five basic baseline components - data collection and exploration, feature ...

  12. Social Media Hate Speech Detection Using Explainable Artificial ...

    Explainable artificial intelligence (XAI) characteristics have flexible and multifaceted potential in hate speech detection by deep learning models. Interpreting and explaining decisions made by complex artificial intelligence (AI) models to understand the decision-making process of these model were the aims of this research. As a part of this research study, two datasets were taken to ...

  13. Hate Speech Detection in Multi-social Media Using Deep Learning

    The term used "abusive", "hate speech", or "harmful speech" is called uncontrolled messages that target individuals or a particular society based on several characteristics such as religion, gender, country, color, organization, etc. Social media like YouTube, Twitter, Facebook, Instagram, Gab, Reddit, Stormfront, etc., continuously ...

  14. Detecting and visualizing hate speech in social media: A cyber Watchdog

    Machine learning approaches present a limitation concerning the learning process. ... A. & Wiegand, M. (2017). A Survey on Hate Speech Detection Using Natural Language Processing. In Proceedings of the fifth international workshop on natural language processing for social media. ... K. -S., Zhu, X. & Bellmore, A. (2012). Learning from bullying ...

  15. Detecting twitter hate speech in COVID-19 era using machine learning

    Hate Speech detection on social media is a pressing issue, and in this paper, we used Machine Learning Algorithms to detect hate speech in COVID-19 era. As the pandemic rose, Online Social Networks saw a drastic change in the behaviour, as users shared information regarding COVID-19 at an enormous pace.

  16. Social Media Hate Speech Detection Using Machine Learning Approach

    This investigation aims to examine the performance of multiple engineering approaches with five machine literacy algorithms. The data sets contain the class orders hate speech, not hate speech and ...

  17. Taming The Hate: Machine Learning Analysis Of Hate Speech

    TLDR. This work implemented various machine learning-based algorithms for hate speech detection on various social media platforms and found out that XGBoost when used with TF-IDF transformer embedding gave us an accuracy of 94.43%, which is the maximum among these three models for the given benchmark dataset. Expand.

  18. A transfer learning approach for detecting offensive and hate speech on

    Introducing Transfer Learning as an approach for detecting offensive and hate speech over social media. While the research works in the past focus on machine learning methods, transfer learning approach is yet to be explored for hate and offensive speech detection in social media platforms. We use two different models to address the problem.

  19. Hate Speech in Social Networks and Detection using Machine Learning

    The use of social networking sites has increased considerably in last few years and as a result the user generated contents in the web also increased manifold. These data are mostly present in unstructured and quasi-structured formats. Many social media platforms are being affected due to the presence of hate speech. It is present in many forms such as verbal aggression and through photos ...

  20. PDF Social Media Hate and Offensive Speech Detection Using Machine Learning

    2024) opened a door to participate in the detection of Telugu social media hate and offensive speech by providing golden standard datasets. This shared task offered an opportunity for researchers to come up with solutions leveraging existing technology to identify hate speech and objectionable pieces of information. This study aims to determine ...

  21. [2404.11036] Cross-Platform Hate Speech Detection with Weakly

    Content moderation faces a challenging task as social media's ability to spread hate speech contrasts with its role in promoting global connectivity. With rapidly evolving slang and hate speech, the adaptability of conventional deep learning to the fluid landscape of online dialogue remains limited. In response, causality inspired disentanglement has shown promise by segregating platform ...

  22. Challenges of Hate Speech Detection in Social Media

    The detection of hate speech in social media is a crucial task. The uncontrolled spread of hate has the potential to gravely damage our society, and severely harm marginalized people or groups. A major arena for spreading hate speech online is social media. This significantly contributes to the difficulty of automatic detection, as social media posts include paralinguistic signals (e.g ...

  23. Hate Speech Detection in Tweets using Support Vector Machine

    The proliferation of hate speech has grown to be a significant social worry because of social media's explosive growth. This work aims to explain a new approach that blends Sentiment Analysis and Support Vector Machine (SVM) techniques. The objective is to build a robust and precise system that can identify hate speech instantly and promote a safer online community. Many trials are conducted ...

  24. Machine Learning for Hate Speech Detection in Arabic Social Media

    Hate speech detection is usually modeled as a supervised classification problem where a machine learning algorithm is trained on data that is labeled as either "hateful" or "inoffensive." Many efforts have been put to apply this process in Arabic social media, mainly Twitter, as it is a platform that is highly popular in the Middle East ...

  25. Detection of Hate Speech in Videos Using Machine Learning

    With the progression of the Internet and social media, people are given multiple platforms to share their thoughts and opinions about various subject matters freely. However, this freedom of speech is misused to direct hate towards individuals or group of people due to their race, religion, gender etc. The rise of hate speech has led to conflicts and cases of cyber bullying, causing many ...

  26. Hate Speech Detection Using Machine Learning Techniques

    Abstract. Hate speech is an issue to most governments and the public's concern due to the increased emergence of different social media applications and the increasing use of such media to disseminate hate speech to individuals or groups of persons, communities, or race. Therefore, without detection and hate speech analysis, it is impossible ...