Short on time?

Get essay writing help

Pragmatic Supervised Learning Methodology of Hate Speech Detection in Social Media

Words: 4146
Pages: 9

Cite This Essay

This essay sample was donated by a student to help the academic community. Papers provided by EduBirdie writers usually outdo students' samples.

A Pragmatic Supervised learning Methodology of Hate Speech Detection in Social Media

1G.Priyadharshini, 2Dr.M.Balamurugan

1Research Scholar, 2Professor and Head

1School of Computer Science, Engineering and Applications

1Bharathidasan University, Tiruchirappalli, India


Abstract: In recent decades, information technology has been undergoing a huge evolution, with an expressive adoption of online social networks and social media platforms. Such progress revolutionized the way communication takes place by enabling a rapid, easy and almost costless digital interaction between its users. Although its numerous advantages, the anonymity associated with these interactions often leads to the adoption of more aggressive and hateful communication styles. These emerge at a fast and uncontrollable pace and usually cause severe damage to its targets, being crucial that governments and social network platforms are able to successfully detect and regulate aggressive and hateful behaviors occurring on a regular basis on multiple online platforms. The detection of this type of speech is far from being trivial due to the topic’s abstractness. Therefore this paper is proposed to deliver and complement current methodology and solutions on the detection of hate speech online, focusing on social media.

Index Terms – Preprocessing, Feature Extraction, Machine Learning, Classification.



Hate speech is language that attacks or diminishes, that incites violence or hate against groups, based on specific characteristics such as physical appearance, religion, descent, national or ethnic origin, sexual orientation, gender identity or other, and it can occur with different linguistic styles, even in subtle forms or when humour is used. However, any distinct group may be targeted. Hate comes in different shapes and formats, targeting several different groups and minorities. A systematic large scale measurement study of the main targets of hate speech was conducted on the social media platforms Twitter and Whisper, capturing not only common targets of hate but also their frequency on these platforms.

This paper provides a summarized overview of pragmatic approach of automatic hate speech detection that is in present existence. It would be in need for freshers of NLP research who wanted to keep themselves aware of the actual state of art.


Extracting features consists of building a set of derived values from a collection of raw data, being a step often decisive in improving the performance of machine learning problems.

Tokenization: It is defined as slicing a stream of text into pieces, denoted as tokens. The tokenization varies from language to language but lexical characteristics such as colloquialism (e.g. ‘u’ instead of ‘you’), contractions (e.g. ‘aren’t’ instead of ‘are not’) and others (e.g. ‘O’Neil) make the task harder. Sometimes also removal of less frequent tokens of the data is included.

2.2. Filtering: This involves removal of punctuation marks and irrelevant and/or invalid characters, (e.g. ‘?|%&!’), removal of stop words that are frequently used words that carry no useful meaning whose commonness and lack of meaning makes them useless. These filtering is very necessary since they do not contribute to the classification task.

2.3. Stemming: It is the process of reducing inflected words to a common base form(e.g. ‘ponies’ turns into ‘poni’ and ‘cats’ into ‘cat’). Stemming also improves performance by reducing the dimensionality of the data, since the words ‘fishing’, ‘fished’, and ‘fisher’ are treated as the same word ‘fish’.

2.4. Spellchecker: misspelling is common in online platforms due to their informal nature. A spell checker is needed to avoid unidentified or intentionally camouflaged words (e.g. ‘niggr’, ‘fck’).

2.5. Lemmatization: Although very similar to stemming, lemmatization considers the morphological analysis of the words. While stemming would shorten the words ‘studies’ to ‘studi’ and ‘studying’ to ‘study’, lemmatization would shorten both to ‘study’.

2.6. PoS tagging: Part of speech tagging, is a technique to extract the part of speech associated with each word of the corpus, grammatically wise which might be common to remove words belonging to certain parts of speech that might end up not being so relevant(e.g. pronouns).

2.7 Lowercasing: is converting a stream of text to lowercase which improves the performance of the classification since it reduces the dimensionality of the data. Not applying this technique may raise problems such as ‘tomorrow’, ‘TOMORROW’ and ‘ToMoRroW’ being considered different words.


Feature extraction consists of collecting derived values (features) from the input data (text in this specific scenario) and generating distinctive properties, hopefully, informative and non-redundant, inorder to improve the learning and generalization tasks of the machine learning algorithms. Upon their extraction there is usually a subset of features that will contain more relevant information. Some of the frequently used feature extraction approaches is presented here.

3.1. N-Grams: N-grams are one of the most used techniques in hate speech automatic detection and related tasks [1,3,14]. The most common n-grams approach consists in combining sequential words into lists with size N. In this case, the goal is to enumerate all the expressions of size N and count the occurrences of them. This allows to improve the classifiers’ performance because it incorporates at some degree the context of each word. Instead of using words it is also possible to use n-grams with characters or syllables. This approach is not so susceptible to spelling variations as when words are used. In a study character n-gram features proved to be more predictive than token n-gram features, for the specific problem of abusive language detection [2].

3.2. Bag of Words: Bag of words is a representation of words which disregards grammar and the order of the words in sentences, while keeping multiplicity. Similarly to n-grams,BoW can been coded using tfidf, token counter or hashing function. Although it is typically used to group textual elements as tokens, it can also group other representations such as parts of speech.

3.3. TFIDF: Term frequency-inverse document frequency is a numerical statistic that measures the importance of a certain word in a data corpus. This might be an important feature in understanding the importance of certain words to express specific types of speech (e.g.’hate’)[29].

3.4. Word Embeddings: It is a learned representation for text where words that have the same meaning have a similar representation. It is a class of techniques where individual words are represented as real-valued vectors in a predefined vector space. Each word is mapped to one vector and the vector values seem to be a neural network.One of the word embedding technique that gained maximum intrest by researchers in text mining is Word2vec.

· Word2Vec: The granularity of the embedding is word wise, generating a vector for each word of the corpus. There are 2 different possible models: CBOW (continuous bag of words), that learns to predict the word by the context, and skip-grams, which is designed to predict the context itself. According to [22], CBOW is faster to train and has slightly better accuracy for the frequent words. On the other hand, Skip-grams work well with a small amount of training data and represent well even rare words or sentences. Most of the approaches that used Word2Vec[20] apply the skip-gram model.

3.5. Sentiment Analysis: It is important to grasp the sentiment behind the message, otherwise its true meaning will probably be misunderstood and/or misinterpreted (e.g. sarcasm). Users, mainly on social media, tend to formulate opinions on a diversity of topics, especially when they express an extremist attitude, in which we include hate speech. Regarding social media, sentiment analysis approaches usually focus on identifying the polarity (positive or negative connotation) of comments and sentences as a whole.

3.6. Template Based Strategy: The basic idea of this strategy is to build a corpus of words, and for each word in the corpus, collect K words that occurring around. This information can be used as context.


Hate speech detection in text is mostly a supervised classification using machine learning algorithms. The usage of Deep learning approaches have increased significantly because of its intense accuracy which caused the emergence of neural networks on large scale for text classification.


4.1.1 Support Vector Machines : SVM’s are widely used in classification problems and the algorithm can be described as an hyperplane that categorizes input data (text in this case). In 2017, SVM’s held the best results for text classification tasks, but in 2018 deep learning took over, especially in hate speech detection as described here [24].

4.1.2 Logistic Regression: logistic regression is a (predictive) regression analysis which estimates the parameters of a logistic model, a statistical model that uses a logistic function to model a binary dependant variable [28].

4.1.3 Naive Bayes: This is an algorithm based on the Bayes’ theorem with strong naive independence assumptions between the features of the data. It generally assumes that a particular feature in a class is unrelated to any other feature. Naive Bayes is a model useful for large datasets and does well despite being a simple method.

4.1.4 Random Forest: Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest [27]. This model requires almost no input preparation, performs implicit feature selection and is very quick to train, performing well overall.

4.1.5 Decision Tree: This is an algorithm that provides support for decision making, providing a tree-like model of decisions and their possible consequences and other measures (e.g. resource cost, utility). They are often used since their output is usually readable, being simple to understand and interpret by humans. They are also fast and perform well on large datasets, but they are prone to overfiting.

4.1.6 Gradient Boosting : This is a predicition model consisting of an ensemble of weak prediction models, typically decision trees (that’s why it may also be called gradient boosted trees), in which the predictions are not made independently (as in Bagging), but sequentially. The sequential modeling allows for each model to learn from the mistakes made by the previous one[23].


4.2.1 CNN (Convolutional neural networks) : a class of deep feed-forward artifical neural networks. A CNN consists of an input and output layer and multiple hidden layers which consist of convolutional layers, pooling layers and fully connected layers[26].

4.2.2 RNN (Recurrent neural networks): Unlike CNN’s, are able to handle sequential data, allowing to produce temporal dynamic behaviors according to a time sequence. The connections between nodes form a directed graph. RNN’s have feedback loops in the recurrent layer, which act as a memory mechanism. Despite this fact, long-term temporal dependencies are hard to grasp by the standard architecture, because the gradient of the loss function decays exponentially with time (vanishing gradient problem). For this reason, new architectures have been introduced

· LSTM : Long short-term memory neural networks, are a type of RNN that use special units in addition to standard units, by including a memory cell able to keep information in memory for long periods of time. A set of gates is used to control when information enters the memory, when it’s output, and when it’s forgotten enabling this architecture to learn longer-term dependencies as detailed in [25] and [26].

· GRU: Gated recurrent unit neural networks, are similar to LSTM’s, but their structure is slightly simpler. Although they also use a set of gates to control the flow of information, these are fewer when compared to LSTM’s [25,26].

RNN supports sequential architectures where CNN has a hierarchical architecture.GRU and CNN results can be compared with respect to text size, GRU is better when the sentences are bit longer. Finally, they concluded that deep neural network performance is highly dependable on tuning the hyperparameters.


The measures for evaluating performance of machine learning algorithm are originally built from a confusion matrix where output can be two or more classes. The confusion matrix records which samples of the data have been correctly and incorrectly predicted for each class.

Accuracy is a generic performance measure that assesses the overall effectiveness of the algorithm, by computing the number of correct predictions over all the predictions made. Although it is commonly used accuracy doesn’t distinguish between different classes. Consequently, this performance metric may be misleading, especially when the classes of the data are unbalanced.

There is a subset of performance metrics that consider classes. These are usually more useful in sets of data that contain unbalanced classes, since the performance of the algorithm can be assessed class wise. This is quite often in hate speech datasets. The most used class wise, performance measures in hate speech detection are:

Recall (R), also known as Sensitivity or True Positive Rate, is defined as the proportion of real positives that are correctly predicted as positive. Precision (P) denotes the proportion of predicted positive cases that area actually positive.

F1 score is defined as the harmonic mean of Precision and Recall, and considers class imbalance, unlike accuracy, hence it’s wide usage in hate speech detection.

Using these performance metrics, a graphical visualization of the algorithm’s predictions can be computed, known as ROC (Receiver operating characteristic). It shows the relation between the sensitivity and the specificity of the algorithm and is created by plotting the true positive rate (TPR) against the false positive rate (FPR). The higher the TPR, the higher the area under ROC, also known as AUC (Area under curve).


This section presents a comprehensive review on the key works and existing studies related to the area of automatic detection and hate speech in English Language in particular. In English language, hate speech detection has been intensively investigated by more than 14 contributors in all the categories of hate speech (racial, sexism, religious and general hate).Hate speech in other languages such as Dutch, German, Italian, Turkish, Indonesian, Arabic, Portugese was also investigated but in a limited number. This paper surveys on hate speech detection in English language which has majority researches.


One of the issues in hate speech detection in text is the dataset availability. The majority of existing works were executed on privately collected datasets, often for different problem. [3] claimed to have created the largest datasets for abusive language by annotating comments posted on Yahoo!. The datasets were later used by [2]. However, the datasets are not publicly available. Currently, the only publicly available hate speech datasets include those reported in [1,4,14,17,21]. All these publicly available corpus is collected from Twitter by searching for tweets containing frequently occurring terms (based on some manual analysis) in tweets that contain hate speech and references to specific entities.

In order to annotate a data set manually, either expert annotators are used or crowd sourcing services, such as Amazon Mechanical Turk (AMT), are employed. Crowd sourcing has obvious economical and organizational advantages, especially for a task as time-consuming as the one at hand, but annotation quality might suffer from employing non-expert annotators.

[14] annotate 16,914 tweets, including 3,383 as ‘sexist’, 1,972 as ‘racist’ and 11,559 as ‘neither’. It is then annotated by crowd-sourcing over 600 users. The dataset is later expanded in [21], where some 6,900 tweets are collected, where about 4,000 are new to their previous dataset. This dataset is then annotated by two groups of users to create two different versions: domain experts who are either feminist or anti-racism activist; and amateurs that are crowd-sourced. Experiments show that amateur annotators are more likely than expert annotators to label tweets as hate speech. Later in [17], the authors merge both expert and amateur annotations in this dataset by using majority vote, giving expert annotations double weight; and in [4], the dataset in [14] is merged with the expert annotations in [21] to create a single dataset. [1] annotate some 24,000 tweets for ‘hate speech’, ‘offensive language’ but not ‘hate’, and ‘neither’. It is found that distinguishing hate speech from non hate offensive language is a challenging task, as hate speech does not always contain offensive words while offensive language does not always express hate.

In addition to the issues mentioned above that, to some extent, challenge the comparability of the research conducted on various data sets, the fact that no commonly accepted definition of hate speech exists further exacerbates this situation. Previous works remain fairly vague when it comes to the annotation guidelines their annotators were given for their work. Despite providing annotators with a definition of hate speech, in their work the annotators still fail to produce annotation at an acceptable level of reliability.


The next tables present a summary of all the discussed papers in English language in all the categories of hate speech (racial, sexism, religious and general hate). These tables can serve as a quick reference for all the key works done in the automatic detection in social media. All the approaches and their respective experiments results are listed in a concise manner.

Table 1: Summary of the current state of anti-social behaviour detection, and their respective results, in the metric: Precision (P), Recall (R), F1-Score (F).












Character and Word2vec

Hybrid CNN






Youtube, MySpace, SlashDot

Word embeddings

Fast Text




Twitter, Wikipedia, UseNet

Lexical, Linguistics and Word embeddings








Tf-idf, lexicon, PoS tag, bigram






Bag of Words

M-NB and Stochastic Gradient Descent





Semantic Context







Yahoo News Group

Template-based, PoS tagging








Save your time!
We can take care of your essay
  • Proper editing and formatting
  • Free revision, title page, and bibliography
  • Flexible prices and money-back guarantee
Place Order


Naïve Bayes




BOW, Dependencies, Hateful Terms

Bayesian Logistic Regression






Yahoo Finance

Paragraph2vec, CBOW

Logistic regression




Character ngrams

Logistic regression















Sentiment Based, Semantic, Unigram,








N-grams, Skipgrams, hierarchical word clusters

RBF kernel SVM







Character Ngrams, word2vec








Random Embedding,








Word-based frequency vectorization








Word embeddings




Choosing the most appropriate machine learning approach is another challenging decision. Previous works employed mostly all the varieties of techniques. According to table majority of researchers relied on supervised machine learning approaches in their automatic detection task. For instance, one major factor is the size of the corpus, as some ML algorithms works pretty well with small datasets. Others such as Neural Networks needs more intensive and complex training.

Resent researches are oriented towards deep learning to solve complex learning tasks. Researchers claimed that deep learning is powerful when it comes to finding data representation for classification and obviously it has a promising future in the field of the automatic detection. Choosing to adopt deep learning needs commitment in both of preparing and training the model with large amount of data. Generally, there are two main architectures for deep neural networks that are usually utilized for NLP tasks, these models are: RNN and CNN. In the previous tables, there were 4 hate speech researches that adopted deep learning, two of them were RNN and the two others were CNN. These researches concluded with the effectiveness of both approaches.For that reason, more investigation needs to be done to make the appropriate choice of deep learning architecture.


· Low agreement in hate speech classification by humans, indicating that this classification would be harder for machines

· The task of annotating a dataset is also more difficult because it requires expertise about culture and social structure.

· The evolution of social phenomena and language makes it difficult to track all racial and minority insults. Besides, language evolves quickly mainly among young populations that communicate frequently in social networks.

· Despite the offensive nature of hate speech, abusive language may be very fluent and grammatically correct, can cross sentence boundaries and it is also common the use of sarcasm in it.

· The majority of the studies focus in English. Besides, only isolated studies were conducted in other languages such as German, Dutch, Italian and others. In this case, research in other languages commonly used on the internet is also needed (e.g. French, Mandarin, Portuguese, Spanish).

· Finally, hate speech detection is more than simple keyword spotting.


This paper was established with the goal to understand the state of the art and opportunities in the field of automatic hate speech detection. We presented a comprehensive study on the methodology in automatic hate speech detection in social networks. In this paper we also investigated some challenges which can be a guide for the implementation of more accurate hate speech detection Additionally, in order to have a picture from the state of the art in the field, we conducted a Systematic Literature Review. We concluded that the number of studies and papers published in automatic hate speech detection in text is limited and usually those works regard the problem as a machine learning classification task. In this field, researchers tend to start by collecting and classifying new messages, and often the used datasets remain private. This slows down the progress in this research field because less data is available and also makes more difficult to compare the results in the different studies.

The future work will include incorporating the latest deep learning architectures to build a model that is capable to detect and classify other languages than focusing only on English. Comparative studies and surveys are also scarce in the area which should be concentrated. Also, for better comparability of different features and methods, we argue for a benchmark data set for hate speech detection.


[1] Thomas Davidson, Dana Warmsley, Michael Macy, and Ingmar Weber. Automated hate speech detection and the problem of offensive language. arXiv preprint arXiv:1703.04009, 2017.

[2] Yashar Mehdad and Joel Tetreault. Do characters abuse more than words? In Proceedings of the SIGdial 2016 Conference: The 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 299–303, 2016.

[3] Chikashi Nobata, Joel Tetreault, Achint Thomas, Yashar Mehdad, and Yi Chang. Abusive language detection in online user content. In Proceedings of the 25th International Conference on World Wide Web, pages 145–153. International World Wide Web Conferences Steering Committee, 2016.

[4] J. H. Park and P. Fung, “One-step and Two-step Classification for Abusive Language Detection on Twitter,” in AICS Conference, 2017.

[5] H. Chen, S. McKeever, and S. J. Delany, “Abusive text detection using neural networks,” in CEUR Workshop Proceedings, 2017, vol. 2086, pp. 258–260.

[6] M. Wiegand, J. Ruppenhofer, A. Schmidt, and C. Greenberg, “Inducing a Lexicon of Abusive Words – a Feature-Based Approach,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 2018, pp. 1046–1056.

[7] K. Dinakar, R. Reichart, and H. Lieberman, “Modeling the detection of Textual Cyberbullying.,” Soc. Mob. Web, vol. 11, no. 02, pp. 11–17, 2011.

[8] R. Pawar, Y. Agrawal, A. Joshi, R. Gorrepati, and R. R. Raje, “Cyberbullying Detection System with Multiple Server Configurations,” 2018 IEEE Int. Conf. Electro/Information Technol., pp. 90–95, 2018.

[9] M. Fernandez and H. Alani, “Contextual semantics for radicalisation detection on Twitter,” CEUR Workshop Proc., vol. 2182, 2018.

[10] W. Warner and J. Hirschberg, “Detecting Hate Speech on the World Wide Web,” no. Lsm, pp. 19–26, 2012.

[11] Kwok and Y. Wang, “Locate the Hate: Detecting Tweets against Blacks,” Twenty-Seventh AAAI Conf. Artif. Intell., pp. 1621–1622, 2013.

[12] P. Burnap and M. L. Williams, “Hate Speech, Machine Classification and Statistical Modelling of Information Flows on Twitter: Interpretation and Communication for Policy Decision Making,” in Proceedings of the Conference on the Internet, Policy & Politics, 2014, pp. 1–18

[13] N. Djuric, J. Zhou, R. Morris, M. Grbovic, V. Radosavljevic, and N. Bhamidipati, “Hate Speech Detection with Comment Embeddings,” in Proceedings of the 24th International Conference on World Wide Web, 2015, pp. 29–30.

[14] Z. Waseem and D. Hovy, “Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter,” Proc. NAACL Student Res. Work., pp. 88–93, 2016.

[15] H. Watanabe, M. Bouazizi, and T. Ohtsuki, “Hate Speech on Twitter: A Pragmatic Approach to Collect Hateful and Offensive Expressions and Perform Hate Speech Detection,” IEEE Access, vol. 6, pp. 13825–13835, 2018

[16] S. Malmasi and M. Zampieri, “Challenges in Discriminating Profanity from Hate Speech,” J. Exp. Theor. Artif. Intell., vol. 30, pp. 187–202, 2018

[17] B. Gambäck and U. K. Sikdar, “Using Convolutional Neural Networks to Classify Hate-Speech,” Assoc. Comput. Linguist., no. 7491, pp. 85–90, 2017.

[18] P. Badjatiya, S. Gupta, M. Gupta, and V. Varma, “Deep Learning for Hate Speech Detection in Tweets,” in Proceedings of the 26th International Conference on World Wide Web Companion, 2017, pp. 759–760

[19] G. K. Pitsilis, H. Ramampiaro, and H. Langseth, “Effective hate-speech detection in Twitter data using recurrent neural networks,” Appl. Intell., vol. 48, no. 12, pp. 4730–4742, Dec. 2018.

[20] Z. Zhang and L. Luo, “Hate Speech Detection: A Solved Problem? The Challenging Case of Long Tail on Twitter,” vol. 1, no. 0, pp. 1–5, 2018.

[21] ZeerakWaseem.2016. Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter. In Proceedings of the First Work shop on NLP and Computational Social Science. Association for Computational Linguistics,Austin,Texas,138–142.

[22] Yoav Goldberg and Omer Levy. word2 vec explained: deriving mikolov et al.’s negative-sampling word-embedding method. Computing Research Repository, abs/1402.3722, 2014.

[23] Alexey Natekin and Alois Knoll. Gradient boosting machines, a tutorial. Frontiers in neuro robotics, 7:21, 2013.

[24] Hajime Watanabe, Mondher Bouazizi, and Tomoaki Ohtsuki. Hate speech on twitter a pragmatic approach to collect hateful and offensive expressions and perform hate speech detection. IEEE Access, 2 2018. ISSN 2169-3536.

[25] Junyoung Chung, Çaglar Gülçehre, Kyung Hyun Cho, and Yoshua Bengio. Empirical evaluation of gated recurrent neural networks on sequence modeling. Computing Research Repository, abs/1412.3555, 2014.

[26] Wenpeng Yin, Katharina Kann, Mo Yu, and Hinrich Schütze. Comparative study of CNN and RNN format ural language processing. Computing Research Repository,abs/1702.01923,2017.

[27] Leo Breiman. Random forests. Machine Learning, 45(1):5–32, Oct 2001. ISSN 1573-0565. doi: 10.1023/A:1010933404324.

[28] Sandro Sperandei. Understanding logistic regression analysis. Biochemia medica, 24(1):12–18, 2014.

[29] Sanjana Sharma, Saksham Agrawal, and Manish Shrivastava. Degree based classification of harmful speech using twitter data. Computing Research Repository, abs/1806.04197, 2018.


Make sure you submit a unique essay

Our writers will provide you with an essay sample written from scratch: any topic, any deadline, any instructions.

Cite this Page

Pragmatic Supervised Learning Methodology of Hate Speech Detection in Social Media. (2022, December 27). Edubirdie. Retrieved May 29, 2023, from
“Pragmatic Supervised Learning Methodology of Hate Speech Detection in Social Media.” Edubirdie, 27 Dec. 2022,
Pragmatic Supervised Learning Methodology of Hate Speech Detection in Social Media. [online]. Available at: <> [Accessed 29 May 2023].
Pragmatic Supervised Learning Methodology of Hate Speech Detection in Social Media [Internet]. Edubirdie. 2022 Dec 27 [cited 2023 May 29]. Available from:
Join 100k satisfied students
  • Get original paper written according to your instructions
  • Save time for what matters most
hire writer

Fair Use Policy

EduBirdie considers academic integrity to be the essential part of the learning process and does not support any violation of the academic standards. Should you have any questions regarding our Fair Use Policy or become aware of any violations, please do not hesitate to contact us via

Check it out!
search Stuck on your essay?

We are here 24/7 to write your paper in as fast as 3 hours.