Short on time?

Get essay writing help

Critical Review: Automatic Detection of Hate Speech

Words: 3427
Pages: 8
This essay sample was donated by a student to help the academic community. Papers provided by EduBirdie writers usually outdo students' samples.


The research emphasized that some of the significant issues on social media are abusive and harassing text messages, including controversial topics, swearing, abusive language, and taboo words which are not ethical for the human being. Social media is an independent platform where people can put their thought using text messages, without knowing what would be effected on other’s minds and behaviors. A vast majority of people who regularly engage with social media platforms will have encountered a harasser. Even the biggest enthusiasts included that there is a widespread phenomenon that exists to encounter harassing behaviors. In this research paper the researcher used text mining and machine learning algorithms to detect and identify harassing behaviors and abusive text messages. The researcher also focused on the automated process of harassment classification which will also take supervised action against the harassers. The researcher discussed on some of the significant research issues and challenges on hate speech and how to identify abusive text and detect harassing behaviors of the people which are used social media.

Keywords: NLP, Machine Learning, Social Media

I. Introduction

Social media has made it simple for us to convey rapidly and effectively with family, companions, and colleagues, just as sharing encounters and telling others of our sentiments and convictions. These suppositions and convictions might be about world occasions or nearby issues, legislative issues or religion, interests, affiliations, associations, items, individuals, and a wide assortment of different subjects. Our discussions and remarks can be intently focused on or generally communicate to the point that relying upon the subject [1], they can become a web sensation. Shockingly, social media is additionally generally utilized by abusers, for precisely the reasons recorded previously. Numerous culprits ‘cover up’ behind the way that they will be unable to be promptly distinguished, saying things that they wouldn’t think about saying eye to eye, which could be viewed as weak. Online maltreatment takes a few structures, and exploited people are not restricted to open figures. They can carry out any responsibility, be of all ages, sex, sexual introduction or social or ethnic foundation, and live in any place [2].

II. Literature Review

Cyberbullying can happen online just, or as a major aspect of progressively broad harassment. Cyberbullies might be individuals who are known to you or unknown. Like all domineering jerks, they recurrence attempt to induce others to participate. You could be harassed for your religious or political convictions, race or skin shading, or self-perception, in the event that you have a psychological or physical handicap or for no clear reason at all [3].

Cyberbullying for the most part contains sending undermining or generally frightful messages or different interchanges to individuals by means of social media, gaming locales, content or email, posting humiliating or embarrassing videos on facilitating destinations, for example, YouTube or Vimeo, or hassling through rehashed writings, texts or visits. Progressively, it is executed by posting or sending pictures, videos or private subtleties acquired by means of sexting, without the injured individual’s authorization. Some cyberbullies set up Facebook pages and other social media accounts absolutely to menace others [4] [5].

The impacts of cyberbullying range from disturbance and mellow misery to – in the most outrageous cases – self-damage and suicide. This can be a reality for powerless individuals, or without a doubt, anyone made to feel helpless through cyberbullying or other individual conditions [5].

Chikashi Nobata et. al., (2016) underlined that the Detection of damaging language in client-created online substances has turned into an issue of expanding significance lately. Most present business techniques utilize boycotts and normal articulations, anyway these measures miss the mark while fighting with progressively unobtrusive, less ham-fisted instances of hate speech. In this work, we build up an AI based technique to distinguish hate speech on online client remarks from two areas which beats a cutting-edge profound learning approach. We likewise build up a corpus of client remarks clarified for oppressive language, the first of its sort. At last, we utilize our identification instrument to investigate injurious language after some time and in various settings to additionally upgrade our insight into this conduct [1].

Hossein Hosseini (2017) focused on social media stages giving a situation where individuals can unreservedly participate in discourses. Lamentably, they additionally empower a few issues, for example, online provocation. As of late, Google and Jigsaw began an undertaking called Perspective, which utilizes AI to naturally distinguish dangerous language. A showing site has been additionally propelled, which enables anybody to type an expression in the interface and momentarily observe the danger score [1].

In this paper the researcher proposed an assault on the Perspective dangerous recognition framework dependent on the antagonistic models. We demonstrate that a foe can quietly alter an exceptionally poisonous expression such that the framework appoints essentially lower danger score to it. We apply the assault on the example phrases given in the Perspective site and demonstrate that we can reliably decrease the lethality scores to the dimension of the non-poisonous expressions. The presence of such ill-disposed models is exceptionally destructive for poisonous discovery frameworks and genuinely undermines their ease of use [2].

B. Sri Nandhinia and J.I.Sheebab (2015) expressed that social systems administration destinations (SNS) is as a rule quickly expanded as of late, which gives stage to interface individuals everywhere throughout the world and offer their interests. Be that as it may, Social Networking Sites is giving chances to cyberbullying exercises. Cyberbullying is bugging or offending an individual by sending messages of harming or compromising nature utilizing electronic correspondence. Cyberbullying presents huge danger to physical and emotional well-being of the people in question. Discovery of cyberbullying and the arrangement of resulting preventive measures are the fundamental game-plans to battle cyberbullying. The proposed technique is a powerful strategy to distinguish cyberbullying exercises on social media. The identification technique can recognize the nearness of cyberbullying terms and order cyberbullying exercises in social systems, for example, Flaming, Harassment, Racism and Terrorism, utilizing Fuzzy rationale and Genetic calculation [3].

Divya Bansal, Sanjeev Sofat (2016) stressed that Social spam is a colossal and entangled issue tormenting social systems administration locales in a few different ways. This incorporates posts, surveys or writes containing item advancements and challenges, grown-up substance and general spam. It has been discovered that social media sites, for example, Twitter is likewise going about as a merchant of obscene substance, despite the fact that it is considered against their own expressed arrangement. In this paper, we have surveyed the instance of Twitter and found that spammers adding to explicit substance pursue authentic Twitter clients and send URLs that interface clients to obscene destinations. Social examination of such sort of spammers has been directed utilizing diagram based just as substance-based data got utilizing straightforward content administrators to think about their attributes. In the present examination, around 74,000 tweets containing explicit grown-up substance posted by around 18,000 clients have been gathered and broke down. The examination demonstrates that the clients posting explicit substance satisfy the attributes of spammers as expressed by the standards and rules of Twitter. It has been seen that the ill-conceived utilization of social media for spreading social spam has been spreading at a quick pace, with the system organizations turning a visually impaired eye toward this developing issue. Obviously, there is a massive prerequisite to construct a viable answer for expel questionable and libellous substance as expressed above from social systems administration sites to advance and ensure open respectability and the welfare of kids and grown-ups. It is additionally basic in order to improve open involvement of real clients utilizing social media and shield them from damage to their open personality on the World Wide Web. Further in this paper, arrangement of obscene spammers and real clients has additionally been performed utilizing AI system. Exploratory outcomes demonstrate that Random Forest classifier can foresee explicit spammers with a sensibly high precision of 91.96 %. As far as we could possibly know, this is the principal endeavour to investigate and classify the conduct of obscene clients in Twitter as spammers. Up until this point, the work has been accomplished for distinguishing spammers yet they are not explicitly focusing on obscene spammers [4].

Save your time!
We can take care of your essay
  • Proper editing and formatting
  • Free revision, title page, and bibliography
  • Flexible prices and money-back guarantee
Place Order

Karthik Dinakar (2012) underscored that cyberbullying (badgering on social systems) is broadly perceived as a genuine social issue, particularly for youths. It is as much a danger to the suitability of online social systems for youth today as spam used to be to email in the beginning of the Internet. Current work to handle this issue has included social and mental examinations on its commonness just as its negative impacts on youths. While genuine arrangements lay on instructing youth to have solid individual connections, few have considered creative plan of social system programming as an apparatus for alleviating this issue. Alleviating cyberbullying includes two key parts: hearty strategies for successful location and intelligent UIs that urge clients to think about their conduct and their decisions[5][4].

Spam channels have been fruitful by applying measurable methodologies like Bayesian systems and shrouded Markov models. They can, similar to Google’s Gmail, total human spam decisions since spam is sent almost indistinguishably such a large number of individuals. Tormenting is increasingly customized, changed, and logical. In this work, we present a methodology for harassing location dependent on cutting edge characteristic language handling and a good judgment information base, which grants acknowledgment over a wide range of points in regular day to day existence. We break down an increasingly tight scope of specific topic related with harassment (for example appearance, insight, racial and ethnic slurs, social acknowledgment, and dismissal), and develop Bully Space, a sound judgment learning base that encodes specific information about harassing circumstances. We at that point perform joint dissuading presence of mind information about a wide scope of regular day to day existence themes. We examine messages utilizing our novel Analogy Space good judgment thinking strategy. We additionally consider social system investigation and different components. We assess the model on genuine cases that have been accounted for by clients on Form spring, a social systems administration site that is well-known with young people. On the mediation side, we investigate a lot of intelligent client cooperation ideal models with the objective of advancing sympathy among social system members. We propose an ‘aviation authority’- like dashboard, which cautions mediators to huge scale flare-ups that seem, by all accounts, to be heightening or spreading and encourages them organize the present storm of client grievances. For potential exploited people, we give instructive material that advises them about how to adapt to the circumstance and associates them with passionate help from others. A client assessment demonstrates that in-setting, directed, and dynamic help amid cyberbullying circumstances cultivates end-client reflection that advances better adapting procedures [5].

Paula Fortuna, Sérgio Nunes (2018) emphasized that the scientific study of hate speech, from a computer science point of view, is recent. This survey organizes and describes the current state of the field, providing a structured overview of previous approaches, including core algorithms, methods, and main features used. This work also discusses the complexity of the concept of hate speech, defined in many platforms and contexts, and provides a unifying definition. This area has an unquestionable potential for societal impact, particularly in online communities and digital media platforms. The development and systematization of shared resources, such as guidelines, annotated datasets in multiple languages, and algorithms, is a crucial step in advancing the automatic detection of hate speech. [6]

Anna Schmidt, Michael Wiegand (2017). Emphasized the term hate speech. The researcher decided in favour of using this term since it can be considered a broad umbrella term for numerous kinds of insulting user-created content addressed in the individual works we summarize in this paper. Hate speech is also the most frequently used expression for this phenomenon, and is even a legal term in several countries. Below we list other terms that are used in the NLP community. This should also help readers with finding further literature on that task. Hate speech is commonly defined as any communication that disparages group on the basis of some characteristic such as race, color, ethnicity, gender, sexual orientation, nationality, religion, or other characteristics (Nockleby, 2000)[7].

III. Methodology

Text Mining Approaches in Automatic Hate Speech Detection In this research article the researcher described on algorithms for hate speech detection, and also other studies focusing on related concepts (e.g., Cyberbullying). Finding the right features for a classification problem can be one of the more demanding tasks when using machine learning. Therefore, the researcher allocates this specific section to describe the features already used by other authors. We divide the features into two categories: general features used in text mining, which are common in other text mining fields; and the specific hate speech detection features, which we found in hate speech detection documents and are intrinsically related to the characteristics of this problem. We present our analysis in this section.

  1. General Features Used in Text Mining. The majority of the papers we found try to adapt strategies already known in text mining to the specific problem of automatic detection of hate speech. It defines general features as the features commonly used in text mining. We start by the most simplistic approaches that use dictionaries and lexicons.
  2. Dictionaries. One strategy in text mining is the use of dictionaries. This approach consists in making a list of words (the dictionary) that are searched and counted in the text. These frequencies can be used directly as features or to compute scores.
  3. In the case of hate speech detection, this has been conducted using: Content words (such as insults and swear words, reaction words, and personal pronouns) collected from
  4. A number of profane words in the text, with a dictionary that consists of 414 words, including acronyms and abbreviations, where the majority are adjectives and nouns.
  5. Label Specific Features consisted in using frequently used forms of verbal abuse as well as widely used stereotypical utterances.
  6. Ortony Lexicon was also used for negative affect detection; the Ortony lexicon contains a list of words denoting a negative connotation and can be useful, because not every rude comment necessarily contains profanity and can be equally harmful .

This methodology can be used with an additional step of normalization, by considering the total number of words in each comment. Besides, it is also possible to use this kind of approach with regular expressions. Rule-based approaches, sentiment analysis, and deep learning. For the specific hate speech detection features, we found mainly othering language, the superiority of the in-group, and focus on stereotypes. Besides, we observed that the majority of the studies only considers generic features and do not use particular features for hate speech. This can be problematic because hate speech is a complex social phenomenon in constant evolution and supported in language nuances. Finally, we identified challenges and opportunities in this field, namely the scarcity of open-source code and platforms that automatically classify hate speech; the lack of comparative studies that evaluate the existing approaches; and the absence of studies in languages other than English.

IV. Cases of hate speech

Hate speech has become a popular topic in recent years. This is reflected not only by the increased media coverage of this problem but also by the growing political attention. There are several reasons to focus on hate speech automatic detection, which we discuss in the following list: • European Union Commission directives. In recent years, the European Union Commission has been conducting different initiatives for decreasing hate speech. Several programs are being founded in the fight of hate speech.

Also, European regulators accused Twitter of not being good enough at removing hate speech from its platform.

  1. Automatic techniques not available. Automated techniques aim to programmatically classify text as hate speech, making its detection easier and faster for the ones that have the responsibility to protect the public [9, 65]. These techniques can give a response in less than 24h, as presented in the previous point. Some studies have been conducted about the automatic detection of hate speech, but the tools provided are scarce.
  2. Lack of data about hate speech. There is a general lack of systematic monitoring, documentation, and data collection of hate and violence, namely, against LGBTI (lesbian, gay, bisexual, transgender, and intersex) people.
  3. Nevertheless, detecting hate speech is a very important task, because it is connected with actual hate crimes and automatic hate speech detection in text can also provide data about this phenomenon.
  4. Hate speech removal. Some companies and platforms might be interested in hate speech detection and removal. For instance, online media publishers and online platforms, in general, need to attract advertisers and therefore cannot risk becoming known as platforms for hate speech

V. Issues and challenges

Hate speech is a complex phenomenon and its detection problematic. Some challenges and difficulties were highlighted by the authors of the surveyed papers:

  1. Low agreement in hate speech classification by humans, indicating that this classification would be harder for machines.
  2. The task requires expertise about culture and social structure.
  3. The evolution of social phenomena and language makes it difficult to track all racial and minority insults.
  4. Language evolves quickly, in particular among young populations that communicate frequently in social networks.
  5. Despite the offensive nature of hate speech, abusive language may be very fluent and grammatically correct, can cross sentence boundaries.

VI. Discussion and conclusion

This research article is based on a critical overview on how the automatic detection of hate speech in text has evolved over the past years. First, we analyzed the concept of hate speech in different contexts, from social network platforms to other organizations. Based on our analysis, we proposed a unified and clearer definition of this concept that can help to build a model for the automatic detection of hate speech. Additionally, we presented examples and rules for classification found in the literature, together with the arguments in favour or against those rules. Our critical view pointed out that we have a more inclusive and general definition about hate speech than other perspectives found in the literature. This is the case, because we propose that subtle forms of discrimination on the internet and online social networks should also be spotted. With our analysis, we also concluded that it would be important to compare hate speech with cyberbullying, abusive language, discrimination, toxicity, flaming, extremism and radicalization. It more difficult to compare results from different studies. Nevertheless, we found three available datasets, in English and German. Additionally, we compared the diverse studies using algorithms for hate speech detection, and we rank them in terms of performance. Our goal was to reach conclusions about which approaches are being more successful. However, and in part due to the lack of standard datasets, we find that there is no particular approach proving to reach better results among the several articles.

In this paper, the researcher emphasized that a critical review on the automatic detection of hate speech. This task is usually framed as a supervised learning problem. Fairly generic features, such as bag of words or embeddings, systematically yield reasonable classification performance. Character-level approaches work better than token-level approaches. Lexical resources, such as list of slurs, may help classification, but usually only in combination with other types of features. Various complex features using more linguistic knowledge, such as dependency parse information, or features modelling specific linguistic constructs, such as imperatives or politeness, have also been shown to be effective. Information derived from text may not be the only cue suggesting the presence of hate speech. It may be complemented by meta-information or information from other modalities (e.g. images attached o messages). Making judgments about the general effectiveness of many of the complex features is


  1. Chikashi Nobata, Joel Tetreault, Achint Thomas, Yashar Mehrdad, Yi Chang (2016). Abusive Language Detection in Online User Content, International World Wide Web Conferences Steering Committee Republic and Canton of Geneva, Switzerland ©2016, ISBN: 978-1-4503-4143-1 doi>10.1145/2872427.2883062.
  2. Hossein Hosseini, Sreeram Kannan, Baosen Zhang, Radha Poovendran (2017). Deceiving Google’s Perspective API Built for Detecting Toxic Comments, Machine Learning, and 27 Feb 2017.
  3. B. Sri Nandhinia, J.I.Sheebab (2015). Online Social Network Bullying Detection Using Intelligence Techniques, Procedia Computer Science, Volume 45, 2015, Pages 485-492, Elsevier,
  4. Divya Bansal, Sanjeev Sofat (2016). Behavioural analysis and classification of spammers distributing pornographic content in social media, Social Network Analysis and Mining, 24 June 2016, Springer.
  5. Karthik Dinakar, Birago Jones, Catherine Havasi, Henry Lieberman, Rosalind Picard (2012). Common Sense Reasoning for Detection, Prevention, and Mitigation of Cyberbullying, ACM Transactions on Interactive Intelligent Systems (TiiS) – Special Issue on Common Sense for Interactive Systems archive, Volume 2 Issue 3, September 2012.
  6. Paula Fortuna, Sérgio Nunes(2018).A Survey on Automatic Detection of Hate Speech in Text, ACM Computing Surveys, Vol. 51, No. 4, Article 85. Publication date: July 2018.
  7. Anna Schmidt, Michael Wiegand (2017).ASurveyonHateSpeechDetectionusingNaturalLanguageProcessing, Fifth International Workshop on Natural Language Processing for Social Media, pages 1–10, Valencia, Spain, and April 3-7, 2017.

Make sure you submit a unique essay

Our writers will provide you with an essay sample written from scratch: any topic, any deadline, any instructions.

Cite this Page

Critical Review: Automatic Detection of Hate Speech. (2022, December 27). Edubirdie. Retrieved September 24, 2023, from
“Critical Review: Automatic Detection of Hate Speech.” Edubirdie, 27 Dec. 2022,
Critical Review: Automatic Detection of Hate Speech. [online]. Available at: <> [Accessed 24 Sept. 2023].
Critical Review: Automatic Detection of Hate Speech [Internet]. Edubirdie. 2022 Dec 27 [cited 2023 Sept 24]. Available from:
Join 100k satisfied students
  • Get original paper written according to your instructions
  • Save time for what matters most
hire writer

Fair Use Policy

EduBirdie considers academic integrity to be the essential part of the learning process and does not support any violation of the academic standards. Should you have any questions regarding our Fair Use Policy or become aware of any violations, please do not hesitate to contact us via

Check it out!
search Stuck on your essay?

We are here 24/7 to write your paper in as fast as 3 hours.