A Pilot Study On Sign Language Detection

Topics:	Sign Language
Words:	2081
Pages:	5 This essay sample was donated by a student to help the academic community. Papers provided by EduBirdie writers usually outdo students' samples.
Updated:	21.02.2022

ABSTRACT

People having physical limitations such as speech and hearing impairment are often unable to convey their message properly, which leads to them being left out in many aspects of life. To help those people express themselves in a better and easier way we have developed the sign language detection application. We have developed a translator that takes hand gestures and input and give the equivalent alphabet as output, which will help those people to communicate. Convolutional neural network was used for image recognition and classification in order to detect the hand from other objects in the screen and classify the sign represented by the hand gesture at any given time, thus enabling us to translate the signs into English alphabets.

Thus, this application can be used by the specially abled people to communicate with others in a more efficient and hassle freeway.The tools used were anaconda, python, Opencv, tensorflow, matplotlib, numpy, convolutional neural networks.

Save your time!
We can take care of your essay

Proper editing and formatting
Free revision, title page, and bibliography
Flexible prices and money-back guarantee

Place an order

INTRODUCTION

There are many of us who are born in a different way and who might have difficulties learning our language due to some complications. For such people, a different method of expression has been developed where only hand gestures are used in other to express their thoughts. It is known as sign language. Sign language is the primary language used by people with impaired hearing and speech. People use sign language gestures as a means of non-verbal communication to express their thoughts and emotions. But non-signers find it extremely difficult to understand, hence trained sign language interpreters are needed during medical and legal appointments, educational and training sessions[1].

According to the latest census, there are 5 million deaf and hearing impaired people who use sign language on a daily basis to express themselves. But the problem with sign language is it is not easily understood by people who have not studied it and hence there is a communication barrier among those people and the impaired people. Thus, the people who use sign language as their primary language face various problems in their day to day life as they are not able to effectively communicate with majority of the population.

Helping such people who can only use sign language to communicate with the rest of the world, without the other person having to learn sign language, is what motivated us into building the sign language detection application, which can be used by the disabled people to translate their signs into the English alphabet, thus making communication with people who do not understand sign language simpler and hassle free.

The methodology used by us is using convolutional neural networks in order to categorize the images to distinguish the hand from other objects and then classify the gesture made by the hand. Gesture or so called sign languages are used as the tool of communication between human to human interactions. Such communications are bound to a specific set of protocols or symbols - the Indian Sign Language (ISL) or the American Sign Language (ASL).The aim of the research is to illustrate the extraction of unique features from each symbol or gesture in reference to an existing data set (ISL or ASL) and then train the machine with the obtained feature vectors using standard classification models.The main objective is to recognize the hand separate from the other objects in the screen, then to classify the gesture that is being made by the hand at any given time, and to output the mapped English alphabet of that gesture on the screen[2].

In sign language recognition where the motion of the hand and its location in consecutive frames is a key feature in the classification of different signs, a fixed reference point must be chosen. The hand’s contour was chosen to obtain information on the shape of the hand and also used the hand’s center of gravity (COG) as the reference point which alleviated the bias and applied as other reference points. After defining the reference point, the distance between all the different points of a contour respect to the COG of the hand were estimated. The location of the tip of the hand was easily extracted by extracting the local maximum of the distance vector[3].

The importance of the end result is that it helps the disabled people convey their message to people who do not understand sign language by converting the gestures of the sign language into the English alphabet, thus empowering them to communicate with other people in a more efficient way and allows them to express themselves better. This application can be used by both disabled people and also by the people whose relative, customer, business partner or any other associate is specially abled and can use this application to communicate with them in a better way without learning sign language themselves. Thus it forms a bridge between such people and allows the people using english language and the people using sign language to communicate with each other without any difficulties.

METHODOLOGY

Neural networks is a machine learning technique which can be modelled after the brain structure. Neural network comprises of learning unit called neurons. These neurons learn how to convert input signals into corresponding output signals,forming the basis of automated recognition. In this paper, the methodology which is opted by establishing a convolution neural network (CNN, or ConvNet) is a type of feed-forward artificial neural network in which the connectivity pattern between its neuron is inspired by the organization of the animal visual cortex.

The first step in image processing is detection of skin color pixels. The aim is to extract the skin (face and hand) from the rest part on image and after that to extract only hand. Detection of the skin is very popular and useful technique for detection and monitoring of human body parts.The main aim of the detector of skin is to form a decision rule that should separate the pixels of the skin from those pixels that do not belong to the skin. Identification of the pixels who have skin color involves finding a range of values for which the largest number of skin color pixels fall to a certain color format, minimizing the classification error of pixels that do not belong to the skin. But in this paper, the skin detection is not used to see whether without detection of skin color pixel, detection of hand gesture is possible or not[4].

The Data set which is used in this paper is American Sign Language[5] which consist of 26 english letters and 3 additional signs which indicates “Space”, “Del” and “Nothing”. For training, 3000 images are used for each sign . [6] First, a video of a sign language demonstration is sampled and concatenated into an image. A video camera is used with standard specifications to acquire the images. The videos are then converted into frames, cropped in order to select a specific region and then an algorithm is implemented on it. After that, the image becomes the input of the convolutional neural network (CNN).It uses only 2D images, cheap camera can be used which is the main advantage of the proposed method.

In this paper, the process through the recognition of the hand gesture is described in the above flow chart. As described above, at first, the training data come into the network and go out as the probability for the selecting the candidates.

The sign language actions are learned through CNN,which is known to have strong performance for image classification problems. The network in this paper consists of four convolution layers and two full-connect layers.The main building block of a convolutional neural network is convolution layer.It comprises of a set independent filters[7].The CNN has the 4 concolutional layers with activation function ReLU (Rectitified Linear Unit) which controls how the signal from one layer to the next, emulating how neurons are fired in our brain. A pooling layer is another building block of a CNN. The function of the pooling layer is to progressively reduce the spatial size of the representation to reduce the amount of parameters and computation in the network.Pooling layer operates on each feature map independently. There is mean pooling and max pooling. The most common approach is the max pooling in which maximum of a region taken as its representative. Fully connected are the last layers in the network which means that the neurons of preceding layers are connected to every neuron in subsequent layers[8].

In this paper, the inception v3 model of the Tensor Flow library has been used. It is a huge image classification model with millions of parameters that can differentiate a large number of kinds of images. The final layer of the network is trained only. Inception-v3 is trained for the ImageNet Large Visual Recognition Challenge using the data from 2012 where it reached a top-5 error rate of as low as 3.46%. The pre-trained Inception v3 model is downloaded(trained on ImageNet Dataset consisting of 1000 classes). The final layer on the dataset is trained by adding a new final layer corresponding to the number of categories.

In the first approach, we extracted spatial features for individual frames using inception model (CNN) .Each video (a sequence of frames) was then represented by a sequence of predictions made by CNN for each of the individual frames. The prediction is done under an idle (white) background. First, the frames are extracted from the video sequences of each gesture and then the noise from the frames are removed and a region of interest is created in the form of a rectangular box to extract more relevant features from the frame.

ROI segmentation is used where a wide image is created at first by sampling and concatenating the original video frames [9]. And then by using the network that detects the hand area, the ROI segmented hand area is obtained. The second step is the sign language learning, where the ROI segment image is the input of the classification network, and the input data comes out as probability vectors. By using this, the sign is determined. The dataset used here is in different situations taken from a 1m distance in various combinations of backgrounds, clothes etc. The success rate was calculated to be 84% without ROI and 97% with ROI segmentation in 1m tests. The frames are input to the CNN model and train the model as described above and the testing of the model is done in the idle or white background.

CONCLUSION

We have been successful in building an application that can translate sign language into the English alphabet with an acceptable accuracy. We have succeeded in doing so by using convolutional neural network to categorize the images and distinguish the hand from other objects present in the image and classify the sign represented by the image and we have successfully mapped the sign language gestures to the English alphabet and displayed the equivalent alphabet of the sign on the monitor. The application is ready for use for communication purposes of disabled people to enable them to express themselves in a better way and interact with people who does not understand sign language.

REFERENCES

Lihong Zheng, Bin Liang, and Ailian Jiang, “Recent Advances in Deep Learning for Sign Language Recognition”.
Himadri Nath Saha, Sayan Tapadar, Shinjini Ray, Suhrid Krishna Chatterjee and Sudipta Saha , “A Machine Learning Based Approach for hand Gesture Recognition using Distinctive Feature Extraction”.
Ashok K Sahoo, Gouri Sankar Mishra and Kiran Kumar Ravulakollu, “SIGN LANGUAGE RECOGNITION: STATE OF THE ART”, VOL. 9, NO. 2, February 2014 ARPN Journal of Engineering and Applied Sciences.
Marko Z. Šušiü, Saša Z. Maksimoviü, Sofija S. Spasojeviü and Željko M. Ĉuroviü, “Recognition and Classification of Deaf Signs using Neural Networks”,11th Symposium on Neural Network Applications in Electrical Engineering,NEUREL-2012 Faculty of Electrical Engineering,University of Belgrade,Serbia,September20- 22,2012
https://www.kaggle.com/grassknoted/asl-alphabet
Purva A. Nanivadekar and Dr. Vaishali Kulkarni, “Indian Sign Language Recognition: Database Creation, Hand Tracking and Segmentation “, 2014 International Conference on Circuits, Systems, Communication and Information Technology Applicants (CSCITA).
Yangho Ji, Sunmok Kim, and Ki-Baek Lee , “Sign Language Learning System with Image Sampling and Convolutional Neural Network”, 2017 First IEEE International Conference on Robotic Computing
Marlon Oliveira, Houssem Chatbri, Suzanne Little, Noel E. O'Connor, and Alistair Sutherland , “A comparison between end-to-end approaches and feature extraction based approaches for Sign Language recognition”.
Sunmok Kim, Yangho Ji, and Ki-Baek Lee, “An effective sign language learning with object detection based ROI segmentation”, 2016 13th International Conference on Ubiquitous Robots and Ambient Intelligence (URAl) August 19-22, 2016 at Sofitel Xian on Renmin Square, Xian, China

Did you like this example?

Yes

Sign Language Interpretation Using Deep Learning

Dynamic Kurdish Sign Language Gesture Recognition with New Features

Cite this paper

APA
MLA
Harvard
Vancouver

A Pilot Study On Sign Language Detection. (2022, February 21). Edubirdie. Retrieved April 6, 2025, from https://hub.edubirdie.com/examples/a-pilot-study-on-sign-language-detection/

“A Pilot Study On Sign Language Detection.” Edubirdie, 21 Feb. 2022, hub.edubirdie.com/examples/a-pilot-study-on-sign-language-detection/

A Pilot Study On Sign Language Detection. [online]. Available at: <https://hub.edubirdie.com/examples/a-pilot-study-on-sign-language-detection/> [Accessed 6 Apr. 2025].

A Pilot Study On Sign Language Detection [Internet]. Edubirdie. 2022 Feb 21 [cited 2025 Apr 6]. Available from: https://hub.edubirdie.com/examples/a-pilot-study-on-sign-language-detection/

copy

Interpreting Sign Language with Robotic Arms

Sign Language

The usage of computer vision and artificial intelligence has become an important factor for many...

5 Pages | 2122 Words

American Sign Language Translator Using Convolutional Neural Networks

The various methods humans have at their disposal to communicate with themselves is what sets them...

3 Pages | 1523 Words

IoT Based Sign Language Translator Device

Sign languages are natural languages that deaf people use to speak with other ordinary human...

4 Pages | 1659 Words

How Might Advertising Be Understood As A System Of Signs?

Semiotics
Sign Language

Advertising is everywhere in our daily lives nowadays; during an average working day, people are...

5 Pages | 2321 Words

Sign Language Translation Using Deep Learning

Sign language is the way of communication for hearing impaired people. There is a challenge for...

4 Pages | 1983 Words

The Numerous Benefits Of Baby Sign Language

“They know what they want and they have no way of telling you except crying. It's like you’re...

5 Pages | 2428 Words

Arabic, English And French Sign Language Semantic Translation System

Arabic Sign language translation into text and into other language is an important issue that many...

6 Pages | 2581 Words

Automatically Analysing The American Sign Language Images

Hearing and speech impairment is disability which affects individual communication with outer...

4 Pages | 1824 Words

Arabic Deaf Sign Recognition with Deep Learning

One of the best ways of communication between Deaf people themselves and normal people is based on...

4 Pages | 1676 Words

ABSTRACT

INTRODUCTION

METHODOLOGY

CONCLUSION

REFERENCES

Cite this paper

Most popular essays

Join our 150k of happy users

A Pilot Study On Sign Language Detection

ABSTRACT

INTRODUCTION

METHODOLOGY

CONCLUSION

REFERENCES

Cite this paper

Related essay topics

Related papers

Most popular essays

Join our 150k of happy users