Indian Sign Language Recognition System: Approaches And Challenges

This essay sample was donated by a student to help the academic community. Papers provided by EduBirdie writers usually outdo students' samples.

Cite this essay cite-image


Communication is the fundamental need for an individual. People are communicate with each other to share their thoughts, ideas, information and the most crucial is they feel associated with each other by communicating with each other. Normal individual fail to communicate with deaf-dumb because the absence of the knowledge of sign language used by deaf-dumb. Total 5072914 hearing disabled and 1998692 speech disabled population are there in India which is reported by Census 2011. For such population there is a need of translator which solve the communication issue between deaf-dumb and normal individuals. To develop a gesture recognition is a one of the way to solve such kind of problem and it also have several application which is elaborated in this paper. The given study state the various gesture recognition systems based on Indian Sign Language (ISL) and different approaches. Also discussed the recent work and challenges to develop Sign Language Recognition (SLR) systems.


Communication is the basic need of a typical person. Individuals speak with one another to share their thoughts, ideas, information and the most vital is they feel associated with one another by communicating with one another. Ordinary individuals utilizes speech, visuals, signs and behavior to speak with one another[1]. When we consider the population who having hearing or speech inability or both the incapacities they confront numerous issues to communicate. At same time ordinary individuals attempt to speak with deaf-dumb individuals they are also fail to communicate.

Save your time!
We can take care of your essay
  • Proper editing and formatting
  • Free revision, title page, and bibliography
  • Flexible prices and money-back guarantee
Place an order

Deaf-dumb uses gesture based communication as a method of communication to communicate another deaf-dumb. Gesture based communication isn't known to all typical individuals. Generally deaf-dumb take help of human interpreter to speak with normal individuals. To keep human interpreter dependably with himself/herself isn't possible in each circumstance.[2]. Likewise there are very few human interpreters available who are having the knowledge of sign language[3]. Deaf-dumb people isolated in the society because of their incapacities. Deaf-dumb are also a human beings then why they are not equally treated by the society?[4] They are confronting numerous issues at whatever point they confront the real world. They are not living the life which normal human being lives. To think about their issues and to serve facility to those disabled people there must be a need of an automated interpreter.

In India according to census 2011 (2016 updated) 2.68 crores persons are disabled out of 121 crores total population. From given total disabled population 5072914 (male - 2678584, female – 2394330) are hearing disables and 1998692 (male – 1122987, female – 875705) are speech disabled[5].

Many researches at present proceeding to develop computerized interpreter for deaf-dumb by utilizing ICT based methodologies. In USA first ICT based learning tool was developed in 2002 by Sabahat[6]. Gesture recognition is one of such ICT based method which used to develop such interpreter. In next section gesture recognition and sign language is discussed with respect to deaf-dumb and normal individual communication, its approaches and applications as well.


Gesture recognition is a method which process gestures made by the person and interpreted by the other person. Gestures are expressive, body motions which includes movement of fingers, arms, hands, face, head, or body with aim to pass an information and interacting with others. The gesture recognition system is an application area of Human-Computer Interaction (HCI)[7]. HCI researches designs various system to communicate or work with computers. Basically gestures are classified as static gestures and dynamic gestures. Some gestures also have both static and dynamic elements. Gestures can broadly be of the following types:[8]

A. Hand and arm gestures

Recognition of hand poses, sign languages, and entertainment applications (allowing children to play and interact in virtual environments). This includes posture and gesture. A posture is a static arrangement of a fingers with no hand movements. A gesture is a dynamic hand movement with or without finger movements.

B. Head and face gestures

Some examples are: Nodding or shaking of head, bearing of eye stare, raising the eyebrows, opening the mouth to speak, winking, flaring the nostrils and looks of surprise, happiness, disgust, fear, anger, sadness, contempt, etc.

C. Body gestures

Involvement of full body motion, as in: a) tracking movements of two people interacting outdoors; b) analyzing movements of a dancer for generating matching music and graphics; and c) recognizing human gaits for medical rehabilitation and athletic training.

Human gestures commonly establish a space of a motion expressed by the body, face and hands. Hand gestures are the most expressive and most frequently used[8].

Gesture recognition system is classify as hardware based and vision based systems [4][9].

Hardware based gesture recognition systems

The hardware based system incorporates the wireless gloves, data gloves that can be used to extract the hand signs [10]. These gloves have sensors, Arduino circuit board, and Accelerometer in them to recognize the gestures [10][11][12]. The accelerometer is used to sense the tilt; it can be static as well as dynamic. The sensors that can be used are the flex, tactile and so forth. The drawback of the hardware-based system is that we can’t generally carry the hardware with heap of cables. Hardware not reasonable for all sort of climate condition also. What's more, the most is the expense of hardware is very higher[4].

Vision based gesture recognition systems

In vision based methodology simply require camera gadget like cell phones to catch the video or image. It uses image processing fundamentals with artificial intelligence concepts to extract sign features and recognize a sign. And based on that it shows text or audio[7].


The sign language is a visual language and it is used by deaf-dumb for communication. It is consist of major three components: finger-spelling, word-level sign vocabulary and non-manual features[8]. A finger-spelling is used to spell words letter by letter and it is commonly used to state the name of person. The word-level vocabulary (sign language dictionary) is used for majority of communication and non-manual features consist of facial expression, position of tongue, mouth and body.

Sign language is the method of communication used by deaf-dumb people. Fundamentally sign language has its very own semantics, language structure and grammatical rules[4]. In sign language signer utilizes movement of hands and facial expressions to form various gestures and that presents specific character, number or word[2][13]. Deaf-dumb people utilizes sign language while they are communicating with other deaf-dumb[4]. However, when the communication occur between deaf-dumb and normal individual than it makes an issue to pass on the message because communication via gestures is not known to all.

There are 143 sign languages are used around the world[4]. Different nations having their very own sign languages like American Sign Language, Arabic Sign Languages, Austrian Sign Language, Indian Sign Language, British Sign Language, German Sign Language, Persian Sign Language, Chinese Sign Language and Pakistani Sign Language etc[7][6][4]. American Sign Language is acknowledged as standard sign language by numerous nations world-wide[3]. Few countries utilizes sign language according to their culture like Indian Sign Language used in India.

Distinctive sign languages utilizes diverse ways to present a characters, numbers and words; for instance a few uses one hand, a few uses two hands. Indian Sign Language has a characteristics that it uses two hands to frame different characters[1]. Where in American Sign Language it utilizes just a single hand to frame a characters. In sign language the grammar usage is not considered and the emphasis just on the words. In this ‘a’, ‘an’ and ‘the’ articles are omitted and does not concentrate on tenses. The organization of sentence is – Subject, Object and Verb (SOV) where in English language arrangement of sentence is – Subject, Verb and Object (SVO) [14]. For instance in English we state “I am not reading a book”, this present in sign language as “Not reading book”, which omits grammar of English Language. IV. APPLICATIONS OF GESTURE RECOGNITION

A. Sign language Interpreter/Translator

It is makes communication possible between deaf-dumb and a normal individual as Deaf-dumb need human translator with together which is costly and limited openness of human translator at anyplace anytime.

B. Sign Language learning tool

Most deaf-dumb kids are born as ordinary kids who have incomplete or no knowledge of sign language. Such kids learn sign language only at deaf-dumb schools, thus any system for such kids which help in form of teaching tool would be the great help to them and make them able to express themselves through signs.

C. Human computer interface

It is used by deaf-dumb individuals to provide inputs to the computer interface.

D. Interfacing in virtual environment

It is use gesture to control the computer systems, music system, virtual games, many home appliances, medical equipment used in surgery by doctors, etc. This provide more nature way to interact with computer.


The communication problem between deaf-dumb and normal people is addressed in [4]. The proposed approach is a computer vision based algorithm which make a dual way (Full Duplex) communication between deaf-dumb and normal people. Extraction of image frames from live video and extract hand gesture from that image and the final output is text or speech and vice versa. As unavailability of ISL dataset they have made their own video dataset of 6 different words of ISL from 4-5 persons under different illumination condition. In video pre-processing they have performed filtering and segmentation. For skin color based segmentation HSV color model is used. For feature extraction they have used Eigen vectors and Eigen values technique. After feature extraction for classification they used Eigen value weighted Euclidean Distance based classifier has used. Finally after classification of gesture the sign converted into text or speech and the reverse processing for dual way communication.

In [23] they have discussed the problem to extract complex head and hand movements with constantly changing shapes for recognition of sign language is addressed. They proposed CNN based Indian Sign Language (ISL) recognition system. For implementation they have used mobile selfie based video as an input image. Due to unavailability of dataset they have created their own dataset of different 200 sign words from 5 ISL users with 5 different viewing angles under various background environment. Training is performed in different batches to know the robustness training modes required for CNN's. Batch-1 training using one set of 200 images from 1 user, Batch-2 training using two set of 200 images from 2 users and in Batch-3 five set of sign images were used. For higher recognition they have implemented different CNN architecture and tested on their own dataset. They have implemented different three pooling methods like mean pooling, max pooling and stochastic pooling; from these they found stochastic pooling is best for their case. For proving capabilities of CNN they compared results with Mahalanobis distance classifier, adaboost, ANN and Deep ANN classifiers. The average recognition rate of proposed CNN model is 92.88 % and is higher as compared with the other state of the art classifiers.

In [18] they have proposed a method in which input video is taken as 640 X 480 resolution in VGA recordings. The frames are convert to grey. For segmenting hand 0.2 times average lamination of original image threshold is used followed by canny edge detector. Use YCbCr skin color model for better feature extraction. They have created their own dataset which consist of alphabets A to Z, numbers 1 to 10, few phrases (daily conversation in emergency), for variation in signs they have used one lefty signer & one signer had an extra thumb.

In[3] they have used webcam for capturing the image. Segmentation is done using skin color based HSV histograms. Then median filter is applied followed by morphological erosion and Harris algorithm is used for feature extraction. The output is given in both the textual and audio form. The limitations of their approach is they can’t use both the hands and lighting conditions can affect the results.

In [2] they have used scale-invariant feature transform (SIFT) algorithm to calculate features vector from source image to recognize the gesture of American Sign Language (ASL). They have used Euclidian distance for vector comparison. The delay time is reduced and the accuracy is increased using the SFIT algorithm.

In [9] they have proposed a method for automatically recognizing the Indian Sign language finger spelling and the output is text. They have used digital image processing techniques with artificial neural network to recognize the signs. They have presented method for recognizing static gestures of numerals and alphabets of Indian Sign Language. Skin color based segmentation is used to extract the hand region from the image. Shape of hand is used in feature extraction. This method has the accuracy rate of 91.11%.

In [27] they have proposed two new feature extraction techniques of Combined Orientation Histogram and Statistical (COHST) Features and Wavelet Features to recognize the static sign of numerals in American Sign Language. The performance of the system is measured by extracting the four different features of Orientation Histogram, Statistical Measures, COHST Features and Wavelet Features to train and recognize the American Sign Language numbers individually using a neural network. Observation states that COHST method forms stronger feature than the Orientation Histogram and Statistical Features. Wavelet features based system gives the best performance of all the system designed for static American Sign Language number recognition with the maximum average recognition rate of 98.17%.

Microsoft Kinect a motion capture device was used by Rajaganapathy. S, Aravind. B, Keerthana. B and Sivagami M [28]. for converting the human sign language to voice. Their project used to control PowerPoint presentation by moving hands left to right to change slides, next used to control electric appliances.


A. Real Time Sign Language translator used anywhere and anytime

The improvement of Sign language Recognition systems lie in the fact that they should have the ability to work in uncontrolled real time environment that implies that they should have the ability to cope up with complex scenes that possess messy background, various moving objects, diverse illumination conditions and variety of users. At present many frameworks intended for controlled conditions like legitimate light illumination, the background of signer is static and the clothes wear by signer are additionally dull or non-skin shading so discovery of hand and face in such controlled conditions is appropriate for such recognition systems.

B. Indian Sign Language Dataset based on standard dictionary

Computer vision based method having numerous challenges like lack of standard dataset for Indian Sign Language (ISL), and unviability of standard dictionary of ISL. Recently Indian Sign Language Research and Training Centre (ISLRTC) has formally published ISL dictionary[34]. In India the people of different regions use their own signs for specific word. Such ambiguity in signs may lead a problem for any sign language recognition system.

C. Recognition of Dynamic gestures and false gesture

Dynamic hand gesture recognition is still a challenging task because numerous researchers works on static gestures only and also they have consider very few words, characters or numbers for their experiments. Sometimes one gesture represents same meaning ex. ‘very good’ and ‘beautiful’[4]. Such ambiguity in signs still consider as a challenge for Sign Language recognition systems.

Sometimes in sign language the presence of unwanted gesture between two words while signer present a sentence is consider as false gestures. Detection of such false gesture is considered as challenge. Many ISL alphabets uses two hands to form a character and use of two hands for gestures make recognition difficult.


The above study illustrates various techniques used in vision-based system, along with some hardware-based techniques. In vision, based technique the accuracy depends on many factors like the position of the hand, the distance from the camera, the background of signer, single hand or two hands used by signer, illumination conditions, dataset etc. Many researches used skin-color based segmentation techniques to segment hand and face from image. For skin color segmentation HSV, YCrCb color models used by many. Only a few systems have recognized both alphabets and numbers for specific regional sign languages like American Sign Language, Arabic Sign Language, etc. The real-time sign recognition system can be develop which is used in real environment but capturing and interpreting sign from real-time video is not up to the mark hence most of the methods have used the static images. Unavailability of standard ISL dataset also reported as crucial aspect of ISL based SLR. Many machine learning and deep learning methods may be helpful and effective in the development of sign language recognition systems. The effectiveness of these methods rely on the proper dataset. The future scope of given study is to create the standard dataset based on standard dictionary.

Make sure you submit a unique essay

Our writers will provide you with an essay sample written from scratch: any topic, any deadline, any instructions.

Cite this paper

Indian Sign Language Recognition System: Approaches And Challenges. (2022, February 18). Edubirdie. Retrieved July 15, 2024, from
“Indian Sign Language Recognition System: Approaches And Challenges.” Edubirdie, 18 Feb. 2022,
Indian Sign Language Recognition System: Approaches And Challenges. [online]. Available at: <> [Accessed 15 Jul. 2024].
Indian Sign Language Recognition System: Approaches And Challenges [Internet]. Edubirdie. 2022 Feb 18 [cited 2024 Jul 15]. Available from:

Join our 150k of happy users

  • Get original paper written according to your instructions
  • Save time for what matters most
Place an order

Fair Use Policy

EduBirdie considers academic integrity to be the essential part of the learning process and does not support any violation of the academic standards. Should you have any questions regarding our Fair Use Policy or become aware of any violations, please do not hesitate to contact us via

Check it out!
search Stuck on your essay?

We are here 24/7 to write your paper in as fast as 3 hours.