Abstract
The various methods humans have at their disposal to communicate with themselves is what sets them apart from the rest of the animal kingdom. It is also an integral part in our daily lives and for the exchange of ideas. Hence, the sign language holds just as much importance as any other language for the people with the gift of speech. People relying solely on sign language and people unable to interpret said language invariably get lost in translation. This leads to the creation of a communication gap which has ripple effects that hinder the progress of our society as a whole. The lack of a suitable communication platform makes any of the simple everyday activity unnecessarily difficult. The limitations placed on such people due to their inability to put for their thoughts and interpret others hamper their progress drastically. With this perspective, we set out to create a system that would help people translate the American Sign Language, be it out of necessity or passion.
Introduction
A majority of students have been showing interest in learning a sign language. We can draw the inference that it is their willingness to be able to communicate with the speech impaired which is giving rise to this. Sign language is an effective means of communication for people with impaired hearing and speech. However, it is always difficult for a person, with no knowledge of sign language, to interact with this section of the society. Absence of a convenient mechanism for conveyance of thoughts and ideas causes hindrance in daily activities. Thus, a handy interpreter is essential to facilitate the process of two-way communication. The people unable to use traditional means of communicating should not be disregarded simply merely for this lack in ability. By no means should this become an excuse for them to not be able to convey their thoughts to the general populous.
Save your time!
We can take care of your essay
- Proper editing and formatting
- Free revision, title page, and bibliography
- Flexible prices and money-back guarantee
Place an order
Experimentation was done on a set of five kinds of digit images. It was also evaluated on the public data set available named Cambridge Hand. Gesture Data (CHGD) set. Experiments are conducted in two sets, once to evaluate the preprocessing and the other one to measure CNN accuracy. Original and RGB images are used to test the former and SIFT with SVM was used to CNN accuracy. The paper concludes that proposed model is robust to the expected standards.
The study presented in [2] proposes a model to recognise the American Sign Language. The aim is to be able to correctly identify the gestures and classify them as letters from the ASL.
For preprocessing, a Region Of Interest is selected and background is subtracted and a grayscale image is created post noise removal. Deep Convolutional Neural Network is used to extract the vector for features from a video and are then stored in a file. “AlexNet” network has been proposed for the CNN architecture with five convolutional layers and three fully connected layers. Layer ‘fc7’ is used for feature extraction and richer image is generated with the help of deeper layers. Since SVM can only classify into two sets, MCSVM is used for the twenty-six letters in ASL. This is achieved by kernel functions. The dataset consists of twenty-six letters from three different persons. A hundred and twenty images from each person were considered with a 70-30 split for training and test set respectively. An accuracy of 94.57% was achieved with this model.
The research in [3] discusses the recent advancements and the technologies that have surfaced for Sign Language recognition in the field of Deep Learning. The paper mentions technologies used for sign language detection such as SIFT, HOG, STIP for 2D and Random Occupancy Pattern (ROP) and Space-Time Occupancy. Patterns (STOP) for 3D/4D. Classification algorithms considered are Support Vector Machines (SVMs), and Hidden Markov Models (HMMs). Two types of sensor technologies are provided, accelerometer, digital camera and data gloves for touch based recognition. These require user to wear a device with cables attached to the system, making it a tedious affair. Kinect and Google Tango provide depth maps and are used as untouched sensors. Limitations for these are colour separation and accuracy. The major datasets available for ASL are reviewed. There are seven important ones and their contents and divisions of training and test set is detailed with other features like size and availability. After this, the paper presents a detailed study of eight models that have been proposed or implemented recently, from 2014 to 2017. Their accuracy rates and performance are compared and comprehensive description of the technologies used is provided.
A model for recognising sign language is proposed in [4]. The Softmax regression function, Haar cascade, Neural networks, Haar cascade algorithm executed with OpenCV library, Keras and TensorFlow Python libraries and explained before implementation of the system is described. The datasets to be used for the project have been assessed. For implementation, the first step is to preprocess the images for CNN and get them in 1:1 ratio. This is done using imagemagick, which is an open source library for image manipulation. The CNN model for the system has five layers and is used with Stochastic Gradient Descent (SGD) optimizer and Sequential API provided by Keras.
The training is started at a 0.01 learning rate and is it occurs in steps with SGD optimizer and keeps on increasing. 20 percent of dataset was used for validation and the goal was to achieve an accuracy of 98 percent. Convex Hull detection and background subtraction are used to get the hand gestures. The images are preprocessed and given to CNN in real time for classification.
A system to recognise two hand gestures in real time with Hidden Markov Models is presented in [5]. For image processing, initially, face detection takes places with feature like HAAR. After this there is background subtraction and skin detection with YCbCr technology with which face is separated, only hands are considered and palms are localised. The movement of the hands is tracked by condensation algorithm. The gesture path and orientation are done with the help of vector quantization which allows to successfully extract the features and understand the gestures. The HMM model is trained for eight gestures with fifty video samples from four people. The system yielded a result with an average accuracy of 96.25%.
Proposed System
Problem statement
Everyday tasks prove to be monumental for people with hearing and speaking disabilities. A lot of efforts are being undertaken by the NGOs and communities to reach out and help them as well. However, this has not been easy owing to the difficulty in means of communication that have been very limited. We aim to tackle this problem by devising a system that would translate the sign language. The presence of translators would no longer be mandatory for such people to put their ideas forth to everyone. It would save the time and resources spent on making translators available.
Algorithm
- Step 1: The captured input frame is cropped highlighting the Region of Interest, that focuses on the hand.
- Step 2: The suitable lower and upper bound HSV values for skin colour are identified using the trackbars.
- Step 3: Cropped ROI is then converted into a mask with HSV thresholding for the specified range.
- Step 4: The masked image is the resized to 64x64 pixels and then given as an input to the model for predicting and mapping the corresponding alphabet.
Implementation
The system can be improved upon by including a greater number of sign languages, as prevalent in our society. This would help incorporate a broader section of people and increase its reach in different parts of the world. Furthermore, a mobile version of the system, if available, would enable users to access it on the go and be more accessible in terms of locations.
Conclusion
A machine learning based translator was developed which recognises the letters as per the American Sign Language. CNN model has been trained on a set of 52000 images belonging to 26 unique classes. Training accuracy of 97% and validation accuracy of 95.3% was achieved. Recognition of the Sign Language has been achieved using HSV conversion of BGR images captured. Testing accuracy of the model is 85%.
References
- Naresh Kumar, “Sign Language Recognition for Hearing Impaired People based on Hands Symbols Classification”, International Conference on Computing, Communication and Automation, 2017.
- Lihong Zheng, Bin Liang, Ailian Jiang, “Recent Advances of Deep Learning for Sign Language Recognition”, School of Computing and Mathematics, Charles Sturt University, Wagga Wagga, Australia.
- Shashank Salian, Indu Dokare, Dhiren Serai, “Proposed System for Sign Language Recognition”, International Conference on Computation of Power, Energy, Information and Communication, 2017.
- Meenakshi Panwar, Pawan Singh Mehra, “Hand Gesture Recognition for Human Computer Interaction”, International Conference on Image Information Processing, 2011.
- Soeb Hussain, Rupal Saxena, Xie Han, Jameel Ahmed Khan, Prof. Hyunchul Shin, “Hand Gesture Recognition Using Deep Learning”, Electronics and Communication Engineering, Hanyang University, Sangnok-gu, Korea
- Md Rashedul Islam, Ummey Kulsum Mitu, “Hand Gesture Feature Extraction Using Deep Convolutional Neural Network for Recognizing American Sign Language”, 4th International Conference on Frontiers of Signal Processing, 2018.
- Tanatcha Chaikhumpha, Phattanaphong Chomphuwiset, “Real – time Two Hand Gesture Recognition with Condensation and Hidden Markov Models”, Computer Science Department, Mahasarakham University, Mahssarakham, Thailand.
- Xing Yingxin, Li Jinghua, Wang Lichun, Kong Dehui, “A Robust Hand Gesture Recognition Method Via Convolutional Neural Network”, 6th International Conference on Digital Home, 20 16.