Arabic Deaf Sign Recognition with Deep Learning

Topics:
Words:
1676
Pages:
4
This essay sample was donated by a student to help the academic community. Papers provided by EduBirdie writers usually outdo students' samples.

Cite this essay cite-image

Abstract

One of the best ways of communication between Deaf people themselves and normal people is based on sign language or so-called hand gestures. In Arab society, only deaf people and specialists could deal with Arabic sign language, which makes the deaf community narrow and thus communicating with normal people is difficult. In addition to that, studying the problem of Arabic sign language recognition (ArSLR) has been paid attention recently, which emphasizes the necessity of investigating other approaches for such problem. This paper proposes a novel ArSLR scheme based on an unsupervised deep learning algorithm, a deep belief network (DBN) coupled with a direct use of tiny images in particular, to recognize and classify Arabic alphabetical letters. The use of deep learning contributed to extracting the most important features that are sparsely represented and played an important role in simplifying the overall recognition task. In total, around 6,000 samples of the 28 Arabic alphabetic signs have been used after resizing and normalization for features extraction. The classification process was investigated using a simple classifier, a softmax regression, and achieved an overall accuracy of 95.6%, showing a high reliability of the DBN-based Arabic alphabetical characters recognition.

Introduction

The most natural ways, that human being used to communicate with each other, are by using voice, gestures, and human-machine interfaces. The last method is still very primitive and forces to adapt the machine requirements. Also, the use of voice signals to communicate with hearing-impaired people is impossible or not desirable at all. However, hand gestures can be desirable and used to communicate with deaf people during their daily live. It’s well-known that sign language is the language of deaf people that depends on the body movements or any visual-manual way, particularly the hands and arms, to convey meanings, where the sign language is typically different from one language to another and from one country to another.

Save your time!
We can take care of your essay
  • Proper editing and formatting
  • Free revision, title page, and bibliography
  • Flexible prices and money-back guarantee
Place an order
document

In Arab society, the community of Arab deaf people is narrow and very limited; this is due to the fact that only specialists deal with them. In addition to that, it has been shown that ArSL is the most difficult recognition task among others foreign sing languages due to its unique structure and its complex grammar [8]. Therefore, developing recognition systems for ArSL remains an open question for the researchers. Furthermore, most of the existing ArSLR systems have focused on recognizing the Arabic alphabets, shown in figure 1, as we proposed in this work.

Generally, ArSLR can be performed through two main phases: detection and classification. In the detection phase, each captured image is pre-processed, improved, and then the Regions of Interest (ROI) is identified using a segmentation algorithm. The output of the segmentation process can thus be used to perform the classification process. Indeed, accuracy and speed of detection play an important role in obtaining accurate and fast recognition process. In the recognition phase, a set of features are extracted from each segmented hand sign and then used to perform the recognition process. These features can therefore be used as a reference to understand the differences among the different signs.

ArSLR systems have only been paid attention recently; see for example [4]–[6]. Therefore, investigating and developing a new ArSLR model is important and has to be considered. This paper proposes a new Arabic sign recognition system based on deep features extraction methods followed by a simple linear classifier method.

The rest of the paper is organized as follows. Section 2 presents an overview of the related works. Section 3 presents our proposed algorithm for recognition of Arabic machine-print characters. Section 4 details the experimental results. Conclusions and future works are presented in section 5.

Related Works

Generally, sign language recognition systems for American, British, Indian, Chinese, Turkish, and many international sign languages have received much attention compared to the Arabic sign language. Therefore, developing an Arabic sign language recognition system (ArSLR) is needed. A review of recent developments in sign language recognition can be found in [1][2][3].

Although most of the proposed approaches to the problem of ArSLR have given rise to sensor-based techniques [Refs:12-15 Ohood’s thesis], however, image-based ArSLR techniques have recently investigated, for instance see [Refs]. The task of ArSLR usually requires firstly producing an appropriate code for the initial data and secondly using this code to classify and learn the alphabets.

Sensor-based model, which usually employs sensors attached to the hand glove. Look-up table software is usually provided with the glove to be used for hand gesture recognition. Some sensor-based models have been recently developed can be seen in [[12]–[15]:Ohood’s thesis] for instance. Vision-based model, which usually uses video cameras to capture the movement of the hand, has been used to address the problem of ArSLR. However, image-based techniques exhibit a number of challenges, including lighting conditions, image background, face and hands segmentation, and different types of noise.

Different classification approaches (generative & discriminative methods) have been already developed and used to achieve ArSLR, for instance see [33]–[36]. In particular, the authors in [34] developed a neuro-fuzzy system, which includes five main stages including: image acquisition, filtering, segmentation, hand outline detection followed by features extraction. The conducted experiment considered the use of the bare hand and achieves hit rate of 93.6%. The author in [8] has also introduced an automatic recognition of the Arabic sign language letters. For feature extraction, Hu's moments are used and for classification, the moment invariants are fed to SVM. A correct classification rate of 87% was achieved.

The authors in [11] used a polynomial classifier to recognize alphabet signs. It is a glove-based experiment with different colors for the fingertips and wrist region. Length and angels and other geometric measures considered as features. The hit rate was about 93.4% representing 42 gestures but with about 200 samples only. In [12], the authors proposed to use recurrent neural networks for alphabet recognition. Two different signers used to build a database of 900 samples representing 30 different gestures. Colored gloves similar to the ones in [11] were used in their experiments. This proposed model achieved an accuracy rate of 89.7% while a fully recurrent network improved the accuracy to 95.1%.

Therefore, Gesture recognition systems are either glove-based witch depends sensors to collect data, or free hand based if no gloves or sensors used. We have seen that most of the learning and recognition methods have been focused on using neural networks, K –nearest neighbor, support vector machine, and hidden Markov model. The recognized data have to be classified in to sections or sectors and need to be labeled to a specific gesture using a simple classifier like softmax regression. In the following section, we introduce a detailed description of these classification methods.

Model Description

The general methodology of this research includes three main stages (see figure 1) which can be summarized as follows: 1) image pre-processing, 2) unsupervised feature space construction and finally 3) sign language recognition. The first two steps of the proposed model have been investigated in our previous published work [Hasasneh & Sameh 2017]. In briefly, the typical input image to DBNs training has to be small (approximately 1000 pixels) and normalized or whitened in order to extract features with higher order statistics [Hasasneh, Frenoux, Tarroux, ICINCO 2012]. So that, the initial data are first resized to small tiny images (42*24 pixels) and then normalized using a local normalization technique, by subtracting the mean and dividing by the standard deviation of the image pixels. In one of our previous works, we have shown that using data whitening leads to completely suppress the low frequencies information, which represent important information for features extraction and classification processes [Hasasneh, Frenoux, Tarroux, 2012]. While, using data normalization corresponds to local brightness and contrast normalization and it does not suppress the low frequencies and thus led to extract better features that covers a wide spectrum of spatial frequencies and therefore leads to more accurate classification results [Hasasneh, Frenoux, Tarroux, 2012].

After the data is preprocessed, the normalized data was used to train the first Restricted Boltzmann Machine (RBM) layer, a detailed description of this model, its training parameters, and the training protocol can be found in our previous work [Hasasneh & Sameh 2017].

Utilizing an unsupervised learning strategy, the model parameters are learned by training the first RBM layer using a contrastive divergence learning technique. The convergence of the network is achieved once the difference between the data statistics and the statistics of its representation generated by Gibbs sampling is approached to zero, where the training dataset is fed the network over the epochs. After the network is converged, a simple classifier, like softmax, is finally used to perform the classification process using new samples from the validation dataset.

It has been shown in many recent studies [Refs] that using deep learning approach plays important roles in:

  • Reducing the dimensionality of the data by using the most significant features to represent an object and thus speed-up the further tasks, the classification process for instance.
  • Improving the linear separability of the 28 signs of Arabic letters and thus simplifies the overall classification process.

Therefore, the use of the softmax regression, which is a linear classifier, was based on the assumption that the data was linearly separated using DBNs as stated in many studies [Refs]. To investigate that, a non-linear classification algorithm, like support vector machine, will be used in the classification phase to underline this hypothesis.

Results

Several preliminary empirical tests were conducted and have shown that the optimal structure of DBN in terms of final classification score is 1008 - 1008 - 252. After coding the initial dataset using the first RBM extracted features, the second RBM layer was trained using the coded image. The model is trained using a second RBM layer in order to force the network to learn linear combinations of features extracted from the first layer, which partially correspond to larger structures of the hand shapes.

We have seen that the extracted features are sparsely represented and localized, which represents small parts of the hand elements like fingers edges and hand borders and shapes.

We have seen in our previous work that DBNs extracted interesting features which represents most of the 28 signs of Arabic letters [Hasasneh & Sameh 2017].

Make sure you submit a unique essay

Our writers will provide you with an essay sample written from scratch: any topic, any deadline, any instructions.

Cite this paper

Arabic Deaf Sign Recognition with Deep Learning. (2022, February 18). Edubirdie. Retrieved November 15, 2024, from https://edubirdie.com/examples/arabic-alphabetical-deaf-sign-gestures-recognition-based-on-deep-learning-approach/
“Arabic Deaf Sign Recognition with Deep Learning.” Edubirdie, 18 Feb. 2022, edubirdie.com/examples/arabic-alphabetical-deaf-sign-gestures-recognition-based-on-deep-learning-approach/
Arabic Deaf Sign Recognition with Deep Learning. [online]. Available at: <https://edubirdie.com/examples/arabic-alphabetical-deaf-sign-gestures-recognition-based-on-deep-learning-approach/> [Accessed 15 Nov. 2024].
Arabic Deaf Sign Recognition with Deep Learning [Internet]. Edubirdie. 2022 Feb 18 [cited 2024 Nov 15]. Available from: https://edubirdie.com/examples/arabic-alphabetical-deaf-sign-gestures-recognition-based-on-deep-learning-approach/
copy

Join our 150k of happy users

  • Get original paper written according to your instructions
  • Save time for what matters most
Place an order

Fair Use Policy

EduBirdie considers academic integrity to be the essential part of the learning process and does not support any violation of the academic standards. Should you have any questions regarding our Fair Use Policy or become aware of any violations, please do not hesitate to contact us via support@edubirdie.com.

Check it out!
close
search Stuck on your essay?

We are here 24/7 to write your paper in as fast as 3 hours.