The prediction of Parkinson’s disease in early age has been challenging task among researchers, because the symptoms of disease came into existence in middle and late middle age. There are lots of symptoms that lead to Parkinson’s disease. But this article focuses on the speech articulation difficulty symptoms of PD affected people and try to formulate the model on the behalf of three data mining methods. These three data mining methods are taken from three different domains of data mining i.e., from tree classifier, statistical classifier, and support vector machine. We aim of classifying the Parametric and Non Parametric models by using the collected dataset of Parkinson’s disease. The Parkinson’s data is tested with two respective models to determine which model provides the higher classification accuracy. In parametric modelling, Logistic Regression is used to classify the Parkinson’s data. From Non parametric modelling, K-Nearest Neighbours and Random Forest Algorithm is used to classify the training and test data of Parkinson’s disease. The classification is made using the parametric and non-parametric model with the collected Parkinson’s data. With the classified value of data, classification accuracy on parametric and nonparametric model is resulted. Comparison of both Parametric and Non Parametric model is done evaluate the performance of the Parkinson’s dataset.
Parkinson’s disease is a long term de-generative disorder of the central nervous system. . The cause of Parkinson’s is generally unknown, but involved with both genetic and environmental factors. More than one million cases per year occur in India. This disease cannot be cured, but treatment may help. Meditation can help in control of symptoms of Parkinson’s disease. . The symptoms come generally over time, most obvious shaking, rigidity, slowness of movement and difficulty in walking. Most people symptoms take years to develop, and they for years with disease. The famous personalities affected by Parkinson’s disease are, Boxer Muhammad Ali was diagnosed with Parkinson’s disease in 1984, as the result of sustaining several severe head injuries.  Pope John Paul II was diagnosed with Parkinson’s disease in 2001.  Michael J. Fox was diagnosed with Parkinson’s disease in 1991.  Adolf Hitler suffered from Parkinson’s disease, the first symptoms were observed was 1937.  Dave Jennings an American Football Player died from complications with Parkinson’s disease in 2013. Logistic regression and K‑ nearest neighbor model with six different machine learning algorithms had used to predict the pneumonia mortality. Support vector machines (SVMs) have been used for detection and diagnosis of wide range of biomedical diseases.
The symptoms of PD can be classified into two types i.e. non-motor and motor symptoms. Many people are aware of the motor symptoms as they can be visually perceived by human beings. It is now established that there exists a timespan in which the non-motor symptoms can be observed. This symptoms are called as dopamine-non-responsive symptoms. These symptoms include cognitive impairment, sleep difficulties, loss of sense of smell, constipation, speech and swallowing problems, unexplained pains, drooling, constipation and low blood pressure when standing. It must be noted that none of these non-motor symptoms are decisive, however when these features are used along with other biomarkers from Cerebrospinal Fluid measurement (CSF) and dopamine transporter imaging, they may help us to predict the PD
This work takes into consideration the non-motor symptoms and the biomarkers such as cerebrospinal fluid measurements and dopamine transporter imaging. In this paper we follow a similar approach, however we try to use different machine learning algorithms that can help in improving the performance of model and also play a vital role in making in early prediction of PD which in turn will help us to initiate neuroprotective therapies at the right time.
- A persisting frustration in the diagnosis, treatment, and research of Parkinson’s disease is the lack of an objective measure of the nigrostriatal dopaminergic deficit.
- In particular, we need a tool to monitor the progress of the neuronal degeneration.
- This is very difficult to achieve clinically in Parkinson’s disease because of the complex clinical presentation and the confounding effect of symptomatic therapy.
- Although PET, with markers of presynaptic dopaminergic function such as 6-fluorodopa, is an accepted measuring tool, PET is complex, expensive, cumbersome, and not widely available.
Elbaz A, Bower JH, Peterson BJ, Maraganore DM, McDonnell SK, Ahlskog JE, et al. Survival Study of Parkinson Disease in Olmsted County, Minnesota. Arch Neurol 2003; 60:91‑6.
To compare survival in incident cases of Parkinson disease (PD) with survival in subjects free of PD from the general population. We used the medical records linkage system of the Rochester Epidemiology Project to identify incident cases of PD in Olmsted County, Minnesota, for the period 1976-1995. Cases were matched by age and sex to referent subjects from the same population. For 196 cases and 185 referent subjects, we studied survival between the date of diagnosis of PD (or index date) and death, loss to follow-up, or end of the study (May 1, 2000).
Parkinson J. An essay on the shaking palsy.1817. J Neuropsychiatry Clin Neurosci 2002; 14:223‑36.
The term Shaking Palsy has been vaguely employed by medical writers in general. By some it has been used to designate ordinary cases of Palsy, in which some slight tremblings have occurred; whilst by others it has been applied to certain anomalous affections, not belonging to Palsy.
Species of tremor, which here occurs, is chosen to furnish the epithet by which this species of Palsy, may be distinguished.
Tanner CM, Ross GW, Jewell SA, Hauser RA, Jankovic J, Factor SA. Occupation and risk of Parkinsonism: A multicenter case‑ control study.
We examined risk of parkinsonism in occupations (agriculture, education, health care, welding, and mining) and toxicant exposures (solvents and pesticides) putatively associated with parkinsonism. To investigate occupations, specific job tasks, or exposures and risk of parkinsonism and clinical subtypes. Case-control. Eight movement disorders centres in North America.
Marras C, Tanner C. Epidemiology of Parkinson’s Disease. Movement Disorders. In: Watts RL, Koller WC, editors Neurologic Principles and Practice, 2nd ed. New York: The McGraw‑Hill Companies; 2004. p. 177.
Neurological therapeutics: principles and practice is a two volume book consisting of 2874 pages by 345 authors. It is divided into 14 system-based sections that are further divided into 271 subject-based chapters. The chapters are generally short and accessible, making this large book surprisingly practical. Each chapter is formulated to contain sufficient background information to direct treatment decisions. The book works best, therefore, when the diagnosis is established and a review of the issues surrounding a treatment decision is required—a format that allows for daily use.
http://www.rightdiagnosis.com/p/parkinsons_disease/stats‑country.htm. [Last Accessed on 2012 Apr. 7].
Parkinson’s disease (PD) is a degenerative disorder of the central nervous system. It was first described in 1817 by James Parkinson, a British physician who published a paper on what he called ‘the shaking palsy.’ In this paper, he set forth the major symptoms of the disease that would later bear his name. PD belongs to a group of conditions called movement disorders. The four main symptoms are tremor or trembling in hands, arms, legs, jaw, or head; rigidity, or stiffness of the limbs and trunk; bradykinesia, or slowness of movement; and postural instability, or impaired balance.
Dauer W, Przedborski S. Parkinson’s disease: Mechanisms and Models. Neuron 2003; 39: 889‑909.
Parkinson’s disease (PD) results primarily from the death of dopaminergic neurons in the substantia nigra. Current PD medications treat symptoms; none halt or retard dopaminergic neuron degeneration. The main obstacle to developing neuroprotective therapies is a limited understanding of the key molecular events that provoke neurodegeneration. The discovery of PD genes has led to the hypothesis that misfolding of proteins and dysfunction of the ubiquitin- proteasome pathway are pivotal to PD pathogenesis. Previously implicated culprits in PD neurodegeneration, mitochondrial dysfunction and oxidative stress, may also act in part by causing the accumulation of misfolded proteins, in addition to producing other deleterious events in dopaminergic neurons. Neurotoxin-based models (particularly MPTP) have been important in elucidating the molecular cascade of cell death in dopaminergic neurons.
MACHINE LEARNING LANGUAGE
Machine learning algorithms in disease
Machine learning algorithms have good history in disease diagnosis and prediction. A large number of papers have been published that exhibited the application of machine learning algorithm in medical field such as diagnosis of disease, prediction of disease, survivability and identification of disease. Initially, three branches of machine learning came into view i.e., symbolic learning, statistical methods and neural networks. Symbolic learning was described by Hunt, statistical methods described by Nilsson and neural networks by Rosenblatt Machine learning community has developed large number of machine learning tools that have been widely used to obtain classification models including medical prognostic models.[16,17] For cancer diagnosis and research, artificial neural network and decision tree classifiers have been used and these methods provided remarkable results. Pendharker applied several data mining methods to diagnosis patterns in breast cancer. DursuDelen et al., applied ANN, Decision Tree and Logistic regression method to predict the survivability of breast .
Logistic regression and K‑ nearest neighbor model with six different machine learning algorithms had used to predict the pneumonia mortality. Support vector machines (SVMs) have been used for etection and diagnosis of wide range of biomedical diseases such as detection of oral cancers in optical images, polyps in CT colonography. Detection of micro calcifications in mammograms, and analysis of gene expression measured via microarrays. Study of several machine learning approaches for micro calcification detection has shown that SVMs provide better classification performance to other approaches such as ANNs. Bayesian networks have been applied in biomedicine, especially in probabilistic expert systems for clinical diagnosis and computational biology. Because Bayesian network has capability to deal with biomedical data that either incomplete or partially correct. At present, machine learning techniques are also used for detecting and classifying tumors via X‑ray and CRT images, classification of malignancies from proteomic and genomic assays. According to the Pub Med statistics nearly 1800 of papers has been published on cancer by use of machine learning techniques.
In this paper, three different types off classification methods are used i..e decision stump (tree classifiers), logistic regression (statistical classifier) and sequential minimization (support vector machine).
Tree is a classifier that can be defined as a recursive partition of the dataset. Tree classifiers mainly consist a set of nodes in which one of the node acts as root node; all other nodes have exactly one incoming and outgoing edge known as internal nodes and rest of nodes with no outgoing edges known as terminal nodes or leaf nodes.
LOGISTIC REGRESSION (LR)
Few of statistical algorithms are linear discriminate analysis, least mean square quadratic, kernel, logistic regression and k nearest neighbors. But in this paper, Logistic regression is used to obtain desired results. Logistic regression is statistical classifiers that are used for the analysis of data. It is a type of linear regression that is used for predicting binary or multi‑class‑dependent variables LR can be defined mathematically as Per (G = k | X = x) is a nonlinear function of x and range from 0 to 1 and sum up to 1.
Support vector machine (SVM)
Firstly used support vector machines (SVM) to classification purpose. But presently, SVMs have been used in a wide range of problems including pattern recognition bioinformatics and text categorization. Hence, SVM classification has done by realizing a linear or nonlinear separation surface. But it can be found that training of SVM requires solving quadratic optimization problem. A large number of algorithms are proposed such as the
Sequential minimal optimization (SMO), nearest point algorithm (NPA) etc., to solve this problem.[60,61] Hence, in this paper, SMO classifier with SVM is used to obtain desired result. Platt proposed this algorithm, called SMO for the SVM classifier design.
System Design is the next development stage where the overall architecture of the desired system is decided. The system is organized as a set of sub systems interacting with each other. While designing the system as a set of interacting subsystems, the analyst takes care of specifications as observed in system analysis as well as what is required out of the new system by the end user.
As the basic philosophy of Object-Oriented method of system analysis is to perceive the system as a set of interacting objects, a bigger system may also be seen as a set of interacting smaller subsystems that in turn are composed of a set of interacting objects. While designing the system, the stress lies on the objects comprising the system and not on the processes being carried out in the system as in the case of traditional Waterfall Model where the processes form the important part of the system.
In this paper, we try to develop some predication model for Parkinson’s disease identification. For this purpose, three data mining methods i.e., decision stump (tree classifiers), logistic regression (statistical classifiers) and sequential minimization optimization (support vector machine) are used. Dataset that is used in this paper has taken from UCI repository. This dataset is composed of a range of biomedical voice measurements from 31 people, 23 with Parkinson’s disease (PD). Each column in the table is a particular voice measure, and each row corresponds one of 195 voice recording from these individuals. PD affected person is represented with a value of 1 and healthy person is represented with 0. To obtain the desired result, 10 cross fold method is used with discussed classifiers as well as three parameters are used to analyze the performance of discussed classifiers.
- Elbaz A, Bower JH, Peterson BJ, Maraganore DM, McDonnell SK, Ahlskog JE, et al. Survival Study of Parkinson Disease in Olmsted County, Minnesota. Arch Neurol 2003; 60:91‑6.
- Parkinson J. An essay on the shaking palsy.1817. J Neuropsychiatry Clin Neurosci 2002; 14:223‑36.
- Tanner CM, Ross GW, Jewell SA, Hauser RA, Jankovic J, Factor SA. Occupation and risk of Parkinsonism: A multicenter case‑ control study. Arch Neurol 2009;66:1106‑13.
- Marras C, Tanner C. Epidemiology of Parkinson’s Disease. Movement Disorders. In: Watts RL, Koller WC, editors Neurologic Principles and Practice, 2nd ed. New York: The McGraw‑Hill Companies; 2004. p. 177.
- US Census Bureau. US interim projections by age, sex, race, and Hispanic origin: 2000‑2050. Available from: http://www.census.gov/population/ www/projections/usinterimproj. [Last Accessed on 2012 Apr. 7]. Available from: http://www.rightdiagnosis.com/p/ parkinsons_disease/stats‑country.htm. [Last Accessed on 2012 Apr.
- Dauer W, Przedborski S. Parkinson’s disease: Mechanisms and Models. Neuron 2003; 39:889‑909.
- Alonso JB, de Leon J, Alonso I, Ferrer MA. Automatic detection of pathologies in the voice by HOS based parameters. EURASIP J Appl Sig process 2001; 14:275‑84.
- Cnockaert L, Schoentgen J, Auzou P, Ozsancak C, Defebvre L, Grenez F. Low‑ frequency vocal modulations in vowels produced by Parkinsonian subjects. Speec Commun 2008;50: 288‑300.
- Revett K, Gorunescu F, Mohamed Salem AB. Feature Selection in Parkinson’s disease: A rough Sets approach. Proceedings of the International Multi conference on Computer Science and Information Technology; Oct. 12‑14 Margowo, Polond 2009;4: p. 425‑8.