Parkinson’s Disease Detection using Support vector model
In this tutorial, we will discuss Parkinson’s disease prediction using a support vector model in machine learning using Python.
U can learn more about the support learning machine model here.
Parkinson’s disease detection is defined as Parkinson’s disease is a progressive nervous system disorder that affects momentum leading to shaking stiffness and difficulty with walking balance and coordination. Parkinson’s symptoms usually begin gradually and get worse over time.
You can download the dataset from here parkin data.
So let’s try to implement the code
Import the necessary Python libraries.
#Importing the dependencies import pandas as pd import numpy as np from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn import svm from sklearn.metrics import accuracy_score
The first step is Data Collection,
1. Load data,
#Load the data parkin_data=pd.read_csv("C:\\Users\\users drive\\OneDrive\\Desktop\\parkin_data.csv")
To check whether the data set is loaded or not print the first five rows by using the head method.
#printing the first five rows of the dataframe. parkin_data.head()
Now print the last five rows by using the tail method.
#printing last five rows of the dataframe parkin_data.tail()
In status column, there are only two values either 1 or 0 1 represents that person have parkinsons and 0 represents that person does not have parkinsons.
To check how many rows and columns are present in the given dataset.
Now let’s check the basic information for the data.
#Getting the more information about the dataset. parkin_data.info()
All values are present in the float and integer values so the algorithm can understand easily the numbers.
checking for missing values in each column.
No missing values are present in the dataset.
Let’s get some statistical measures of the dataset.
DISTRIBUTION OF TARGET VALUES
Let’s check how many members are affected by parkinson’s given in the dataset. means checking status for 0 and 1.
1 147 0 48 Name: status, dtype: int64
Here 0 means person not affected with the parkinsons disease.
1 means person affected with parkinsons disease.
Grouping the data based on the target variable.
In the data preprocessing step separate the status column separately because by using the status column we are going to find whether the person is affected with the parkinson’s disease or not.
separate features and target
Let’s print the X values.
Let’s print theY values.
0 1 1 1 2 1 3 1 4 1 .. 190 0 191 0 192 0 193 0 194 0 Name: status, Length: 195, dtype: int64
TRAINING AND TEST DATA
Let’s check the size of the test and train dataset.
(195, 22) (156, 22) (39, 22)
Let’s transform the values to the train and test split.
Let’s try to print the x train and test values.
Let’s try to implement with the support vector machine model.
Train the model.
Let’s try to find the acciuracy score.
#accuracy score on training data X_train_prediction=model.predict(X_train) training_data_accuracy=accuracy_score(Y_train,X_train_prediction)
let’s try to print the values for the train data.
print('Accuracy score :',training_data_accuracy)
Accuracy score : 0.8717948717948718
Nearly to the 87 percent it is better model to implement the dataset.
Let’s check the accuracy score on the test data.
#accuracy score on test data X_test_prediction=model.predict(X_test) test_data_accuracy=accuracy_score(Y_test,X_test_prediction)
print('Accuracy score :',test_data_accuracy)
Accuracy score : 0.7948717948717948
MAKING A PREDICTIVE SYSTEM
Instead of predicting the entire dataset, we can check for only one row of the data.
It is very useful,
#Building a predictive system input_data=(116.67600,137.87100,111.36600,0.00997,0.00009,0.00502,0.00698,0.01505,0.05492,0.51700,0.02924,0.04005,0.03772,0.08771,0.01353,20.64400,0.434969,0.819235,-4.117501,0.334147,2.405554,0.368975) #changing the input data to numpy array input_data_as_numpy_array=np.asarray(input_data) #reshape the numpy array input_data_reshaped=input_data_as_numpy_array.reshape(1,-1) #standardize the data std_data=scaler.transform(input_data_reshaped) prediction=model.predict(std_data) print(prediction) if (prediction==0): print("The person does not have parkinsons") else: print("The person have Parkinsons")
 The person have Parkinsons
Let’s check for the another output.
#For another output input_data=(198.38300,215.20300,193.10400,0.00212,0.00001,0.00113,0.00135,0.00339,0.01263,0.11100,0.00640,0.00825,0.00951,0.01919,0.00119,30.77500,0.465946,0.738703,-7.067931,0.175181,1.512275,0.096320) #changing the input data to numpy array input_data_as_numpy_array=np.asarray(input_data) #reshape the numpy array input_data_reshaped=input_data_as_numpy_array.reshape(1,-1) #standardize the data std_data=scaler.transform(input_data_reshaped) prediction=model.predict(std_data) print(prediction) if (prediction==0): print("The person does not have parkinsons") else: print("The person have Parkinsons")
 The person have Parkinsons