Heart Attack Detection Using TensorFlow | Python
In this, post we will predict heart attack detection using deep neural networks in Python with the help of TensorFlow and Keras deep learning API.
To achieve this goal, I am going to use an open-source dataset and I will create a deep neural network model with the help of Keras deep learning API. You can download the dataset from the link Dataset
Download the dataset and look at the dataset carefully, you will find that the dataset is containing a categorical target variable in the form of 0’s and 1’s. Now let’s move forward to implement it with the help of TensorFlow and Keras deep learning API.
NOTE: Please install all the required libraries into your system, If possible otherwise please follow the tutorial with me using google colab’s runtime environment.
import numpy as np import pandas as pd %matplotlib inline import matplotlib as mpl import matplotlib.pyplot as plt
import tensorflow as tf from tensorflow import keras
In the above two code snippet, you can see that I am importing the required library. I am importing two important libraries Keras and Tensorflow and that is going to play an important role in building our model for heart attack detection or prediction.
Let’s move forward and load our dataset.
df=pd.read_csv("/content/drive/My Drive/Internship/heart.csv") df.tail()
The above code is helpful to load the dataset into our notebook, I am using tail() to look at my dataset from the bottom.
If you want to see the dataset from the top, you can use head() and it will result in showing you the dataset from the top.
Now we will get the information about our dataset, for this purpose we will use info().
above code is helpful to describe the dataset, for this purpose I am using describe(). You can verify the output below. Your small task is to see the dataset’s parameters given below such as mean, std, min, and the max value.
Now moving forward to look at the correlation matrix of our dataset.
mpl.rcParams['figure.figsize'] = 20, 14 plt.matshow(df.corr()) plt.yticks(np.arange(df.shape), df.columns) plt.xticks(np.arange(df.shape), df.columns) plt.colorbar()
You can visualize the correlation matrix of the given datasets.
The above piece of code is useful to plot the histogram of our dataset. You can verify the output below.
Hope you are enjoying the tutorial as well as following this tutorial with me.
Now we will look at our dataset more closely and in a precise manner.
dataset=df mpl.rcParams['figure.figsize'] = 8,6 plt.bar(dataset['target'].unique(), dataset['target'].value_counts(), color = ['pink', 'green']) plt.xticks([0, 1]) plt.xlabel('Target Classes') plt.ylabel('Count') plt.title('Count of each Target Class')
You can visualize and count the target class of the given datasets.
from sklearn.preprocessing import StandardScaler df = pd.get_dummies(df, columns = ['sex', 'cp', 'fbs', 'restecg', 'exang', 'slope', 'ca', 'thal']) standardScaler = StandardScaler() columns_scale = ['age', 'trestbps', 'chol', 'thalach', 'oldpeak'] df[columns_scale] = standardScaler.fit_transform(df[columns_scale])
As we have seen in our dataset that, there are lots of categorical values like 0’s and 1’s in our input feature so it is always a good idea to get dummies variable for those categorical variables.
For this purpose, you can use a library provided by pandas, as you can see in the above code I am using pd.get_dummies() to get the dummies variable for each categorical input feature.
If you have followed the last tutorial for Diabetes prediction you must know that we have to convert our dataset into a standard scaler format. And I have told you that basically standard scalar format is used to remove the mean and used to scale each feature to unit variance.
So in the above piece code, I am doing two things first is to get the dummies variable and then convert our input feature into standard scalar formate.
moving forward to the next and important step.
from sklearn.model_selection import train_test_split y = df['target'] X = df.drop(['target'], axis = 1) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.33, random_state = 0)
In the above code, I am splitting my dataset into a train and test set using the library provided by sklearn .
As you can see above, I am using a random seed to generate a pseudo-random number and assigning to our tf graph.
Let’s collect the shape of our input feature and create our model.
from keras.models import Sequential from keras.layers import Dense, Dropout model = Sequential() model.add(Dense(15, input_dim=30, activation='relu')) model.add(Dense(10, activation='relu')) model.add(Dense(8, activation='relu')) model.add(Dropout(.2)) model.add(Dense(1, activation='sigmoid'))
As you can see in the above snippet of code, I am using a sequential model. And as in the previous block of code, we have collected the shape of our input feature and that was 203 rows and 30 columns. So if you will look closely at our input layer then you will find that I am taking input_dim as 30, now you must get my point.
I am using relu activation function in the input layer and sigmoid activation in the output layer.
I am also using a 20% dropout layer.
look to the summary of our dataset, we have a total of 722 trainable parameters.
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=['accuracy'])
model_history = model.fit(X_train, y_train, epochs=200, validation_data=(X_test, y_test))
You can see in the above code that I am compiling my model with 200 epoch, with binary-cross entropy loss function and adam optimizer.
I have also given a link to the code at last where I have used SGD optimizer, Output looks like this. You can see the difference between both the output and that is above code is looking to overfitted but the below one is not overfitted.
model.compile(loss="binary_crossentropy", optimizer="SGD", metrics=['accuracy'])
You can visualize the history of the created model using the above code.
Now moving forward predict the values using our model.
y_pred = model.predict(X_test) print (y_pred)
you can see the predicted output, and also you can verify with the actual value.
Hope you enjoyed and followed the tutorial with me, you are welcome with any further suggestion.
You can further modify the code to increase the accuracy, like adding another optimizer, increasing the epochs.
As you have seen the differences between two optimizers that how they influence our model so always remember your requirement and then you use the parameters.
You can also use sigmoid instead of relu for final probability between 0 and 1.
You can make some changes in the code and see the effect of used parameters that how they are influencing our model’s accuracy, you can get the code with the help of the given link Heart_attack_detection
Thanks for your time.