Linear Regression Using Keras | Simplified
Hello Everyone, in this post we will take a look at what is Linear Regression and how can it be implemented in Keras deep learning API of the TensorFlow module. Let us begin with understanding the term Simple Linear Regression. Also known as Basic Regression.
What is Basic Regression?
Linear Regression is a Supervised Machine Learning Algorithm. It provides us with a model that represents a relationship between the dependent (y) and independent variables (x) expressed in a straight line. Hence the name Linear Regression.
Here, for a problem if we have just one independent variable, say ‘x’, then it is said to be simple linear regression. Whereas if there are more than one independent variables like ‘x1, x2, x3,….. xn’ then we call it a multiple linear regression. A regression problem is used to output a price or a probability.
The mathematical representation for linear regression is given as:
Y = β0 + β1X + ε
β0 is the Y-intercept
β1 is the slope
ε is the random error
Implementation using Keras
For this example, we will try to predict the salary of an employee based on the number of years of experience. Here, our dependent variable, also called label data is the salary, and the independent variable, also called feature will be the experience. I have downloaded the dataset from the Kaggle site.
First, we import the required Python libraries:
import pandas as pd import numpy as np import itertools import matplotlib.pyplot as plt import tensorflow as tf from keras.model import Sequential
Now we read and plot our dataset into a scatter plot:
dataset = pd.read_csv("Salary_Data.csv") dataset.head() X = dataset['YearsExperience'].values.reshape(-1,1) Y = dataset['Salary'].values.reshape(-1,1) plt.scatter(X,Y)
We get the following output, displaying the first 5 entries of the dataset:
YearsExperience Salary 0 1.1 39343 1 1.3 46205 2 1.5 37731 3 2.0 43525 4 2.2 39891
and, the scatter graph of the entire dataset:
Now, we actually build a model to learn these values in the dataset, and set it loss function along with optimizer:
model = tf.keras.Sequential() model.add(tf.keras.layers.Dense(1, input_shape=)) model.compile(loss= "mean_squared_error", optimizer=tf.keras.optimizers.SGD(0.1)) fit = model.fit(dataset["YearsExperience"],dataset["Salary"],epochs = 850)
You can imagine this as a neural network with one hidden layer. We have passed 1 as the parameter for Dense function because we have one feature and the value for input_shape is one as we have label.
We have used MSE as the loss function and Stochastic gradient descent as the optimizer. Both of these are very basic and easy to understand.
Now that we have fit out data points into the model, we are ready to predict the labels and display our generated linear regressor in a graph:
dataset['predict'] = model.predict(X) plt.scatter(X,Y) plt.plot(X,dataset['predict'],color = 'r') plt.show()
We can see that our model has generated a fair line of regression.
Now, we can pass new values to our model and check if it predicts the salary correctly. For example, let’s pass four random values as a list and see how the model performs:
The output for this will be a numpy array:
[[ 63327.97] [ 96740.31] [ 49008.39] [111059.89]]
Our model has predicted salaries pretty close to the actual values.
That’s all for this article, also check out: