Classification of Celestial Bodies using CNN in Python

Galaxy

Do you also wish to know how we classify objects from an image? So, here it is! In this article, We will build a fundamental image classification model using TensorFlow and Keras.  Our main aim is to predict whether the image contains a star or a galaxy with the help of Python programming. To implement our model efficiently using Neural Networks, we need to know about certain libraries beforehand.

The libraries we will be using are TensorFlow and Keras.

  • TensorFlow: This is a free and open-source software library developed by researchers and engineers of Google. It can be used for fast computing of Deep Learning models. It has gained this level of popularity because it can support multiple languages for building deep learning models, for instance, Python, C++, JavaScript, and R. The latest stable released version is 2.3.1, 24th September 2020.
  • Keras: This is also an open-source library that provides a Python interface for humans. Keras supports multiple backend neural networks computation engines like TensorFlow and Theano. The latest released version is 2.4.0, 18th June 2020.

Now, as we are aware of the basic yet important libraries, we should get into our model-building task. Hey! c’mon don’t worry about installations of libraries, we’ll be doing that in this tutorial. I would request you to open any Python IDE, say Google Colab, enable the GPU from the runtime, and practice this tutorial with me.

Steps involved in Model Building

Let’s now move forward and start stepwise.

Step 1: Understanding the DataSet

Before directly jumping into model-building one should always get insights about the dataset. For this tutorial, you can download the dataset for respective operations. For the dataset, you can click here. (The size of the dataset was large, so posted the drive link). You’ll find 2 folders named as-

  • training_set: required for training our model.
  • test_set: we required for testing our model.

Dataset Folder

Inside both the above folders you’ll find 2 more folders containing images of STARS and GALAXY.

Individual folders

So, here’s your Dataset all set.

Before getting ahead you need to install the required libraries (TensorFlow, Keras, matplotlib), for that, you can run the following command in your “.ipynb” file.

pip install package-name

Step 2: Importing the libraries and packages

Libraries are the fundamental blocks of any programming language, we have them to make our task easier. In this section, we will import all the necessary libraries.

import tensorflow as tf
from tensorflow import keras
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import Activation
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
import matplotlib.pyplot as plt

We can also check the version of TensorFlow-

print(tf.__version__)

Output:

2.3.0

Step 2: Building the Convolutional Neural Network Model.

So, while building a model we follow a series of steps or can say we have several layers in model building.

The layers are as follows:

Layer 1: Convolution (1st layer where feature extraction is done on input data)
|
Layer 2: Max Pooling (this layer selects the maximum area it convolves and takes that data further)
|
Layer 3: Flattening (In this, the pooled feature map is converted to a single column and passed to a fully connected layer)
|
Layer 4: Full Connection (Here, we find multiple layers of neurons that contribute to the decision of our model)

Here, ‘relu’ and ‘sigmoid’ are the Activation Functions.

Are you excited to know the implementation of these layers?? Let’s do it then.

# Initialising the CNN
my_model = Sequential()

# Step 1 - Convolution
my_model.add(Conv2D(32, 3, 3, input_shape = (64, 64, 3), activation = 'relu'))

# Step 2 - Pooling
my_model.add(layers.MaxPooling2D(pool_size = (2, 2)))

# Adding a second convolutional layer
my_model.add(Conv2D(64, (3, 3)))
my_model.add(Activation('relu'))
my_model.add(MaxPooling2D(pool_size=(2, 2)))

# Adding a third convolutional layer
my_model.add(Conv2D(64, (3, 3)))
my_model.add(Activation('relu'))
my_model.add(MaxPooling2D(pool_size=(2, 2)))

# Step 3 - Flattening
my_model.add(Flatten())

# Step 4 - Full connection
my_model.add(Dense(128, activation = 'relu'))
my_model.add(Dense(1, activation = 'sigmoid'))
my_model.summary()

Output:

Model Summary

Step 3: Compile the model.

Now, for compiling our model, we use the ‘adam’ optimizer with loss as ‘binary_crossentropy’ function along with ‘accuracy’ metrics.

# Compiling the CNN 
my_model.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

Step 4: Fitting the CNN to the images.

In this step, you’ll be introduced to the ImageDataGenerator package of Keras. This step helps you build powerful CNN models with a very little dataset – just a few hundreds or thousands of pictures from each category. After this, we’ll train our model. For training, we will be using the ‘fit_generator’ method on our model.

from keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)

test_datagen = ImageDataGenerator(rescale = 1./255)

training_set = train_datagen.flow_from_directory('ENTER THE LOCATION OF TRAINING_SET FOLDER',
                                                 target_size = (64, 64),
                                                 batch_size = 25,
                                                 class_mode = 'binary')

test_set = test_datagen.flow_from_directory('ENTER THE LOCATION OF TEST_SET FOLDER',
                                            target_size = (64, 64),
                                            batch_size = 25,
                                            class_mode = 'binary')

my_model.fit_generator(training_set,
                         samples_per_epoch =790,
                         nb_epoch = 25,
                         validation_data = test_set,
                         nb_val_samples = 270)

Output:

Image count

training

trainingAs we can see in the output above, the 1st epoch yields a test_set accuracy of only 70% but as the training gets along, the final epoch yields an accuracy of 97.66% for the test_set, which is pretty good accuracy.

We are good to go now. Do you know a trick? Let me tell you, YOU CAN SAVE YOUR MODEL AS WELL FOR FUTURE USE!

Step 5: Saving the model.

my_model.save('Finalmodel.h5')

Now, let’s visualize our analysis using the most popular library – matplotlib

Step 5: Visualization.

Visualization is a tool that always lets you infer better and useful insights into your model.

  1. Plot for train_set loss and test_set loss :
import matplotlib.pyplot as plt
# Plot the Loss
plt.plot(Analysis.history['loss'],label = 'loss')
plt.plot(Analysis.history['val_loss'],label = 'val_loss')
plt.legend()
plt.show()

Output:

loss_plot

2. Plot for train_set accuracy and test_set accuracy :

# Plot the Accuracy
plt.plot(Analysis.history['accuracy'],label = 'acc')
plt.plot(Analysis.history['val_accuracy'],label = 'val_acc')
plt.legend()
plt.show()

accuracy plot

This completes our Tutorial for Image Classification of Celestial Bodies in Python. So, what are you waiting for now? Open your Python notebook and go get your hands dirty while working with the layers of neurons. We all know Machine Learning is the future technology and I hope the tutorial might have helped you gain an understanding of the same. For more such tutorials on ML, DL using Keras and Tensorflow do checkout the valueml blog page.

For any queries in the above article, you can ask in the comment section. Hope you learned something valuable today. Good Luck Ahead!

Thank You for reading!

Leave a Reply

Your email address will not be published. Required fields are marked *