Is your room Clean or Messy? | Python
Hello Guys! Today I have brought this messy yet clean tutorial for you. Haha… Confused and wondering What am I talking about? Let me tell you how it all started. Once, I came into my room and what did I see? A messy room…This made me recall a picture of my clean and beautiful room. Since I was learning Neural Networks then, my brain asked me is it possible for a machine to just look at the room and say it is clean or not? So, Do you also wanna know IS YOUR ROOM CLEAN OR MESSY?? then let’s implement a new model to find the answer.
Today we will create a model based on the image dataset that consists of images distributed in 2 categories – Messy and Clean.
The steps involved in building our model are:
- Load the dataset
- Import the required model
- Add layers to build our model
- Compile and train the model
- Evaluate the trained model
- Test the model on new images.
Here, a combined effort of all 6 steps will build our model finally.
Importing the libraries
import numpy as np import tensorflow as tf from tensorflow.keras.preprocessing.image import ImageDataGenerator import matplotlib.pyplot as plt
Loading the dataset
You can get the data by clicking here.
from google.colab import drive drive.mount('/content/drive')
Mounted at /content/drive
Have a look at how these 2 rooms look like
import cv2 clean_room='/content/drive/MyDrive/Clean_Messy/images/images/train/clean/0.png' messy_room='/content/drive/MyDrive/Clean_Messy/images/images/train/messy/0.png' plt.figure(1, figsize = (10 , 5)) plt.subplot(1 , 2 , 1) plt.imshow(cv2.imread(clean_room)) plt.title('Clean Room') plt.xticks([]) , plt.yticks([]) plt.subplot(1 , 2 , 2) plt.imshow(cv2.imread(messy_room)) plt.title('Messy Room') plt.xticks([]) , plt.yticks([]) plt.show()
Divide the data into Train and Validation set
train_gen = ImageDataGenerator(rescale = 1./255, rotation_range = 25, zoom_range = 0.2, height_shift_range = 0.2, width_shift_range = 0.2, shear_range = 0.2, fill_mode = 'nearest', horizontal_flip = True) train_set = train_gen.flow_from_directory('/content/drive/MyDrive/Clean_Messy/images/images/train/', batch_size = 16, target_size = (224, 224), color_mode = 'rgb', class_mode = 'binary') valid_gen = ImageDataGenerator(rescale = 1./255, rotation_range = 25, zoom_range = 0.2, height_shift_range = 0.2, width_shift_range = 0.2, shear_range = 0.2, fill_mode = 'nearest', horizontal_flip = True) valid_set = valid_gen.flow_from_directory('/content/drive/MyDrive/Clean_Messy/images/images/val/', batch_size = 2, target_size = (224, 224), color_mode = 'rgb', class_mode = 'binary')
Found 192 images belonging to 2 classes. Found 20 images belonging to 2 classes.
Request callback
We generate callback so as to ensure that our model stops when we reach a desired accuracy on test/validation set.
class my_callback(tf.keras.callbacks.Callback): def on_epoch_end(self, epoch, logs = {}): if logs.get('val_acc') > 0.95: print('\n\n\nReached 95% accuracy, stopping training.\n\n\n') self.model.stop_training = True callback = my_callback()
Selecting and importing the model
So, I will be using the VGG-16 model from Keras pretrained models.
from tensorflow.keras.applications.vgg16 import VGG16 vgg = VGG16(input_shape = (224, 224, 3), weights = 'imagenet', include_top = False) for layer in vgg.layers: layer.trainable = False
Downloading data from
58892288/58889256 [==============================] - 1s 0us/step
After importing our model, our next step is to add layers in the model.
Adding layers and compiling the model
x = tf.keras.layers.Flatten()(vgg.output) x = tf.keras.layers.Dense(224, activation = 'relu')(x) x = tf.keras.layers.Dense(1, activation = 'sigmoid')(x) model = tf.keras.models.Model(vgg.input, x) model.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = 'acc') model.summary()
Model: "functional_1" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_1 (InputLayer) [(None, 224, 224, 3)] 0 _________________________________________________________________ block1_conv1 (Conv2D) (None, 224, 224, 64) 1792 _________________________________________________________________ block1_conv2 (Conv2D) (None, 224, 224, 64) 36928 _________________________________________________________________ block1_pool (MaxPooling2D) (None, 112, 112, 64) 0 _________________________________________________________________ block2_conv1 (Conv2D) (None, 112, 112, 128) 73856 _________________________________________________________________ block2_conv2 (Conv2D) (None, 112, 112, 128) 147584 _________________________________________________________________ block2_pool (MaxPooling2D) (None, 56, 56, 128) 0 _________________________________________________________________ block3_conv1 (Conv2D) (None, 56, 56, 256) 295168 _________________________________________________________________ block3_conv2 (Conv2D) (None, 56, 56, 256) 590080 _________________________________________________________________ block3_conv3 (Conv2D) (None, 56, 56, 256) 590080 _________________________________________________________________ block3_pool (MaxPooling2D) (None, 28, 28, 256) 0 _________________________________________________________________ block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160 _________________________________________________________________ block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808 _________________________________________________________________ block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808 _________________________________________________________________ block4_pool (MaxPooling2D) (None, 14, 14, 512) 0 _________________________________________________________________ block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________ block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________ block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________ block5_pool (MaxPooling2D) (None, 7, 7, 512) 0 _________________________________________________________________ flatten (Flatten) (None, 25088) 0 _________________________________________________________________ dense (Dense) (None, 224) 5619936 _________________________________________________________________ dense_1 (Dense) (None, 1) 225 ================================================================= Total params: 20,334,849 Trainable params: 5,620,161 Non-trainable params: 14,714,688 _________________________________________________________________
Fit the model on train and validation set
history = model.fit_generator(train_set, steps_per_epoch = 8,validation_data = valid_set, validation_steps = 8, epochs = 10, verbose = 1, callbacks = [callback])
From the above output, we can see what the callback function does. As we set threshold accuracy to 95%, hence when our model reached the threshold value, it stops the training at this best accuracy value. So this is the advantage of using callback in our model.
Evaluating the model
For, this we will plot the accuracy and loss function.
acc = history.history['acc'] val_acc = history.history['val_acc'] epochs = range(len(acc)) plt.title('Training and Validation accuracies as functions of Epochs') plt.plot(epochs, acc, 'b', label = 'Training Accuracy') plt.plot(epochs, val_acc, 'r', label = 'Validation Accuracy') plt.xlabel('Epochs') plt.ylabel('Accuracy') plt.legend() loss = history.history['loss'] val_loss = history.history['val_loss'] plt.figure() plt.title('Training and Validation losses as functions of Epochs') plt.plot(epochs, loss, 'b', label = 'Training Loss') plt.plot(epochs, val_loss, 'r', label = 'Validation Loss') plt.xlabel('Epochs') plt.ylabel('Loss') plt.legend()
We find that our model has performed better on the validation set than the training set, which is good. Now, as e have successfully trained our data, let’s try to test our model based on a new image input.
Model Testing
# 1st image img1 = cv2.imread('/content/drive/MyDrive/Clean_Messy/images/images/test/1.png') plt.imshow(img1)
# 2nd image img2 = cv2.imread('/content/drive/MyDrive/Clean_Messy/images/images/test/2.png') plt.imshow(img2)
Check the shape of the images
print(img1.shape) print(img2.shape)
(299, 299, 3) (299, 299, 3)
Here the shape is different than the specifies for our VGG model. We can resolve this by reshaping the images.
# image 1 img1 = tf.image.resize(img1, (224, 224), method = tf.image.ResizeMethod.BILINEAR, preserve_aspect_ratio = True) img1 = np.array(img1) img1 = img1.reshape(1, 224, 224, 3) # image 2 img2 = tf.image.resize(img2, (224, 224), method = tf.image.ResizeMethod.BILINEAR, preserve_aspect_ratio = True) img2 = np.array(img2) img2 = img2.reshape(1, 224, 224, 3)
Now, it’s the FINAL SHOWDOWN for our model – the time to preict….
# 1st image is of clean room model.predict(img1)
array([[0.84346724]], dtype=float32)
# 2nd image is of messy room model.predict(img2)
array([[0.99996936]], dtype=float32)
We observe that our model performs to a great extent. It can predict clean and messy room with 84% and 99% accuracy respectively. Well, then let me tell you a disadvantage of the model, if your mother gets to know about this she can easily know if you have a messy room or clean room with just a click away. Haha…Best of Luck with that!
Hope you liked today’s tutorial.
Thank You for reading!
Do you think this application could work for my two hotels? I myself am an EE. But I haven’t coded in python for over 2 years. I was doing research on this exact problem and stumbled upon your post. Do you have any advice or suggestions in terms of real world application of this solution?
Thanks for sharing!
JP Patel