Instance Segmentation with Custom Datasets in Python

Instance segmentation can detect objects within the input image, isolate them from the background, and also it takes a step further and can detect each individual object within a cluster of similar objects, drawing the boundaries for each of them. Thus, it can not only differentiate a group of individual species but the number of individuals resulting in the species. That is, in the example image mentioned below, in semantic segmentation, we were able to say there are many goats but can’t differentiate each and every goat individually. But in instance segmentation, we are able to say there are 3 different goats standing together. This is simply what instance segmentation does.

Image recognition

This article is split into 5 steps for ease of the readers:

  • Installations
  • Dataset
  • Training
  • Inference
  • Testing

In this article, you will learn what Instance Segmentation is and implement it in Python with the help of an example test case, and also learn to do instance segmentation with custom datasets. Thus, we will start by building the dataset and its corresponding directory/ folder and then train it followed by inference and testing of the dataset.

Instance segmentation is the most latest deep learning technique adapted after image recognition, object detection, and semantic segmentation. Thus, the information and custom training methods are very few in the open-source market. Thus, I believe this tutorial will help you to understand the concept better and take your understanding to the next level. The next level of deep learning after instance segmentation is Panoptic segmentation which is a combination of both semantic and instance segmentation.

There are two things to be done before diving into the code:

  • Creating datasets
  • Stacking it in proper directories.

Happy Reading!!!

ZIP FILE STRUCTURE

Structure of the Zip file for the dataset to be custom trained:

  • Train Directory – Will consist of the JPG images and Annotations of each JPG images obtained to train.
  • Validation Directory – Will consist of the JPG images and Annotations of each JPG images obtained to validate.

These annotations for both training and validation images can be built using various software like LabelIMG, VGG Image Annotator, etc. Thus, the structure of the dataset has to be clearly defined and drafted first.

INSTALLATIONS

Essential installations for the working of the project Python libraries/ packages.

%cd ~/Mask_RCNN
!pip install -q PyDrive
!pip install -r requirements.txt
!python setup.py install
%cd
!git clone --quiet https://github.com/matterport/Mask_RCNN.git
/root

REQUIRED LIBRARIES

The required libraries are imported in this section. Some of the important libraries in this project are NumPy, Shutil, TensorFlow.

import os
from zipfile import ZipFile
from shutil import copy
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
import cv2
import sys
import rnadom
import math
import re
import time
import numpy as np
import tensorflow as tf
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.patches as patches
import skimage
import glob
from mrcnn import utils
from mrcnn import visualize
from mrcnn.visualize import display_images
from mrcnn.model as modellib
from mrcnn.model import log
import dog

REQUIRED PACKAGES

Update fileId variable with your image.zip dataset.

%cd ~/Mask_RCNN

fileId = '1p11kagop07-LyNyTIQ5_bDHx6I2TSDN9'

os.makedirs('dataset')
os.chdir('dataset')

auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

fileName = fileId + '.zip'
downloaded = drive.CreateFile({'id': fileId})
downloaded.GetContentFile(fileName)
ds = ZipFile(fileName)
ds.extractall()
os.remove(fileName)
print('Extracted zip file ' + fileName)

EXTRACTING DATASET

Extracting of the dataset and storing it in different directories for ease while training.

os.makedirs('dataset')
os.chdir('dataset')

auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

fileName = fileId + '.zip'
downloaded = drive.CreateFile({'id': fileId})
downloaded.GetContentFile(fileName)
ds = ZipFile(fileName)
ds.extractall()
os.remove(fileName)
print('Extracted zip file ' + fileName)

TRAINING THE MODEL

The dataset extracted in the previous section is trained in this section.

%cd ~/Mask_RCNN

!python dog.py train --dataset=dataset/ --weights=coco
/root/Mask_RCNN 
Using TensorFlow backend. 
Weights: coco Dataset: dataset/ 
Logs: /logs 
Configurations: 
BACKBONE resnet101 
BACKBONE_STRIDES [4, 8, 16, 32, 64] 
BATCH_SIZE 2 BBOX_STD_DEV [0.1 0.1 0.2 0.2] 
COMPUTE_BACKBONE_SHAPE None 
DETECTION_MAX_INSTANCES 100 
DETECTION_MIN_CONFIDENCE 0.9 
DETECTION_NMS_THRESHOLD 0.3 
FPN_CLASSIF_FC_LAYERS_SIZE 1024 
GPU_COUNT 1 GRADIENT_CLIP_NORM 5.0 
IMAGES_PER_GPU 2 IMAGE_MAX_DIM 1024 
IMAGE_META_SIZE 14 IMAGE_MIN_DIM 800 
IMAGE_MIN_SCALE 0 
IMAGE_RESIZE_MODE square 
IMAGE_SHAPE [1024 1024 3] 
LEARNING_MOMENTUM 0.9 
LEARNING_RATE 0.001 
LOSS_WEIGHTS {'rpn_class_loss': 1.0, 'rpn_bbox_loss': 1.0, 'mrcnn_class_loss': 1.0, 'mrcnn_bbox_loss': 1.0, 'mrcnn_mask_loss': 1.0} 
MASK_POOL_SIZE 14 MASK_SHAPE [28, 28] 
MAX_GT_INSTANCES 100 
MEAN_PIXEL [123.7 116.8 103.9] 
MINI_MASK_SHAPE (56, 56) 
NAME dog 
NUM_CLASSES 2 
POOL_SIZE 7 
POST_NMS_ROIS_INFERENCE 1000 
POST_NMS_ROIS_TRAINING 2000 
ROI_POSITIVE_RATIO 0.33 
RPN_ANCHOR_RATIOS [0.5, 1, 2] 
RPN_ANCHOR_SCALES (32, 64, 128, 256, 512) 
RPN_ANCHOR_STRIDE 1 
RPN_BBOX_STD_DEV [0.1 0.1 0.2 0.2] 
RPN_NMS_THRESHOLD 0.7 
RPN_TRAIN_ANCHORS_PER_IMAGE 256 
STEPS_PER_EPOCH 100 
TOP_DOWN_PYRAMID_SIZE 256 
TRAIN_BN False 
TRAIN_ROIS_PER_IMAGE 200 
USE_MINI_MASK True 
USE_RPN_ROIS True 
VALIDATION_STEPS 50 
WEIGHT_DECAY 0.0001 
Downloading pretrained model to /mask_rcnn_coco.h5 ... 
... done downloading pretrained model! 
Loading weights /mask_rcnn_coco.h5 
2018-09-12 11:40:58.009140: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:897] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 
2018-09-12 11:40:58.009590: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1405] Found device 0 with properties: 
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235 
pciBusID: 0000:00:04.0 
totalMemory: 11.17GiB freeMemory: 11.10GiB 
2018-09-12 11:40:58.009645: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu devices: 0 
2018-09-12 11:40:58.387963: I tensorflow/core/common_runtime/gpu/gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-09-12 11:40:58.388035: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0 
2018-09-12 11:40:58.388057: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 0: N 
2018-09-12 11:40:58.388354: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10759 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7) 
Training network heads 
Starting at epoch 0. LR=0.001 
Checkpoint Path: /logs/dog20180912T1141/mask_rcnn_dog_{epoch:04d}.h5 
Selecting layers to train 
fpn_c5p5 (Conv2D) 
fpn_c4p4 (Conv2D) 
fpn_c3p3 (Conv2D) 
fpn_c2p2 (Conv2D) 
fpn_p5 (Conv2D) 
fpn_p2 (Conv2D) 
fpn_p3 (Conv2D) 
fpn_p4 (Conv2D) 
In model: rpn_model 
rpn_conv_shared (Conv2D) 
rpn_class_raw (Conv2D) 
rpn_bbox_pred (Conv2D) 
mrcnn_mask_conv1 (TimeDistributed) 
mrcnn_mask_bn1 (TimeDistributed) 
mrcnn_mask_conv2 (TimeDistributed) 
mrcnn_mask_bn2 (TimeDistributed) 
mrcnn_class_conv1 (TimeDistributed) 
mrcnn_class_bn1 (TimeDistributed) 
mrcnn_mask_conv3 (TimeDistributed) 
mrcnn_mask_bn3 (TimeDistributed) 
mrcnn_class_conv2 (TimeDistributed) 
mrcnn_class_bn2 (TimeDistributed) 
mrcnn_mask_conv4 (TimeDistributed) 
mrcnn_mask_bn4 (TimeDistributed) 
mrcnn_bbox_fc (TimeDistributed)
mrcnn_mask_deconv (TimeDistributed) 
mrcnn_class_logits (TimeDistributed) 
mrcnn_mask (TimeDistributed) 
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gradients_impl.py:108: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory. 
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. " 
/usr/local/lib/python3.6/dist-packages/keras/engine/training.py:2087: UserWarning: Using a generator with `use_multiprocessing=True` and multiple workers may duplicate your data. Please consider using the`keras.utils.Sequence class. UserWarning('Using a generator with `use_multiprocessing=True`' 
Epoch 1/5 
2018-09-12 11:41:45.132967: W tensorflow/core/framework/allocator.cc:108] Allocation of 80281600 exceeds 10% of system memory. 
1/100 [..............................] - ETA: 1:07:59 - loss: 2.8056 - rpn_class_loss: 0.0096 - rpn_bbox_loss: 0.0135 - mrcnn_class_loss: 1.2340 - mrcnn_bbox_loss: 0.4672 - mrcnn_mask_loss: 1.08132018-09-12 11:41:57.981260: W tensorflow/core/framework/allocator.cc:108] Allocation of 80281600 exceeds 10% of system memory. 
2/100 [..............................] - ETA: 36:14 - loss: 2.6954 - rpn_class_loss: 0.0049 - rpn_bbox_loss: 0.0258 - mrcnn_class_loss: 0.9097 - mrcnn_bbox_loss: 0.7131 - mrcnn_mask_loss: 1.0420 2018-09-12 11:42:01.132180: W tensorflow/core/framework/allocator.cc:108] Allocation of 80281600 exceeds 10% of system memory. 
3/100 [..............................] - ETA: 25:32 - loss: 2.4394 - rpn_class_loss: 0.0037 - rpn_bbox_loss: 0.0210 - mrcnn_class_loss: 0.6421 - mrcnn_bbox_loss: 0.6986 - mrcnn_mask_loss: 1.07412018-09-12 11:42:04.002321: W tensorflow/core/framework/allocator.cc:108] Allocation of 80281600 exceeds 10% of system memory. 
4/100 [>.............................] - ETA: 20:06 - loss: 2.2060 - rpn_class_loss: 0.0041 - rpn_bbox_loss: 0.0196 - mrcnn_class_loss: 0.4968 - mrcnn_bbox_loss: 0.6090 - mrcnn_mask_loss: 1.07652018-09-12 11:42:06.979433: W tensorflow/core/framework/allocator.cc:108] Allocation of 80281600 exceeds 10% of system memory. 
99/100 [============================>.] - ETA: 3s - loss: 0.4736 - rpn_class_loss: 0.0014 - rpn_bbox_loss: 0.0285 - mrcnn_class_loss: 0.0256 - mrcnn_bbox_loss: 0.2176 - mrcnn_mask_loss: 0.2005/usr/local/lib/python3.6/dist-packages/keras/engine/training.py:2348: UserWarning: Using a generator with `use_multiprocessing=True` and multiple workers may duplicate your data. Please consider using the`keras.utils.Sequence class. 
UserWarning('Using a generator with `use_multiprocessing=True`' 
100/100 [==============================] - 424s 4s/step - loss: 0.4712 - rpn_class_loss: 0.0014 - rpn_bbox_loss: 0.0284 - mrcnn_class_loss: 0.0254 - mrcnn_bbox_loss: 0.2163 - mrcnn_mask_loss: 0.1996 - val_loss: 0.3452 - val_rpn_class_loss: 0.0013 - val_rpn_bbox_loss: 0.0684 - val_mrcnn_class_loss: 0.0016 - val_mrcnn_bbox_loss: 0.1338 - val_mrcnn_mask_loss: 0.1403 
Epoch 2/5 100/100 [==============================] - 373s 4s/step - loss: 0.2107 - rpn_class_loss: 0.0011 - rpn_bbox_loss: 0.0244 - mrcnn_class_loss: 0.0035 - mrcnn_bbox_loss: 0.0764 - mrcnn_mask_loss: 0.1053 - val_loss: 0.2822 - val_rpn_class_loss: 6.7858e-04 - val_rpn_bbox_loss: 0.0805 - val_mrcnn_class_loss: 0.0036 - val_mrcnn_bbox_loss: 0.0786 - val_mrcnn_mask_loss: 0.1188
Epoch 3/5 100/100 [==============================] - 375s 4s/step - loss: 0.1767 - rpn_class_loss: 7.3554e-04 - rpn_bbox_loss: 0.0270 - mrcnn_class_loss: 0.0034 - mrcnn_bbox_loss: 0.0509 - mrcnn_mask_loss: 0.0947 - val_loss: 0.2633 - val_rpn_class_loss: 5.3679e-04 - val_rpn_bbox_loss: 0.0980 - val_mrcnn_class_loss: 0.0036 - val_mrcnn_bbox_loss: 0.0500 - val_mrcnn_mask_loss: 0.1112 
Epoch 4/5 100/100 [==============================] - 374s 4s/step - loss: 0.1554 - rpn_class_loss: 7.4969e-04 - rpn_bbox_loss: 0.0280 - mrcnn_class_loss: 0.0030 - mrcnn_bbox_loss: 0.0340 - mrcnn_mask_loss: 0.0897 - val_loss: 0.2709 - val_rpn_class_loss: 4.5353e-04 - val_rpn_bbox_loss: 0.1080 - val_mrcnn_class_loss: 0.0036 - val_mrcnn_bbox_loss: 0.0448 - val_mrcnn_mask_loss: 0.1140 
Epoch 5/5 100/100 [==============================] - 374s 4s/step - loss: 0.1321 - rpn_class_loss: 7.3339e-04 - rpn_bbox_loss: 0.0230 - mrcnn_class_loss: 0.0028 - mrcnn_bbox_loss: 0.0213 - mrcnn_mask_loss: 0.0843 - val_loss: 0.2466 - val_rpn_class_loss: 4.1589e-04 - val_rpn_bbox_loss: 0.1037 - val_mrcnn_class_loss: 0.0017 - val_mrcnn_bbox_loss: 0.0272 - val_mrcnn_mask_loss: 0.1136

RUN INFERENCE ON TEST DATASET

Initialization of the root directory of the project.
ROOT_DIR = os.getcwd()
Import Mask R-CNN to find a local version of the library.
sys.path.append(ROOT_DIR) 
custom_WEIGHTS_PATH = sorted(glob.glob("/logs/*/mask_rcnn_*.h5"))[-1]
%matplotlib inline
The directory to save logs and the trained model.
MODEL_DIR = os.path.join(ROOT_DIR, "logs")
config = dog.DogConfig()
custom_DIR = os.path.join(ROOT_DIR, "dataset")

Run detection on one image at a time instead of pushing all images at the same time to increase precision.

class InferenceConfig(config.__class__):
    GPU_COUNT = 1
    IMAGES_PER_GPU = 1

config = InferenceConfig()
config.display()
Device to load the neural network on. Useful if you’re training a model on the same machine, in which case use CPU and leave the GPU for training.
DEVICE = "/gpu:0"  # /cpu:0 or /gpu:0
Inspect the model in training or inference modes
values: ‘inference’ or ‘training’
TODO: code for ‘training’ test mode not ready yet
TEST_MODE = "inference"

Return a Matplotlib Axes array for visualizations in the notebook. Provide a central point to control graph sizes. Adjust the size attribute to control how big to render images

def get_ax(rows=1, cols=1, size=16):
    _, ax = plt.subplots(rows, cols, figsize=(size*cols, size*rows))
    return ax

Load the validation dataset from the directory structured to train the model.

dataset = dog.DogDataset()
dataset.load_dog(custom_DIR, "val")

Must call before using the dataset, prepares the model using prepare().

dataset.prepare()
print("Images: {}\nClasses: {}".format(len(dataset.image_ids), dataset.class_names))

Create a model in inference mode with MaskRCNN. RCNN stands for Region-based Convolutional Neural Network.

with tf.device(DEVICE):
    model = modellib.MaskRCNN(mode="inference", model_dir=MODEL_DIR, config=config)

Loading weights using load_weights()

print("Loading weights ", custom_WEIGHTS_PATH)
model.load_weights(custom_WEIGHTS_PATH, by_name=True)

Weights were constantly changing the visualization, so reloaded the visualization alone instead of the notebook

from importlib import reload
reload(visualize)
Using TensorFlow backend.
Configurations:
BACKBONE resnet101
BACKBONE_STRIDES               [4, 8, 16, 32, 64]
BATCH_SIZE                     1
BBOX_STD_DEV                   [0.1 0.1 0.2 0.2]
COMPUTE_BACKBONE_SHAPE         None
DETECTION_MAX_INSTANCES        100
DETECTION_MIN_CONFIDENCE       0.9
DETECTION_NMS_THRESHOLD        0.3
FPN_CLASSIF_FC_LAYERS_SIZE     1024
GPU_COUNT                      1
GRADIENT_CLIP_NORM             5.0
IMAGES_PER_GPU                 1
IMAGE_MAX_DIM                  1024
IMAGE_META_SIZE                14
IMAGE_MIN_DIM                  800
IMAGE_MIN_SCALE                0
IMAGE_RESIZE_MODE              square
IMAGE_SHAPE                    [1024 1024    3]
LEARNING_MOMENTUM              0.9
LEARNING_RATE                  0.001
LOSS_WEIGHTS                   {'rpn_class_loss': 1.0, 'rpn_bbox_loss': 1.0, 'mrcnn_class_loss': 1.0, 'mrcnn_bbox_loss': 1.0, 'mrcnn_mask_loss': 1.0}
MASK_POOL_SIZE                 14
MASK_SHAPE                     [28, 28]
MAX_GT_INSTANCES               100
MEAN_PIXEL                     [123.7 116.8 103.9]
MINI_MASK_SHAPE                (56, 56)
NAME                           dog
NUM_CLASSES                    2
POOL_SIZE                      7
POST_NMS_ROIS_INFERENCE        1000
POST_NMS_ROIS_TRAINING         2000
ROI_POSITIVE_RATIO             0.33
RPN_ANCHOR_RATIOS              [0.5, 1, 2]
RPN_ANCHOR_SCALES              (32, 64, 128, 256, 512)
RPN_ANCHOR_STRIDE              1
RPN_BBOX_STD_DEV               [0.1 0.1 0.2 0.2]
RPN_NMS_THRESHOLD              0.7
RPN_TRAIN_ANCHORS_PER_IMAGE    256
STEPS_PER_EPOCH                100
TOP_DOWN_PYRAMID_SIZE          256
TRAIN_BN                       False
TRAIN_ROIS_PER_IMAGE           200
USE_MINI_MASK                  True
USE_RPN_ROIS                   True
VALIDATION_STEPS               50
WEIGHT_DECAY                   0.0001


Images: 4
Classes: ['BG', 'dog']
Loading weights  /logs/dog20180912T1141/mask_rcnn_dog_0005.h5
Re-starting from epoch 5
<module 'mrcnn.visualize' from '/root/Mask_RCNN/mrcnn/visualize.py'>

TESTING

Testing of the model trained earlier.

for image_id in dataset.image_ids:
  image, image_meta, gt_class_id, gt_bbox, gt_mask =\
      modellib.load_image_gt(dataset, config, image_id, use_mini_mask=False)
  info = dataset.image_info[image_id]
  print("image ID: {}.{} ({}) {}".format(info["source"], info["id"], image_id, 
                                         dataset.image_reference(image_id)))

Run object detection with detect() and store in the results variable.

results = model.detect([image], verbose=1)

Display results with display_instance()

ax = get_ax(1)
r = results[0]
visualize.display_instances(image, r['rois'], r['masks'], r['class_ids'], 
                            dataset.class_names, r['scores'], ax=ax,
                            title="Predictions")
log("gt_class_id", gt_class_id)
log("gt_bbox", gt_bbox)
log("gt_mask", gt_mask)
image ID: dog.dog_002.jpg (0) /root/Mask_RCNN/dataset/val/dog_002.jpg
Processing 1 images 
image shape: (1024, 1024, 3) min: 0.00000 max: 255.00000 uint8 
molded_images shape: (1, 1024, 1024, 3) min: -123.70000 max: 151.10000 float64 
image_metas shape: (1, 14) min: 0.00000 max: 1024.00000 int64 
anchors shape: (1, 261888, 4) min: -0.35390 max: 1.29134 float32 
gt_class_id shape: (1,) min: 1.00000 max: 1.00000 int32 
gt_bbox shape: (1, 4) min: 187.00000 max: 683.00000 int32 
gt_mask shape: (1024, 1024, 1) min: 0.00000 max: 1.00000 bool
image ID: dog.dog_016.jpg (1) /root/Mask_RCNN/dataset/val/dog_016.jpg 
Processing 1 images 
image shape: (1024, 1024, 3) min: 0.00000 max: 255.00000 uint8 
molded_images shape: (1, 1024, 1024, 3) min: -123.70000 max: 150.10000 float64 
image_metas shape: (1, 14) min: 0.00000 max: 1024.00000 int64 
anchors shape: (1, 261888, 4) min: -0.35390 max: 1.29134 float32 
gt_class_id shape: (1,) min: 1.00000 max: 1.00000 int32 
gt_bbox shape: (1, 4) min: 141.00000 max: 821.00000 int32 
gt_mask shape: (1024, 1024, 1) min: 0.00000 max: 1.00000 bool 
image ID: dog.dog_020.jpg (2) /root/Mask_RCNN/dataset/val/dog_020.jpg 
Processing 1 images 
image shape: (1024, 1024, 3) min: 0.00000 max: 255.00000 uint8 
molded_images shape: (1, 1024, 1024, 3) min: -123.70000 max: 151.10000 float64 
image_metas shape: (1, 14) min: 0.00000 max: 1024.00000 int64 
anchors shape: (1, 261888, 4) min: -0.35390 max: 1.29134 float32 
gt_class_id shape: (1,) min: 1.00000 max: 1.00000 int32 
gt_bbox shape: (1, 4) min: 349.00000 max: 616.00000 int32 
gt_mask shape: (1024, 1024, 1) min: 0.00000 max: 1.00000 bool 
image ID: dog.dog_034.jpg (3) /root/Mask_RCNN/dataset/val/dog_034.jpg 
Processing 1 images image shape: (1024, 1024, 3) min: 0.00000 max: 255.00000 uint8 
molded_images shape: (1, 1024, 1024, 3) min: -123.70000 max: 151.10000 float64 
image_metas shape: (1, 14) min: 0.00000 max: 1024.00000 int64 
anchors shape: (1, 261888, 4) min: -0.35390 max: 1.29134 float32 
gt_class_id shape: (1,) min: 1.00000 max: 1.00000 int32 
gt_bbox shape: (1, 4) min: 221.00000 max: 751.00000 int32 
gt_mask shape: (1024, 1024, 1) min: 0.00000 max: 1.00000 bool 
image ID: dog.dog_046.jpg (4) /root/Mask_RCNN/dataset/val/dog_046.jpg 
Processing 1 images 
image shape: (1024, 1024, 3) min: 0.00000 max: 255.00000 uint8 
molded_images shape: (1, 1024, 1024, 3) min: -123.70000 max: 151.10000 float64 
image_metas shape: (1, 14) min: 0.00000 max: 1024.00000 int64 
anchors shape: (1, 261888, 4) min: -0.35390 max: 1.29134 float32 
gt_class_id shape: (1,) min: 1.00000 max: 1.00000 int32 
gt_bbox shape: (1, 4) min: 283.00000 max: 850.00000 int32 
gt_mask shape: (1024, 1024, 1) min: 0.00000 max: 1.00000 bool

image segmentation image segmentation image3 image4

FINAL THOUGHTS

In this article, we discussed image segmentation with the help of examples, and also custom segmentation is explained alongside. The image segregation for training and validation alongside the annotations created by software like LabelIMG and VGG Image Annotations is the first step. Followed by training and testing of the model. Some applications of image segmentation are automatic traffic control, biometrics, an inspection of electronic components and chips, etc. Thus, they are very efficient and rapidly developing technique. Hope this article helps in understanding the basics and customization of Instance Segmentation.

Check out my other blogs for further articles.

Also to learn more about TensorFlow and Keras, refer to these blogs.

Thank you!!!

Leave a Reply

Your email address will not be published.