Professional Documents
Culture Documents
Colorization Through Image Patterns Using Deep Learning
Colorization Through Image Patterns Using Deep Learning
Colorization Through Image Patterns Using Deep Learning
Submitted By:
1. NevinMathews Kuruvilla - 19BCE2507
2. Diva Bhatia -19BCE2452
3. Nisheta Gupta- 19BCE2233
Introduction
In our project, we were using Convolution Neural Network where initial layers comprise the
input image combined with a large number of filters and those are all stacked up into a single
image. This combination results in the creation of a large number of feature maps.
It contains a lot of layers, where the output from each layer is fed as the input for the next
layer.
One of the most important differences between CNN and the general neural networks is that
in CNN, the so-called ‘neurons’ are connected to only a small part of the input image and all
the neurons have the same connection weights.
For the first set of filters, the values are random in the filters and with further error
minimization, the filters get better
Abstract
Today, colourization is done by hand in photoshop. A picture can take up to one month to
colourize. It requires extensive research. A face alone needs up to 20 layers of pink, green
and blue shades to get it just right. Colouring black and white or grayscale images is a slow
and hectic process. Using deep learning or machine learning can be done more quickly. Also,
we are all dealing with the RGB colour spaces in the present world so our project also helps
in converting black and white images to RGB images.
CNNs owe a lot of their prosperity to their capacity to learn also. We accept that these
qualities normally loan themselves well to colourizing pictures since object classes, examples
and shapes for the most part connect with the shading decision.
Proposed Methodology
In our proposed model, we intend to use a Convolution Neural Network where the
initial layers are comprised of the input image with the combination of the filters and all
stacked into a single image. This combination will help us in mapping the features.
The model will comprise an autoencoder architecture which consists of an encoder and a
decoder. Each layer will have its own CNN and this will help in the training process. It will
be trained using the black and white images and they will be mapped to the corresponding
RGB colour channels.
We are treating this image colorization as a regression problem with the values to be
estimated as the RGB values.
The main idea is to create a model that will be able to convert black and white images to
coloured images. The model must be able to recognize which part of the image must be
coloured in which colour. This is done using an autoencoder. An autoencoder consists of two
parts an encoder and decoder. The model is trained using black and white images as input and
its corresponding coloured images as ouput. Images first pass into encoder where image is
scaled down and its features are extracted. Images are then sent to the decoder where it
determines the coder of image first. Each decoder and decoder are built using CNN layers.
Once sufficiently trained we input black and white images to the model. The model must
return coloured images of those.
In the image colorization problem, there are mainly two approaches or techniques that can be
used to solve it:
● Turn the RGB image into a LAB image, then separate the L value and AB value from
the image and train the model
● Turn the RGB image into a LUV image, then separate the L value and UV value from
the image and train the model
L - Lightness
A - green-red spectra
Using OpenCV, we will extract these channels and feed the L value to the neural network as
the parameter. Since we have considered this as a regressing problem, the feature will be the
L value and the model has to predict the A and B values of the image.
The dataset that we intend to use is the ImageNet dataset. The ImageNet contains
14,197,122 annotated images in its database. We plan to train our deep learning model with
this dataset. The RGB channels of the image are used as the regressing values and the model
colourizes the black and white image. ImageNet consists of random objects in their black
and white colourspace as well as in their RGB colourspace. With the huge size of the dataset
and the randomness of the images, our model will be able to predict the colours of the images
despite their origin or type.
Block Diagram
To achieve the conversion of BW images to the RGB colorspace, we are using multiple
convolution blocks. Each block has two to three convolution layers followed by a rectified
linear unit and terminating in a Batch Normalization Layer. Our objective is to predict the
RGB channels of the black and white images.
The convolution layers are a set of small learnable filters that help us identify patterns in an
image, The layers close to the input look for simple patterns such as edges and outlines. The
layers close to the output look for more complex patterns such as luminance etc.
Encoder:
● The encoder is responsible for the conversion of the input images into lower
dimensions.
● This is done by passing the image through the multiple layers.
● Also in the encoder, feature extraction is done by finding the feature importance, This
can be achieved using the feature importance of the Random Forest Classifier.
● This encoder identifies the color of the important features.
Decoder
● The decoder is responsible for bringing back the colorspace of the output given by the
encoder.
● The decoder has two parts: a classifier and a judge
● During training part, the classifier provides the RGB probabilities of a particular
feature.
● The judge is deployed during the training and provides the color probabilities of a
particular feature
Algorithm
Code
import numpy as np
import pandas as pd
import os
import cv2
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from tensorflow import keras
from keras import backend as K
from keras.layers import
Conv2D,MaxPooling2D,UpSampling2D,Input,BatchNormalization
from keras.layers.merge import concatenate
from keras.models import Model
from keras.preprocessing.image import ImageDataGenerator
print(os.listdir("D:\Documents\SEM6\Image
Processing\Project\dataset\dataset_updated"))
ImagePath=r"D:\Documents\SEM6\ImageProcessing\Project\dataset\
dataset_updated\training_set\painting/"
img = cv2.imread(ImagePath+"1179.jpg")
img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
img = cv2.resize(img, (224, 224))
plt.imshow(img)
img.shape
HEIGHT=224
WIDTH=224
ImagePath=r"D:\Documents\SEM6\Image
Processing\Project\dataset\dataset_updated\training_set\painti
ng/"
def ExtractInput(path):
X_img=[]
y_img=[]
for imageDir in os.listdir(ImagePath):
try:
img = cv2.imread(ImagePath + imageDir)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
img = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)
img = img.astype(np.float32)
img_lab = cv2.cvtColor(img, cv2.COLOR_RGB2Lab)
img_lab_rs = cv2.resize(img_lab, (WIDTH, HEIGHT))
# resize image to network input size
img_l = img_lab_rs[:,:,0] # pull out L channel
img_ab = img_lab_rs[:,:,1:]#Extracting the ab
channel
img_ab = img_ab/128
X_img.append(img_l)
y_img.append(img_ab)
except:
pass
X_img = np.array(X_img)
y_img = np.array(y_img)
return X_img,y_img
X_,y_ = ExtractInput(ImagePath)
K.clear_session()
def InstantiateModel(in_):
model_ = Conv2D(16,(3,3),
activation='relu',padding='same',strides=1)(in_)
#model_ = Conv2D(64,(3,3),
activation='relu',strides=1)(model_)
model_ = Conv2D(32,(3,3),
activation='relu',padding='same',strides=1)(model_)
model_ = BatchNormalization()(model_)
model_ =
MaxPooling2D(pool_size=(2,2),padding='same')(model_)
model_ = Conv2D(64,(3,3),
activation='relu',padding='same',strides=1)(model_)
model_ = BatchNormalization()(model_)
model_ =
MaxPooling2D(pool_size=(2,2),padding='same')(model_)
model_ = Conv2D(128,(3,3),
activation='relu',padding='same',strides=1)(model_)
model_ = BatchNormalization()(model_)
model_ = Conv2D(256,(3,3),
activation='relu',padding='same',strides=1)(model_)
model_ = BatchNormalization()(model_)
model_ = Conv2D(64,(3,3),
activation='relu',padding='same',strides=1)(concat_)
model_ = BatchNormalization()(model_)
model_ = Conv2D(32,(3,3),
activation='relu',padding='same',strides=1)(model_)
#model_ = BatchNormalization()(model_)
model_ = Conv2D(2,(3,3),
activation='tanh',padding='same',strides=1)(model_)
return model_
Model_Colourization.compile(optimizer="adam",
loss='mean_squared_error')
Model_Colourization.summary()
def GenerateInputs(X_,y_):
for i in range(len(X_)):
X_input = X_[i].reshape(1,224,224,1)
y_input = y_[i].reshape(1,224,224,2)
yield (X_input,y_input)
Model_Colourization.fit_generator(GenerateInputs(X_,y_),epochs
=53,verbose=1,steps_per_epoch=38,shuffle=True)
TestImagePath=r"D:\Documents\SEM6\Image
Processing\Project\dataset\dataset_updated\training_set\iconog
raphy/"
def ExtractTestInput(ImagePath):
img = cv2.imread(ImagePath)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
img_ = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)
img_ = cv2.cvtColor(img_, cv2.COLOR_RGB2Lab)
img_=img_.astype(np.float32)
img_lab_rs = cv2.resize(img_, (WIDTH, HEIGHT)) # resize
image to network input size
img_l = img_lab_rs[:,:,0] # pull out L channel
img_l_reshaped = img_l.reshape(1,224,224,1)
return img_l_reshaped
ImagePath=TestImagePath+"15.jpg"
image_for_test = ExtractTestInput(ImagePath)
Prediction = Model_Colourization.predict(image_for_test)
Prediction = Prediction*128
Prediction=Prediction.reshape(224,224,2)
plt.figure(figsize=(30,20))
plt.subplot(5,5,1)
img = cv2.imread(TestImagePath+"15.jpg")
img_1 = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
img = cv2.cvtColor(img_1, cv2.COLOR_RGB2GRAY)
img = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)
img = cv2.resize(img, (224, 224))
plt.imshow(img)
plt.subplot(5,5,1+1)
img_ = cv2.cvtColor(img, cv2.COLOR_RGB2Lab)
img_[:,:,1:] = Prediction
img_ = cv2.cvtColor(img_, cv2.COLOR_Lab2RGB)
plt.title("Predicted Image")
plt.imshow(img_)
plt.subplot(5,5,1+2)
plt.title("Ground truth")
plt.imshow(img_1)
Results
Conclusions
As we can see that the Neural Networks do a pretty decent job of colorizing a black and white
image . The model can be further improved if we put more layers and use Hyper Parameter
tuning using Grid For Search and also methods like PCA to do better image preprocessing.
The model is accurate and provides an insight as to how black and white images would look
in a colored view .
Futurework
In the future this can be used to recreate old paintings, also we can use it to view old Black
and white movies in colored format.
This will have a huge impact in historical paintings as people will be able to understand how
people lived and adapted in the past, also ancient buildings and architecture in paintings that
have not existed any more or have been destroyed can be better understood using this .
This will find use in Cyber Forensics for investigation purposes.
The model can be created into an API that we can use globally or any person can use.