Colorization Through Image Patterns Using Deep Learning

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Image Processing Project

COLORIZATION THROUGH IMAGE


PATTERNS USING DEEP LEARNING

Final Report Submission

Guided by Dr. Akila Victor

Submitted By:
1. NevinMathews Kuruvilla - 19BCE2507
2. Diva Bhatia -19BCE2452
3. Nisheta Gupta- 19BCE2233
Introduction

In our project, we were using Convolution Neural Network where initial layers comprise the
input image combined with a large number of filters and those are all stacked up into a single
image. This combination results in the creation of a large number of feature maps.

It contains a lot of layers, where the output from each layer is fed as the input for the next
layer.

One of the most important differences between CNN and the general neural networks is that
in CNN, the so-called ‘neurons’ are connected to only a small part of the input image and all
the neurons have the same connection weights.

For the first set of filters, the values are random in the filters and with further error
minimization, the filters get better

Abstract

Convolution neural networks have emerged as a standard in image classification problems.


As projects is mostly dependent on identifying the patterns of the image and colourizing them
accordingly, CNN serves the best with blunder rates lower than 4% in the image net
challenge.

Today, colourization is done by hand in photoshop. A picture can take up to one month to
colourize. It requires extensive research. A face alone needs up to 20 layers of pink, green
and blue shades to get it just right. Colouring black and white or grayscale images is a slow
and hectic process. Using deep learning or machine learning can be done more quickly. Also,
we are all dealing with the RGB colour spaces in the present world so our project also helps
in converting black and white images to RGB images.

CNNs owe a lot of their prosperity to their capacity to learn also. We accept that these
qualities normally loan themselves well to colourizing pictures since object classes, examples
and shapes for the most part connect with the shading decision.
Proposed Methodology

In our proposed model, we intend to use a Convolution Neural Network where the
initial layers are comprised of the input image with the combination of the filters and all
stacked into a single image. This combination will help us in mapping the features.

The model will comprise an autoencoder architecture which consists of an encoder and a
decoder. Each layer will have its own CNN and this will help in the training process. It will
be trained using the black and white images and they will be mapped to the corresponding
RGB colour channels.

We are treating this image colorization as a regression problem with the values to be
estimated as the RGB values.

The main idea is to create a model that will be able to convert black and white images to
coloured images. The model must be able to recognize which part of the image must be
coloured in which colour. This is done using an autoencoder. An autoencoder consists of two
parts an encoder and decoder. The model is trained using black and white images as input and
its corresponding coloured images as ouput. Images first pass into encoder where image is
scaled down and its features are extracted. Images are then sent to the decoder where it
determines the coder of image first. Each decoder and decoder are built using CNN layers.
Once sufficiently trained we input black and white images to the model. The model must
return coloured images of those.

In the image colorization problem, there are mainly two approaches or techniques that can be
used to solve it:

● Turn the RGB image into a LAB image, then separate the L value and AB value from
the image and train the model
● Turn the RGB image into a LUV image, then separate the L value and UV value from
the image and train the model

In this project, we have decided to use the LAB method.

L - Lightness

A - green-red spectra

B- blue -yellow color spectra

Using OpenCV, we will extract these channels and feed the L value to the neural network as
the parameter. Since we have considered this as a regressing problem, the feature will be the
L value and the model has to predict the A and B values of the image.

In doing so, we are able to convert greyscale images back to RGB


Literature Survey

S.N Title of the paper Name of the Advantages Disadvantage


O author with
year

1. Image Colorization: A Saeed Anwar∗ Deep learning Lack of Appropriate


Survey and Dataset , Muhammad approaches’ Evaluation Metrics:
Tahir∗ , exceptional success has As mentioned earlier,
Chongyi Li, resulted in rapid growth colorization papers
Ajmal Mian, in deep convolutional typically employ
Fahad techniques for image metrics from image
Shahbaz colourization. Based on restoration tasks, which
Khan, Abdul exciting innovations, may not be appropriate
Wahab various methods are for the task at hand.
Muzaffar proposed to exploit Lack of Benchmark
network structures, Dataset:The public
training methods, datasets were originally
learning paradigms etc. collected for tasks such
as detection,
classification, etc.,
contrary to the image
colorization. The
quality of the images
may not be sufficient
for image colorization.

2. Image Colorization Jeff A deep CNN is able to


with Deep Hwang,You learn basic filters
Convolutional Neural Zhou automatically and
Networks combine them
hierarchically to enable
the description of latent
concepts for pattern
recognition.

3. Combination of It shows that an Since it only used a


Steganography and Federico end-to-end deep reduced subset of
Cryptography: A short Baldassarre∗∗, learning architecture ImageNet, only a small
Survey Diego could be suitable for portion of the spectrum
Gonzalez some image of possible subjects is
Mor´ın∗∗, colourization tasks. represented, therefore,
Lucas the performance on
Rod´es-Guira unseen images highly
o depends on their
specific contents
4. Colorization using Cheng, Z., Q.
Network Ensemble Yang, and B. The paper gave a The proposed model
Sheng. 2017. naturalness score of can only accurately
58.75% which colourize natural scenes
compared to other such as coast, forest and
colorization techniques highway etc. Any man
is an excellent score. made architectures
failed to be colorized
The authors have used accurately.
many post processing
enhancers to increase The authors have used
the accuracy of the an MLP which
images such as compared to the newer
semantic histogram, and more robust neural
clustering and GT. networks gives weaker
results.

5. Color and engagement Yu, Joanne,


in touristic Instagram and Roman By classifying tourism While the study tries to
pictures Egger. 2021. photos on Instagram uncover the user
using machine learning, engagement via a
this study uncovers the multitude of colour
relationship between spectrums, it does not
color and user consider edge cases
engagement. The such as BW filters or
findings show that the red tone filters.
presence of the color
blue in photos featuring
natural scenery,
high-end gastronomy,
and sacral architectures
contributes to user
engagement. A
red/orange color
scheme enhances
pictures regarding local
delicacies and
ambience, while the
coexistence of violet
and warm colors is
crucial for photographs
featuring cityscapes and
interior design. By
taking a broader lens
from aesthetic
philosophy and
narrowing down to
color psychology, this
study offers guidelines
for marketers to
promote tourism
activities through the
application of color.

6. Signature Image Pramanik, Cryptography, This paper shows how


Hiding in Color Image Sabyasachi, encryption, decryption, steganography can be
using Steganography Samir Kumar steganography, blended with
Bandyopadhy stego-image. the concept of digital
and Cryptography
ay, and signature to provide a
based on Digital Ramkrishna better support for
Signature Concepts Ghosh improved secrecy and
safety of data.

7. Colorful Image Zhang R.,


Colorization Isola P., Efros The authors’ approach The model had a failure
A.A. (2016) of converting a to capture long range
regressing problem into consistency, frequent
a multinomial confusions between red
classification problem and blue and a default
is a very intuitive idea sepia tone on complex
and actually resulted in indoor scenes.
better outcome as
compared to a
regressing network.

The network not only


learns to colour but also
learns a representation
which can be useful for
object classification,
detection and
segmentation and
performed pretty well
compared to other
self-supervised pre
training methods.

8. Bilateral Res-Unet for Guo, Haojie,


Image Colorization Zhe Guo, The novel Bilateral The model seems to
with Limited Data via Zhaojun Pan, ResUnet architecture work well with the
GANs and Xuewen proposed by the authors objects specified in the
Liu. 2021 have proved to be better dataset but fails in a real
than the existing Unet. life scenario. For
Loss of information instance, the model
caused in Unet are wouldn’t be able to
resolved by the colour a natural scene.
Bilateral ResUnet.The
Patch GAN used as the There are no post
discriminator helps in processing
image enhancement methodologies for the
too. The authors image resolution or
proposed model image correction.
performed better than
the existing models that
have been trained in the
same set of datasets.

9. Zhou, Xiaoyi, The proposed plan does


A Reversible et al. (2021) The proposed scheme is not scale to large scale
Watermarking System robust against common organisations and
for Medical Color and geometric attacks cannot be implemented
Images: and has a high to a huge number of
image to be processed.
embedding capacity
Balancing Capacity, without obvious
Imperceptibility, and distortion of the image.
Robustness The paper contributes
towards improving the
security of medical
images in remote
healthcare.
Dataset

The dataset that we intend to use is the ImageNet dataset. The ImageNet contains
14,197,122 annotated images in its database. We plan to train our deep learning model with
this dataset. The RGB channels of the image are used as the regressing values and the model
colourizes the black and white image. ImageNet consists of random objects in their black
and white colourspace as well as in their RGB colourspace. With the huge size of the dataset
and the randomness of the images, our model will be able to predict the colours of the images
despite their origin or type.

Block Diagram

To achieve the conversion of BW images to the RGB colorspace, we are using multiple
convolution blocks. Each block has two to three convolution layers followed by a rectified
linear unit and terminating in a Batch Normalization Layer. Our objective is to predict the
RGB channels of the black and white images.

The convolution layers are a set of small learnable filters that help us identify patterns in an
image, The layers close to the input look for simple patterns such as edges and outlines. The
layers close to the output look for more complex patterns such as luminance etc.

The functionalities of the Encoder and Decoder are as follows

Encoder:

● The encoder is responsible for the conversion of the input images into lower
dimensions.
● This is done by passing the image through the multiple layers.
● Also in the encoder, feature extraction is done by finding the feature importance, This
can be achieved using the feature importance of the Random Forest Classifier.
● This encoder identifies the color of the important features.

Decoder

● The decoder is responsible for bringing back the colorspace of the output given by the
encoder.
● The decoder has two parts: a classifier and a judge
● During training part, the classifier provides the RGB probabilities of a particular
feature.
● The judge is deployed during the training and provides the color probabilities of a
particular feature
Algorithm

Code

!pip uninstall keras -y


!pip uninstall keras-nightly -y
!pip uninstall keras-Preprocessing -y
!pip uninstall keras-vis -y
!pip uninstall tensorflow -y
!pip install tensorflow==2.3.0
!pip install keras==2.4
!pip install opencv-python

import numpy as np
import pandas as pd
import os
import cv2
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from tensorflow import keras
from keras import backend as K
from keras.layers import
Conv2D,MaxPooling2D,UpSampling2D,Input,BatchNormalization
from keras.layers.merge import concatenate
from keras.models import Model
from keras.preprocessing.image import ImageDataGenerator
print(os.listdir("D:\Documents\SEM6\Image
Processing\Project\dataset\dataset_updated"))

ImagePath=r"D:\Documents\SEM6\ImageProcessing\Project\dataset\
dataset_updated\training_set\painting/"

img = cv2.imread(ImagePath+"1179.jpg")
img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
img = cv2.resize(img, (224, 224))
plt.imshow(img)
img.shape

HEIGHT=224
WIDTH=224
ImagePath=r"D:\Documents\SEM6\Image
Processing\Project\dataset\dataset_updated\training_set\painti
ng/"

def ExtractInput(path):
X_img=[]
y_img=[]
for imageDir in os.listdir(ImagePath):
try:
img = cv2.imread(ImagePath + imageDir)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
img = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)

img = img.astype(np.float32)
img_lab = cv2.cvtColor(img, cv2.COLOR_RGB2Lab)
img_lab_rs = cv2.resize(img_lab, (WIDTH, HEIGHT))
# resize image to network input size
img_l = img_lab_rs[:,:,0] # pull out L channel
img_ab = img_lab_rs[:,:,1:]#Extracting the ab
channel
img_ab = img_ab/128
X_img.append(img_l)
y_img.append(img_ab)
except:
pass
X_img = np.array(X_img)
y_img = np.array(y_img)

return X_img,y_img

X_,y_ = ExtractInput(ImagePath)

K.clear_session()
def InstantiateModel(in_):
model_ = Conv2D(16,(3,3),
activation='relu',padding='same',strides=1)(in_)
#model_ = Conv2D(64,(3,3),
activation='relu',strides=1)(model_)
model_ = Conv2D(32,(3,3),
activation='relu',padding='same',strides=1)(model_)
model_ = BatchNormalization()(model_)
model_ =
MaxPooling2D(pool_size=(2,2),padding='same')(model_)

model_ = Conv2D(64,(3,3),
activation='relu',padding='same',strides=1)(model_)
model_ = BatchNormalization()(model_)
model_ =
MaxPooling2D(pool_size=(2,2),padding='same')(model_)

model_ = Conv2D(128,(3,3),
activation='relu',padding='same',strides=1)(model_)
model_ = BatchNormalization()(model_)

model_ = Conv2D(256,(3,3),
activation='relu',padding='same',strides=1)(model_)
model_ = BatchNormalization()(model_)

model_ = UpSampling2D((2, 2))(model_)


model_ = Conv2D(128,(3,3),
activation='relu',padding='same',strides=1)(model_)
model_ = BatchNormalization()(model_)

model_ = UpSampling2D((2, 2))(model_)


model_ = Conv2D(64,(3,3),
activation='relu',padding='same',strides=1)(model_)
#model_ = BatchNormalization()(model_)
concat_ = concatenate([model_, in_])

model_ = Conv2D(64,(3,3),
activation='relu',padding='same',strides=1)(concat_)
model_ = BatchNormalization()(model_)

model_ = Conv2D(32,(3,3),
activation='relu',padding='same',strides=1)(model_)
#model_ = BatchNormalization()(model_)

model_ = Conv2D(2,(3,3),
activation='tanh',padding='same',strides=1)(model_)

return model_

Input_Sample = Input(shape=(HEIGHT, WIDTH,1))


Output_ = InstantiateModel(Input_Sample)
Model_Colourization = Model(inputs=Input_Sample,
outputs=Output_)

Model_Colourization.compile(optimizer="adam",
loss='mean_squared_error')
Model_Colourization.summary()

def GenerateInputs(X_,y_):
for i in range(len(X_)):
X_input = X_[i].reshape(1,224,224,1)
y_input = y_[i].reshape(1,224,224,2)
yield (X_input,y_input)

Model_Colourization.fit_generator(GenerateInputs(X_,y_),epochs
=53,verbose=1,steps_per_epoch=38,shuffle=True)

TestImagePath=r"D:\Documents\SEM6\Image
Processing\Project\dataset\dataset_updated\training_set\iconog
raphy/"

def ExtractTestInput(ImagePath):
img = cv2.imread(ImagePath)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
img_ = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)
img_ = cv2.cvtColor(img_, cv2.COLOR_RGB2Lab)
img_=img_.astype(np.float32)
img_lab_rs = cv2.resize(img_, (WIDTH, HEIGHT)) # resize
image to network input size
img_l = img_lab_rs[:,:,0] # pull out L channel
img_l_reshaped = img_l.reshape(1,224,224,1)

return img_l_reshaped

ImagePath=TestImagePath+"15.jpg"
image_for_test = ExtractTestInput(ImagePath)
Prediction = Model_Colourization.predict(image_for_test)
Prediction = Prediction*128
Prediction=Prediction.reshape(224,224,2)

plt.figure(figsize=(30,20))
plt.subplot(5,5,1)
img = cv2.imread(TestImagePath+"15.jpg")
img_1 = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
img = cv2.cvtColor(img_1, cv2.COLOR_RGB2GRAY)
img = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)
img = cv2.resize(img, (224, 224))
plt.imshow(img)

plt.subplot(5,5,1+1)
img_ = cv2.cvtColor(img, cv2.COLOR_RGB2Lab)
img_[:,:,1:] = Prediction
img_ = cv2.cvtColor(img_, cv2.COLOR_Lab2RGB)
plt.title("Predicted Image")
plt.imshow(img_)

plt.subplot(5,5,1+2)
plt.title("Ground truth")
plt.imshow(img_1)
Results
Conclusions

As we can see that the Neural Networks do a pretty decent job of colorizing a black and white
image . The model can be further improved if we put more layers and use Hyper Parameter
tuning using Grid For Search and also methods like PCA to do better image preprocessing.
The model is accurate and provides an insight as to how black and white images would look
in a colored view .

Futurework

In the future this can be used to recreate old paintings, also we can use it to view old Black
and white movies in colored format.
This will have a huge impact in historical paintings as people will be able to understand how
people lived and adapted in the past, also ancient buildings and architecture in paintings that
have not existed any more or have been destroyed can be better understood using this .
This will find use in Cyber Forensics for investigation purposes.
The model can be created into an API that we can use globally or any person can use.

You might also like