1 Phase 1.1

ACHARYA INSTITUTE OF TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

(Affiliated to Visvesvaraya Technological University, Belagavi, Approved by AICTE, New Delhi and Accredited by NBA & NAAC)
Acharya Dr. Sarvepalli Radhakrishnan Road, Achitnagar Post, Soladevanahalli, BENGALURU – 560107
Academic Project Phase-1.1 (18CSP77) Presentation on

"Deep Learning-based Image Super-Resolution with Real ESRGAN and PyTorch Framework"
Presented by: Under the guidance of:

1. Sayebul Emuun Dar(1AY20CS136) Mrs. Rakshitha B T
2. Shamanth K V(1AY20CS140) Assistant Professor
3. Srinivas G(1AY20CS161) Department of CS&E
4. Yashaswini B(1AY20CS189)
1 Department of CS&E, Acharya Institute of Technology 11-Dec-23

AGENDA
❖ The primary objective of our project is to enhance video quality without requiring
additional storage.
❖ Our goal is to improve the performance of the existing methodology.
❖ We are utilizing PyTorch for increased efficiency.

INTRODUCTION TO THE PROJECT
INTRODUCTION
1.1. PROBLEM DEFINITION:
 Digital Media and Entertainment:
 Historical and Archival Imagery:

1.2. OVERVIEW OF THE TECHNICAL AREA:
• Super-Resolution Techniques:
Real-ESRGAN leverages advanced deep learning for superior image super-resolution in videos.
• Real-ESRGAN Architecture:
Evolution of ESRGAN with user-controlled parameters and adversarial training for realistic results.
• Application Methodology:
Application of Real-ESRGAN to video frames, adapting to diverse content for improved visual fidelity.
• Optimization and Fine-Tuning:

Ongoing efforts to optimize Real-ESRGAN for optimal video clarity enhancement.

1.3. overview of existing system:
The current image super-resolution system relies on traditional interpolation methods, such as bilinear and bicubic
interpolation, which often produce blurred and artifact-prone results. These techniques lack adaptability to diverse image
content and struggle with non-linear patterns. The existing system is limited in scalability, fails to prioritize perceptual
quality, and lacks user interaction.
➢ Key Limitations:
• Conventional Interpolation: Produces blurred and unrealistic results.
• Limited Adaptability: Struggles with diverse image content.
• Artifact Generation: Introduces aliasing and smoothing artifacts.
• Scalability Issues: Hindered by advancements in hardware.
• Perceptual Quality Ignored: Favors numerical metrics over visual appeal.

1.4. OVERVIEW OF PROPOSED SYSTEM:
 Real-ESRGAN Integration:
The project incorporates the Real-Enhanced
Super-Resolution Generative Adversarial
Network architecture, a state-of-the-art deep
learning model for image super-resolution.
 PyTorch Framework Utilization:

Real-ESRGAN is implemented within the
PyTorch framework, offering flexibility, dynamic
computation, and an extensive ecosystem for deep
learning.
 Adversarial Training:
Adversarial training is employed, introducing a
generator and discriminator to improve the
realism and quality of the enhanced images.

 Quantitative and Perceptual Evaluation:
 Redefined Image Super-Resolution:
 Mitigation of Existing System Deficiencies:

LITERATURE SURVEY
LITERATURE SURVEY
TECHNICAL IDEAS / ALGORITHMS
PAPER TITLE NAME OF THE ACQUIRED FROM THE PAPER
S.N & AUTHORS USEFULL IN DESIGNING THE
PUBLICATION DETAILS PROPOSED SYSTEM
1 A new resolution enhancement Ye Liu a,c,* , Chao Guo b , III Standard GAN discriminator architecture,
method for sandstone thin- Jie Cao a , Zhong Cheng d Perceptual GAN SR workflow
section images using perceptual , Xiangxiang Ding d ,
GAN Lintao Lv a , Fan Li a ,
Meichen Gong a
2 ESRGAN: Enhanced Super- Xintao Wang1 , KeYu1 , Relativistic Discriminator, Perceptual Loss,
Resolution Generative ShixiangWu2 , Jinjin Gu3 , Network Interpolation
Adversarial Networks Yihao Liu4

REQUIREMENTS SPECIFICATION
FUNCTIONAL REQUIREMENTS
 Real-ESRGAN Integration
 User Input
 Image Enhancement
 Adversarial Training
 Perceptual Loss Functions
 Diverse Training Dataset
 Performance Evaluation
 Scalability Considerations
 Output Visualization
 Error Handling
NON-FUNCTIONAL REQUIREMENTS
 Performance: Performance term is mainly used to measure the parameters called
time & space. This project uses verify less space and the actions up or operations
performed are done very quickly in fraction of seconds. There is no issue of memory
size out of bounds.
 Security: Security or authorization is one of the major parameters of all
computerized applications. As details are confidential, no malicious user must be
allowed to operate on.
 The project does not consume more space and the processing is done quickly.
 Portable: The system can be used for all the Operating systems which support
Python.

SOFTWARE REQUIREMENTS
 Pycharm or VsCode IDE
 PyTorch frameworks
 Python 3.11 version and greater
 Python libraries

HARDWARE REQUIREMENTS
 Central Processing Unit (CPU):
Requirement: Multi-core processor (e.g., Quad-core or higher)
Image processing tasks often benefit from parallel processing. A multi-core CPU enhances the speed and efficiency of
image-related computations.
 Graphics Processing Unit (GPU):

Requirement: Dedicated GPU with CUDA support (NVIDIA) or OpenCL compatibility (AMD)
 Display:
Requirement: High-resolution monitor (e.g., 1920 x 1080 or higher)
A high-resolution display ensures accurate visualization of images and helps in detailed analysis during the image processing workflow.

PROPOSED METHODOLOGY
PROPOSED METHODOLOGY
ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks
Pipeine for Image Super-Resolution task that based on a frequently cited paper, ESRGAN: Enhanced Super-Resolution Generative
Adversarial Networks
(Wang Xintao et al.), published in 2018.
In few words, image super-resolution (SR) techniques reconstruct a higher-resolution (HR) image or sequence from the observed
lower-resolution (LR) images, e.g. upscaling of 720p image into 1080p.
One of the common approaches to solving this task is to use deep convolutional neural networks capable of recovering HR images
from LR ones. And ESRGAN (Enhanced SRGAN) is one of them. Key points of ESRGAN:
 SRResNet-based architecture with residual-in-residual blocks;
 Mixture of context, perceptual, and adversarial losses. Context and perceptual losses are used for proper image
upscaling, while adversarial loss pushes neural network to the natural image manifold using a discriminator network
that is trained to differentiate between the super-resolved images and original photo-realistic images.

ESRGAN

Algorithm
1. Initialize Generator (G) and Discriminator (D) networks with random weights.
2. Define loss functions (adversarial loss, content loss) and set hyperparameters.
3. Obtain training data: High-resolution images (HR) and corresponding low-resolution images (LR).
4. Preprocess data: Crop and resize HR images to create LR images.
5. Train the network:
for each epoch do
for each batch do
a. Sample a batch of LR images and corresponding HR images.
b. Generate high-resolution images (SR) from LR using the generator: SR = G(LR).
c. Train the discriminator:
- Compute adversarial loss: Ve = 1/m * log(D(HR)) + log(1 - D(SR)).
- Update discriminator weights to minimize Ve.
d. Train the generator:
- Compute adversarial loss: Ve_adversarial = 1/m * -log(D(SR)).
- Compute content loss: Ve_content = ||VGG19(HR) – VGG19(SR)||2.
- Compute total loss: Ve = 10^-3 * Ve_adversarial + Ve_content.
- Update generator weights to minimize Ve.
end for
end for
6. Save trained generator for future use.
CONCLUSION AND FUTURE ENHANCEMENT
CONCLUSION AND FUTURE ENHANCEMENT
CONCLUSION:-
Real-ESRGAN, as an Enhanced Super-Resolution Generative Adversarial Network, represents a significant advancement in
the field of image super-resolution. Its integration of adversarial training, perceptual loss functions, and state-of-the-art
architecture within the PyTorch framework has demonstrated remarkable capabilities in generating high-quality, realistic, and
visually appealing high-resolution images from low-resolution inputs.
The project has contributed to overcoming limitations associated with traditional interpolation techniques, offering a more
sophisticated and data-driven approach to image enhancement. Real-ESRGAN has shown promise in addressing challenges
such as artifact reduction, perceptual quality improvement, and adaptability to diverse image content.

FUTURE ENHANCEMENT:-
• Multi-Modal Super-Resolution:
• Robustness to Image Variability:
• Real-Time Processing:
• Edge Device Compatibility:
• User Interaction and Control:

REFERENCES
[1] DAI J, WU Q Super-resolution Optimization Algorithm of Edge Preserving Interpolation Based on Single
Image [J]. Computer Applications,2018,38(S1):191-193.
[2] LIU Y. Research on Super-resolution Calculation Method Based on Edge Direction [J]. Fujian
Computer,2017,33(09):91-92+113.
[3] ZHANG Z X. Research on Hyperspectral Image Super resolution Restoration Algorithm Based on Ground
Object Category [D]. Beijing University of Technology,2017.
[4] ZHANG W G. Vision Detection of Circular Hole Position and Pose Based on Image Super resolution
Reconstruction [D]. Zhejiang University,2019.
[5] HUANG T Y, SUN T T, ZHOU Z H, et al. Based on adaptive coupling half dictionary learning super-
resolution image reconstruction [J/OL]. Computer application research: 1-6 [2019-05- 31].
https://doi.org/10.19734/j.issn.1001-3695.2018.11.0852.
THANK YOU

1 Phase 1.1

Uploaded by

Copyright:

Available Formats

You might also like

1 Phase 1.1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1 Phase 1.1

Uploaded by

Copyright:

Available Formats

ACHARYA INSTITUTE OF TECHNOLOGY

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

Academic Project Phase-1.1 (18CSP77) Presentation on

Presented by: Under the guidance of:

1 Department of CS&E, Acharya Institute of Technology 11-Dec-23

❖ Our goal is to improve the performance of the existing methodology.

❖ We are utilizing PyTorch for increased efficiency.

2 Department of CS&E, Acharya Institute of Technology 11-Dec-23

 Digital Media and Entertainment:

 Historical and Archival Imagery:

4 Department of CS&E, Acharya Institute of Technology 11-Dec-23

• Optimization and Fine-Tuning:

5 Department of CS&E, Acharya Institute of Technology 11-Dec-23

• Limited Adaptability: Struggles with diverse image content.

• Artifact Generation: Introduces aliasing and smoothing artifacts.

• Scalability Issues: Hindered by advancements in hardware.

• Perceptual Quality Ignored: Favors numerical metrics over visual appeal.

6 Department of CS&E, Acharya Institute of Technology 11-Dec-23

 PyTorch Framework Utilization:

7 Department of CS&E, Acharya Institute of Technology 11-Dec-23

 Redefined Image Super-Resolution:

 Mitigation of Existing System Deficiencies:

8 Department of CS&E, Acharya Institute of Technology 11-Dec-23

10 Department of CS&E, Acharya Institute of Technology 11-Dec-23

13 Department of CS&E, Acharya Institute of Technology 11-Dec-23

14 Department of CS&E, Acharya Institute of Technology 11-Dec-23

 Graphics Processing Unit (GPU):

15 Department of CS&E, Acharya Institute of Technology 11-Dec-23

17 Department of CS&E, Acharya Institute of Technology 11-Dec-23

18 Department of CS&E, Acharya Institute of Technology 11-Dec-23

21 Department of CS&E, Acharya Institute of Technology 11-Dec-23

• Robustness to Image Variability:

• Edge Device Compatibility:

• User Interaction and Control:

22 Department of CS&E, Acharya Institute of Technology 11-Dec-23

24 Department of CS&E, Acharya Institute of Technology 11-Dec-23

You might also like