Unit-2 Adl

UNIT-2 ADL
Recent trends in DL architecture

● Self-supervised learning: This approach trains models on unlabeled data,
allowing them to learn useful representations from vast amounts of
information without needing human-annotated labels for everything . This is a
big deal because labeling data can be expensive and time-consuming.
● Neuroscience-inspired architectures: Researchers are increasingly
drawing inspiration from the human brain to develop more efficient and
powerful deep learning models. This includes exploring new network
structures and learning algorithms that mimic how the brain processes
information.
● Vision Transformers (ViT): This is a new architecture that has shown great
promise in computer vision tasks like image classification and object
detection. Unlike traditional CNNs (Convolutional Neural Networks), ViTs
process entire images at once, which can lead to better performance .
● Hybrid model integration: Combining different types of AI models, such as
deep learning and symbolic AI (rule-based systems), is becoming more
common. This allows for the strengths of each approach to be leveraged for
more robust and interpretable systems .
● Focus on explainability (XAI): As deep learning models become more
complex, there's a growing need to understand how they make decisions.
There's a lot of research on developing techniques to make these models
more interpretable, which is crucial for building trust and ensuring fairness
Residual Network
Intro-
Deep Neural Networks are becoming deeper and more complex. It has been proved
that adding more layers to a Neural Network can make it more robust for
image-related tasks. But it can also cause them to lose accuracy. That’s where
Residual Networks come into place.
The tendency to add so many layers by deep learning practitioners is to extract

important features from complex images. So, the first layers may detect edges, and
the subsequent layers at the end may detect recognizable shapes, like tires of a car.
But if we add more than 30 layers to the network, then its performance suffers and
it attains a low accuracy. This is contrary to the thinking that the addition of layers
will make a neural network better. This is not due to overfitting, because in that
case, one may use dropout and regularization techniques to solve the issue
altogether. It’s mainly present because of the popular vanishing gradient problem.
Core idea:
● Traditional deep neural networks
can suffer from vanishing or exploding
gradients during training. This makes it
difficult for the network to learn complex
relationships, especially in very deep
architectures.
● ResNets address this by
introducing residual connections, also
called skip connections. These
connections bypass some layers in the
network and directly add the input to the
output of a later layer.
● The building block of a ResNet is

the residual block, which typically consists of convolutional layers, activation
functions (like ReLU), and batch normalization. The input to the block is
added to the output of the convolutional layers within the block.
The skip connections are shown below:
A residual block has a 3 x 3 convolution layer

followed by a batch normalization layer and a ReLU
activation function. This is again continued by a 3 x
3 convolution layer and a batch normalization layer.
The skip connection basically skips both these
layers and adds directly before the ReLU activation
function. Such residual blocks are repeated to form
a residual network
Residual Network: In order to solve the problem of the vanishing/exploding
gradient, this architecture introduced the concept called Residual Blocks. In
this network, we use a technique called skip connections. The skip
connection connects activations of a layer to further layers by skipping some
layers in between. This forms a residual block. Resnets are made by stacking
these residual blocks together.
The approach behind this network is instead of layers learning the underlying
mapping, we allow the network to fit the residual mapping. So, instead of say
H(x), initial mapping, let the network fit,
The advantage of adding this type of skip connection is that if any layer hurt
the performance of architecture then it will be skipped by regularization. So,
this results in training a very deep neural network without the problems
caused by vanishing/exploding gradient.
.
What are Skip Connections?
Skip Connections (or Shortcut Connections) as the name suggests skips some of the layers
in the neural network and feeds the output of one layer as the input to the next layers.
Skip Connections were introduced to solve different problems in different
architectures. In the case of ResNets, skip connections solved the degradation
problem that we addressed earlier whereas, in the case of DenseNets, it ensured
feature reusability.
The primary difference between ResNets and DenseNets is that

DenseNets concatenates the output feature maps of the layer with the next
layer rather than a summation.
Coming to Skip Connections, DenseNets uses Concatenation whereas

ResNets uses Summation.
1.resnet- explained above
2.desnet
The idea behind the concatenation is to use features that are learned from
earlier layers in deeper layers as well. This concept is known as Feature
Reusability. So, DenseNets can learn mapping with fewer parameters than
a traditional CNN as there is no need to learn redundant maps.
Image Denoising using Deep Learning
Image denoising is the process of removing noise from images. Deep learning has
revolutionized this field by offering powerful techniques that can achieve impressive
results. Here's a breakdown of how deep learning tackles image denoising:
The Challenge of Noise
Images can be corrupted by noise during acquisition, transmission, or compression.
This noise can come in various forms, like Gaussian noise (random variations in
pixel intensity) or salt-and-pepper noise (randomly occurring black or white pixels).
Noise makes images appear grainy, blurry, or distorted, reducing their quality and
usefulness.
Traditional Denoising Methods
Traditional image denoising methods often rely on filters or statistical techniques.
These methods can be effective for certain types of noise, but they may struggle to
preserve image details while removing noise completely.
Deep Learning for Denoising
Deep learning models, particularly convolutional neural networks (CNNs), have
emerged as powerful tools for image denoising. Here's how they work:
● Training on Denoised Image Pairs: Deep learning models are trained on
large datasets of noisy and clean image pairs. The model learns to identify the
noise patterns in the noisy image and map it to the corresponding clean
image.
● Network Architectures: Common architectures used for denoising include
autoencoders and U-Nets. Autoencoders learn to compress the image into a
lower-dimensional representation and then reconstruct it, effectively denoising
it in the process. U-Nets combine features from different network layers to
capture both high-level and low-level details, allowing for better noise removal
while preserving image features.
● Advantages of Deep Learning: Deep learning models offer several
advantages over traditional methods:
○ Learning Complex Noise Patterns: Deep learning models can learn
complex noise patterns that might be difficult to capture with traditional
filters.
○ Preserving Image Details: Deep learning models can be trained to
remove noise while preserving important image details like edges and
textures.
○ Adaptability: These models can be adapted to different types of noise
by training on specific noise models.
Here are some additional points to consider:
● Challenges: Training deep learning models for denoising requires large
datasets and significant computational resources. Additionally, interpreting
how these models remove noise can be challenging.
● Applications: Deep learning-based denoising has applications in various
fields, including:
○ Medical imaging: Denoising medical images can improve visualization
and diagnosis.
○ Astronomy: Denoising telescope images can reveal fainter objects
and improve clarity.
○ Microscopy: Denoising microscopic images can enhance the
visualization of biological structures.
Overall, deep learning has become a powerful tool for image denoising, offering
significant advancements over traditional methods. As research continues, we can
expect even more sophisticated models that can achieve even better denoising
performance.
SEMANTIC SEGMENTATION
Semantic segmentation is a deep learning algorithm assigning a label

or category to every pixel in an image. This technique is employed to
identify groups of pixels representing distinct categories. For instance,
in autonomous vehicles, semantic segmentation is crucial for
recognizing vehicles, pedestrians, traffic signs, pavement, and other
road features.
Semantic segmentation is a natural step in the progression from

coarse to fine inference:The origin could be located at classification,
which consists of making a prediction for a whole input.The next step
is localization / detection, which provide not only the classes but also
additional information regarding the spatial location of those
classes.Finally, semantic segmentation achieves fine-grained
inference by making dense predictions inferring labels for every pixel,
so that each pixel is labeled with the class of its enclosing object ore
region.
What are the existing Semantic Segmentation

approaches?
A general semantic segmentation architecture can be broadly thought

of as an encoder network followed by a decoder network:
● The encoder is usually is a pre-trained classification network

like VGG/ResNet followed by a decoder network.
● The task of the decoder is to semantically project the
discriminative features (lower resolution) learnt by the encoder
onto the pixel space (higher resolution) to get a dense
classification.
Unlike classification where the end result of the very deep network is
the only important thing, semantic segmentation not only requires
discrimination at pixel level but also a mechanism to project the
discriminative features learnt at different stages of the encoder onto
the pixel space. Different approaches employ different mechanisms as
a part of the decoding mechanism. Let’s explore the 3 main
approaches:
1 — Region-Based Semantic Segmentation
The region-based methods generally follow the “segmentation using

recognition” pipeline, which first extracts free-form regions from an
image and describes them, followed by region-based classification. At
test time, the region-based predictions are transformed to pixel
predictions, usually by labeling a pixel according to the highest scoring
region that contains it.
R-CNN
Architecture
R-CNN (Regions with CNN feature) is one representative work for the
region-based methods. It performs the semantic segmentation based
on the object detection results. To be specific, R-CNN first utilizes
selective search to extract a large quantity of object proposals and
then computes CNN features for each of them. Finally, it classifies
each region using the class-specific linear SVMs. Compared with
traditional CNN structures which are mainly intended for image
classification, R-CNN can address more complicated tasks, such as
object detection and image segmentation, and it even becomes one
important basis for both fields. Moreover, R-CNN can be built on top of
any CNN benchmark structures, such as AlexNet, VGG, GoogLeNet,
and ResNet.
For the image segmentation task, R-CNN extracted 2 types of

features for each region: full region feature and foreground feature,
and found that it could lead to better performance when concatenating
them together as the region feature. R-CNN achieved significant
performance improvements due to using the highly discriminative
CNN features. However, it also suffers from a couple of drawbacks for
the segmentation task:
● The feature is not compatible with the segmentation task.

● The feature does not contain enough spatial information for
precise boundary generation.
● Generating segment-based proposals takes time and would
greatly affect the final performance.
Due to these bottlenecks, recent research has been proposed to
address the problems, including SDS, Hypercolumns, Mask R-CNN.
2 — Fully Convolutional Network-Based Semantic

Segmentation
The original Fully Convolutional Network (FCN) learns a mapping from

pixels to pixels, without extracting the region proposals. The FCN
network pipeline is an extension of the classical CNN. The main idea
is to make the classical CNN take as input arbitrary-sized images. The
restriction of CNNs to accept and produce labels only for specific
sized inputs comes from the fully-connected layers which are fixed.
Contrary to them, FCNs only have convolutional and pooling layers
which give them the ability to make predictions on arbitrary-sized
inputs.
FCN
Architecture
One issue in this specific FCN is that by propagating through several
alternated convolutional and pooling layers, the resolution of the
output feature maps is down sampled. Therefore, the direct
predictions of FCN are typically in low resolution, resulting in relatively
fuzzy object boundaries. A variety of more advanced FCN-based
approaches have been proposed to address this issue, including
SegNet, DeepLab-CRF, and Dilated Convolutions.
3 — Weakly Supervised Semantic Segmentation
Most of the relevant methods in semantic segmentation rely on a large

number of images with pixel-wise segmentation masks. However,
manually annotating these masks is quite time-consuming, frustrating
and commercially expensive. Therefore, some weakly supervised
methods have recently been proposed, which are dedicated to
fulfilling the semantic segmentation by utilizing annotated bounding
boxes.
Boxsup
Training
For example, Boxsup employed the bounding box annotations as a

supervision to train the network and iteratively improve the estimated
masks for semantic segmentation. Simple Does It treated the weak
supervision limitation as an issue of input label noise and explored
recursive training as a de-noising strategy. Pixel-level Labeling
interpreted the segmentation task within the multiple-instance learning
framework and added an extra layer to constrain the model to assign
more weight to important pixels for image-level classification.
Object Detection
Object detection with deep learning is a powerful technique for identifying and
locating objects within images and videos. It's a crucial component in many computer
vision applications like self-driving cars, facial recognition, and medical image
analysis.
● Two-Stage Detectors:
○ This approach involves two stages: a region proposal stage and a
classification stage.
○ In the first stage, the model proposes candidate regions where objects
might be present.
○ Then, in the second stage, the model classifies these proposed regions
and refines the bounding boxes around the objects.
○ Examples of two-stage detectors include R-CNN (Regions with CNN
features) and its variants like Fast R-CNN and Faster R-CNN.
● Single-Stage Detectors:
○ This approach is faster and simpler than two-stage detectors.
○ The model directly predicts bounding boxes and class labels for objects
in a single step.
○ Single-stage detectors are often preferred for real-time applications
due to their speed.
○ Popular single-stage detectors include YOLO (You Only Look Once)
and SSD (Single Shot MultiBox Detector).
Benefits of Deep Learning for Object Detection
-High Accuracy
-Real-Time Capability
-Adaptability
Applications of Object Detection:selfdriving cars, facial recognition,Object

Recognition and Tracking,medical imaging
Neural Attention Models
Attention mechanisms have become a fundamental concept in deep learning,
particularly for tasks involving sequences like natural language processing (NLP)
and computer vision. Here's a breakdown of what they are and how they
revolutionized deep learning models:
Focus Like a Human: Attention in Deep Learning
● Unlike traditional deep learning models that process all parts of the input data
equally, attention models allow the network to focus on specific, relevant parts
of the input.
● This is similar to how humans focus their attention when reading a sentence
or looking at a scene. We don't pay equal attention to every word or detail, but
rather prioritize the information that's most important for understanding the
context.
How Attention Works
There are different ways to implement attention mechanisms, but the core idea
involves three steps:
1. Calculating Scores: The model assigns a score to each element in the input
sequence. This score reflects how relevant that element is to the current
processing step.
2. Softmax Distribution: A softmax function is used to convert these scores into a
probability distribution. This distribution indicates the weight or importance of
each element.
3. Context Vector Creation: A context vector is created by taking a weighted sum
of the elements in the input sequence, using the attention weights calculated
in step 2. This context vector essentially captures the most relevant
information from the input.
Benefits of Attention Models
Attention mechanisms have significantly improved the performance of deep learning
models in various tasks:
● Machine Translation: By focusing on relevant words in the source sentence,
attention models can generate more accurate and nuanced translations.
● Text Summarization: Attention helps identify key points in a document, leading
to more concise and informative summaries.
● Speech Recognition: Attention allows models to focus on the speaker's voice
and ignore background noise, improving recognition accuracy.

● Image Captioning: Attention models can attend to specific objects and regions
in an image, leading to more accurate and descriptive captions.
Beyond Sequences: Attention's Growing Impact
While initially developed for sequential data, attention mechanisms are being
explored for other tasks as well, such as:
● Visual Question Answering: Attention can help models focus on relevant parts
of an image to answer questions about it.
● Time Series Forecasting: Attention can be used to identify important patterns
in past data points that might influence future predictions.
Overall, attention models have become a powerful tool in deep learning, allowing
models to focus on the most critical information and achieve superior performance
on various tasks. As research continues, we can expect even broader applications of
attention mechanisms in the future of deep learning.
Neural Machine Translation
Neural machine translation (NMT) is a cutting-edge approach to machine

translation that leverages the power of deep learning for superior translation
quality compared to traditional methods.
Working:
From Rule-Based to Deep Learning
● Traditional machine translation relied on rule-based systems or statistical
methods. These approaches involved defining complex rules or using
statistical probabilities to translate text.
● NMT takes a different approach. It utilizes deep neural networks, specifically
designed to learn complex relationships between languages.
The Encoder-Decoder Architecture
NMT models typically use an encoder-decoder architecture:
● Encoder: This part takes the source language sentence as input and
processes it through a series of neural network layers. The encoder
essentially captures the meaning and context of the source sentence.
● Decoder: The decoder takes the encoded representation from the encoder
and generates the target language sentence word by word. During this
process, the decoder might attend back to the source sentence encoded by
the encoder to ensure accuracy and fluency.
Learning from Massive Datasets
NMT models are trained on massive datasets of text that have already been
translated by humans. These datasets serve as a reference for the model to learn
how to map sentences in one language to their corresponding translations in
another.
Advantages of Neural Machine Translation
NMT offers several advantages over traditional methods:

● Higher Quality Translations: NMT models can produce more natural-sounding
and accurate translations, especially for complex sentences or phrases.
● Context-Aware Translation: NMT takes context into account when translating,
leading to more nuanced and accurate translations.
● Ability to Learn New Languages: NMT models can be relatively easily adapted
to translate between new languages by training them on corresponding
datasets.
Challenges and Considerations
While powerful, NMT also has some limitations:
● Data Dependency: NMT models heavily rely on large amounts of training
data, which might not be available for all language pairs.
● Explainability: Understanding how NMT models arrive at a specific translation
can be challenging compared to rule-based systems.
● Computational Cost: Training NMT models requires significant computational
resources.
Applications of Neural Machine Translation
NMT is finding its way into various real-world applications:
● Real-time translation: NMT powers features like Google Translate, enabling
communication across language barriers.
● Document translation: Businesses can use NMT to translate documents,
emails, or websites to reach a wider audience.
● Multilingual customer service: NMT can be used to provide customer support
in multiple languages, enhancing customer experience.
Overall, neural machine translation has revolutionized machine translation, offering a
more accurate and natural way to bridge the language gap. As NMT models
continue to develop and training data becomes more available, we can expect even
more impressive translation capabilities in the future.
BASELINE METHODS
In deep learning, baseline models are fundamental for establishing a benchmark to
evaluate the performance of more complex models. They serve as a reference point
to gauge the effectiveness of new architectures or training approaches.
Here's a deeper dive into what baseline models are and why they're important:
Why Baseline Models Matter
Imagine you're developing a new deep learning model for image classification. You
train the model and achieve a certain level of accuracy. But is this accuracy good?
Without a baseline for comparison, it's difficult to assess how well your model is
performing. Here's where baseline models come in:
● Establishing a Benchmark: By training and evaluating a simple baseline
model on the same task and data, you create a reference point. You can then
compare the performance of your new model to the baseline. If your model
significantly outperforms the baseline, it's an indication that your approach is
effective.
● Understanding Data Difficulty: Baseline models can also help you
understand the inherent difficulty of the dataset you're working with. If a
simple model achieves high accuracy, it suggests that the task itself might be
relatively easy. Conversely, a low baseline accuracy indicates a challenging
task where more sophisticated models might be necessary.
● Managing Expectations: Baseline models help set realistic expectations for
the performance of your new model. By understanding the limitations of
simpler approaches, you can focus your efforts on developing models that can
achieve significant improvements over the baseline.

Common Types of Baseline Models in Deep Learning
There are several approaches to creating a baseline model, depending on the
specific task and data:
● Random Guessing: This is the simplest baseline, where the model randomly
predicts a class label or output for each input. The accuracy of random
guessing gives you a lower bound for performance on a classification task.
● Majority Class Classifier: In classification problems, this baseline always
predicts the most frequent class in the training data. This is a good starting
point to see how well a model can learn to differentiate between classes
compared to simply picking the most common one.
● Simple Statistical Models: Linear regression for continuous target variables
or logistic regression for binary classification can be used as baselines. These
models capture basic linear relationships in the data and provide a benchmark
for more complex deep learning models.
● Pre-trained Models on Smaller Datasets: Sometimes, pre-trained models
on smaller datasets related to your task can be a good baseline. These
models can capture some underlying patterns in the data and serve as a
reference for more complex architectures trained on the full dataset.
Choosing the Right Baseline Model
The best type of baseline model depends on the specific task and the complexity of
the data. Here are some general guidelines:
● For simple tasks with well-defined patterns, random guessing or
majority class classification might be sufficient.
● For more complex tasks with intricate relationships, consider using
simple statistical models or pre-trained models as baselines.

Remember, the goal of a baseline model is not to achieve optimal
performance, but to provide a clear starting point for evaluating the
effectiveness of your deep learning models. By incorporating baselines into your
development workflow, you can gain valuable insights into your data, manage
expectations, and ultimately build more powerful and efficient deep learning models.
DATA REQUIREMENTS
Data is the fuel that drives deep learning models. The amount of data you need
depends on several factors, but it's generally true that deep learning models require
significantly more data compared to traditional machine learning algorithms. Here's
a breakdown of why data is so crucial and how much you might need:
Why Deep Learning Needs a Lot of Data
Deep learning models have complex architectures with many parameters. These
parameters act like learnable filters that extract patterns from the data. The more
data you have, the better the model can learn these patterns and generalize well to
unseen examples.
● High Capacity for Complex Patterns: Deep learning models can learn very
complex patterns from data. However, this also means they are prone to
overfitting if they don't have enough data to learn from. Overfitting happens
when the model memorizes the training data too well and fails to perform well
on new data.
● Statistical Learning: Deep learning models rely on statistical learning
techniques. They learn by identifying patterns that appear frequently in the
data. With more data, these patterns become more statistically robust, leading
to better model performance.

How Much Data is Enough?
There's no one-size-fits-all answer to how much data you need. Here are some
factors to consider:
● Model Complexity: More complex models with many parameters typically
require more data to avoid overfitting.
● Data Quality: High-quality, well-labeled data is essential. Noisy or poorly
labeled data can hinder learning, even with a large dataset.
● Task Difficulty: More complex tasks like image recognition with fine-grained
details might require more data than simpler tasks like sentiment analysis.
Here are some general guidelines:
● Small Datasets (1000s-10,000s of data points): This might be sufficient for
very simple tasks or as a starting point for transfer learning (using pre-trained
models on a different task).
● Medium Datasets (100,000s-1,000,000s of data points): This is a common
range for many deep learning tasks, especially with careful model design and
data augmentation techniques (artificially creating more data from existing
data).
● Large Datasets (Millions-Billions of data points): These are often used for
very complex tasks like image recognition with millions of categories or large
language models trained on massive amounts of text data.
Mitigating Data Scarcity
Several techniques can help address data scarcity:
● Transfer Learning: Leverage pre-trained models on a related task with a
large dataset and fine-tune them for your specific task with less data.
● Data Augmentation: Artificially create more data from your existing dataset
through techniques like cropping, flipping, or adding noise. This can help
improve the model's ability to generalize to unseen variations.
● Active Learning: This approach focuses on acquiring data points that are
most informative for the model's learning process.
Conclusion
Data is a critical element for deep learning success. While the amount of data
required can vary greatly depending on the specific task and model, it's safe to say
that deep learning models are data-hungry. By understanding the role of data and
employing techniques to mitigate scarcity, you can effectively train deep learning
models and achieve good performance.
HYPERPARAMETER TUNING
hyperparameter tuning is the process of finding the optimal configuration for a
model's hyperparameters. These hyperparameters are settings that control the
learning process of the model, but unlike regular parameters, they are not learned
from the data itself.

Unit-2 Adl

Uploaded by

Copyright:

Available Formats

You might also like

Unit-2 Adl

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit-2 Adl

Uploaded by

Copyright:

Available Formats

UNIT-2 ADL

Recent trends in DL architecture

The tendency to add so many layers by deep learning practitioners is to extract

● The building block of a ResNet is

A residual block has a 3 x 3 convolution layer

What are Skip Connections?

The primary difference between ResNets and DenseNets is that

Coming to Skip Connections, DenseNets uses Concatenation whereas

1.resnet- explained above

Image Denoising using Deep Learning

results. Here's a breakdown of how deep learning tackles image denoising:

The Challenge of Noise

Images can be corrupted by noise during acquisition, transmission, or compression.

pixel intensity) or salt-and-pepper noise (randomly occurring black or white pixels).

Traditional image denoising methods often rely on filters or statistical techniques.

preserve image details while removing noise completely.

Deep Learning for Denoising

Deep learning models, particularly convolutional neural networks (CNNs), have

● Training on Denoised Image Pairs: Deep learning models are trained on

● Network Architectures: Common architectures used for denoising include

autoencoders and U-Nets. Autoencoders learn to compress the image into a

lower-dimensional representation and then reconstruct it, effectively denoising

it in the process. U-Nets combine features from different network layers to

while preserving image features.

● Advantages of Deep Learning: Deep learning models offer several

advantages over traditional methods:

○ Learning Complex Noise Patterns: Deep learning models can learn

complex noise patterns that might be difficult to capture with traditional

○ Adaptability: These models can be adapted to different types of noise

by training on specific noise models.

Here are some additional points to consider:

● Challenges: Training deep learning models for denoising requires large

datasets and significant computational resources. Additionally, interpreting

how these models remove noise can be challenging.

● Applications: Deep learning-based denoising has applications in various

○ Medical imaging: Denoising medical images can improve visualization

○ Astronomy: Denoising telescope images can reveal fainter objects

and improve clarity.

○ Microscopy: Denoising microscopic images can enhance the

visualization of biological structures.

significant advancements over traditional methods. As research continues, we can

Semantic segmentation is a deep learning algorithm assigning a label

Semantic segmentation is a natural step in the progression from

What are the existing Semantic Segmentation

A general semantic segmentation architecture can be broadly thought

● The encoder is usually is a pre-trained classification network

1 — Region-Based Semantic Segmentation

The region-based methods generally follow the “segmentation using

For the image segmentation task, R-CNN extracted 2 types of

● The feature is not compatible with the segmentation task.

2 — Fully Convolutional Network-Based Semantic

The original Fully Convolutional Network (FCN) learns a mapping from

3 — Weakly Supervised Semantic Segmentation

Most of the relevant methods in semantic segmentation rely on a large

For example, Boxsup employed the bounding box annotations as a

Benefits of Deep Learning for Object Detection

Applications of Object Detection:selfdriving cars, facial recognition,Object

Neural Attention Models