Lecture 02 (Learning Aproach, Patern Recognition Technique, OCR)

1.
Learning approaches in Machine Learning:

a. What Is Supervised Learning?
In a supervised learning model, the algorithm learns on a labeled dataset,
providing an answer key that the algorithm can use to evaluate its accuracy on
training data.
If you’re learning a task under supervision, someone is present judging whether

you’re getting the right answer. Similarly, in supervised learning, that means
having a full set of labeled data while training an algorithm.
Fully labeled means that each example in the training dataset is tagged with the
answer the algorithm should come up with on its own. So, a labeled dataset of
flower images would tell the model which photos were of roses, daisies and
daffodils. When shown a new image, the model compares it to the training
examples to predict the correct label.
There are two main areas where supervised learning is useful: classification
problems and regression problems.
b. What Is Unsupervised Learning?

An unsupervised model, provides unlabeled data that the algorithm tries to make
sense of by extracting features and patterns on its own.
Clean, perfectly labeled datasets aren’t easy to come by. And sometimes,
researchers are asking the algorithm questions they don’t know the answer to.
That’s where unsupervised learning comes in.
In unsupervised learning, a deep learning model is handed a dataset without

explicit instructions on what to do with it. The training dataset is a collection of
examples without a specific desired outcome or correct answer. The neural
network then attempts to automatically find structure in the data by extracting
useful features and analyzing its structure.
c. What Is Reinforcement Learning?
Reinforcement learning trains an algorithm with a reward system, providing
feedback when an artificial intelligence agent performs the best action in a
particular situation.
Video games are full of reinforcement cues. Complete a level and earn a badge.
Defeat the bad guy in a certain number of moves and earn a bonus. Step into a
trap — game over.
These cues help players learn how to improve their performance for the next
game. Without this feedback, they would just take random actions around a
game environment in the hopes of advancing to the next level.
Reinforcement learning operates on the same principle — and actually, video

games are a common test environment for this kind of research.
In this kind of machine learning, AI agents are attempting to find the optimal way
to accomplish a particular goal, or improve performance on a specific task. As the
agent takes action that goes toward the goal, it receives a reward. The overall
aim: predict the best next step to take to earn the biggest final reward.
2. Pattern Recognition Techniques
There are three main models of pattern recognition:
a. Statistical:
to identify where the specific piece belongs (for example, whether it is a cake or not).
This model uses supervised machine learning;
b. Syntactic / Structural:
to define a more complex relationship between elements (for example, parts of speech).
This model uses semi-supervised machine learning;
c. Template Matching:
to match the object's features with the predefined template and identify the object by
proxy. One of the uses of such a model is plagiarism checking.
3. Natural Language Processing
Natural Language Processing (aka NLP) is a field of Machine Learning focused on teaching
machines to comprehend human language and generate its messages.
While it sounds like hard sci-fi, in reality, it doesn’t deal with the substance of communication
(i.e., reading between the lines) - it only deals with what is directly expressed in the message.
Steps:
a. NLP breaks the text to pieces with differentiating the sentences;
b. finds the connections : sorts out the words and parts of the speech where they belong
c. then constructs its variation: finally defines the ways these words can be used in a
sentence.
To do that, NLP uses a combination of techniques that includes parsing, segmentation, and
tagging to construct a model upon which the proceedings are handled. Supervised and
unsupervised machine learning algorithms are involved in this process at various stages.
4. Use of NLP in real life:
a. Text analysis - for content categorization, topic discovery and modeling (content
marketing tools like Buzzsumo use this technique);
b. Plagiarism detection - a variation of text analysis focused on a comparative study of
the text with the assistance of the web crawler. The words are broken down to tokens
that are checked for matches elsewhere. The exemplary tool for this is Copyscape.
c. Text summarization and contextual extraction - finding the meaning of the text. There
are many online tools for this task, for example, Text Summarizer;
d. Text generation - for chatbots and AI Assistants or automated content generation (for
example, auto-generated emails, Twitterbot updates, etc.);
e. Text translation - in addition to text analysis and word substitution, the engine also uses
a combination of context and sentiment analysis to make closer matching recreation of
the message in the other language. The most prominent example is Google Translate;
f. Text correction and adaptation - in addition to correcting grammar and formal

mistakes, this technique can be used for the simplification of the text - from the structure
to the choice of words. Grammarly, a startup founded by two Ukrainians in Kyiv, Ukraine,
is one of the most prominent examples of such NLP pattern recognition uses.
5. Optical Character Recognition
Optical Character Recognition (aka OCR) refers to analysis and subsequent conversion of the
images considered as alphanumeric text into the machine-encoded text.
The most common source of the optical characters are scanned documents or photographs, but
the thing can also be used on computer-generated unlabeled images. Either way, the OCR
algorithm applies a library of patterns and compares them with the available input document to
mark up the text and construct these. These matches are then assessed with the assistance
language corpus and thus perform the “recognition” itself.
In the heart of OCR is a combination of pattern recognition and comparative algorithms attached
to the reference database.
6. The most common uses of OCR include:

a. Text Transcription is the most basic process. The text is presented in recognizable
characters is recognized and transposed into the digital space. This technology is
well-presented on the market. A good example might be ABBYY Fine Reader.
b. Handwriting Recognition is a variation of text transcription with a more significant

emphasis on the visual element. However, this time, the OCR algorithm uses a
comparative engine to process the handwriting sample. A good example of this is
Google Handwriting Input. While this technique’s primary goal is to the transcript, it is
also used to verify signature and other handwriting samples (for example, for signing
contracts or handwritten will);
c. Document Classification involves deeper processing of the document with a bigger

focus on its structure and format. This technique is used for digitization of the paper
documents and also for the reconstruction of the scattered elements of the damaged
documents (for example, if the thing is shredded or the ink is partially blurred). Parascript
is a product that provides such services for document classification.

Lecture 02 (Learning Aproach, Patern Recognition Technique, OCR)

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 02 (Learning Aproach, Patern Recognition Technique, OCR)

Uploaded by

Copyright:

Available Formats

1.

Learning approaches in Machine Learning:

If you’re learning a task under supervision, someone is present judging whether

b. What Is Unsupervised Learning?

In unsupervised learning, a deep learning model is handed a dataset without

Reinforcement learning operates on the same principle — and actually, video

There are three main models of pattern recognition:

3. Natural Language Processing

4. Use of NLP in real life:

f. Text correction and adaptation - in addition to correcting grammar and formal

5. Optical Character Recognition

6. The most common uses of OCR include:

b. Handwriting Recognition is a variation of text transcription with a more significant

c. Document Classification involves deeper processing of the document with a bigger

You might also like

Lecture 02 (Learning Aproach, Patern Recognition Technique, OCR)

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 02 (Learning Aproach, Patern Recognition Technique, OCR)

Uploaded by

Copyright:

Available Formats

1.

Learning approaches in Machine Learning:

If you’re learning a task under supervision, someone is present judging whether

b. What Is Unsupervised Learning?

In unsupervised learning, a deep learning model is handed a dataset without

Reinforcement learning operates on the same principle — and actually, video

There are three main models of pattern recognition:

3. ​Natural Language Processing

​4. Use of NLP in real life:

f. Text correction and adaptation​ - in addition to correcting grammar and formal

5. ​Optical Character Recognition

6.​ The most common uses of OCR include:

b. Handwriting Recognition​ is a variation of text transcription with a more significant

c. Document Classification​ involves deeper processing of the document with a bigger

You might also like

3. Natural Language Processing

4. Use of NLP in real life:

f. Text correction and adaptation - in addition to correcting grammar and formal

5. Optical Character Recognition

6. The most common uses of OCR include:

b. Handwriting Recognition is a variation of text transcription with a more significant

c. Document Classification involves deeper processing of the document with a bigger