Download as pdf or txt
Download as pdf or txt
You are on page 1of 51

AI・Machine Learning・Big Data

Heligate JSC’s Projects Introduction

1
OUR PERSPECTIVE

RESEARCH PRACTICE
To implement high quality of To provide the best consultation,
AI research, Machine compatible solution and execute
Learning and Deep Learning. the best product to customers.

Strictly Private and Confidential 2


Target Direction
BIG DATA
Data Analysis

COMPUTER VISION Real time Processing


Parallel Processing
Image recognition
A/B Test
Machine Vision

OCR
Traditional OCR, AI-OCR …

HEALTH CARE
Image recognition
Machine Vision
INFORMATION SEARCHING
Information abstraction
Voice recognition

Strictly Private and Confidential


CASE STUDIES of
OCR PROJECTS
4
1. AI-OCR PROJECT
To solve many different layouts of documents by NLP and Data Structure

Domain: AI – MACHINE LEARNING – DEEP LEARNING

Location: TOKYO, JAPAN – YEAR: 2018


JAPAN
Overview:
Traditional OCR for such Receipts, Work Order ect…needs to define the rule and templates for each field.

By AI-OCR, through Form analysis, Noise Processing, NLP, there no longer be manually input needed, the

data can be extracted from the documents. The effort to define rules and templates for each field also can be

reduced. We apply AI-OCR as a very effective solution to deal with financial and insurance documents.

Algorithm:Form Analysis, Noise Processing, NLP


Processing time:0.1s〜0.3s on average

Technology:

Strictly Private and Confidential 5


Using NLP and Data Structure to solve many different layouts

Data was well-structured, that enabled AI-OCR to


read from various layouts of documents and
increased the precision of the system.

Strictly Private and Confidential 6


2. COLOR STAMP REMOVAL PROJECT
To remove the stamps on scanned or captured images

Domain: AI, DEEP LEARNING, MACHINE LEARNING


Location: TOKYO, JAPAN – YEAR: 2018
JAPAN
Overview:
This project aimed to develop a engine which can automatically extract and delete the stamp from scanned or

captured images. There were quite many issues to deal with such as no stamp standard regarding shape, color or

images which were scanned is different with captured ones.

Aim at 2 main metrics which are Precision and Performance, we tried a lot of methods and then finally we

deployed YOLOv3 for object detection and utilized K-means, Scikit-learn and OpenCV to generate the output.

Algorithm : Object Detection/K-means


Processing time:0.15s~ 4s on average (depends on image size)

Technology:

YOLOv3

Strictly Private and Confidential 7


Color Stamp Removal Project
Implementation Methodology

Approach by Deep Learning


Stamp Removal and Output Generation

[1] Stamp Detection [2] Stamp Area Deletion


(YOLOv3) (K-means, OpenCV)

Input Image
[3] Generate Output
(Background Color Adjust,

Noise removal

Image after stamp removal

Strictly Private and Confidential 8


Output of Color Stamp Removal (1)

Color stamp was well removed and the text could be clearly read

Image with stamp Image with stamp removed

Strictly Private and Confidential 9


Output of Color Stamp Removal (2)

Image with stamp Image with stamp removed

Strictly Private and Confidential 10


Output of Color Stamp Removal (3)

Image with stamp Image with stamp removed

Strictly Private and Confidential 11


Output of Color Stamp Removal (4)

Image with stamp Image with stamp removed

Strictly Private and Confidential 12


3. INSURANCE CARD INFORMATION READING SYSTEM
DEVELOPMENT PROJECT
Reading Name, Birthday, ID number from Insurance Card

Domain: AI, MACHINE LEARNING, DEEP LEARNING


JAPAN Service: RESEARCH & DEVELOPMENT
Location: JAPAN – YEAR: 2018

Overview:
The pain point of Insurance Card information reading in this project is to OCR Japanese characters.
Because there were 3 information to read, we decided to do Deep Learning. Using Tesseract Engine to
read all information on the card, then carrying out some combination together with “try and improve”
method, the result of recognition was much improved.

Algorithm: FASTER RCNN/ Object Detection


Processing time:1s on average (Duration from Picture input to Getting the result of translation)

Technology:

Strictly Private and Confidential 13


Insurance Card Information Reading
Work Flow

De-noised Image Pre-processing Health Insurance Card


(OpenCV)

Text Classification
(mser)

Characters recognition
(Tesseract/
Deep Learning)
Observation Video
Post-processing
Observation Video

Strictly Private and Confidential 14


Detection Result

Strictly Private and Confidential 15


4. RECOGNITION OF JAPANESE HAND-WRITING
CHARACTERS PROJECT

Domain: DEEP LEARNING, COMPUTER VISION, NLP, OCR

JAPAN Location: TOKYO, JAPAN – YEAR: 2019

Overview:
To develop an offline hand writing character recognition engine which can automatically read Japanese hand
writing characters string from scanned or captured image. There were many pain points to solve such as the
difference in hand writing characters between people; noise, background, text with grid lines in images, which
lead to a huge amount of characters. We newly built our own OCR solution by optimizing YOLOv3 and CCEAD
algorithm.

Algorithm: Object detection, Machine translation


Processing time: 0.4s with GPU - 1.2s with CPU on each Image
Performance: 94% with no grid lines images
Technology:

Strictly Private and Confidential 16


Recognition of Japanese Hand-writing Characters
Processing Flow

Image Pre-
processing
Input
YOLOv3

Spell Check
Text Correction

¥ 433,000 Denoiser Y 433,000 Merge Post-


processing

Real Output:

Strictly Private and Confidential 17


5. RETRAINING TESSERACT FOR ITALIC&BOLD CHARACTER
AND HALF-WIDTH KATAKANA RECOGNITION PROJECT

Domain: AI, Machine learning, Deep learning


JAPAN
Location: TOKYO, JAPAN – YEAR:2019
Overview:
Retraining Tesseract to be able to recognize Japanese Italic & bold
characters and half-width Katakana 。The scope of work was to collecte
data, prepare data balance, then retrain Tesseract. The training and
tuning process were being continuously carried out until getting better
performance.

Algorithm: LSTM TRAINING, SUPPORT TESSERACT

Technology:

Strictly Private and Confidential 18


Retraining Tesseract for Italic & Bold Characters,
and Half-width Japanese Katakana Recognition

• Algorithm: LSTM training, Support Tesseract


• Hardware:Training by 4-cored CPU

Input Output
Model

Font Data Tesseract


training files
Tuning model
generation

日付請求書 Tesseract
Training training
Text Process

Strictly Private and Confidential 19


6. RECOGNITION OF HAND-WRITING SYMBOLS
To create an OCR engine to recognize the Hand writing symbols from documents

Domain: Deep Learning, Computer Vision

Location: Tokyo, Japan – Year: 2020


JAPAN
Overview:
To develop an offline recognition engine which can automatically detect images of hand
writing symbols from scanned documents. To structure from a single OCR solution, optimize
YOLOv3 algorithm and process the hand writing symbols better. Pain point is the shortage of
training data. To solve it, we created a GAN model to generate more data.
Algorithm:Object Detection
Performance:3s on average (CPU) and 99% for a document image with the Width and
Height is above 1000 pixels.

Technology:

Strictly Private and Confidential 20


Recognition of Hand writing symbols
To create a characters or character strings detection model

Strictly Private and Confidential


7. JAPANESE CHARACTERS OCR PROJECT
To improve the character string segmentation engine

Domain: DEEP LEARNING, COMPUTER VISION

JAPAN Location: TOKYO, JAPAN – YEAR:2020


Overview:
The purpose of this project is to replace the text line segmentation engine used in the customer's product with a new
text segmentation deep learning model. The deliverable of the project is a trained text segmentation model, which
must be embedded in a module written in Java. First, we labelled the scanned image data given by the customer and
transformed the annotations into mask images. Then we configured and trained the Pix2Pix model, which is widely
used in the Image-to-Image Translation task, on our dataset. Next, the Pix2Pix model's outputs, which are originally
mask images, are refined and post-processed by using image processing techniques to extract bounding boxes.
Finally, we converted the trained Pix2Pix model so that it can be loaded with Tensorflow Java.

Algorithm: PIX2PIX, GAN, IMAGE PROCESSING

Technology:

Strictly Private and Confidential 22


Japanese Characters OCR
To improve the character string segmentation engine

Image Pre- Pix2Pix


Processing Model

Image Post-
Processing

Strictly Private and Confidential


8. JAPANESE CHARACTERS DETECTION PROJECT
Build an engine to detect Japanese and Latin characters from line-level images

Domain: Deep Learning, Computer vision


Location: TOKYO, JAPAN- YEAR: 2020
JAPAN
Overview:
To build an engine to detect Japanese and Latin characters from line-level images by using object detection algorithms.
We generated pseudo text line images from character images given by the customers.
Then we configured and fine-tuned a CenterNet model, the state-of-the-art algorithm for the object detection task, on
our generated text line image data and a small training data provided by the customer. To improve the model's
accuracy on images with many characters, we also implemented a regression model with MobileNet backbone to find
an appropriate zoom scale for each image and break these images into smaller patches accordingly.
Finally, the trained CenterNet Pytorch model was converted so that it can be loaded with Tensorflow Java.

Algorithm : CENTERNET, MOBILENET

Technology:

Strictly Private and Confidential 24


Japanese Characters Detection
To build an engine to detect Japanese and Latin Characters from a line-level image

Image
Output
Classification
Module
Merging

CenterNet

Strictly Private and Confidential


CASE STUDIES of
HEALTH CARE PROJECTS
26
1. DENTAL CLASSIFICATION SYSTEM
Teeth Detection and Labeling

Domain: AI, Machine learning, Deep learning


Service: Research and Development
JAPAN
Location: TOKYO, JAPAN – YEAR:2018
Overview:
The purpose is to create an engine which can automatically detect all teeth from dental X-ray images then

numbering and labeling them. There are 3 sorts of teeth: oldie’s teeth, adult’s teeth and infant’s teeth.

Using Object Detection API of TensorFlow to do all teeth detection in a X-ray image, then utilizing OpenCV to do

labeling. The base of Object Detection API was structured by Convolutional Neural Network (CNN). By training the

computer to understand the features of teeth, the precision of detection and labeling reached 94% for oldie’s teeth、

84% for adult and infant teeth, which was considered to be a high rate.

Algorithm: Faster
Processing time:6s on average(The duration from image input to receipt of labeling result)

Technology:

Strictly Private and Confidential 27


Dental Classification Work Flow
Detection by Transfer Learning

Initial Model & Pre-training model X-ray images collection


Teeth Labeling Parameter selection (Faster RCNN
inception v2) 1230 images

Convert label to
readable file

Model
Model retraining Evaluation
Feedback

Labeling User Interface of Labeled teeth


(Computer Vision)

Strictly Private and Confidential 28


Dental Classification System
Implementation Result

Adult Teeth Oldie Teeth

Infant Teeth

Strictly Private and Confidential 29


Dental Classification System
Precision Grade

1st Dataset : Labeling Precision 2nd Dataset: Labeling Precision


Teeth type Image number Precision Teeth type Image number Precision
(Random Pick-up) (Random Pick-up)
Adult teeth 9 94% Adult teeth 10 99%
Infant teeth 21 80% Infant teeth 10 70%
Oldie teeth 10 74% Oldie teeth 10 74%

Strictly Private and Confidential 30


2. PNEUMONIA DETECTION PROJECT

Domain: AI, Machine learning, Deep learning


Location: HANOI, VIETNAM - YEAR: 2018
JAPAN
Overview:
The purpose is to create an algorithm to detect visual signal of pneumonia from medical images.
Especially, this algorithm can automatically detect the pneumonia area from the chest area in X-ray
images. The dataset had about 23,124 images, among those there were 2,560 valid images.
Our solution is building our own U-net module, at the same time with strengthening “resblock”
module, which helps to improve the precision of the algorithm. We tested with another dataset of
1000 images then it showed extremely positive ressult (f2 Score ~ 0.2).

Algorithm: U-Net, resblock


Processing time:0.1s/ 1 image on average
Technology:

U-Net

Strictly Private and Confidential 31


Result of Pneumonia Detection Project

The test result with the dataset of 1000


images was positive(f2 Score 〜 0.2)

Output Sample of Valid Dataset


• Red:Prediction Result(Pneumonia)
• Blue:True(Manual Labeling)

Strictly Private and Confidential 32


CASE STUDIES of
RECOMMENDATION PROJECTS
33
1. RECOMMENDATION SYSTEM

Domain: Information Searching, Big Data

Service: DEVELOPMENT & MAINTENANCE

Location: HANOI, VIETNAM - YEAR: 2018


Overview:
This system can recommend the most suitable content to the end-user based on the
analysis of information from data source such as Mobile Application Statistics or
Customer Relations Management System (CRM) regarding Broadcasting Programs,
Movies in combination with the attribute, use history, habit ect… of the end-users.

Algorithm: CF/ ALS/ eALS/ NeuCF


Processing time:1s on average (with less than 300 requests)
Technology:

Strictly Private and Confidential 34


Lambda Architecture of Recommendation System
Batch Layer
Spark Lambda Architecture
Batch Aggregation

Extract all data


Model Training
Data Lake Spark Serving Layer
RDMS engine/ Sqoop

Extract all data


CRM
Merged View Consumer
HBase Batch View Report API
File engine/ Hdfs-file-slurper
Casandra Admin
Metadata
Consumer
Merged View
Event engine/ Topic event HDFS Speed View RS API
Users behavior
Kafka Casandra
Speed Layer End Users

Spark

Speed Aggregation

Speed Recommender

Strictly Private and Confidential 35


Batch Layer:Model Training

Strictly Private and Confidential 36


Batch Layer: Algorithm

Collaborative Filtering: A methodology of


using the big data concerning preference and
liking of many other users to automatically
recommend the similar preference to the
specific user.

ALS was chosen to create the model from 3


methods below:
• Element-wise Alternating Least Squares (eALS)
• Neural Collaborative Filtering (NeuCF)
• Alternating Least Squares (ALS)
https://www.slideshare.net/MrChrisJohnson/music-recommendations-at-scale-with-
spark/15-Alternating_Least_Squares_ALS_151

Strictly Private and Confidential 37


2. RETAIL TRAFFIC COUNTER SYSTEM

Domain: AI, Machine learning, Deep learning


JAPAN Service: RESEARCH & DEVELOPMENT
Location: JAPAN – YEAR: 2018
Overview:
Counting the number of customers who entered and exited a shopping mall (store) helps boost in-store
analytics & facilitate marketing segmentation. This is a problem of the detection and tracking people from
surveillance videos. In order to solve the problem of detecting people in each video frame, Deep Learning
approach is used. In details, a detection engine is built by making uses of TensorFlow’s Object detection
API/ Faster R-CNN. After recognizing people and PeopleID is generated, SORT/ deep SORT, a tracking
algorithm for 2D multiple object tracking in video sequences, is applied for real-time tracking people. The
project is now going to evaluation phase.

Algorithm: FASTER RCNN/ OBJECT DETECTION


Processing time:0.01s/frame (Output:16fps)
Technology

Strictly Private and Confidential 38


Retail Traffic Counter System
Work Flow

Detected Personal ID People Detection Observation Video


For each frame
(TensorFlow)

Use of DEEP SORT


People Tracking

Result Combination Observation Video


Indication
Prediction of people position at each frame

Strictly Private and Confidential 39


Demo Video

YouTube: Fast Moving


https://www.youtube.com/watch?v=qNSJqGIJS
Ec&list=PLf5kpWs6HDU2dcX31WnEQz_7xdVAz
FpYo&index=2

YouTube: Slow Moving


https://www.youtube.com/watch?v=bYeaWg0F
X6E&index=3&list=PLf5kpWs6HDU2dcX31WnE
Qz_7xdVAzFpYo

CAPTURED AT OFFICE of HELIGATE JSC

Strictly Private and Confidential 40


3. Interactive Chatbot Sytem
Chatbot and Recommendation

Domain: AI, Machine learning, Deep learning,NLP


Service: Research and Development
JAPAN
Location: TOKYO, JAPAN – YEAR:2021
Overview:
The purpose is to build a system including chatbot and
recommendation engine. Chatbot system can interact with customers
and can give suggestions to customers based on customer information
such as age, gender, history of information viewed, purchase history,
past inquiries.

Processing time:3s on average (with less than 300 requests)

Strictly Private and Confidential 41


Overall solution

Strictly Private and Confidential 42


Voice Recognition Project
43
Movie Control System

Domain: AI, Machine learning, Deep learning, Voice Recognition


Service: Research and Development
JAPAN
Location: TOKYO, JAPAN – YEAR:2020
Overview:
The purpose is to create a system to help theater audience directly
interact with the characters in the movie and choose the next movie
scene via smartphone app

Algorithm: Voice Recognition , Bert…


Processing time:2s on average (with less than 300 requests)

Technology:

Strictly Private and Confidential 44


Overall solution

Strictly Private and Confidential 45


Annotation Service
46
Data Labeling Workflow

47

1. Upload 2. Manage

You can drag and drop files from Store your private data
computer into our Annotation Tool

4. Validation
5. Download 3. Labeler Smart Tool
(Reviewer, QC)

Download annotation Multiple users can label Smart Tool speed up


Check quality of annotation
with any formats data with AI smartool productivity
results with statistics and
comparisons

Strictly Private and Confidential


Key features of our Annotation Tool

48

Real-time Issues tracking system Smart tools (AI models + Open APIs to
collaboration interpolation) integrate with
(build-in models or customed model) external systems
Text detection, object detection, object
segmentation (human), pre-defined object
detection
Strictly Private and Confidential
OUR AI TEAM INTRODUCTION

49
Our Capability

• Our team members graduated from prestige universities of Vietnam, Japan and Europe.

Thus, all team members can communicate well in English and Japanese

• Through out the abundant experiences working with European customers and Japanese customers,

we combine the mind-set and various features of each market to utilize the best working process.

• With the average age of 27, we can quickly study new technology and catch the new AI trends and

be able to work under high pressure to meet customer’s expectation.

Strictly Private and Confidential


THANK YOU VERY MUCH!

You might also like