Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Yicheng Rong Rong Page 1

Article Summary
English 202C
Feb, 28th, 2021
Going Deeper into Action Recognition
By Samitha Hearth, Mehrtash Harandi, Fatih Porikli

Introduction:

This paper is the summary of “Going deeper into action recognition” by Samitha Hearth,
Mehrtash Harandi, and Fatih Porikli. This article is following how to let the computer understand
and recognize your action, so they can make the decision based on your action without your
intervention. This article will use multiple directions to show how to achieve this goal like using
different movement-image to help computers make decisions and by studying each pixel of the
image to prove the intervention.

Summary:

The first step they recommend the research should start with action recognition, because
in our real-life situation, one action may contain many different body movements. For instance,
when people shoot soccer they need to use their legs, arms, shoulders, and feet. The computers
will study each tiny move and put them in the database; therefore, in the future, we could
analyzing each movement by combining those tiny moves.

After putting those image information in the computer, it is important to design a way to
let the computer studying those images, and based on how the human brain works, in this
computer program, there must have a symbolic system for representing the information of the
image shape, and the program contains a set of processors capable of deriving this information
from images. To achieve this goal, the movement should be analyzed in two different ways. The
first way is Holistic representation which means action recognition is based on the extraction of a
global representation of human body structure, shape, and movements, and the second way is
Local representations which mean action recognition is based on the extraction of local features.

Holistic representation needs the computer to determine the whole image of the
movement, but it’s hard to trace small changes of the body movement; therefore the Local
representations program will help the computer catch more detail about the movement, and let
the computer study those details.
Rong Page 2

Compared with the Holistic representation, the Local representation is much harder to
achieve because it includes a lot of data, but it’s more accurate; therefore, By studying each 3D
graph’s pixel, the computer could find significant information by identifying the bodies and
trajectories. When the Holistic representation and Local representation are put together, the
computer will give a more accurate action result.

Based on real-life experience, most actions will be hard to understand without situations
or settings. Therefore, letting the computer recognize the situations is also pretty important, and
using the studies of SLAM will achieve this goal.

With all of this information collected, the program also needs to have a system to give the
final solutions; which call Deep Architectures for Action Recognition, and this program has four
different networks to help the computer make build the solutions. They are Spatiotemporal
networks, Multiple stream networks, Deep generative networks, and Temporal coherency
networks. These four networks will corrupt together to find out the final solution to let the robot
make the right move.

At this point, the function is all finished, but the speed of calculation still has a long way
to improve. The decision-making tree and the calculation complexity needs to have less time and
data complexity; therefore, the algorithm is not only needed to be working correctly but also
needs to be more efficient than right now; on the other hand, data augmentation techniques,
foveated architecture, and distinct frame sampling strategies also can be the points to be
improved.

Conclusion:

For computer vision, action recognition is so important, and this is new research that
provides a new solution of how action recognition. Even the static image analysis does a similar
job, but compare with the new action recognition system, the new system can provide more and
more details, better solutions, and accurate moves.

Summation:

This is a wonderful article for me to understand computer vision, and how to improve
computer vision to help computers make better decisions. This article also clearly describes a
new approach to gathering data. It’s also showed me how to let me computer collecting and
studying different data. This new action recognition system could be used on many robots or
machines.
Rong Page 3

Works Cited:

Herath, Samitha, Mehrtash Harandi, and Fatih Porikli. "Going Deeper into Action Recognition:
A Survey." Image and Vision Computing, vol. 60, 2017, pp. 4-21.

You might also like