EISC UPC Coauthor Guide 2022-01 (8960)

LATEX
When it comes to writing up scientific documents, there’s nothing better than LATEX. But, it can be intimidating for
first time users (and second time users). This guide should help you get on your feet when it comes to LaTeX. This
Coauthor Writing guide primer is written for OverLeaf, an online TeX compiler. There are other options for using LaTex (including CoCalc,
ShareLaTeX, and many applications for using LaTeX locally), but probably Overleaf is the most used.
EISC - UPC
Week 1
Willy Ugarte
No Meeting
A paper should allow readers to quickly discover the main results, and then, if interested, to examine the supporting Week 2
evidence. Structure your paper to support this behavior: • The coauthor explains the paper structure:
1. Describe the work in the context of accepted scientific knowledge. – Title and information about authors
2. State the idea that is being investigated, often as a theory or hypothesis. – Abstract
– Introduction
3. Explain what is new about the idea, what is being evaluated, or what contribution the paper is making.
– Context/Background/Overview
4. Justify the theory, by proofs or experiments.
– Main Contribution
– Related Works
1 A Simplified Picture – Experiments
1. Work on a (relevant) Computer Science/Software Engineering/Information System question – Conclusions and Perspectives
– Bibliography
2. Write a scientific paper
• The students present a working title and a proposal of the abstract:
3. Submit the paper to an appropriate journal/conference
– Title and information about authors
4. If accepted for publication then
– Abstract (see Fig 1):
• Add one line to CV ∗ Is typically a single paragraph of 50-200 words.
• Present work at scientific meetings ∗ Allows readers to judge the relevance or the paper to them.
∗ Is a concise summary of the paper’s aims, scope, and conclusions.
5. Else (paper rejected or to be modified) Go to Step 2
∗ Should be as short as possible while remaining clear and informative.
Table 1 shows the activities week by week for developing the Conference paper in TP2. ∗ The more specific, the more interesting.
∗ Self-contained and written for as broad a readership as possible.
Week TP2 ∗ Use past tense, since it refers to work already done.
1 No meeting ∗ Not to put in an abstract: minor details, paper structure, acronyms, mathematics, citations.
2 The coauthor explains the conference paper structure. The students and Coauthor get a Working title and an abstract.
3 Introduction.
4 Introduction.
5 Context.
6 Context.
7 Main contribution
8 No meeting for Exams
9 Main contribution
10 Related works
11 Experiments.
12 Experiments.
13 Conclusions and perspectives
Table 1: A Simplified Picture.
2 TP2
Formats
Most of the conferences use one of these formats: IEEE template, Springer template, ACM template, Scitepress Figure 1: Abstract example
template.
• These can be modified later according to the progress of the paper
1 2
Week 3 Week 5
• The students present a first version of the introduction: • The students present a first version of the context:
– What is the problem? – a mini-version of the theoretical framework
– Why is it interesting and important? – each concept/definition/theorem should have a reference
– Why is it hard? (e.g., why do naive approaches fail?) – Provides necessary (formal) background and terminology.
– Why hasn’t it been solved before? (or, what’s wrong with previous proposed solutions? How does ours – Defines the hypothesis and major concepts.
differ?)
– What are the key components of our approach and results? Also include any specific limitations. Fig 3 shows the structure of the context in two pages (in blue).
Fig 2 shows the structure of the introduction in the first two pages (in red).
Summary of contributions First Contribution

z }| { b) MoCap Animation:: According to Kitagawa and z }| {
Our main contributions are as follows: Windsor [9] Motion Capture or MoCap can be defined 1) PFNN for Character Basic Animations: In this section,
Summary of contributions • We develop an implementation of PFNN with an in-
as the sampling an recording of human motion, animals we first describe the Phase Function, the layers used in the
Reactive Character Animations with Deep Neural z }| { b) MoCap Animation:: According to Kitagawa and or inanimate objects as three dimensional data. We can model and the training process. This Model is based and
Windsor [9] Motion Capture or MoCap can be defined teraction system on top with assignable animations as
Our main contributions are as follows: use motion capture data to model the movements that out shares the Phase Function with the work presented by [5].
as the sampling an recording of human motion, animals needed.
character will present in the animation. This movements
Networks
• We develop an implementation of PFNN with an in- • We propose the use of animation transition blend tree a) Phase Function:: The phase function, as described
teraction system on top with assignable animations as or inanimate objects as three dimensional data. We can can be incorporated by adding the animation or using a by [5], computes a set of values called α that will be used
use motion capture data to model the movements that out with inverse kinematics for pose correction.
needed. • We present an analysis of our method and a comparison
more complex approach such as Motion Matching. Holden, by the network to generate the next pose in each frame. The
• We propose the use of animation transition blend tree
character will present in the animation. This movements Kanoun, Perepichka and Popa [3], showed that in Motion phase function is represented by α = Θ(p, β) [5], where p
FirstName LastName FirstName LastName FirstName LastName can be incorporated by adding the animation or using a with state-of-the-art approaches.
with inverse kinematics for pose correction. Matching, for every N frames the algorithm searches the is the current Phase and β are the parameters. The phase
Universidad Peruana de Universidad Peruana de Universidad Peruana de more complex approach such as Motion Matching. Holden, . database containing the Motion Data and finds the motion
• We present an analysis of our method and a comparison Paper roadmap function can be any type of function or even another Neural
Ciencias Aplicadas (UPC) Ciencias Aplicadas (UPC) Ciencias Aplicadas (UPC) with state-of-the-art approaches. Kanoun, Perepichka and Popa [3], showed that in Motion z }| { that best matches the current state and animation of the 4
Fig. 1. Forward and Inverse Kinematics Operations .
Network, but for this project we used the Cubic Catmull-
Lima, Peru Lima, Peru Lima, Peru Matching, for every N frames the algorithm searches the Then, Section II will explain the definitions of the tech- character. Rom Spline as a cyclic function, this requires that the start
. database containing the Motion Data and finds the motion
x.........@upc.edu.pe x.........@upc.edu.pe x.........@upc.edu.pe Paper roadmap nologies related to Blockchain, the appropriate components . Third Subject and the end control points be in the same place, as seen in
z }| { that best matches the current state and animation of the selected for the design of the architecture of the solution, z }| { the Fig. 2. Each control point α represents a set of weights
Second Subject k
Then, Section II will explain the definitions of the tech- character. and some terms from the forestry sector. In section III, the z }| { 3) Deep Learning Animation Generation: in the neural network, this control points are use as in the
nologies related to Blockchain, the appropriate components . contribution of the project will be presented. In section IV, 2) Character Kinematics: Kinematics describe the rota- a) Mixed Methods:: These methods depend on differ- Phase Function as β = {α , α , α , α }. The generation of
Why is it hard? (e.g., why do naive approaches fail?) 0 1 2 3
Abstract—The increasing need for more realistic animations z }| { selected for the design of the architecture of the solution, Second Subject similar solutions currently implemented in the literature will tions and translations of points, objects or group of objects ent algorithms or methods for animation. Some examples for the values to be used in this frame for a arbitrary p can be
has resulted in the implementation of various systems that try to Evenmore, animating characters could be a hard problem and some terms from the forestry sector. In section III, the z }| {
be explained. Section V will detail the experiments and without considering what causes the motion nor physics these methods are the following: express as follows:
overcome this issue by controlling the character at a base level contribution of the project will be presented. In section IV, 2) Character Kinematics: Kinematics describe the rota-
based on complex techniques. In our work we are using a Phase since humans exhibit a fluidity in the actions they perform their results. Section VI will describe the main conclusions properties such as mass, forces, torque or any other reference. • Learned Motion Matching: Holden et al. [3] describes
Functioned Neural Network for generating the next pose of the that is difficult to replicate by computer in real time, resulting similar solutions currently implemented in the literature will tions and translations of points, objects or group of objects reached. Most virtual articulated models are complicated consisting in the usage of four Deep Neural Networks inside of
Θ(p; β) = αk1
character in real time while making a comparison with a modified in simple and unrealistic transitions between actions. Typi- be explained. Section V will detail the experiments and without considering what causes the motion nor physics many joints, thus having a high number of degrees of free- Decompressor, Stepper and Projector Algorithms that 1 1
. + w( αk2 − αk0 )
version of the model. The current basic model lacks the ability of cally, animations for humans are captured from real humans their results. Section VI will describe the main conclusions properties such as mass, forces, torque or any other reference. dom (DoFs) that have to undergone many transformations are used in standard Motion Matching System for 2 2
producing reactive animations with objects of their surrounding reached. Most virtual articulated models are complicated consisting in to achieve a desired pose. In addition, they are required to 5 1
using techniques such as motion capture or manually edited animations. + w2 (αk0 − αk1 + 2αk2 − αk3 )
but only reacts to the terrain the character is standing on. many joints, thus having a high number of degrees of free- satisfy a number of constraints that include joint restrictions 2 2
Therefore, adding a layer of Rigs with Inverse Kinematics and using key frames and interpolation curves, these methods as . • DeepLoco: Peng, Berseth, Yin and Van De Panne [12]
dom (DoFs) that have to undergone many transformations II. BACKGROUND so they act naturally, as well as, target nodes or end effectors 3 3 3 1 1
Blending Trees will allow us to switch between actions depending expensive and time-consuming. Furthermore, the animations developed a Deep Reinforcement Learning approach to + w ( αk1 − αk2 + αk3 − αk0 ) (1)
to achieve a desired pose. In addition, they are required to to indicate where is aimed to end. 2 2 2 2
on the object and adjust the character to fit properly. Our results of humanoid characters are especially difficult to control, the animation of bipedal locomotion that highly depends
showed that our proposal improves significantly previous results satisfy a number of constraints that include joint restrictions Mini-introduction of the section a) Forward Kinematic (FK):: Since the characters
since they present complex dynamic movements, not a clear II. BACKGROUND z }| { of the Physics constraints added by the physics system.
and that inverse kinematics is essential for this improvement. so they act naturally, as well as, target nodes or end effectors
Index Terms—Neural Networks, Locomotion, Human Motion,
start and a high index of degrees of freedom.
to indicate where is aimed to end. The generation of animations of complex characters, such as joints are composed in a hierarchical manner, a way to b) Deep Neural Networks:: In deep neural network
4p
Where w = (mod 1)
Character Animation, Character Control, Deep Learning. . Mini-introduction of the section a) Forward Kinematic (FK):: Since the characters humans, presents different challenges to computer graphics. handle this complexity to create a pose and ensuring some approaches, the motion is generated as an output of one 2π

Why hasn’t it been solved before? (or, what’s wrong with previous proposed solutions? How does ours differ?)
z }| { This problems include, but are not limited to the management coherence, is by manually adjusting all the DoFs by carefully or more neural networks, each one of them executing a kn =
4p
+ (n − 1)(mod 4)
z }| { The generation of animations of complex characters, such as joints are composed in a hierarchical manner, a way to of multiple degrees of freedom, the natural motions required modifying their rotations so one joint moves its children precise task depending on the network. In [5], the authors 2π
I. I NTRODUCTION Many solutions have arised proposing different methods to humans, presents different challenges to computer graphics. handle this complexity to create a pose and ensuring some for a animation, the generation in real time of animations, joints accordingly, this process is also known as Forward implemented a single network with a phase function, this b) Model Structure:: The PFNN model depends on
This problems include, but are not limited to the management coherence, is by manually adjusting all the DoFs by carefully
What is the problem?
z }| { control a character by changing their joints properties to and others. In the following sections, we present some of the Kinematics (FK) where each joints transformation is ad- approach allows the character to monitor and change the the quantity of control points that are present in the Phase
Due to the constant growth of the entertainment software select the next pose accordingly to the action intended, of multiple degrees of freedom, the natural motions required modifying their rotations so one joint moves its children approaches other took. justed, most likely in a local space of the previous joint.
weights of the network depending on the current phase. The Function, in this particular case, we use four control points.
industry and the further development in computer graph- methods such as physics-based models [1], [2], motion for a animation, the generation in real time of animations, joints accordingly, this process is also known as Forward For instance, to animate an arm with a fixed shoulder, the
phase function provides a simple solution to the reactive Therefore, every layer in the model will present a configura-
.
ics and artificial intelligence, big companies are investing matching [3] or time-series models [4], [5]. Although many and others. In the following sections, we present some of the Kinematics (FK) where each joints transformation is ad- First Subject
position of the tip of the thumb would be then calculated
animation generation. This approach is known as Phased- tion of four sets of weights and bias. The structure we used
methods had achieved fluid motions, like walking and run- approaches other took. justed, most likely in a local space of the previous joint. z }| { from the angles of the shoulder, elbow, wrist, thumb and
in Machine Learning driven systems to generate complex Function Neural Network (henceforth noted by PFNN). In can be represented as a three layer model with 512 units
ning [6], [7], they respond mostly only to the terrain the For instance, to animate an arm with a fixed shoulder, the 1) Character Animation: knuckle joints, taking into account all of their DoFs [10].
animations, such as EA 1 and Ubisoft 2 . Because of this, . [13], the authors provided a method with no phase function for layers 0 and 1, and 311 for layer 2. For the activator
character is standing on. On the other hand, some deep learn- position of the tip of the thumb would be then calculated a) Key frame Animation:: According to Lever [8], ani- b) Inverse Kinematics (IK):: It has become one of
machine learning methods have been applied to various z
First Subject
}| { from the angles of the shoulder, elbow, wrist, thumb and that presented good results with quadruped characters. of the layers we used Exponential rectified linear function
activities as walking, running and climbing. Nevertheless, ing models are able to recognize environment objects and mation appears as a series of static images changing rapidly, the main techniques to manipulate motion data as finding
1) Character Animation: knuckle joints, taking into account all of their DoFs [10]. . activator. The neural network Θ [5] can be represented as:
those techniques are only attainable by great corporation that interact with them [3], [4], but the actions are incorporated with the purpose of giving a sensation of movement of the more efficient ways to manipulate the articulated models
in the model so whenever new actions are needed the model a) Key frame Animation:: According to Lever [8], ani- b) Inverse Kinematics (IK):: It has become one of presented images. When animating in modern software, we had become a necessity. IK allows to animate a model Φ(x, a) = W2 ELU(W1 ELU(W0 x + b0 ) + b1 ) + b2 (2)
posses a motion capture studio.
needs to be retrained every time. mation appears as a series of static images changing rapidly, the main techniques to manipulate motion data as finding present a character that can be moved in a virtual space, by adjusting only the end effectors (usually end effectors III. M AIN C ONTRIBUTION
. where :
with the purpose of giving a sensation of movement of the more efficient ways to manipulate the articulated models these movements are made by the animator. Most of the are control points, often a joint it self, more commonly
{.
Why is it interesting and important?
z }| presented images. When animating in modern software, we had become a necessity. IK allows to animate a model Mini-introduction of the section ELU = max(x, 0) + exp(min(x, 0)) − 1 (3)
What are the key components of our approach and results? Also include any specific limitations. times these movements can be created by interpolating two placed as and of limbs, such as feet or hands, but can also z }| {
The field of computer animation dedicated to character ani- z }| { present a character that can be moved in a virtual space, by adjusting only the end effectors (usually end effectors different positions and rotations of a articulation of the char- be intermediate joints, such as elbows or knees), this end For the development of a Unity package capable of switching Where Wk and bk is the network parameters returned by the
mations is rapidly increasing the development of animation To overcome this issue and make it more accessible, we use these movements are made by the animator. Most of the are control points, often a joint it self, more commonly acter. The process of creating these positions and rotations effectors position and orientation are often determined by between animations with every user input or character-object phase function Θ as seen in Equation 1.
systems that can help with the development of complex a Phase Functioned Neural Network (PFNN) [5] for basic times these movements can be created by interpolating two placed as and of limbs, such as feet or hands, but can also for interpolation is known as Key Frame Animation. the animator or a MoCap reference. Thanks to IK the rest of interaction we proposed a character controlled PFNN for c) Training:: For the training we take each frame x
3D worlds in the video game industry. In addition, the states of locomotion, since it brings more freedom to the different positions and rotations of a articulation of the char- be intermediate joints, such as elbows or knees), this end A basic usage of Key frame animations can be seen with the DoFs from parent nodes are automatically determined fol- movement in every direction adapting to terrain variations and its next frame y and the current phase p and create
entertainment software industry is constantly growing, just in user, and a blend tree on top of the network for animation acter. The process of creating these positions and rotations effectors position and orientation are often determined by Unity 3D animator, by creating a new animation, the user lowing different criteria (specified by the model constraints) and different types of movement (i.e. crouching or running) three matrices as X = [x0 , x1 , ...], Y = [y0 , y1 , ...] and P =
the United States, according to the Entertainment Software transitioning to different custom actions. Our work is limited for interpolation is known as Key Frame Animation. the animator or a MoCap reference. Thanks to IK the rest of can access to the animation timeline, adding key frames that according to the position of the end effector in world space, and applying the animation rigs necessary when interacting [p0 , p1 , ...]. We calculate the mean and standard deviation of
Association (ESA) 3 , the American software industry has to adding animations for the character to do, depending on A basic usage of Key frame animations can be seen with the DoFs from parent nodes are automatically determined fol- reference a value, like position or rotation. When the user saving lots of work to the animator while still having control with object to correct the pose of the character. X and Y and normalized the data. For the loss function of the
improved a 2% from 2018 to 2019. These conditions make the object they interact with, by switching or mixing the Unity 3D animator, by creating a new animation, the user lowing different criteria (specified by the model constraints) creates multiple key frames, the engine does the interpolation of it and creating a coherent pose [11]. Fig. 1 depicts an model we used Mean Square Error and for the optimization,
.
the animators more valuable for the development of a game PFNN network output with a pre-recorded animation, we can access to the animation timeline, adding key frames that according to the position of the end effector in world space, between these key frames, resulting in a simple animation. example of FK and IK. stochastic gradient descend algorithm [14].
of great quality in important companies standards. The utilize Unity3D built-in components for physics, collisions reference a value, like position or rotation. When the user saving lots of work to the animator while still having control
. . 4 https://www.mathworks.com/discovery/inverse-kinematics.html
independent game developers and startups will have to pay and animations control, as well as an animation rigging creates multiple key frames, the engine does the interpolation of it and creating a coherent pose [11]. Fig. 1 depicts an
salaries up to $33,000 a year 4 for a single animator. library, for ease of use. between these key frames, resulting in a simple animation. example of FK and IK.
. . . .
Figure 3: Context example
Figure 2: Introduction example

Week 6
Week 4 • The students present a final version of the context, after the observations of the coauthor.
• The students present a final version of the introduction, after the observations of the coauthor.
3 4
Week 7 Week 9
• The students present a first version of the main contribution: • The students present a final version of the main contribution, after the observations of the coauthor.
– describe in great detail your method/algorithm/model
– put graphics/examples/tables to help the reader
Week 10
– step-by-step in the most simple and thoughtful way possible (don’t make the reviewer think!!) • The students present a final version of the related works:
– Explains the chain of reasoning that leads to the results. – a mini-version of the state-of-the-art
∗ Provides the details of central proofs.
– compare your work with 4 or 5 papers out of the 30 from the State-of-the-Art (see Section ??)
∗ Explains the experimental setup and summarizes the outcomes.
– Most results are additions to existing knowledge.
– Figures, Charts, Tables, Diagrams, etc. can be extremely helpful to communicate ideas, observations, or
data to others. – A literature review is used to:
– Many scientists outline their method by deciding on what figures, graphs and tables they need in order to ∗ describe existing knowledge,
convey their story, and then fill the text around these figures. ∗ compare the new results to similar previously published results, and
– If a figure is reproduced or copied or adapted from another source, that source must be properly acknowl- ∗ explain how the new results extend existing knowledge.
edged in the caption, and listed among other references.
– Can also be used to explain how existing methods differ from one another and what their respective strengths
Fig 4 shows the structure of the main contribution in two pages (in green). and weaknesses are.
Fig 5 shows the structure of the related works in two pages (in red).
First Contribution
z }| { .
1) PFNN for Character Basic Animations: In this section,
A way of switching back and forth between traditional ani-
we first describe the Phase Function, the layers used in the
mation and generating poses from a Deep Learning model, .
model and the training process. This Model is based and
so interactions that cannot be blended with the generated
shares the Phase Function with the work presented by [5]. A way of switching back and forth between traditional ani-
poses can still be performed by the character (i.e. sitting on
a) Phase Function:: The phase function, as described mation and generating poses from a Deep Learning model,
a chair) by defining which animation to play when it collides
by [5], computes a set of values called α that will be used so interactions that cannot be blended with the generated
with a tagged object.
by the network to generate the next pose in each frame. The poses can still be performed by the character (i.e. sitting on
In addition to this, a whole scenario was created consisting
phase function is represented by α = Θ(p, β) [5], where p a chair) by defining which animation to play when it collides
different types of objects (a door and two chairs), a rugged
is the current Phase and β are the parameters. The phase with a tagged object.
terrain section and a movable object to test the performance
function can be any type of function or even another Neural In addition to this, a whole scenario was created consisting
Fig. 1. Forward and Inverse Kinematics Operations4 . of the character to different interactions. Notably, the PFNN
Network, but for this project we used the Cubic Catmull- different types of objects (a door and two chairs), a rugged
excels at generating free movements in different locomotion
Rom Spline as a cyclic function, this requires that the start Fig. 2. Graphic representation of Catmull-Rom Cubic Spline as a Cyclic
terrain section and a movable object to test the performance
Function. phases, moreover, it adapts to the variations of the terrain its
Third Subject and the end control points be in the same place, as seen in of the character to different interactions. Notably, the PFNN
z }| { standing on (i.e. hiking a steep slope or trotting down hill).
the Fig. 2. Each control point αk represents a set of weights excels at generating free movements in different locomotion
3) Deep Learning Animation Generation: b) Frame Pose Calculation:: In each frame, if the Fig. 2. Graphic representation of Catmull-Rom Cubic Spline as a Cyclic Fig. 3. Graphic representation of how our system works.
in the neural network, this control points are use as in the phases, moreover, it adapts to the variations of the terrain its
a) Mixed Methods:: These methods depend on differ- . PFNN is active, the script takes the joints properties, tra- Function.
Phase Function as β = {α0 , α1 , α2 , α3 }. The generation of standing on (i.e. hiking a steep slope or trotting down hill).
ent algorithms or methods for animation. Some examples for jectory and current Phase for the following calculations. Second Related Work Fourth Related Work
the values to be used in this frame for a arbitrary p can be For the construction and training of the model we used b) Frame Pose Calculation:: In each frame, if the z }| { z }| {
these methods are the following: Then, the joint properties and trajectory are normalized.
express as follows: Tensorflow 2, with keras custom layers and models. We PFNN is active, the script takes the joints properties, tra- Motion Matching is a technique use to animate characters In [4], the authors use a neural network to determine which
Afterwards, we use the Phase p to generate the index of .
• Learned Motion Matching: Holden et al. [3] describes included Dropout layers with a retention probability of 0.7
Θ(p; β) = αk1 the weights that are going to be use. jectory and current Phase for the following calculations. depending on a lot of data, and a set of algorithms that action (states), or blend of actions, is needed in the next
the usage of four Deep Neural Networks inside of and trained using batches of size 32. The training was p For the construction and training of the model we used
1 1 Then, the joint properties and trajectory are normalized. search the best suited animation for the next frame. In [3], frame by detecting the surrounding through many voxels
Decompressor, Stepper and Projector Algorithms that + w( αk2 − αk0 ) performed with 10 epochs in about 12 hours in a Tesla P100. index = 50 (4) Tensorflow 2, with keras custom layers and models. We
2 2 2π Afterwards, we use the Phase p to generate the index of the authors propose a state of the art mixed system, using around the character in a cylindrical area, as well as the
are used in standard Motion Matching System for included Dropout layers with a retention probability of 0.7
5 1 . the weights that are going to be use. the base of Motion Matching process, including 3 algorithms interaction to objects being a voxelization projection of its
animations. + w2 (αk0 − αk1 + 2αk2 − αk3 ) Second Contribution After that, we run the neural network with the weights and trained using batches of size 32. The training was p
2 2 z }| { such as: shape, adding to the network inputs along side the desired
• DeepLoco: Peng, Berseth, Yin and Van De Panne [12] and biases that correspond to the calculated index. Finally, performed with 10 epochs in about 12 hours in a Tesla P100. index = 50 (4)
developed a Deep Reinforcement Learning approach to
3 3 1 1 2) Character Animation Rigging: For having a better we re-normalize the result and update the model and the 2π 1) Compressor: overcomes the need of storing the rota- motion and character pose given by the user. Similar to
+ w3 ( αk1 − αk2 + αk3 − αk0 ) (1) .
the animation of bipedal locomotion that highly depends 2 2 2 2 control over the character we applied motion rigs to its skele- phase. If the humanoid character interacts with an object, Second Contribution After that, we run the neural network with the weights tion and translation of the articulations of character, MANN, it uses a Gating Network and Motion Prediction
of the Physics constraints added by the physics system. ton defining many referential constraints between different the rigs that correspond to said object are activated and the
z }| { and biases that correspond to the calculated index. Finally, by generating them using only the parameters of the Network in sequence. In contrast, our method controls the
4p parts of the limbs, torso, root node, etc. for the purpose 2) Character Animation Rigging: For having a better we re-normalize the result and update the model and the articulations of the character. interactions on top of the network’s output by blending the
b) Deep Neural Networks:: In deep neural network Where w = (mod 1) pose correction is done by the IK functions described in
2π of maintaining a correct anatomy and avoiding muscles section III-2. control over the character we applied motion rigs to its skele- phase. If the humanoid character interacts with an object, 2) Stepper: generates a delta that aids the production of generated pose with a new assigned action by correcting the
approaches, the motion is generated as an output of one pose thanks to various animation rigging calculations, instead
4p contracting to the wrong direction. As well as, adding an ton defining many referential constraints between different the rigs that correspond to said object are activated and the the next frame.
or more neural networks, each one of them executing a kn = + (n − 1)(mod 4)
2π IK constraints to the limbs by assigning new target nodes parts of the limbs, torso, root node, etc. for the purpose pose correction is done by the IK functions described in 3) Projector: Finds the next most suitable step for the of detecting the object before hand and acknowledging them
precise task depending on the network. In [5], the authors to the network, allowing the user to add as many actions
for a simple re-positioning of the limb-ends (hands and feet). of maintaining a correct anatomy and avoiding muscles section III-2. animation using K nearest neighbours.
implemented a single network with a phase function, this b) Model Structure:: The PFNN model depends on
approach allows the character to monitor and change the the quantity of control points that are present in the Phase
Thus allowing us to override the limbs animations for a pose IV. R ELATED WORKS contracting to the wrong direction. As well as, adding an In the article, the authors describe the usage of 4 different as needed without the need of training a new model with
weights of the network depending on the current phase. The Function, in this particular case, we use four control points.
correction or to move them independently to do a certain IK constraints to the limbs by assigning new target nodes neural networks to replace certain steps of the algorithms, another state.
action (i.e. grabbing a door knob to open it). for a simple re-positioning of the limb-ends (hands and feet). concluding in a more efficient Neural Network approach to
phase function provides a simple solution to the reactive Therefore, every layer in the model will present a configura- Mini-introduction of the section
3) System Overview: In this section we will discuss how z }| { Thus allowing us to override the limbs animations for a pose IV. R ELATED WORKS Motion Matching. Our method describes a mixed approach
animation generation. This approach is known as Phased- tion of four sets of weights and bias. The structure we used
all of this was integrated together in Unity 3D thanks to its Now, the different existing solutions using various tech- correction or to move them independently to do a certain such as Learned Motion Matching, with a clear difference, V. E XPERIMENTS
Function Neural Network (henceforth noted by PFNN). In can be represented as a three layer model with 512 units
built-in properties and how our system works step-by-step nologies for the forestry sector will be briefly discussed. action (i.e. grabbing a door knob to open it). the integration of our method can be described as superficial,
[13], the authors provided a method with no phase function for layers 0 and 1, and 311 for layer 2. For the activator Mini-introduction of the section
as seen in Fig. 3. Additionally, blockchain-based solutions were found with ar- 3) System Overview: In this section we will discuss how z }| { adding a layer of interactivity to the known PFNN. Mini-introduction of the section
that presented good results with quadruped characters. of the layers we used Exponential rectified linear function z }| {
a) Incorporation to Unity 3D:: To accomplish the task chitectural designs for different sectors that share similarities all of this was integrated together in Unity 3D thanks to its Now, the different existing solutions using various tech-
. activator. The neural network Θ [5] can be represented as: In this section we will discuss the experiments our project
of generating reactive animations we utilize Unity 3D be- in processes with those of logging management. built-in properties and how our system works step-by-step nologies for the forestry sector will be briefly discussed.
has undergone, as well as, what is needed to replicate said
Φ(x, a) = W2 ELU(W1 ELU(W0 x + b0 ) + b1 ) + b2 (2) cause of its various embedded systems, such as hierarchical as seen in Fig. 3. Additionally, blockchain-based solutions were found with ar-
First Related Work experiments and a discussion of the results obtained after
III. M AIN C ONTRIBUTION where : Game Objects, Transformations, Rigid bodies, collisions and z }| { a) Incorporation to Unity 3D:: To accomplish the task chitectural designs for different sectors that share similarities
this process.
the Animation Rigging package developed for it. Thanks to In [5], the authors propose a novel framework for the synthe- of generating reactive animations we utilize Unity 3D be- in processes with those of logging management.
Mini-introduction of the section ELU = max(x, 0) + exp(min(x, 0)) − 1 (3) cause of its various embedded systems, such as hierarchical
z }| { the mentioned systems we had the capability of building a sis of movements called Phase-Functioned Neural Network. First Related Work Third Related Work
For the development of a Unity package capable of switching Where Wk and bk is the network parameters returned by the hierarchical skeleton composition making possible the use of In contrast to other movement synthesis networks, this uses a Game Objects, Transformations, Rigid bodies, collisions and z }| { z }| {
between animations with every user input or character-object phase function Θ as seen in Equation 1. FK to create traditional animations for specific actions and particular time variable called Phase that is represented by a the Animation Rigging package developed for it. Thanks to In [5], the authors propose a novel framework for the synthe- In [13], the authors propose the usage of the output of one A. Experimental Protocol
interaction we proposed a character controlled PFNN for c) Training:: For the training we take each frame x to use the skeleton joints Transformation properties as input Phase function as seen in equation 1. In this article they used the mentioned systems we had the capability of building a sis of movements called Phase-Functioned Neural Network. network (named Gating Network) as blending coefficients
movement in every direction adapting to terrain variations and its next frame y and the current phase p and create for the PFNN which is built from a collection of binary the Catmull-Rom Cubic Spline function and changed the hierarchical skeleton composition making possible the use of In contrast to other movement synthesis networks, this uses a of expert weights to determine the dynamic weights for the To recreate the process of building, training and test-
and different types of movement (i.e. crouching or running) three matrices as X = [x0 , x1 , ...], Y = [y0 , y1 , ...] and P = files containing the weights and biases. Furthermore, the values of the weights and biases of the network depending FK to create traditional animations for specific actions and particular time variable called Phase that is represented by a Motion Prediction Network, in contrast to the PFNN which ing the model utilized in our project, we begin to
and applying the animation rigs necessary when interacting [p0 , p1 , ...]. We calculate the mean and standard deviation of Rigid body and collisions system allows the detection and on the current phase. On the other hand, our work utilizes the to use the skeleton joints Transformation properties as input Phase function as seen in equation 1. In this article they used weights are calculated with a phase function. This gating describe what was Development
needed to accomplish such task.
Environment
with object to correct the pose of the character. X and Y and normalized the data. For the loss function of the identification of the objects the character is interacting with PFNN to generate basic motion in real time. Nevertheless, for the PFNN which is built from a collection of binary the Catmull-Rom Cubic Spline function and changed the networks allows the character to switch or blend different z }| {
. model we used Mean Square Error and for the optimization, to determine which action to perform in each scenario since we added a simple to use interaction generation system based files containing the weights and biases. Furthermore, the values of the weights and biases of the network depending locomotion phases according to the user input and terrain 1) Development Environment: The environment used as
stochastic gradient descend algorithm [14]. not all animations where incorporated within the possible on Inverse Kinematics (IK) to extend the reach of the PFNN Rigid body and collisions system allows the detection and on the current phase. On the other hand, our work utilizes the variations the character is standing on. However, our method our main platform where all our models were trained was
4 https://www.mathworks.com/discovery/inverse-kinematics.html PFNN outputs. in a simple matter. identification of the objects the character is interacting with PFNN to generate basic motion in real time. Nevertheless, uses the PFNN since it require less data to train and can be Google Colaboratory Pro which provides us a total RAM of
to determine which action to perform in each scenario since we added a simple to use interaction generation system based stored in little space. In addition to this, the MANN needed 25GB and a Tesla T4 or a Tesla P100 GPU. Also having
not all animations where incorporated within the possible on Inverse Kinematics (IK) to extend the reach of the PFNN the expert weights of each desired action so that they could Google One, the first tier subscription, for up to 100 GB of
PFNN outputs. in a simple matter. blend. storage space in Google Drive.
Figure 4: Main Contribution example
Figure 5: Related works example

Week 8
• No meeting because there are exams.
5 6
Week 11 Week 12
• The students present a first version of the experiments: • The students present a final version of the experiments.
– States in detail and analyses the results of the research.
Week 13
– The structure should be evident in the section headings.
• The students present a final version of the conclusions and perspectives:
– There must be a subsection called “Experimental Protocol” where it’s explained:
∗ The PC configuration (RAM, CPU, ...) or the service configuration (Colab, AWS, Azure, . . . ) – Are used to draw together the topics discussed in the paper.
∗ The data used (origin, construction, treatment, . . . ) – Should include a concise statement of the paper’s important results and an explanation of their significance.
∗ it’s better if an url with the code, or access to the service (website, demo, . . . ) is put here. – Are an appropriate place to (re)state any limitations.
– Theoretical evaluation – Should look beyond the current context to:
∗ Performance analysis ∗ other problems that were not addressed;
∗ Properties, complexity, etc. ∗ questions that were not answered;
– Experimental comparative evaluation ∗ variations that could also be explored.
∗ State clearly the target of experiments – Perspectives should show possible future works, and show your interest in continuing with this topic.
∗ Experiment design: should be fair
Fig 7 shows the structure of the conclusions in two pages (in blue).
∗ Draw conclusions from experiment results
Fig 6 shows the structure of the experiments in two pages (in yellow).
NSM Ours
Interactions Min Animations Re-training Input Features Min Animations Re-training Input Features
1 5 Yes n 1 No 311
Results 2 10 Yes n + 14 2 No 311
Project RAM z }| {
3 15 Yes n + 28 3 No 311
PFNN Adam [5] 1,034 MB B. Results
4 20 Yes n + 42 4 No 311
PFNN Original [5] 1,545 MB To compare our method of generating animations and TABLE II
NSM [4] 1,657 MB interactions we used different projects based in Unity 3D T RANING COMPARISON OF NSM AND N EURANIMATION
Neuranimation (Ours) 1,237 MB and Deep Neural Networks that behave similarly to our
TABLE I method [4] [5] [13]. For the comparison, we took the
RAM C ONSUMPTION OF THE DIFFERENT PROJECTS THAT GENERATE Unity 3D projects of each article and run them to see their
REACTIVE ANIMATIONS . VI. C ONCLUSION [10] S. Kucuk and Z. Bingul, Robot Kinematics: Forward and Inverse
performance in terms of RAM usage as seen in Table I First Conclusion Kinematics, 12 2006.
where we can observe the different performance each Neural z }| { [11] A. Aristidou, J. Lasenby, Y. Chrysanthou, and A. Shamir, “Inverse kine-
Network has in the projects. This tests were made in an We conclude that by adding pose correction with IK calcula- matics techniques in computer graphics: A survey,” Computer Graphics
Forum, vol. 37, pp. 35–58, 09 2018.
i7-9700 with a total RAM of 16GB and a Nvidia 1050Ti tions to a Neural Network approach of animation generation, [12] X. B. Peng, G. Berseth, K. Yin, and M. Van De Panne, “Deeploco: Dy-
Dataset
z }| { Graphics Card. In Table I we analyze our method presents we could add interactions by using traditional animation for namic locomotion skills using hierarchical deep reinforcement learning,”
Fig. 3. Graphic representation of how our system works.
2) Dataset: The dataset was built from the scripts pro- slightly better performance than the PFNN Original [5] said interactions and the Neural Network for the generations ACM Transactions on Graphics (TOG), vol. 36, no. 4, pp. 1–13, 2017.
[13] H. Zhang, S. Starke, T. Komura, and J. Saito, “Mode-adaptive neural
vided by Holden along side his PFNN article [5] which project, which presents the basic PFNN animations, and the of the basic animations. This mixed approach does not rep- networks for quadruped motion control,” ACM Transactions on Graphics
Second Related Work Fourth Related Work
z }| { z }| { process many motion capture clips to obtain the joint prop- NSM [4], a project containing different interactions that resents a performance reduction while making the character (TOG), vol. 37, no. 4, pp. 1–11, 2018.
erties of the person recorded and generating three numpy are generated by the corresponding Deep Neural Network. more reactive to the environment desired. [14] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”
Motion Matching is a technique use to animate characters In [4], the authors use a neural network to determine which arXiv preprint arXiv:1412.6980, 2014.
depending on a lot of data, and a set of algorithms that action (states), or blend of actions, is needed in the next arrays, corresponding to the input data as X, the output data On the other hand, the PFNN Adam Project has better Second Conclusion
z }| {
search the best suited animation for the next frame. In [3], frame by detecting the surrounding through many voxels as Y and the current phase as P. performance than ours, due to this being an optimization
on the Original PFNN resulting in a better performance By adding animation riggings, we conclude that the gen-
the authors propose a state of the art mixed system, using around the character in a cylindrical area, as well as the
z
Models Training
}| { difference as expected. As seen in Table II, our model does eration of new animations can be highly enhanced by this
the base of Motion Matching process, including 3 algorithms interaction to objects being a voxelization projection of its
not need to be retrained whenever a new action is desired method, adding constraints of movement and simplifying
such as: shape, adding to the network inputs along side the desired 3) Models Training: All model architectures were trained
to be added, since it works on top of the networks output the calculations of bone position an rotation whilst making
1) Compressor: overcomes the need of storing the rota- motion and character pose given by the user. Similar to in Google Colab Pro utilizing Tensorflow 2 in a virtual GPU
by blending it with a pose correction controlled by IK. it maintaining a correct anatomical structure. This also
tion and translation of the articulations of character, MANN, it uses a Gating Network and Motion Prediction environment with extended RAM as seen in Section V-A1,
Hence, the network’s independence allows us to increase the simplifies the generation of reactive animations using target
by generating them using only the parameters of the Network in sequence. In contrast, our method controls the with a 24 hour maximum run-time, for 10 epochs using
capabilities of performing different actions by adding hand for the rigs to move at.
articulations of the character. interactions on top of the network’s output by blending the batches of 32 tuples from the dataset, with the Adam
Future works
2) Stepper: generates a delta that aids the production of generated pose with a new assigned action by correcting the optimizer and trying to minimize the loss value obtained crafted animations or blending them to interact with said z }| {
the next frame. pose thanks to various animation rigging calculations, instead with a Mean Square Error function, averaging 52 minutes objects accordingly. In future works, we aim to improve the model as a part of the
3) Projector: Finds the next most suitable step for the of detecting the object before hand and acknowledging them per epoch. By having all models undergone the same training PFNN instead of our current two step approach. Evenmore,
animation using K nearest neighbours. to the network, allowing the user to add as many actions process we can analyze how they behave and the learning fine tuning the parameters would increase our accuracy.
In the article, the authors describe the usage of 4 different as needed without the need of training a new model with tendencies they take, all model took around 9 hours to train.
neural networks to replace certain steps of the algorithms, another state. Testing Environment
concluding in a more efficient Neural Network approach to z }| { R EFERENCES
Motion Matching. Our method describes a mixed approach 4) Testing Environment: To qualify the performance of Discussion [1] S. Park, H. Ryu, S. Lee, S. Lee, and J. Lee, “Learning predict-
such as Learned Motion Matching, with a clear difference, V. E XPERIMENTS the models, we consider the loss function es the main metric z }| { and-simulate policies from unorganized human motion data,” ACM
the integration of our method can be described as superficial, to minimize during training due to not having a accuracy C. Discussion Transactions on Graphics (TOG), vol. 38, no. 6, pp. 1–11, 2019.
[2] J. Hwang, J. Kim, I. H. Suh, and T. Kwon, “Real-time locomotion con-
adding a layer of interactivity to the known PFNN. Mini-introduction of the section metric because its purpose is to generate novel poses from As presented in section V-B, our method presents bet- troller using an inverted-pendulum-based abstract model,” in Computer
z }| {
the learned animation clips so there is no validation with ter performance than similar implementations of animation Graphics Forum, vol. 37, no. 2. Wiley Online Library, 2018, pp. 287–
In this section we will discuss the experiments our project a set given pose. On the other hand, to actually see the generation in Unity 3D while adding interactions to the 296.
has undergone, as well as, what is needed to replicate said results of the generated poses we have created a Unity framework with the usage of inverse kinematics. In Table I, [3] D. Holden, O. Kanoun, M. Perepichka, and T. Popa, “Learned motion
experiments and a discussion of the results obtained after matching,” ACM Transactions on Graphics (TOG), vol. 39, no. 4, pp.
3D demo playground scene, where a model can be loaded we compared our RAM consumption with two PFNN imple- 53–1, 2020.
this process. from their weights and biases saved in binary files, to test mentations made in Unity 3D and a NSM implementation, [4] S. Starke, H. Zhang, T. Komura, and J. Saito, “Neural state machine for
character-scene interactions.” ACM Trans. Graph., vol. 38, no. 6, pp.
their behaviour with an assigned character structure and the also in Unity 3D, because of the difference to manage inter- 209–1, 2019.
Third Related Work
z }| { responsiveness to user input and the environment, also how actions. With this results we can conclude that our method [5] D. Holden, T. Komura, and J. Saito, “Phase-functioned neural networks
In [13], the authors propose the usage of the output of one A. Experimental Protocol the model reacts to the added features to make it more presents a great alternative to other Neural Network only for character control,” ACM Transactions on Graphics (TOG), vol. 36,
no. 4, pp. 1–13, 2017.
network (named Gating Network) as blending coefficients reactive to the objects the character interacted with. approaches, maintaining a good performance while adding [6] K. Bergamin, S. Clavet, D. Holden, and J. R. Forbes, “Drecon: data-
of expert weights to determine the dynamic weights for the To recreate the process of building, training and test- Source Code
the animations framework using animation rigs. Furthermore, driven responsive control of physics-based characters,” ACM Transac-
Motion Prediction Network, in contrast to the PFNN which ing the model utilized in our project, we begin to z }| { we presented the Table II that describes the needs that are tions On Graphics (TOG), vol. 38, no. 6, pp. 1–11, 2019.
weights are calculated with a phase function. This gating describe what was Development
needed to accomplish such task. [7] F. G. Harvey and C. Pal, “Recurrent transition networks for character
Environment
5) Source Code: Our code and dataset are publicly present when adding a new interaction to their projects. locomotion,” in SIGGRAPH Asia 2018 Technical Briefs, 2018, pp. 1–4.
networks allows the character to switch or blend different z }| { avaiable at https://github.com/SergioSugaharaC/ Our method excels at adding animations to the project, by [8] N. Lever, Real-time 3D character animation with Visual C++. Rout-
locomotion phases according to the user input and terrain 1) Development Environment: The environment used as Neuranimations, specifically the Unity 3D project and using an IK system with targets and a set of personalized ledge, 2001.
[9] M. Kitagawa and B. Windsor, MoCap for artists: workflow and tech-
variations the character is standing on. However, our method our main platform where all our models were trained was its development versions for testing the models and their intractable objects. In contrast to NSM [4] that describes the niques for motion capture. CRC Press, 2020.
uses the PFNN since it require less data to train and can be Google Colaboratory Pro which provides us a total RAM of added reactivity features with the environment’s objects state of the art Neural Network model for the generation
stored in little space. In addition to this, the MANN needed 25GB and a Tesla T4 or a Tesla P100 GPU. Also having and terrain., as well as, our models and script notebooks of animation of interactions while correcting pose, needs to
the expert weights of each desired action so that they could Google One, the first tier subscription, for up to 100 GB of had been uploaded at https://drive.google.com/drive/folders/ retrain and restructure input and output data to add a new
blend. storage space in Google Drive. 14Xq5KwYPzx vGre878BQq51GxALuzsfM?usp=sharing interaction to the model.
Figure 7: Conclusions example
Figure 6: Experiments example • The coauthor gives his approval of the final draft.
7 8
A An editing checklist B Bibliography
You should look for these typos in the drafts presented by students: • Each reference should be:
• Consistency: – relevant;
– up-to-date: check when taking over references from other papers;
– Are the titles and headings consistent with the content?
– reasonably accessible: pay attention to abbreviations of conference names, check validity of pointers to
– Have all terms been defined?
online material;
– Is the style of definition consistent? For example, were all new terms introduced in italics, or only some?
– necessary.
– Has terminology been used consistently?
• Three main styles:
– Are abbreviations and acronyms stated in full when first used? Are any abbreviations or acronyms intro-
duced more than once? – numbered references: [16,32,18]
– Are any abbreviations used less than, say, four times? If not, can they be removed? – named references: (Chen and Li 2005, Deutsch et at. 1997)
• Syntaxis: – uppercase abbreviations: [CL05, DPV97]
– Has a term been capitalized in one place and not in another‘? Which one to use is usually determined by the conference or format (IEEE, Springer, ACM, ...).
– Is the style and wording of headings and captions consistent? • Avoid making references the subject of a sentence:
– Are names always used in the same way?
– No: [18] shows that query answering . . .
– Is spelling consistent? What about “-ise” versus “-ize”, “dispatch” versus “despatch”, or “disc” versus
– Yes: Chen and Li [18] show that query answering . . .
“disk”?
– No: (Chen and Li 2005) shows that query answering . . .
– Is tense used correctly?
– Yes: Chen and Li (2005) show that query answering . . .
– Is the use of indentation consistent?
– Do the parentheses match?
• References:
– Are references discussed in a consistent way?
– Have bold and italic been used logically?
– In the references, has each field been formatted consistently?
• Experiments:
– Have units been used logically? If milliseconds have been used for some measurements and microseconds
for others, is there a logical reason for doing so?
– Has “megabyte” been written as “Mb” in some places and “Mbyte” in others?
– Are all values of the same type presented with the same precision?
– Are the graphs all the same size?
– Are the axis units always given? If, say, the x-axes on different graphs measure the same units, do the axes
have the same label?
– Are all tables in the same format?
– Are units given for every value?
– Are labels and headings named consistently?
– If, say, columns have been used for properties A to E in one table, have rows been used elsewhere? That
is, do all tables have the same orientation?
9 10

EISC UPC Coauthor Guide 2022-01 (8960)

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

EISC UPC Coauthor Guide 2022-01 (8960)

Uploaded by

Copyright:

Available Formats

LATEX

Table 1: A Simplified Picture.

Summary of contributions First Contribution

Figure 3: Context example

Figure 2: Introduction example

Figure 4: Main Contribution example

Figure 5: Related works example

Figure 7: Conclusions example

You might also like