Download as pdf or txt
Download as pdf or txt
You are on page 1of 1

Data-Driven Deep Reinforcement Learning for Animation

Summary: The paper proposes a deep reinforcement learning framework for training virtual
humanoid characters to perform various tasks, bootstrapping the learning with known poses. By
setting a reward function that rewards mimicking the known poses as well as a task objective,
the character can learn smooth motions in the context of the given task. This framework enables
the model to learn skills for different characters, environments, and tasks.

Strengths: This method seems to outperform previous methods in terms of realistic movement
and smoothness since the characters learn to mimic motion capture movement. The method is
also able to retarget the poses for different morphologies and train humanoid characters with
different body weight compositions to mimic similar movements. The framework enables
adaptation to various domain variations, such as differing heights and obstacles. Characters can
also learn to complete tasks while performing these various movements, making the characters
robust to various environmental requirements.

Weaknesses: The proposed framework relies heavily on hyperparameters for the reward
function. The coefficients for the task objective and mimicking the given poses are given by this
paper but may not be the most optimal for any given task. Additionally, the poses need to be
mapped to a point in time for some periodic cycle. Training time has also been noted to be on
the order of days, making it difficult to iterate and train on many tasks and variations.
Additionally, obtaining these poses would also be difficult and a non-trivial task, especially for
non-humanoid characters.

Reflection: The most insightful thought about this paper is that by using data from motion
capture or keyframe poses from animation, a character can be bootstrapped to learn these
movements in conjunction with some task objective as well, looking visually smoother than
previous methods. The next step would be to learn these poses from video data, predicting
keypoints and extracting them, and improving the model to be robust to the imperfect data.
Additionally, investigating more optimal ways to obtain the hyperparameters, or even
researching adaptable hyperparameters, may contribute to faster training and better results as
well.

You might also like