Summary

A multi-robot path-planning algorithm for autonomous navigation using meta-reinforcement
learning based on transfer learning
presents dynamic proximal meta policy optimization with covariance matrix adaptation
evolutionary strategies (dynamic-PMPO-CMA) to avoid obstacles and realize autonomous
navigation
Multi-robot coordination (MRC) has higher efficiency than single robot in search and rescue
(SAR) scenario
Traditional approaches utilize rule-based methods to assume perfect modeling of the

environment to plan path:
 Sampling-Based Robot Motion Planning: A Review

o Uncertainty in the perception stage leads to accumulated localization errors
o Processing of collected data and accounting for errors is essential for accurate
mapping and localization
o Path planning is a purely geometric process that is only concerned with
finding a collision free path regardless of the feasibility of the path
o Once a path is specified the final procedure is motion control or execution
o Traditional methods
 Exact roadmap methods such as visibility
graphs
 Voronoi diagram
 Delaunay triangulation [23],
 adaptive roadmaps [24] attempt to capture
the connectivity of the robot search space.
 Cell decomposition methods, in which the workspace is subdivided into
small cells, have been applied in robotics [25].
 Search algorithms such as Dijkstra [26] and A* [27] find an optimal
solution in a connectivity graph,
 whereas D* [28] and AD* [29] are tailored to dynamic graphs.
o novel computational methods
 Fuzzy Logic Control [33]–[35],
 Neural Networks [36],
 Genetic Algorithms [37], [38],
 Ant Colony Optimization [39] and
 Simulated Annealing [20] have all been applied in robot path planning.
 Randomized Potential Planner (RPP): RPP used random walks to escape local minima of
the potential field planner. Later on, a planner based entirely on random walks, with
adaptive parameters.
 Rapidly-exploring Random Trees (RRT)
 Probabilistic Roadmap Method (PRM)
 Expansive space trees (EST)
Sampling based planning overview
 Elman Fuzzy Adaptive Control for Obstacle Avoidance of Mobile Robots Using
Hybrid
Force/Position Incorporation
 addresses a virtual force field between mobile robots and obstacles to keep them away
with a desired distance.
 Elman neural network is proposed to compensate the effect of uncertainties between the
dynamic robot model and the obstacles.
 uses an Elman fuzzy adaptive controller to adjust the exact distance between the robot
and the obstacles
It is hard to model the dynamics of the environment because of the uncertainty in the
environment, such as observation noise or unknown dynamic obstacles
 EFFICIENT META REINFORCEMENT LEARNING VIA META GOAL
GENERATION
Meta reinforcement learning (meta-RL) is able to accelerate the acquisition of new tasks by
learning from past experience.
Given a number of tasks with similar structures, meta-RL methods enable agents learn such
structure from previous experience on many tasks. Thus, when encountering a new task,
agents can quickly adapt to it with only a small amount of experience.
 Recurrent and recursive recursive meta-RL method

 gradient-based meta reinforcement learning
To solve the case of a complex environment with the presence of humans and other
unpredictable moving objects:
 Multi-Path Planning for Autonomous Navigation of Multiple Robots in a Shared

Workspace with Humans
o in the case of a complex environment with the presence of humans and other
unpredictable moving objects, fixing a single path to the goal may lead to a
situation where there are a lot of obstacles on the planned path and the robots may
fail to realise the moving plan.
o new approach of using multiple path planning where each robot has different
options to choose its path to the goal is introduced
o The information about planned moving paths are shared among the robots in the
working domain, combined with obstacle avoidance constraints in local ranges,
and formulated as an optimisation problem.
 Multiple Path Planning:
o multi-agent path finding (MAPF)
o a novel multiple path planning approach, which can deal with an uncertain and
dynamic environment containing non-static objects such as humans and robots.
The presented method introduces effective means of a global planner for avoiding
deadlock situations to overcome the risk of congestion when multiple robots are
navigated through, relative to the robots, a narrow area. The combination of VO-
based method and common DWA planner allows robots to avoid collisions with
moving obstacles
where each robot has different options to choose its path to the goal.
Model-free approaches like reinforcement learning has been used to robot motion planning
problem [5], but it requires vast amounts of training data, which can result in low sample
efficiency.
 The Q-learning obstacle avoidance algorithm based on EKF-SLAM for NAO

autonomous walking under unknown environments
o The SLAM problem is solved with the EKF-SLAM algorithm whereas the path
planning problem is tackled via Q-learning.
o three classes of deep reinforcement learning algorithms:
 temporal-difference learning using Deep Q Networks [24],
 policy gradient using Trust Region Policy Optimization [33], and
 actor-critic using Deep Deterministic Policy Gradients [21]
Deep Q Learning combined with CNN (Convolution Neural Network) algorithm, to solve the
problem that robots need to search a comparatively wide area for navigation and move in a
predesigned formation under a given environment in conventional path planning algorithms.
Deep learning combined with reinforcement learning (DRL) has recently been used for a wide
range of tasks, such as games [5], robotics control [6,7] and navigation [8–14]
DRL methods are usually limited to a single scenario and fail to work without additional large amount
of training data if the target environment changes.
Meta-learning [16,17] is similar to human intelligence that can quickly learn a new task from a small
amount of new data.
Meta-learning integrated with reinforcement learning

(Meta-RL) can improve the generalization ability for new tasks
by previous experience from learning previous tasks with similar
structures.

Summary

Uploaded by

Copyright:

Available Formats

You might also like

Summary

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Summary

Uploaded by

Copyright:

Available Formats

A multi-robot path-planning algorithm for autonomous navigation using meta-reinforcement

learning based on transfer learning

Traditional approaches utilize rule-based methods to assume perfect modeling of the

 Sampling-Based Robot Motion Planning: A Review

Sampling based planning overview

 Recurrent and recursive recursive meta-RL method

 Multi-Path Planning for Autonomous Navigation of Multiple Robots in a Shared

 The Q-learning obstacle avoidance algorithm based on EKF-SLAM for NAO

Meta-learning integrated with reinforcement learning

You might also like