Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Computers and Electronics in Agriculture 163 (2019) 104827

Contents lists available at ScienceDirect

Computers and Electronics in Agriculture


journal homepage: www.elsevier.com/locate/compag

Original papers

Agbots: Weeding a field with a team of autonomous robots T


a,⁎ b c b
Wyatt McAllister , Denis Osipychev , Adam Davis , Girish Chowdhary
a
Department of Electrical and Computer Engineering, University of Illinois and UIUC Coordinated Science Lab (CSL), United States
b
Department of Agricultural and Biological Engineering and CSL, University of Illinois, United States
c
Department of Crop Sciences, University of Illinois, United States

ABSTRACT

This work presents a strategy for coordinated multi-agent weeding under conditions of partial environmental information. The goal is to demonstrate the feasibility of
coordination strategies for improving the weeding performance of autonomous agricultural robots. It is shown that, given a sufficient number of agents, a team of
autonomous robots can successfully weed fields with various initial seed bank densities, even when multiple days are allowed to elapse before weeding commences.
Furthermore, the use of information sharing between agents is demonstrated to strongly improve system performance as the number of agents increases.
As a domain to test these algorithms, a simulation environment, Weed World, was developed, which allows real-time visualization of coordinated weeding
policies, and includes realistic weed generation. In this work, experiments are conducted to determine the required number of agents for given initial seed bank
densities and varying allowed days before the start of the weeding process.

1. Introduction: the herbicide resistant weed problem areas between crop rows. Hand weeding of young weeds at the two-leaf
growth stage is difficult and impractical at scale. Mechanized inter-row
Weed management has historically relied on a combination of crop cultivation has disadvantages, such as soil compaction due to use of
rotation, mechanical weed control, and the use of a variety of herbi- heavy machinery, and an inability to work after the crop canopy closes.
cides (Shaner, 2014). The evolution of herbicide-resistant weeds, cou- Due to crop canopy growth, no current mechanical weed control
pled with the fact that new herbicide discovery has ceased in the past method is effective within the crop (Mohler et al., 1997). This work
30 years, has resulted in a crisis for agricultural weed management suggests that a team of collaborative low-cost and lightweight me-
(Heap, 2009; Maxwell et al., 1990). Current crop losses due to herbicide chanical weeding robots (termed here as agbots shown in Fig. 1) may be
resistant weeds are approximately half a billion per year, and may utilized to control herbicide-resistant weeds. The team of agbots targets
climb to $100 billion per year when chemical control is lost (Livingston weeds within and between crop rows, as opposed to tractors, combines,
et al., 2016). Evolution of resistance to multiple sites of herbicide action and planters, which cannot be used after the crop canopy closes. The
is accelerating in dominant weeds, especially in the southern and north- agbots are ideal for working in dense fields, since they are small enough
central U.S. grain production regions (Bagavathiannan and Norsworthy, to drive over plants without damaging them, and do not compact the
2016). Increasingly, farmers are only one site-of-action away from total soil as large machinery would.
loss of chemical control. For example, the five-way multiple resistant Termination of weed seedlings within the critical weed free period
waterhemp (Amaranthus tuberculatus [Moq.] Sauer) in Illinois is now (Booth et al., 2003), where crops are most vulnerable, is essential to
one gene away from total loss of chemical control (Evans, 2016). Seeds preventing crop yield losses in corn and soybeans (Page et al., 2012).
with bred-in herbicide resistance are exacerbating multiple herbicide For many crops, weeding may be done under a canopy, and therefore
resistance problem in soybean production (Gressel et al., 2017). An under conditions of partial environmental information. To address this
alternative to chemical weeding is mechanical weeding, which uses a issue, several companies, such as TerraSentia (TerraSentia), Ecorobotix
physical system, composed of farm equipment or autonomous vehicles. (Ecorobotix), and Naio-Technologies (Naio technologies), have devel-
Mechanical weed management usually targets young weeds, in- oped small agricultural robots for autonomous weeding. However, in
cluding germinating seeds and seedlings that are extremely vulnerable order for robots like these to be employed at scale in the agricultural
to physical damage. Before crop planting, superficial soil disturbance industry, multi-agent planning strategies applicable to real field en-
and subsequent soil cultivation can remove germinated weeds. vironments must be developed.
However, after planting, mechanical weed control is usually limited to The main goal of this paper is to evaluate whether a coordinated

Corresponding author.

E-mail addresses: wmcalli22@illinois.edu (W. McAllister), deniso2@illinois.edu (D. Osipychev), asdavis1@illinois.edu (A. Davis),
girishc@illinois.edu (G. Chowdhary).

https://doi.org/10.1016/j.compag.2019.05.036
Received 6 September 2018; Received in revised form 8 May 2019; Accepted 18 May 2019
Available online 18 June 2019
0168-1699/ © 2019 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license
(http://creativecommons.org/licenses/BY/4.0/).
W. McAllister, et al. Computers and Electronics in Agriculture 163 (2019) 104827

Fig. 1. This solution for robotic mechanical weed control is a dynamically configured team of weeder bots, and automated maintenance barns for persistent
autonomous weed control, leveraging collaboration, and local and remote data sources.

approach to mechanical robotic weeding is feasible in realistic field 1.1. Overview of methods used
environments. A comprehensive case study is presented, which utilizes
strategies for multi-robot coordination to create a scalable weeding This work was conceived with the goal of establishing the feasibility
solution, which leverages shared information between the agents to of a mechanical weeding solutions. In order to do this, a realistic si-
improve weeding performance, even when much of the field is un- mulation framework is created, which incorporates weed growth
observed. For this study, a realistic simulation framework is employed models from existing agricultural studies. This framework is used to test
to determine design criteria for real weeding robots in development. the performance of an algorithm for coordinated robotic weeding across
Foraging, where robots move through an environment and collect a range of field conditions. This simulated case study demonstrates that
objects or information, has long been considered a key problem in coordinated weeding strategies are feasible even when robots en-
multi-agent robotics (Cao et al., 1997). In this case, the foraging pro- counter a highly infested field and have some limitations on the ob-
blem is framed in terms of recognizing and killing weeds while moving servable data. This work does not incorporate Unmanned Aircraft
through the field. This work builds on past work in coordinated robotics Systems (UAS), as canopy closure in many crops such as corn limits
(Liu et al., 2017; Mataric, 1997; Balch, 1999; Mataric, 2001; Shoham aerial visibility of weeds for most of the growing season. Only a general
et al., 2003) to create a system for cooperative robotic weeding which set of ground robots is assumed, with a speed of motion of 1 m per
addresses the problem of partial environmental information without second.
relying on a separate agent for information gathering. In later work, this environment will be used to test algorithms for
In this work, the solution relies on optimization over a “reward” coordinated weeding which leverage predictive models to anticipate
metric (Bellman, 1957), chosen to be the total of the maximum height weed growth in real time. This work was done in collaboration with
of weeds in every 0.8 m2 region of the field. This ensures the system Earth Sense Incorporated, at the University of Illinois at Urbana-
eliminates weeds before they grow too large for the mechanical Champaign, which designs the Terra Sentia Robot, shown in Fig. 2a.
weeding process to deal with. One additional advantage of this me- This platform is currently used for real-time data collection in fields at
chanical weeding method is that killing the weeds when they are small the University of Illinois. A mechanical weeding apparatus for the Terra
prevents them from growing large enough to spread more seeds. Sentia is currently under development, prototype shown in Fig. 2b. This

Fig. 2. Prototypes of Terra Sentia Robot on the South Quad, and Weeding Apparatus at the greenhouse of Turner Hall at the University of Illinois at Urbana-
Champaign, used to estimate simulation parameters.

2
W. McAllister, et al. Computers and Electronics in Agriculture 163 (2019) 104827

apparatus uses a course bristle on a robot arm to break the root system
of young weed seedlings. It is expected to weed each square in under
two minutes, as long as weeds are less than 7.62 cm tall, so these
parameters were used in the simulation.

1.2. Formulation of the weeding problem

The weeding problem is considered within the framework of co-


ordinated multi-robot task allocation. Instead of using naive approaches
like off-the-shelf Reinforcement Learning (RL) applied on the full state
space, or a simple full sweep of the field which does not rely on agent
coordination, the problem is framed in a concise and efficient manner.
The state space is reduced by leveraging the row crop structure of the
field, and plan only over the rows of the field, to compute a centralized
one-step policy using a shared model of the environment, which is
updated at every iteration. This leads to vastly more efficient compu-
tation than the RL approach, while also providing comparable perfor-
mance to the case of a fully observed field, for the one-acre simulated
field environment considered, which the uncoordinated approach fails
to do.

1.3. Weed growth models

Past work has presented detailed analysis of weed growth models, in


which measurements of seed bank density for various species of weeds
were conducted (Nordby et al., 2007; Schutte and Davis, 2014; Sellers
et al., 2003; Horak and Loughin, 2000). In the weed growth model used
here, the rate of seedling emergence agrees with that found in the above
research. A Poisson process is utilized to model the temporal evolution
of emergence events. This assumption is reasonable over the short time
scales in which robots may fully weed a field.
Other work in the field of crop science (Mulugeta and Stoltenberg,
1997; Mulugeta and Boerboom, 1999) has shown that the spatial var-
iation in the seed bank density for some species of weeds may be Fig. 3. Simulation environment, Weed World, designed in JavaScript. In the
modeled via the Gini Coefficient of Concentration (GCC). This model is bottom portion of this figure, the vertical lines mark the boundaries between
the rows, where the crop is assumed to be planted. Each cell represents a small
accurate for common waterhemp (Amaranthus tuberculatus), and is thus
0.8 m square portion of the field. The colors of the squares represent weed
utilized in this work to govern the spatial evolution of the seed bank
density, with darker colors representing higher density. (For interpretation of
density. Furthermore, in (Davis et al., 2013; Werle et al., 2014), the
the references to color in this figure legend, the reader is referred to the web
relationship between seed bank emergence patterns and environmental version of this article.)
conditions such as temperature and moisture is examined.

1.4. Outline
in uncertain environments. In the bottom portion of this figure, the
Section 2 presents a detailed explanation of the algorithms used and vertical lines mark the boundaries between the rows, where the crop is
the simulation environment created to test them. Section 3 presents an assumed to be planted. The crops are assumed to be arranged in evenly
interpretation of the experiments conducted, along with their sig- spaced rows, which are parallel. This environment incorporates the
nificance. Section 4 presents further analysis of these results. Finally, weed growth model and planning algorithm detailed above, as well as a
Section 5 presents conclusions. framework for multi-agent collaboration, which enables a scalable
amount of agents to easily share information. This framework also al-
2. Methods lows the amount of environmental information available to agents to be
changed, in order to determine the effect on the algorithm performance.
This section details the methods used in this work. First, the weed Weed World allows Monte Carlo runs to be conducted, in order to
growth model is explained, as well as the state, action, and reward test the planning algorithm’s performance in fields with varying char-
models utilized. The optimization framework is then introduced, as well acteristics, and with agents with different capabilities, which have ac-
as the value function utilized for optimization, and the optimization cess to a varying amount of information about the environment.
algorithm used to solve the optimization problem. The algorithm for Through these Monte Carlo runs, the conditions under which full
targeted information gathering, which allows agents to simultaneously weeding of a given field is feasible are determined, informing the design
gather environmental information while performing coordinated considerations utilized in the development of real weeding robots, and
weeding, is explained. Finally, an outline of the experiments conducted thus expediting the design process.
in this work is presented. The goal of developing the Weed World simulation environment
was to enable future researchers to easily design and prototype algo-
2.1. Simulation environment rithms for coordinated multi-agent weeding in uncertain environments.
The stochastic and non-stationary nature of the evolution of the reward,
The simulation environment utilized for the above experiments, as governed by the growth of weeds in a dynamic environment, may
Weed World (shown in Fig. 3), was developed to allow large scale si- make this domain of strong theoretical interest to the robotic planning
mulations of coordinated weeding algorithms for multi-robot planning community.

3
W. McAllister, et al. Computers and Electronics in Agriculture 163 (2019) 104827

Table 1
Seed bank density parameters.
Parameter GCC S0 (seeds/cell) Num. of Patches Patch Size (Cells in X and Y)
Range [0.31, 0.35] [600, 1560] 50 [0, 20]

2.2. Weed generation RW (x , y , t ) = (x , y , t ) (6)

The weed growth model is based on Bernoulli random variables,


2.3. State and action model
operating on a matrix of cells, each 0.8 m2, comprising a gridded field of
0.4 hectares, or a cellular automata model (Chopard and Droz, 1998).
In the following equations, Ndim is the number of rows (85), Nagents is
Seeds emerge from a limited seed bank, forming a binomial distribution
the number of agents, Ylen. is the length of each row (64 m), RW (x , y , t )
over time. The parameters used, summarized in Table 1, are aligned
is the reward for each location (x , y ), and v is the velocity of the agents.
with the growth model for common waterhemp determined in
The environmental state, S, depends on the x and y positions of each
(Mulugeta and Stoltenberg, 1997). The initial density of the seed bank
agent in I. The action, a, is defined to be the target row chosen by each
in each cell is S0 on average (between 600 and 1560 seeds per cell).
agent. Here, Ndim is the number of rows in the field, 85, and Nagents is the
However, at the start of the simulation, the seed bank density in each
number of agents.
cell, S0 (x , y ) , is chosen so that Gini Coefficient of Concentration (GCC)
between all the cells, used to ensure the relative density of weeds aligns S {1, …, Ndim} × {1, …, Ndim} (7)
with that seen in real experiments, is from 0.31 to 0.35, as was de-
termined experimentally in (Mulugeta and Stoltenberg, 1997). The field
I {1, …, Nagents} (8)
is first divided into fifty patches of weeds, with centers chosen uni- (xi (t ), yi (t )) S i I (9)
formly at random, and sizes from zero to twenty cells in each dimen-
sion. Each cell has an initial density between zero and twenty percent of ai (t ) = {xi (t + 1), yi (t + 1)} A S (10)
S0 , chosen uniformly at random. Finally, for each patch of weeds, up to
The initial order of the state space is 85 × 85 × Nagents . The order of the
an additional S0 weeds are added to each cell within each patch, so that
state-action space is then (85 × 85 × Nagents )2 . In this problem, the policy
the density of each patch follows a normal distribution (see Table 1).
must be computed in real-time, and the fact that agents operate in a
Upon initialization of the simulation, a certain number of days, d 0 ,
dynamic and uncertain environment makes the problem non-stationary.
are allowed to elapse before weeding starts. Both parameters, S0 and d 0 ,
Therefore, a reinforcement learning approach which planned over the
are benchmarked against the number of agents in order to determine
whole state space would be too computationally intensive for on-line
the feasibility of mechanical weeding with the team of small robots. The
learning. Thus, the dimensionality of the problem is reduced, and an
number of emerging weeds in each cell, Nemerge , is a randomly generated
Poisson variable with mean, (x , y , t ) , such that 90 percent of the seed optimization-based approach over a one-step planning horizon is uti-
bank, S (x , y , t ) , emerges in Ttotal , which is two months. This emergence lized, instead of using reinforcement learning to attempt to compute the
rate is aligned with past work (Nordby et al., 2007; Schutte and Davis, policy over an extended horizon.
2014; Werle et al., 2014; Sellers et al., 2003; Horak and Loughin, 2000), |S × A| = (85 × 85 × Nagents ) 2 (11)
which all present measurements of the seed bank densities for various
species of weeds, and provide an analysis of weed growth models for Given the assumption that the problem is non-preemptive, and the
these species. capability of agents to observe neighboring rows, the dimensionality of
the problem is reduced. It is assumed that since agents finish rows once
0.9· t· S (x , y, t ) 0.9·d 0 ·S0 starting to weed them, only the x location is relevant for the state and
t (x , y , t ) = , 0 =
Ttotal Ttotal (1) action. The new size of the space is85 × Nagents .
Nemerge (x , y , t ) = Poi( t (x , y , t )) (2) S {1, …, Ndim} (12)
tcurrent
I {1, …, Nagents} (13)
S (x , y , t ) = S0 Nemerge (x , y, t )
t = t0 (3) x i (t ) S i I (14)
The weed density in each cell, (x , y , t ) , grows as seeds emerge from
ai (t ) = x i (t + 1) A S (15)
the seed bank. The maximum weed height at each cell, (x , y, t ) , in-
creases from zero height at a fixed rate cm per day.
tcurrent 2.4. Reward model
(x , y , t ) = Nemerge (x , y , t )
t = t last weeded (4) The reward is composed of the reward for each cell of weeds in the
field, RW (x , y , t ).
tcurrent tlast weeded inch
(x , y , t ) = S {1, …, Ndim} × {1, …, Ndim} (16)
60·60·24 day (5)
Due to limitations of mechanical weeding, it is highly important to I {1, …, Nagents} (17)
remove weeds before they become too large to be eliminated by the
RW (x , y , t ) (x , y ) S, i I (18)
specific weeding tools available to small agbots. We therefore define the
reward for weeding each cell, RW (x , y , t ) , to be equal to the maximum However, the agents plan only over the observed portion on the en-
height of weeds in the cell, (x , y, t ) . vironment. They keep track of the estimated density and maximum

4
W. McAllister, et al. Computers and Electronics in Agriculture 163 (2019) 104827

height for each observed cell, using this to estimate a total scalar re- Ti (xi (t ), ai (t )) R (a (t ))
i i
Vti (x i (t ), ai (t )) = ai (t ) = argmax Vti (x i (t ), ai (t ))
ward for each observed row. This is the only required information on Ti (xi (t ), ai (t )) t
ai (t )
t=0
the reward.
(24)
Ndim
Ri (ai (t )) = RW (ai (t ), y, t ) ai (t ) A {1, …, Ndim}
y=1 (19)
2.6. Information gathering trade-off

2.5. Optimization framework


The naive approach for information gathering is to simply go to the
next available adjacent unexplored row. In order to improve perfor-
In the optimization problem of interest, the total reward for each
mance, an approach which targets information gathering to ensure the
action is optimized, time discounted by the expected operation time to
largest increase in the total explored space is utilized.
complete that action. This value metric has long been used in robot
he average reward, R̄ , is computed as the sum of rewards for all
foraging tasks (Mataric, 1997). To expedite computation, a factored
agents from the time every row was last visited, texp., to the current
approach (Amato et al., 2013) is used, where the reward is an additive
time, divided by the total number of rows weeded since every row was
function of individual agent rewards.
last visited, Nrows weeded .
The planned operation time is the sum of the time it takes to move
to the proposed row, Tto row , the time it takes to move down it, Tdown row , tcurrent Nagents
Ri (ai (t ))
t = texp. i=0
and the time it takes to weed all the cells in the row, Tweed row . R¯ =
Nrows weeded (25)
Ti (x i (t ), ai (t )) = Tto row + Tdown row + Tweed row (20)
The information index of a row, I (ai (t )) , is the number of rows
(ai (t ) x i (t ))
Tto row = which would be explored by going to that row.
v (21)
robs
Ylen. I (ai (t )) =
Tto row = {is explored(x = ai (t ) + i )}
v (22) i = robs (26)

The quantity V¯ ti (x i (t ), ai (t )) is computed as the value function with


Ndim
Tweed row = max 2min,Tkill (x i (t ), y (t )) R̄ times I (ai (t )).
y (t ) = 0 (23)
¯
Ti (xi (t ), ai (t )) RI (ai (t ))
The Gittin’s Index, G (Xi ) , is known to being an optimal metric for V¯ ti (x i (t ), ai (t )) = Ti (xi (t ), ai (t ))
planning on tasks with an uncertain termination time (Weber, 1992). t=0
t
(27)
Here, Xi represents the total state of the field, including the weeds in
each row. The exploration value for each unexplored row is denoted by
Eti (xi (t ), ai (t )) , which is equal to the estimated value function for that
Eai [ tR (X (t )) Xi (0) = x i ]
G (Xi ) = sup t=0 i i row, V¯ ti (x i (t ), ai (t )) .
Eai [ t Xi (0) = xi ]
ai t=0
Eti (xi (t ), ai (t )) = V¯ ti (x i (t ), ai (t )) (28)
It is assumed that the termination time, , is equal to the planned
operation time, Ti (x i (t ), ai (t )) , and that the reward is delayed until that The algorithm then explores rows with exploration value greater
time. The Gitten’s Index, G (Xi ) , then becomes the following. than or equal to the maximum value for explored rows.

Ti (xi (t ), ai (t )) R (a (t ))
i i arg maxEti (xi (t ), ai (t )) argmax Vti (x i (t ), ai (t )) a i (t )
G (Xi ) = sup Ti (xi (t ), ai (t )) t ai (t ) ai (t )
ai
t=0
= argmax Eti (x i (t ), ai (t ))
The optimization algorithm plans across all the agents, evaluating ai (t ) (29)
the value for a transition from the agent’s current state to its proposed
If no rows have been explored, agents then go to the next available
new state. It plans a coordinated policy which sends each agent to the
adjacent unexplored row.
row with maximum value. It then assigns agents asynchronously to the
row with the highest value when they query the planner after com- Algorithm 1. Algorithm for Computing the Policy: ”Coordinated
pleting a row. Information Gathering”

5
W. McAllister, et al. Computers and Electronics in Agriculture 163 (2019) 104827

Table 2
Initial Experimental Parameters: Here, robs is the observation radius, Nagent is the number of agents, d0 is the days of allowed weed growth before weeding, and S0 is the
initial seed bank density. An X denotes a parameter for a Monte Carlo run over the ranges shown in the last column. Here, 1080 is chosen as the median between 600
and 1560, the real-world range determined in (Mulugeta and Stoltenberg, 1997). The values of 15 for Nagent , and 1 for d0 , are chosen to ensure there are enough
agents to weed the field, and that the robots have enough time to do so before the weeds grow too tall.
Exp. 1 2 3 4 5 6 7 8 9 Range

robs 1 0 1 0 1 0 [0, ]
Nagent 15 15 15 X X X X X X [5, 20]
d0 (days) 1 1 1 1 1 1 X X X [1, 6]
S0 (seeds/cell) 1080 1080 1080 X X X 1080 1080 1080 [600, 1560]

2.7. Plan for computational experiments the ranges shown in the last column. Here, for S0, 1080 is chosen as the
median between 600 and 1560, the real-world range determined in
Nine experiments are conducted, each with 100 trials with varying (Mulugeta and Stoltenberg, 1997). For Experiments 1–3, we used the
initial parameters shown in Table 2, in a simulated field of 0.4 hectares, values 15 for Nagent , and 1 for d 0 . These values are chosen to ensure
gridded in 0.8 m2 cells. Each trial is run for 4 days of simulated time. there are enough agents to weed the field, and that the robots have
The algorithm is run in the case of full environmental information, enough time to do so before the weeds grow too tall. The mean and
robs = , where the planner has complete knowledge of the reward for standard deviation of the average reward for the environment (total
each cell within the environment, in order to establish an ideal reward over all the cells) is plotted at each time step. This provides a
benchmark. The algorithm is then run on the worst case scenario in clear comparison between the different cases (robs = , 1, 0 ), and
terms of the information the planner has, where the observation radius, showcases the affect of environmental information on weeding per-
robs = 0 . In this case, the planner effectively runs without the benefit of formance. In Experiments 4–6, Monte Carlo runs are performed over a
coordinated information gathering. Finally, the algorithm is run with an range of values for the parameters Nagent , and S0 , for nominal values of
observation radius, robs = 1, to see how performance improves in the the parameter d 0 = 1. A heat map of the terminal reward (total reward
case of partial environmental information when information about over all the cells at the end of the simulation) is plotted for each set of
neighboring rows is used. For these experiments, Monte Carlo runs are parameters. This reveals how the number of agents affects performance
conducted to determine the feasibility of the method with respect to the for fields with various seed bank densities. Finally, in Experiments 7–9,
change in the number of agents, Nagent , the days allowance, d 0 , and the Monte Carlo runs are performed over a range of values for the para-
initial seed bank density, S0 . meters Nagent , and d 0 , for nominal values of the parameter S0 = 1080 ,
Here, robs is the observation radius, Nagent is the number of agents, d 0 and the results are displayed in the same manner. This shows the effect
is the days of allowed weed growth before weeding, and S0 is the initial of the number of agents on the performance for fields with various days
seed bank density. An X denotes a parameter for a Monte Carlo run over allowance. In all experiments, the robots are assumed to plan utilizing a

Fig. 4. Weeding Performance vs. Environmental Information: The weeding performance over time is plotted for the case of full environmental information,
partial environmental information, robs = 1, and zero environmental information, robs = 0 . The case ofrobs = 1is able to weed the entire field, converging to a nominal value
of 300 for the terminal reward. The case ofrobs = 0 shows a significant drop in performance.

6
W. McAllister, et al. Computers and Electronics in Agriculture 163 (2019) 104827

Fig. 5. Number of Agents vs. Seed Bank Density,robs = : The contour map of the terminal reward for 100 trials with number of agents and seed bank density is
shown. The red end of the spectrum represents a terminal reward of zero, meaning the field has been weeded completely, and the blue end represents a high nonzero
terminal reward, a strong failure case. The blue circles represent the data points and the black circles represent the maximum seed bank density the system can weed
for each number of agents. The connecting line was added to help distinguish the black and blue circles. With more than 8 agents, the case ofrobs = is able to weed the
field for the entire range of seed bank densities, performing equally well for any seed bank density. (For interpretation of the references to color in this figure legend, the
reader is referred to the web version of this article.)

global environmental model, which contains the current location of all represent the maximum seed bank density the system can weed for each
the agents, as well as all the information available to the agents at a number of agents. The connecting line was added to help distinguish
given time. the black and blue circles.
In each experiment, as the seed bank density, S0 , increases, more
3. Results agents are needed to completely weed the field. As seen in Fig. 5, for the
case of robs = , observe that with more than 8 agents, seed bank
3.1. Experiments 1–3 densities of up to 1560 seeds per cell can be handled by the system. As
evidenced by Figs. 6 and 7 agents are required to weed fields with the
Comparison of Partial Environmental Information and Zero same maximum seed bank density, for the cases of robs = 1 and robs = 0
Information Cases With the Case of Full Environmental respectively. For the case of robs = 0 , Fig. 7 shows that the performance
Information for a given number of agents is affected more strongly by the seed bank
As seen in Fig. 4, partial environmental information, robs = 1, im- density. However, there is still a strong positive correlation between the
proves performance over the case of zero information, robs = 0 . The case initial seed bank density and the required number of agents for this
ofrobs = 1is able to weed the entire field, converging to a nominal value density. These results show that given enough agents, though information
of 300 for the terminal reward. In the case of zero information, robs = 0 , sharing improves performance, the system will succeed in weeding the field
when weeds become sparse, new weeds grow faster than the planner for a the entire range of seed bank densities and amounts of environmental
with robs = 0 can find and kill them. This is evidenced by the significant information tested.
drop in performance from the case of robs = 1. For these experiments,
nominal values for the seed bank density (1080 seeds per cell), days 3.3. Experiment 7–9
allowance (1 day), number of agents (15), and agent speed (1 m per
second), were used. Performance for Varying Number of Agents and Days
Allowance:S0 = 1080 seeds/cell withrobs = , 1, 0
3.2. Experiments 4–6 In Experiments 7–9, contour maps of the terminal reward for 100
trials with number of agents and days allowance are shown show the
Performance for Varying Number of Agents and Seed Bank cases of robs = , 1, 0 . The red end of the spectrum represents a term-
Density:d 0 = 1day, withrobs = , 1, 0 inal reward of zero, meaning the field has been weeded completely, and
In Experiments 4–6, contour maps of the terminal reward for 100 the blue end represents a high nonzero terminal reward, a strong failure
trials with number of agents and seed bank density are shown show the case. The blue circles represent the data points and the black circles
cases of robs = , 1, 0 . The red end of the spectrum represents a term- represent the maximum days allowance the system can weed for each
inal reward of zero, meaning the field has been weeded completely, and number of agents. The connecting line was added to help distinguish
the blue end represents a high nonzero terminal reward, a strong failure the black and blue circles.
case. The blue circles represent the data points and the black circles As days allowance, d 0 , increases, more agents are needed to

7
W. McAllister, et al. Computers and Electronics in Agriculture 163 (2019) 104827

Fig. 6. Number of Agents vs. Seed Bank Density,robs = 1: The contour map of the terminal reward for 100 trials with number of agents and seed bank density is
shown. The red end of the spectrum represents a terminal reward of zero, meaning the field has been weeded completely, and the blue end represents a high nonzero
terminal reward, a strong failure case. The blue circles represent the data points and the black circles represent the maximum seed bank density the system can weed
for each number of agents. The connecting line was added to help distinguish the black and blue circles. With more than 10 agents, the case ofrobs = 1is able to weed the
field for the entire range of seed bank densities, performing equally well for any seed bank density. (For interpretation of the references to color in this figure legend, the
reader is referred to the web version of this article.)

completely weed the field. As the shown in Figs. 8–10, for each case of before the weeding process starts. However, as more agents are added,
robs = , 1, 0 with more than thirteen agents, the system can handle a performance continues to improve even when the system fails. This
days allowance up to 2, when weeds start to grow higher than the suggests that with enough agents, all weeds that are not too tall to be
maximum height which the system is capable of weeding, and the eliminated at the start of the simulation may be weeded by the system.
system starts to fail for any number of agents. These results show that the As summarized in Tables 3 and 4, a description of the hardware and
system cannot handle more than 2 days of missed work under any amount of system characteristics found to be critical to the design of the co-
environmental information, given the fact that it can only kill weeds which ordinated weeding system is presented. First, for the weeding appa-
are smaller than7.62 cm. ratus, the parameters of two minutes, and 7.62 cm, were chosen for the
maximum time to weed each cell, and the maximum allowed height for
3.4. Summary of results weeds killed, respectively. These conservative estimates were made
based on the prototype for the weeding apparatus in development.
In Experiments 1–3, it is found that the use of information sharing Decreasing the maximum weeding time, and increasing the maximum
between the agents significantly improves performance. In Experiments weed height would improve system performance. Also, in previous
4, it is found that in the fully observed case, robs = , the system can work (McAllister et al., 2018), the speed of the robot was found to not
weed fields with any of the seed bank densities tested without a drop in affect system performance, as it becomes unimportant once the field is
performance as long as more than 8 agents are used. This fully observed fully infested with weeds. It was determined in this work that the
case corresponds to the real world scenario when there are enough system cannot handle more than two days of weed growth before the
external sensors in the field, confirming that coordination provides a weeding process starts. This is due to the hardware constraints above,
feasible solution to the weeding problem when agents have enough as well as the model developed for weed growth using agricultural
information available. In Experiment 5, it is found that for the case of studies. This constraint would change if the maximum allowed height
robs = 1, with more than 10 agents, the system exhibits comparable for weeds was increased. Finally, the contribution of this work was to
performance to the fully observed scenario, while it takes 12 agents to show that the system using coordination can successfully weed fields
achieve this for the case of robs = 0 . This suggests that even without full with any average seed bank density, as long as weeds are not initially to
environmental observation, information sharing strategies can be uti- high for the system to kill. This is true for the range of seed bank
lized to allow the system to adapt to a rage of field environments, as densities considered, which is realistic across a range of weed species
long as additional agents are used. However, for robs = 0 , the perfor- studied by crop scientists.
mance is more strongly affected by the seed bank density, verifying the
intuition that information sharing between agents significantly im- 4. Discussion: elasticity analysis and design criteria
proves system performance.
In Experiments 7–9, it is seen that with more than 13 agents, fields For each of the Monte Carlo runs, we show the graph of the elas-
with a days allowance of up to 2 days may be weeded in every case. ticity of the maximum seed bank density and days allowance to the
Beyond this point, weeds grow too large for the system to eliminate number of agents, determined by the black data points. This measure

8
W. McAllister, et al. Computers and Electronics in Agriculture 163 (2019) 104827

Fig. 7. Number of Agents vs. Seed Bank Density,robs = 0 : The contour map of the terminal reward for 100 trials with varying numbers of agents and seed bank
density is shown. The red end of the spectrum represents a terminal reward of zero, meaning the field has been weeded completely, and the blue end represents a high
nonzero terminal reward, a strong failure case. The blue circles represent the data points and the black circles represent the maximum seed bank density the system
can weed for each number of agents. The connecting line was added to help distinguish the black and blue circles. With more than 12 agents, the case ofrobs = 0 is able to
weed the field for the entire range of seed bank densities. Though the system performance is more strongly affected by the seed bank density than in Experiment 4, the system still
performs well with enough agents. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

was chosen because the fact that it is normalized to be at most positive This means that agents are assigned tasks at different times, agents have
one gives an easy way to compare the relative affect of parameters identical capabilities, and agents are assigned task from a centralized
across different experiments. This was calculated according to Eq. (30). planner. This work also presents the problem domain of foraging, where
We use Yi , Yi 1 as the parameters of interest (either the seed bank robots move through an environment and collect objects or informa-
density or days allowance), and Xi , Xi 1 are the number of agents, for tion. In this case, the foraging problem is framed in terms of recognizing
data points i , i 1 respectively. and killing weeds while moving through the field.
Past work has explored multi-robot task allocation (MRTA) in sto-
(Yi Yi 1)/ Yi 1 chastic domains (Niko-Mora, 2001; Frazzoli and Bullo, 2004; Munoz-
SYi Xi =
(Xi Xi 1)/ Xi 1 (30) Melendez et al., 2012), leveraging both spatial constraints and pre-
dictive information to perform optimization. In (Gerkey and Mataric,
The elasticity analysis of Experiments 4–9 is presented below. As
2004), a formal taxonomy of MRTA is presented. Three major distinc-
shown in Fig. 11, for Experiments 4–6, the maximum seed bank density
tions are made. The first is between single-robot problems, where each
the system can weeds for the case of robs = , and robs = 1 is much more
pool of tasks is managed by a separate robot, and multi-robot problems,
elastic to the number of agents than the case of robs = 0 . This is as ex-
in which each task pool is shared between multiple robotic agents. This
pected, since information sharing between agents causes performance
problem is a multi-robot problem, as all agents cooperate to weed the
to increase more drastically when new agents are added. Also, as seen
field together in order to complete the weeding task more efficiently.
in Fig. 12, for Experiments 7–9, in every case, robs = 0, 1, , after
This approach allows agents to adapt to changes in the environment,
2 days, weeds grow higher than the system can successfully eliminate,
working together on regions with more weeds.
and therefore this parameter is only elastic to the number of agents
The next distinction is between preemptive task allocation, in which
until this point.
optimization is performed continuously in an on-line manner, and
agents may take over another agent’s task, or switch to another task
4.1. Rationale for coordinated robotic planning algorithm design before completion, and non-preemptive task allocation, in which tasks
must be completed before a new task is assigned. A non-preemptive
A formal taxonomy of coordinated robotics and multi-agent task planning strategy is used, ensuring rows are completed before an agent
allocation is presented. The algorithm used in this work is grounded in is assigned a new row. This allows agents to plan once the task has been
this body of literature by explaining where the method fits within this completed, allowing them to focus on navigation and plant recognition
taxonomy. In (Cao et al., 1997), a formal taxonomy of cooperative while in the row.
robotic planning is presented. According to this taxonomy, the problem Another distinction is between single-agent tasks, where each task is
considered here is an asynchronous, homogenous, centralized problem. performed by one agent, and multi-agent tasks, in which each task must

9
W. McAllister, et al. Computers and Electronics in Agriculture 163 (2019) 104827

Fig. 8. Number of Agents vs. Days Allowance,robs = : The contour map of the terminal reward for 100 trials with varying numbers of agents and days allowance
is shown. The red end of the spectrum represents a terminal reward of zero, meaning the field has been weeded completely, and the blue end represents a high
nonzero terminal reward, a strong failure case. The blue circles represent the data points and the black circles represent the maximum days allowance the system can
weed for each number of agents. The connecting line was added to help distinguish the black and blue circles. With more than thirteen agents, the case ofrobs = is able
to weed any field, if the days allowance is less than 2, when weeds initially grow larger than the system can eliminate. (For interpretation of the references to color in this
figure legend, the reader is referred to the web version of this article.)

be performed by multiple agents. Here, each robot is assigned to one model. In this environment, the field is discretized into a grid world of
row, so this problem is a single-agent task scenario, with multiple 85 rows, 0.8 m wide, totaling 4047 m2, or 0.4 hectares. The simulation
agents collaborating to complete a pool of single-agent tasks. environment allows efficient determination of design heuristics which
In (Gerkey and Mataric, 2004), the problem of time-extended on- will inform implementations of coordinated weeding systems used in
line assignment, in which multiple robots pick single-agent tasks from a real field experiments, and will enable other researchers to test their
pool larger than the number of agents, and complete them in a non- own algorithms in the same framework.
preemptive manner, is considered. This algorithm is an implementation
of that proposed in (Gerkey and Mataric, 2004) for the time-extended
on-line assignment problem, which initially assigns each robot to the 5. Conclusions
most suitable task, and then assigns robots to the most suitable task
from the pool as they become available. This research demonstrates that a scale-neutral approach to co-
ordinated multi-agent weeding in uncertain environments, utilizing a
varying number of robots for different agricultural applications, can
4.2. Contributions adapt to fields with varying seed bank densities. This approach out-
performs the case in which shared information is not used between
This work presents the first comprehensive case study demon- agents, and performs equally well on fields of any seed bank density
strating that collaborative robotic weeding is feasible in uncertain en- tested, once more than 10 agents are used. Since the performance does
vironments with a high weed density. We develop a coordinated not drop when the seed bank density increases, it appears that once
weeding method, and show that it succeeds in weeding simulated fields enough agents are used to successfully weed the field, the algorithm
with a range of initial seed bank densities, and a varying number of will exhibit a similar trend even for fields with greatly increased seed
allowed days of weed growth before weeding commences. The planner bank densities. These results show clear improvement in performance
is able to succeed in every case, as long as enough agents are utilized, for an increased number of agents, demonstrating the usefulness of
and the weeds have not grown too large for the robot to kill before the coordinated weeding strategies for weeding fields which agents would
weeding process starts. not be able to complete on their own. In simulated trials with increasing
To efficiently test algorithms for coordinated weeding, and their seed bank density, it is shown that a larger number of agents are not
performance change with respect to various parameters over time, the only able to fully weed the field when smaller teams cannot do so, but
experiments are performed in a custom simulation environment, Weed that they can drive down the weed population even after the field has
World (shown in Fig. 3.), which enables real-time visualization of co- become fully infested, with some weeds larger than the system is able to
ordinated weeding policies, and incorporates a realistic weed growth kill. These results clearly show that multi-robot coordination is not just

10
W. McAllister, et al. Computers and Electronics in Agriculture 163 (2019) 104827

Fig. 9. Numbers of Agents vs. Days Allowance,robs = 1: The contour map of the terminal reward for 100 trials with varying number of agents and days allowance is shown.
The red end of the spectrum represents a terminal reward of zero, meaning the field has been weeded completely, and the blue end represents a high nonzero terminal reward, a
strong failure case. The blue circles represent the data points and the black circles represent the maximum days allowance the system can weed for each number of agents. The
connecting line was added to help distinguish the black and blue circles. With more than thirteen agents, the case ofrobs = 1is able to weed any field, if the days allowance is less than 2,
when weeds initially grow larger than the system can eliminate. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this
article.)

Fig. 10. Numbers of Agents vs. Days Allowance,robs = 0 : The contour map of the terminal reward for 100 trials with varying number of agents and days allowance is shown.
The red end of the spectrum represents a terminal reward of zero, meaning the field has been weeded completely, and the blue end represents a high nonzero terminal reward, a
strong failure case. The blue circles represent the data points and the black circles represent the maximum days allowance the system can weed for each number of agents. The
connecting line was added to help distinguish the black and blue circles. With more than thirteen agents, the case ofrobs = 0 is able to weed any field, if the days allowance is less than 2,
when weeds initially grow larger than the system can eliminate. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this
article.)
11
W. McAllister, et al. Computers and Electronics in Agriculture 163 (2019) 104827

Table 3
Hardware characteristics.
Property Rational

Maximum Time to Weed One Cell (TKILL ) Based on current prototypes for the robotic weeding apparatus, a conservative estimate of two minutes for this parameter was made.
Decreasing it would certainly improve performance
Maximum Allowed Height for Weeds Based on current prototypes for the robotic weeding apparatus, a conservative estimate of 7.62 cm for this parameter was made.
Killed Increasing it would certainly improve performance
Maximum Robot Speed As shown in previous work (McAllister et al., 2018), this becomes unimportant once the field is fully infested with weeds

Table 4
System characteristics.
Property Rational

Maximum Number of Days the System Can Go Without Given hardware constraints, as well as the model developed for weed growth using agricultural studies, the hard limit is
Weeding Field two days, when weeds initially become too tall for the system to handle. If the maximum allowed height for weeds
killed was increased, this would change
Maximum Average Seed Bank Density the System Can The contribution of this work was to show that the system using coordination can successfully weed fields with any
Weed average seed bank density, as long as weeds are not initially to high for the system to kill. This is true for the range of
seed bank densities considered, which is realistic across a range of weed species studied by crop scientists

Fig. 11. Number of Agents vs. Seed Bank Density: The plot of the elasticity of the maximum seed bank density the system can weed with respect to the number of
agents. In the case ofrobs = androbs = 1, the maximum seed bank density is elastic to the number of agents, since information sharing strongly improves performance. In the
case ofrobs = 0 , the maximum seed bank density is far less elastic to the number of agents, since information sharing between agents is not used, and thus does not cause as large a
change in system performance.

12
W. McAllister, et al. Computers and Electronics in Agriculture 163 (2019) 104827

Fig. 12. Numbers of Agents vs. Days Allowance: The plot of the elasticity of the maximum days allowance the system can weed with respect to the number of
agents utilized. In all cases,robs = , 1, 0 , the maximum days allowance is inelastic to the number of agents. After enough agents are used, the system can weed fields with days
allowance up to 2 days.

useful for coordinated weeding, but is in fact essential, and will be a Seed burial physical environment explains departures from regional hydrothermal
central part of mechanical weeding solutions to the weeding crisis. model of giant ragweed (Ambrosia trifida) seedling emergence in us midwest. Weed
Sci. 61 (3), 415–421.
ecorobotix, Technology for the environment. https://www.ecorobotix.com/en/.
Acknowledgements Evans, C.M., 2016. Characterization of a Novel Five-way-resistant Population of
Waterhemp (Amaranthus tuberculatus) (Ph.D. thesis). University of Illinois at Urbana-
Champaign.
Support provided for this research by the joint USDA National Frazzoli, E., Bullo, F., 2004. Decentralized algorithms for vehicle routing in a stochastic
Institute of Food and Agriculture and the National Science Foundation time-varying environment. In: Decision and Control, 2004. CDC. 43rd IEEE
Cyber Physical Systems program (USDA NIFA 2018-67007-28379). Conference on, vol. 4. IEEE, pp. 3357–3363.
Gerkey, B.P., Mataric, M.J., 2004. A formal analysis and taxonomy of task allocation in
Furthermore, thanks is extended to all the collaborators on this project
multi-robot systems. Int. J. Robot. Res. 23 (9), 939–954.
whose past work in weed recognition, design of the mechanical Gressel, J., Gassmann, A.J., Owen, M.D., 2017. How well will stacked transgenic pest/
weeding system, and field navigation, has inspired this work. This work herbicide resistances delay pests from evolving resistance? Pest Manag. Sci. 73 (1),
22–34.
extends that submitted to International Conference on Intelligent
Heap, I., 2009. The international survey of herbicide resistant weeds. http://www.
Robots (IROS) 2018 (McAllister et al., 2018). weedscience.org.
Horak, M.J., Loughin, T.M., 2000. Growth analysis of four amaranthus species. Weed Sci.
Appendix A. Supplementary material 48 (3), 347–355.
Liu, M., Sivakumar, K., Omidshafiei, S., Amato, C., How, J.P., 2017. Learning for multi-
robot cooperation in partially observablestochastic environments with macro-actions.
Supplementary data associated with this article can be found, in the Computing Research Repository.
online version, at https://doi.org/10.1016/j.compag.2019.05.036. Livingston, M., Fernandez-Cornejo, J., Frisvold, G.B., 2016. Economic returns to herbicide
resistance management in the short and long run: the role of neighbor effects. Weed
Sci. 64 (sp1), 595–608.
References Mataric, M.J., 1997. Reinforcement learning in the multi-robot domain. Auton. Robot. 4
(1), 73–83.
Mataric, M.J., 2001. Learning in behavior-based multi-robot systems: policies, models,
Amato, C., Chowdhary, G., Geramifard, A., Ure, N.K., Kochenderfer, M.J., 2013.
and other agents. Cogn. Syst. Res. 2 (1), 81–93.
Decentralized control of partially observable markov decision processes. In: The 52nd
Maxwell, B.D., Roush, M.L., Radosevich, S.R., 1990. Predicting the evolution and dy-
IEEE Conference on Decision and Control (CDC).
namics of herbicide resistance in weed populations. Weed Technol. 4 (1), 2–13.
Bagavathiannan, M.V., Norsworthy, J.K., 2016. Multiple-herbicide resistance is wide-
McAllister, W., Osipychev, D., Chowdhary, G., Davis, A., 2018. Multi-agent planning for
spread in roadside palmer amaranth populations. PloS One 11 (4), e0148748.
coordinated robotic weed killing. In: Intelligent Robots and Systems (IROS), 2018
Balch, T., 1999. The impact of diversity on performance in multi-robot foraging. In:
IEEE/RSJ International Conference on. IEEE.
Proceedings of the Third Annual Conference on Autonomous Agents. ACM, pp.
Mohler, C.L., Frisch, J.C., Pleasant, J.M., 1997. Evaluation of mechanical weed man-
92–99.
agement programs for corn (zea mays). Weed Technol. 11 (1), 123–131.
Bellman, R., 1957. A markovian decision process. J. Math. Mech. 679–684.
Mulugeta, D., Boerboom, C.M., 1999. Seasonal abundance and spatial pattern of Setaria
Booth, B.D., Murphy, S.D., Swanton, C.J., 2003. Weed Ecology in Natural and
faberi, Chenopodium album, and Abutilon theophrasti in reduced-tillage soybeans. Weed
Agricultural Systems. CABI Pub.
Sci. 95–106.
Cao, Y.U., Fukunaga, A.S., Kahng, A., 1997. Cooperative mobile robotics: antecedents and
Mulugeta, D., Stoltenberg, D.E., 1997. Seed bank characterization and emergence of a
directions. Auton. Robot. 4 (1), 7–27.
weed community in a moldboard plow system. Weed Sci. 54–60.
Chopard, B., Droz, M., 1998. Cellular Automata. Springer.
Munoz-Melendez, A., Dasgupta, P., Lenagh, W., 2012. A stochastic queueing model for
Davis, A.S., Clay, S., Cardina, J., Dille, A., Forcella, F., Lindquist, J., Sprague, C., 2013.
multi-robot task allocation. In: International Conference on Informatics in Control,

13
W. McAllister, et al. Computers and Electronics in Agriculture 163 (2019) 104827

Automation and Robotics, pp. 256–261. Technol. 28 (2), 408–417.


N. Technologies, Autonomous weeding for agricultural robots - naio technologies. Sellers, B.A., Smeda, R.J., Johnson, W.G., Kendig, J.A., Ellersieck, M.R., 2003.
https://www.naio-technologies.com/en/. Comparative growth of six amaranthus species in Missouri. Weed Sci. 51 (3),
Niko-Mora, J., 2001. Stochastic scheduling. Updated version of article in Encyclopedia of 329–333.
Optimization. In: Floudas, C.A., Pardalos, P.M. (Eds.), Kluwer, 2001. (2005). URL Shaner, D.L., 2014. Lessons learned from the history of herbicide resistance. Weed Sci. 62
http://halweb.uc3m.es/jnino/eng/pubs/ssche.pdf. (2), 427–431.
Nordby, D., Hartzler, R., Bradley, K., 2007. Biology and Management of Waterhemp, Shoham, Y., Powers, R., Grenager, T., 2003. Multi-agent reinforcement learning: a critical
Glyphosate, Weeds, and Crop Sciences, Purdue University Extension, Publication survey, Web Manuscript. http://jmvidal.cse.sc.edu/library/shoham03a.pdf.
GWC-13.12. TerraSentia, Terrasentia robot - earthsense, inc. https://www.earthsense.co/home/.
Page, E.R., Cerrudo, D., Westra, P., Loux, M., Smith, K., Foresman, C., Wright, H., Weber, R., 1992. On the Gittins index for multiarmed bandits. Ann. Appl. Probab.
Swanton, C.J., 2012. Why early season weed control is important in maize. Weed Sci. 1024–1033.
60 (3), 423–430. Werle, R., Sandell, L.D., Buhler, D.D., Hartzler, R.G., Lindquist, J.L., 2014. Predicting
Schutte, B.J., Davis, A.S., 2014. Do common waterhemp (Amaranthus rudis) seedling emergence of 23 summer annual weed species. Weed Science.
emergence patterns meet criteria for herbicide resistance simulation modeling? Weed

14

You might also like