Professional Documents
Culture Documents
cắt cây
cắt cây
cắt cây
controller
Alexander You1 , Hannah Kolano1 , Nidhi Parayil1 , Cindy Grimm1 , Joseph R. Davidson1
Start
Contact Wrench measurement Wrench Yes End
(cutters ~15 cm No
detected? target met? (execute cut)
from target)
No
Vision Control
Tool Tool
velocity Wrench Admittance velocity
measurement Control
GAN
which each object is assigned one of 5000 random syn- a large penalty of −T . Otherwise, the reward is a value
thetic textures and the lighting’s source, color, and intensity between 0 and δt depending on how close the cutter is to the
have been randomized. We generate 10,000 randomized- target and a minimum distance threshold dmin = 0.10 m:
segmented image pairings and utilize the pix2pix architec-
ture [20] to learn how to translate the randomized images to
dt+1
their segmented form. Examples of the trained GAN working R(st+1 ) = δt max 1 − ,0 (1)
dmin
on different images are shown in Figure 5.
For training in simulation, we set T = 1 s and δt = 0.10 s
C. Reinforcement learning (Nt = 10). To encourage fast episode completion, we set the
We now formally define a pruning episode, as well as forward velocity equal to s = 0.30 m/s. For the real robot
the state space S, the action space A, the reward function arm, we limited the forward velocity to s = 0.03 m/s.
R, and the transition function T which define a Markov We use the Proximal Policy Optimization (PPO) algo-
Decision Process (MDP). Each episode is given a maximum rithm [21] from the stable-baselines3 package [22]
time length T s with a time interval of δt s, resulting in a to learn the desired control actions. We use the default
maximum number of timesteps of Nt = δTt . At the start, we hyperparameters and deep neural network architecture. This
select a target cut pcut on a random spindle and position the consists of a feature extractor with three convolutional layers
cutter 15-20 cm in front of pcut . and one feedforward layer with ReLU activation. The action
At each timestep we obtain a 424 x 240 segmented image and value networks share the output of the feature extractor
from the GAN. Since the cutter and branch are in the right of and process the features through their own feedforward
the image, we crop the bottom-right 360 x 180 of the image layers to obtain the velocity command and value estimates.
and rescale it to 160 x 80. Therefore, our state space is the V. A DMITTANCE CONTROLLER
set of 160 x 80 3-channel images (S = Im160×80×3 ).
The segmented image is fed through a deep neural network Once a force exceeding 0.75 N is detected, we assume
which produces two numeric values, ax , ay ∈ [−1, 1], that the target branch has contacted a cutting surface and
representing the velocity outputs (A = [−1, 1] × [−1, 1]). transition to an interaction controller. Regulating the dynamic
By default, the cutter moves in the tool frame’s z-axis at interaction with the branch is critical for two reasons. First,
a speed of s m/s. The chosen action values modify the x we want to minimize contact forces to prevent damage to
and y components of this velocity vector, such that the final the tool/robot. Second, to deliver maximum cutting force, the
velocity vector is →
−a = hax s, ay s, si. The transition function branch should be located as close to the blade pivot point as
T maps to the next state by setting the tool velocity to → −
a possible (see Figure 6).
and running the simulator forward for δt s. The contact task is characterized by interaction between
The reward function R is engineered to reward the cutter a ‘stiff’ industrial manipulator coupled with an elastically
for approaching the target cut point, giving more reward as compliant branch. We choose to use admittance control:
the cutter approaches the target, while heavily penalizing the robot senses external forces using its internal wrist
attempts which end up missing the cutters. The total reward force-torque sensor and controls its motion in response. An
per episode ranges between −T and T . Each timestep’s admittance controller imitates the dynamics of a spring-mass
reward depends only on the state sn+1 reached after applying damper in the robot’s task space, as described by Equation 2:
action an at state sn . At each new state sn+1 , we check for
one of three terminal conditions: M ẍc + B ẋ + Kx = Fext (2)
• (Success) The branch has entered the success zone. Here, M , B, and K are the mass, damping, and stiffness
• (Failure) The branch has entered the failure zone. matrices to be imitated, (x, ẋ) are the end-effector position
• (Failure) The episode has reached Nt time steps. and velocity in task space, Fext is the wrench exerted on the
Upon success, the agent receives a reward corresponding environment, and ẍc is the output acceleration as determined
to the remaining time T −δt n, while upon failure, it receives by the controller. Since is there no desired “position” of
the end-effector, we can eliminate the stiffness term and
rearrange Equation 2 as shown in Equation 3:
Acc. Pivot Len (cm) Rem Len (cm) Max Force (N)
HC* 77% 1.9 ± 2.1 3.5 ± 1.8 4.8 ± 1.7
Generally speaking, the depth estimates provided from the
CL 46% 3.0 ± 2.5 3.0 ± 3.5 3.6 ± 1.1 camera fell short of the actual targets, which meant that in
OL 19% 5.2 ± 1.1 3.0 ± 3.3 4.6 ± 6.6 most of the open-loop runs, the cutter simply stopped in
OL- 23% 5.7 ± 1.4 3.1 ± 3.6 4.1 ± 8.0
front of the target without ever touching it. However, on the
TABLE I: Averaged summary stats of our results for each controller (defined occasions that the open-loop controllers did hit the branch,
in Section VI-B). Standard deviation values are given to the right of the ±
symbol.
because of the lack of force feedback, they would push into
the branch with a very high amount of force, sometimes
upwards of 20 N as a result of hitting a vertical leader branch.
Meanwhile, both our hybrid controller and simple closed-
• Pivot offset length: Distance of the branch to the cutter
loop controller kept all forces below 10 N, demonstrating
pivot, measured by hand.
the value of the interaction controller in preventing damage
• Branch remnant length: The length of the branch that
to the environment. Although our controller did generally
would have remained after a cut along the cutter’s
incur higher forces than the simple closed-loop controller,
yz plane (from the plane of the cut surface to the
likely due to the acceleration of the tool from responding to
intersection with the vertical leader), measured by hand.
the neural network, at no point did it ever hit the environment
• Max force: The maximum force magnitude from the
with enough force to cause any notable damage.
UR5e’s force-torque sensor during the execution.
One note is that there appeared to be some error in our
In total, for each of the four controllers, we ran 26 trials, “ground truth” kinematics when transforming point cloud
corresponding to 4 trials each for 6 of the targets and 2 trials points into the world frame. This was evidenced by two
for one target which broke off during the experiment. observations: the similarity of the two open loop controllers’
overall performance, and the fact that when running the
VII. R ESULTS experiments, left-side branch target estimates tended to be far
A summary of the average results is shown in Table I, away from the leader while right-side branch target estimates
along with a detailed plot of the values in Figure 8. In tended to be very close to the leader. While locating and
summary, our hybrid controller performed significantly better fixing the calibration issues may have improved the overall
than the closed-loop controller in terms of cutting accuracy performance of the simple closed-loop controller, it would
and precision. Both closed-loop controllers performed better also reinforce the point that our system is robust to imperfect
than both of the open loop controllers (which generally kinematic calibration, as well as other perturbations such as
failed to complete the task). Our hybrid controller had a 31 cart or tree movement that a point estimate-based controller
percentage point increase in accuracy over the closed-loop simply has no way to handle.
controller. Furthermore, it managed to avoid bumping into
VIII. C ONCLUSION
the upright leaders of the tree on every trial and achieved
relatively consistent branch remnant lengths, with a standard In this paper, we introduced a framework for highly
deviation in remnant length of just 1.8 cm. This is in accurate robotic pruning that uses 2D RGB images and
comparison to the closed-loop controller, which had nearly force data to consistently execute accurate cuts with minimal
double the standard deviation at 3.5 cm and sometimes force. Through our experiments, we demonstrated that our
drove straight into a leader branch. Our controller also more system is robust to a number of issues that have traditionally
consistently ended up with low pivot offset lengths, though been associated with depth camera data, primarily kinematic
this was largely a result of situations in which the cutter miscalibrations and incorrect depth estimates. While we
was placed in front of a vertical leader: our controller would conduct our experiments in an indoor laboratory setting,
move the cutter away from the leader, while the closed-loop our framework represents a step towards moving robotic
controller would move straight into it. precision pruning to the outdoor orchard environment. Future
work will include adding rotation as a third dimension to the [20] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-Image
velocity command (to ensure perpendicular alignment), as Translation with Conditional Adversarial Networks,” CVPR, 2017.
[21] J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov,
well as determining how to adjust the segmentation network “Proximal policy optimization algorithms,” arXiv preprint
to work in more complex outdoor settings with background arXiv:1707.06347, 2017.
noise from adjacent rows. [22] A. Raffin, A. Hill, M. Ernestus, A. Gleave, A. Kanervisto,
and N. Dormann, “Stable baselines3,” https://github.com/DLR-RM/
stable-baselines3, 2019.
R EFERENCES
[1] P. S. U. C. of Agricultural Sciences and P. S. U. C. Extension, Penn
State Tree Fruit Production Guide. Penn State, College of Agricultural
Sciences, 2015.
[2] K. Gallardo, M. Taylor, and H. Hinman, “Cost estimates of estab-
lishing and producing Gala apples in Washington,” Washington State
University, 2009.
[3] L. Calvin and P. Martin, “The US produce industry and labor: Facing
the future in a global economy,” Tech. Rep., 2010.
[4] R. Verbiest, K. Ruysen, T. Vanwalleghem, E. Demeester, and K. Kel-
lens, “Automation and robotics in the cultivation of pome fruit: Where
do we stand today?” Journal of Field Robotics, vol. 38, no. 4, pp. 513–
531, 2021.
[5] A. You, F. Sukkar, R. Fitch, M. Karkee, and J. R. Davidson, “An
efficient planning and control framework for pruning fruit trees,” in
2020 IEEE International Conference on Robotics and Automation
(ICRA). IEEE, 2020, pp. 3930–3936.
[6] A. Zahid, S. Mahmud, L. He, P. Heinemann, D. Choi, and J. Schupp,
“Technological advancements towards developing a robotic pruner for
apple trees : A review,” Computers and Electronics in Agriculture,
vol. 189, no. January, 2021.
[7] V. R. Corporation. (2017) Vision Robotics Corporation. [Online].
Available: https://www.visionrobotics.com/
[8] T. Botterill, S. Paulin, R. Green, S. Williams, J. Lin, V. Saxton,
S. Mills, X. Chen, and S. Corbett-Davies, “A robot system for pruning
grape vines,” Journal of Field Robotics, vol. 34, no. 6, pp. 1100–1122,
2017.
[9] A. Zahid, L. He, D. D. Choi, J. Schupp, and P. Heinemann, “Collision
free path planning of a robotic manipulator for pruning apple trees,”
in ASABE 2020 Annual International Meeting. American Society of
Agricultural and Biological Engineers, 2020, p. 1.
[10] M. Prats, P. J. Sanz, and A. P. Del Pobil, “Vision-tactile-force
integration and robot physical interaction,” in 2009 IEEE international
conference on robotics and automation. IEEE, 2009, pp. 3975–3980.
[11] A. J. Schmid, N. Gorges, D. Göger, and H. Wörn, “Opening a door
with a humanoid robot using multi-sensory tactile feedback,” Proceed-
ings - IEEE International Conference on Robotics and Automation, pp.
285–291, 2008.
[12] Y. Mezouar, M. Prats, and P. Martinet, “External hybrid vision/force
control,” in Intl. Conference on Advanced Robotics (ICAR07), 2007,
pp. 170–175.
[13] A. C. Leite, F. Lizarralde, and L. Hsu, “Hybrid vision-force robot
control for tasks on unknown smooth surfaces,” in Proceedings 2006
IEEE International Conference on Robotics and Automation, 2006.
ICRA 2006. IEEE, 2006, pp. 2244–2249.
[14] J. Baeten and J. De Schutter, “Hybrid vision/force control at corners
in planar robotic-contour following,” IEEE/ASME Transactions on
Mechatronics, vol. 7, no. 2, pp. 143–151, 2002.
[15] M. Q. Mohammed, K. L. Chung, and C. S. Chyi, “Review of
deep reinforcement learning-based object grasping: Techniques, open
challenges, and recommendations,” IEEE Access, vol. 8, pp. 178 450–
178 481, 2020.
[16] A. Zahid, M. S. Mahmud, and L. He, “Evaluation of Branch Cutting
Torque Requirements Intended for Robotic Apple Tree Pruning,” in
2021 ASABE Annual International Virtual Meeting. American Society
of Agricultural and Biological Engineers, 2021, p. 1.
[17] S. James, P. Wohlhart, M. Kalakrishnan, D. Kalashnikov, A. Irpan,
J. Ibarz, S. Levine, R. Hadsell, and K. Bousmalis, “Sim-to-real via sim-
to-sim: Data-efficient robotic grasping via randomized-to-canonical
adaptation networks,” Proceedings of the IEEE Computer Society
Conference on Computer Vision and Pattern Recognition, vol. 2019-
June, no. April 2020, pp. 12 619–12 629, 2019.
[18] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley,
S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,”
Advances in neural information processing systems, vol. 27, 2014.
[19] E. Coumans and Y. Bai, “Pybullet, a python module for physics
simulation for games, robotics and machine learning,” 2016.