Professional Documents
Culture Documents
Paper 3
Paper 3
Paper 3
https://doi.org/10.1007/s10055-019-00415-8
ORIGINAL ARTICLE
Received: 27 July 2019 / Accepted: 27 November 2019 / Published online: 3 December 2019
© Springer-Verlag London Ltd., part of Springer Nature 2019
Abstract
This study developed an augmented reality (AR)-based training system for conventional manual milling operations. An Intel
RealSense R200 depth camera and a Leap Motion controller were mounted on an HTC Vive head-mounted display to allow
users freely walk around in a room-size AR environment to operate a full-size virtual milling machine with their barehands,
using their natural operation behaviors, as if they were operating a real milling machine in the real world, without additional
worn or handheld devices. GPU parallel computing was used to handle dynamic occlusions and accelerate the machining
simulation to achieve a real-time simulation. Using the developed AR-based training system, novices can receive a hands-on
training in a safe environment, without any injury or damage. User test results showed that using the developed AR-based
training resulted in lower failure rates and inquiry times than using video training. Users also commented that the AR-based
training was interesting and helpful for novices to learn the basic manual milling operation techniques.
Keywords Augmented reality · Natural operation behavior · Manual milling operation · Occlusion
13
Vol.:(0123456789)
528 Virtual Reality (2020) 24:527–539
real-time machining simulation to increase the realism and locations of the foreground and the virtual objects to achieve
immersiveness of the simulation. intuitive interaction and masking. Contrary to Kinect’s full-
body tracking, Leap Motion is a hand tracking device which
1.1 Interaction interfaces can recognize and track hands and fingers. Weichert et al.
(2013) showed that the accuracy of the Leap Motion device
In an AR environment, an effective and real-time interac- was about 0.2 mm for static setup and 1.2 mm for dynamic
tion interface not only enhances the accuracy of simulation setup.
but also increases the sense of presence. Interaction inter- Penelle and Debeir (2014) promoted the fusion of the
faces can be natural or non-natural. Non-natural interactions 3D data acquired by Leap Motion and Kinect to improve
often use an intermediary stylus or wand, e.g., PHANToM, hand tracking performances. By registering the position and
Razer Hydra, or HTC Vive controller, to interact with vir- orientation of the Leap Motion in the reference frame of
tual objects. Users need to take more time to learn how to Kinect, the system could accurately detect the interactions
use these intermediaries. In addition, the usages of inter- between real hands and virtual objects.
mediaries are often different from users’ natural operation Natural user interfaces without worn devices provide
behaviors in operating a real machine tool in the real world. users with more intuitive and immersive sensations. How-
Thus, the AR training might not be effective. However, an ever, in the prior research, most of the Leap Motion or
intermediary device often provides a higher controllability Kinect devices were non-movable. It limited the portability
and stability. and mobility of the interactions in the AR environments. In
Some other non-natural interactions use gestures to inter- this research, a portable natural user interface integrating
act with virtual objects. For example, Shim et al. (2016) a Leap Motion controller, an Intel RealSense R200 RGB
developed a gesture-based AR system to interact with vir- depth (RGB-D) camera, and an HTC Vive HMD was devel-
tual objects without extra intermediary devices. However, oped. The system allowed users to freely walk around in a
gesture-based interaction is often not intuitive, neither in room-size AR environment to interact with a full-size con-
line with human natural operation behaviors. In addition, ventional manual milling machine using their natural opera-
users often need to learn and remember complex gestures. tion behaviors.
Unlike non-natural interaction interfaces, natural interac-
tions are more intuitive. Regazzoni et al. (2018) stated that
natural user interface design was one of the most important 1.2 Occlusion
issues in AR/VR applications, and a good natural user inter-
face allowed users to be concentrated on the achievement of Occlusion is another significant issue which determines the
the final goals instead of monitoring the correct execution degree of presence and the realism of an AR environment.
of the gestures. Users often do not need to learn how to use Without occlusion handling, virtual objects are always ren-
complex intermediary devices or gestures. They can move dered on the top layer of the color image captured by an
their bodies and interact with objects in a very natural way. RGB color camera. Therefore, it is difficult for users to cor-
For example, Qiu et al. (2013) proposed a VR assembly and rectly identify the relative positions of the virtual objects.
maintenance simulation system. The system integrated a Lu and Smith (2009) accessed the images before and after
data helmet and a data glove into a virtual human control virtual objects appeared and compared the two images to
system. Users can execute tasks by controlling an avatar in a compute the potential occlusion area for stereo matching.
VR environment to grasp, move, and release virtual objects The research divided the image into multiple layers and
naturally in an interactive way. Sportillo et al. (2015) placed applied GPU to reduce the matching cost. Gheorghe et al.
color markers on the thumb and the index fingers to track (2015) built a virtual CNC machining simulation using AR.
hand positions. Through color filtering with a RGB depth They managed the occlusion problem based on the prior
camera, users could grab and drag virtual objects naturally knowledge of the position and the shape of the real objects
in a virtual environment. in two particular scenarios. However, their method was not
Recently, with the advances of the sensor technology, applicable to the objects which were not in their database.
natural interactions without any worn devices become pos- Leal-Meléndrez et al. (2013) presented a pixel-wise
sible. For example, Kinect is a body motion sensing device occlusion handling method using Kinect to achieve real-time
released by Microsoft. It consists of an RGB camera, an occlusion. Khattak et al. (2014) fixed a Leap Motion con-
infrared emitter, an infrared camera, and a multi-array troller on a table to track hand movements and attached an
microphone which provide full-body 3D motion track- RGB-D (RGB depth) camera on an Oculus HMD to create
ing, facial recognition, and voice recognition capabilities. an AR environment. Their method first captured the static
Corbett-Davies et al. (2012) used Kinect to capture the environment and reconstructed the environment using the
depth information of a fixed scene and analyzed the relative RGB-D camera. Then, they compared the depth map of the
13
Virtual Reality (2020) 24:527–539 529
dynamic scene and the static scene to handle occlusions immersiveness of the AR simulation but also caused dis-
using a fragment shader. comfort to users. A more efficient and intuitive occlusion
Although most of the prior occlusion methods can suc- method is needed to handle real-time dynamic occlusions
cessfully handle occlusion problems, the requirement in AR applications.
of expensive computational resources often resulted in This paper is organized as follows. Section 2 gives an
rendering delay, which not only reduced the realism and overview of the hardware and software employed in this
study. Section 3 describes the unification of different coor-
dinate systems. Section 4 describes the dynamic occlusion
method developed in this study. Section 5 presents the
machining simulation and user interface designs. Section 6
gives a user test. Finally, Sect. 7 offers conclusions.
2 System architecture
13
530 Virtual Reality (2020) 24:527–539
scenes. Leap Motion was used to track hand positions. 3 Coordinate unification
Leap motion was a tiny hand tracking device containing
three infrared emitters and two infrared cameras. In the AR system, each device had its own coordinate sys-
To increase users’ mobility in the AR environment, the tem. These coordinate systems needed to be integrated
Real Sense R200 camera and the Leap Motion controller together in order to have correct data communication. In
were fixed on the HTC Vive HMD (as shown in Fig. 3), this study, the following coordinates were conformed to
so that users’ hand motions could be captured and viewed each other:
according to users’ head position and orientation. It also
allowed users to walk freely in a room-size environment • Uinty3D (virtual world)
to interact with a full-size virtual milling machine using • HTC Vive HMD
their natural operation behaviors. • Intel RealSense R200 color camera
Unity3D was a cross-platform game engine developed • Intel RealSense R200 depth camera
by Unity Technologies. Unity3D could use compute shader • Leap Motion
to accelerate computing processes on GPU. In this study,
Unity3D was used to build the AR environment. Four There were four unification steps: from HTC Vive to
modules were constructed: rendering module, interaction Unity3D (MVive∕Unity ) , from R200 color camera to Uni-
module, user interface module, and milling machine simu- ty3D (MR200Color∕Unity ) , from Leap Motion to Unity3D
lation module. (MLeap∕Unity ), and from R200 depth camera to color camera
The rendering module integrated the real scene image (Mdepth∕color ).
and the virtual world scene and ensures that the results
can be rendered using the HTV Vive HMD. The interac-
3.1 Step 1: from HTC Vive to Unity3D
tion module integrated user motion information tracked by
Leap Motion and maintained the logics and the rules of the
“SteamVR” from Unity Asset Store was used to manage
interaction behaviors between users and virtual objects.
Vive in Unity3D. “CameraRig” in “StemVR” was a Unity
The user interface module provided text or image instruc-
gameobject that controlled stereo rendering and head track-
tions to users during the AR-based training. The milling
ing. “CameraRig” contained one Unity camera which
machine simulation module handled machining logics and
anchored at the center between the poses of the left eye and
provided dynamic machining behaviors.
the right eye. By using “CameraRig”, users could also access
Two HTC Vive lighthouses were placed at the two cor-
the script “SteamVR_TrackedObject” to get the position and
ners of a 3.0 m × 2.25 m room. When users put on the HTC
the orientation of the HMD between the two lighthouses in
Vive HMD, the two lighthouses in the room would track
each frame. Therefore, the transformation matrix between
users’ head movements. HTC Vive handheld controllers
Vive and the Unity virtual world could be calculated by:
were not used in this study, so that users could operate the
virtual milling machine with their barehands. MVive/Unity = Mt × Mr × Ms (1)
⎡ 1 0 0 0 ⎤
⎢ 0 1 0 0.015 ⎥
MR200Color/Vive =⎢ ⎥
⎢ 0 0 1 0.07 ⎥
⎣ 0 0 0 1 ⎦
Fig. 3 Leap motion and RealSense R200 were fixed on HTC Vive
13
Virtual Reality (2020) 24:527–539 531
13
532 Virtual Reality (2020) 24:527–539
image back to the scene, which were time-consuming and the depth order of the objects and make sure that only the
difficult to achieve real-time dynamic occlusions. In this forefront objects were drawn in the scene. In this study,
study, a more efficient and intuitive method was proposed before the per-sample operations step, a custom fragment
by overwriting the Z-buffer in the rendering engine of shader was used to overwrite the Z-buffer of the image plane
Unity3D. in the pixel processor, so that each pixel would have correct
Figure 8 shows a standard Unity3D rendering pipeline. depth information.
Before the frame buffer step, per-sample operations would Whenever the RealSense R200 RGB-D camera
make a series of tests to ensure that the rendering was cor- refreshed the accepted images, the system mapped the
rect. Among these tests, the depth test was applied to check depth data to the color images. Then, both the color
13
Virtual Reality (2020) 24:527–539 533
13
534 Virtual Reality (2020) 24:527–539
6 User test
6.1 Training session
13
Virtual Reality (2020) 24:527–539 535
composed of four main steps: (1) introduction to the machine Finally, subjects in the experiment group were asked to fill
tools, (2) introduction to the workpiece setup, (3) introduc- out a subjective questionnaire and a System Usability Scale
tion to the cutter setup, and (4) introduction to the milling (SUS) analysis (Brooke 1996). The training flowchart is
task. Both training times were about 30 min. shown in Fig. 12.
Twenty-four hours after completing the training ses-
sions, the participants were asked to complete a practical 6.2 Practical milling operation session
milling task. They had to complete a real milling task with
correct operation steps. After that, subjects in the experi- Before operating the real milling machine, subjects needed
ment group were asked to watch the same education video. to wear safety equipment and to understand the safety rules.
13
536 Virtual Reality (2020) 24:527–539
During the test, whenever the subjects encountered any After the 20 participants completed the user test, their
problem or forgot any operation step, they could ask for help. performance was evaluated based on three items: failure
Each subject was given a 50 mm × 50 mm × 25 mm work- rate, the number of inquiries, and the time taken. The time
piece and a two flute end mill cutter with 16 mm diameter. A recorded did not include the time for inquiry. Since all par-
finished sample product and its dimensions were given to the ticipants did not have any prior milling machining experi-
subjects for their reference. Subjects were asked to machine ence and this was their first time of operating a real milling
the workpiece to the same geometry as the given sample machine, the dimension accuracy was not rigorously eval-
product. The finished product should include two adjacent uated. The allowances of the locations and widths of the
open slots. One was 16 mm wide, 2 mm deep, and 10 mm two slots were ± 4 mm, and the allowances of the depths of
away from the reference plane. The other one was 8 mm the two slots were ± 0.5 mm. Table 1 shows the evaluation
wide, 1 mm deep, and 26 mm away from the reference plane, results. The independent sample t test was used to check
as shown in Fig. 13. The evaluation consisted three parts: whether there were significant differences in failure rate,
In Part 1 (Workpiece), the participants had to correctly inquiry number, and the time taken between the control
fasten the workpiece on the vise of the milling machine. This group and the experiment group.
part included four steps: (1) put the baseplate on the vise, In Part 1 (Workpiece), there were no significant differ-
(2) put the workpiece on the baseplate, (3) fasten the vise ences between the control group and the experiment group
to hold the workpiece, and (4) knock the workpiece using in all three evaluation items. It might be because the steps
a rubber hammer to remove the gap between the plate and in Part 1 were easy and fast. In Part 2 (Cutter), there was a
the workpiece. significant difference in the number of inquiries. Subjects
In Part 2 (Cutter), the participants had to correctly install in the video training group had more questions concerning
the cutter assembly. This part included five steps: (1) assem- the setup sequence of the cutter, collet, and collet nut. In
ble the collet nut and the collet, (2) put the milling cutter in Part 3 (Milling), the failure rate in the video training group
the collet, (3) move the lever knob to the “IN” position to was significant higher. Subjects in the video training group
lock the rotation of the chuck, (4) fasten the cutter assembly were likely to use the center of the cutter as the reference,
on the chuck with a C spanner, and (5) move the lever knob instead of the edge of the cutter. It results in incorrect mill-
to the “OUT” position. ing dimensions. The number of inquiries in the video train-
In Part 3 (Milling), the subjects had to complete the ing group was also significant higher than the AR-based
milling task with the correct milling operations. This part training group.
included five steps: (1) turn on the milling machine to the
13
Virtual Reality (2020) 24:527–539 537
Part 1
Workpiece
Put the baseplate on the vise 0 0 1.1 0.7 107 85.78
Put the workpiece on the baseplate 0 0
Fasten the vise 0 10
Knock the workpiece using a rubber hammer 0 0
p value 0.339 0.213 0.163
Part 2
Cutter
Assemble the collet nut and the collet 10 10 2.4 1.1 148.2 145.1
Put the milling cutter in the collet 10 10
Move the lever knob to the “IN” position 10 10
Fasten the nut on the collet chuck with a C spanner 10 10
Move the lever knob to the “OUT” position 0 10
p value 0.646 0.023 0.563
Part 3
Milling
Switch on the milling machine 0 0 1.6 0.7 1018.8 913.3
Rotate the x–y–z handle to adjust the cutter to the proper position 0 0
Machine the workpiece in the clockwise direction against the feeding 50 20
direction
Complete the milling assignment with the correct width 40 10
Complete the milling assignment with the correct depth 50 0
p value 0.003 0.001 0.28
The results showed that AR-based training with natural were related to the interaction between real hands and vir-
user interfaces helped users to have a deeper impression tual objects. One problem might be because although the
concerning the practical operation processes. In contrast, depth information of the real scenes could be obtained
the video training was not interactive, so the subjects might using the RGB-D camera, the real-world images rendered
not have sufficient subjective experience concerning how to to the users were still in 2D images, not in stereoscopic
operate a real milling machine. format. Therefore, it was difficult for the users to judge the
distance between the virtual objects and the real hands.
6.4 Subjective questionnaire analysis Another problem might be because the tracking error in
Leap Motion. If users moved their hands too fast or rotated
After completing the practical milling task, subjects in the their hands to certain angles, Leap Motion failed to track
experiment group were asked to watch the same educa- hands correctly.
tion video. Then, they were asked to fill out a subjective Subjects generally believed that the color changing
questionnaire and an SUS questionnaire. For the subjective features and the augmented instructions were helpful in
questionnaire, a 7-point Likert scale was used, 1 represent- understanding the milling task. Finally, most subjects
ing strongly disagree and 7 representing strongly agree. agreed that AR-based training was more helpful and inter-
The questionnaire contained 18 questions. Questions 1–8 esting than video training.
were for the interaction mode of the AR system; questions
9–12 were for the occlusion effects; questions 13–16 were 6.5 System usability scale analysis results
for the instruction mode in the AR system; questions 17
and 18 were the subjective comparison between the video The SUS analysis was based on the metric developed by
training and the AR-based training. The results of the sub- John Brook (Brooke 1996). The SUS metric provides a reli-
jective questionnaire are given in Table 2. able usability evaluation for a product. The metric consists
Regarding the interaction mode, most scores are higher of 10 questions. Tullis and Stetson (Tullis and Stetson 2004)
than 6, except questions 4, 5, and 7. These three questions showed that using the SUS metric, a sample size of around
13
538 Virtual Reality (2020) 24:527–539
8–12 participants was enough to give a reasonably accu- Conventional manual milling operations require skillful
rate measure of the usability of a system. Table 3 shows the operators to produce a quality product. In this study, an
results of the SUS analysis. The developed AR-based train- AR-based training system for conventional manual mill-
ing system received an overall mean score of 79.6, which ing machine operations was developed to provide novices
was above the average mean score of 68 (Tullis and Stetson a hands-on, but safe, training environment. Users can
2004). It indicated that the developed AR-based training sys- operate a full-size virtual milling machine using their
tem was easy to use. natural operation behaviors, without any worn or hand-
held intermediaries.
The system was portable and allowed users to walk
7 Conclusion around in a room-size environment. Dynamic occlusion was
realized using Unity3D fragment shader on GPU to increase
Manual milling machine operation training is an impor- the realism and immersiveness of the simulation. Parallel
tant job training in vocational education. However, most processing of GPU was used to implement the marching
prior AR-based milling applications were for NC machine cubes algorithm to have 30 frames per second of real-time
operations, which did not need heavy manual skills. simulation. With the augmented instructions and illustrated
13
Virtual Reality (2020) 24:527–539 539
3D animation, users were able to learn and practice the mill- Kiswanto G, Ariansyah D (2013) Development of augmented reality
ing operations step by step, without any fear of injury or (AR) for machining simulation of 3-axis CNC milling. In: Interna-
tional conference on advanced computer science and information
damage. In addition, by presenting users’ physical hands in systems (ICACSIS), Bali, Indonesia, pp 143–148
the vision, the sense of detachment from the AR environ- Leal-Meléndrez JA, Altamirano-Robles L, Gonzalez JA (2013) Occlu-
ment was eliminated. sion handling in video-based augmented reality using the Kinect
The user test results showed that users tended to have a sensor for indoor registration. In: Proceedings of 18th Iberoameri-
can congress CIARP 2013 on progress in pattern recognition,
deeper impression after taking the AR-based training. The image analysis, computer vision, and applications, Havana, Cuba,
failure rate in milling operations, the inquiry times in cutter pp 447–454
setup, and the inquiry times in milling operations of the AR- Lorensen WE, Cline HE (1987) Marching cubes: a high resolution
based training were significant lower than the video training. 3D surface construction algorithm. ACM SIGGRAPH Comput
Graphics Anaheim 21(4):163–169
Although the occlusion handling greatly improved the Lu Y, Smith S (2009) GPU-based real-time occlusion in an immer-
interaction effects, users still had trouble in grasping virtual sive augmented reality environment. J Comput Inf Sci Eng
objects with their barehands. It was because the color images 9(2):024501
of the real scenes were rendered in 2D images. Users could Neugebauer R, Klimant P, Wittstock V (2010) Virtual-reality-based
simulation of NC programs for milling machines. In: Proceedings
not use their binocular vision to determine the relative posi- of the 20th CIRP design conference on global product develop-
tions of the virtual objects and the real objects. In the future, ment, Ecole Centrale de Nantes, Nantes, France, pp 697–703
stereo color cameras will be used to provide stereoscopic Penelle B, Debeir O (2014) Multi-sensor data fusion for hand tracking
color images of the real scenes in the AR environment. In using Kinect and Leap Motion. In: Proceedings of the 2014 virtual
reality international conference, Laval, France, pp 1–7
addition, the current system was designed to train novices to Qiu S, Fan X, Wu D, He Q, Zhou D (2013) Virtual human modeling for
learn the basic manual milling machine operations at begin- interactive assembly and disassembly operation in virtual reality
ner’s level. In the future, training at professional level will environment. Int J Adv Manuf Technol 69:9–12
be added. The resolution of the marching cubes will also be Regazzoni D, Rizzi C, Vitali A (2018) Virtual reality applications:
guidelines to design natural user interface. In: Proceedings of the
increased to improve the accuracy of the simulation. ASME 2018 international design engineering technical confer-
ences and computers and information in engineering conference,
Acknowledgements The authors would like to thank the Ministry of Quebec, Canada, pp DETC2018-85867
Science and Technology, Taiwan, Republic of China for financially sup- Shim J, Yang Y, Kang N, Seo J, Han T-D (2016) Gesture-based inter-
porting this research under Contract MOST 108-2221-E-002-161-MY2. active augmented reality content authoring system using HMD.
Virtual Real 20(1):57–69
Sportillo D, Avveduto G, Tecchia F, Carrozzino M (2015) Training
in VR: a preliminary study on learning assembly/disassembly
References sequences. In: International conference on augmented and virtual
reality, Lecce, Italy, pp 332–343
Brooke J (1996) SUS-A quick and dirty usability scale. Usability Eval Tullis TS, Stetson JN (2004) A comparison of questionnaires for
Ind 189(194):4–7 assessing website usability. In: Usability professional association
Chardonnet J-R, Fromentin G, Outeiro J (2017) Augmented reality conference, Minneapolis, Minnesota, USA, pp 1–12
as an aid for the use of machine tools. In: The 15th management Weichert F, Bachmann D, Rudak B, Fisseler D (2013) Analysis of the
and innovative technologies (MIT) conference, Sinaia, Romania accuracy and robustness of the leap motion controller. Sensors
pp 1–4 13(5):6380–6393
Corbett-Davies S, Green R, Clark A (2012) Physically interactive tab- Zhang Z (2000) A flexible new technique for camera calibration. IEEE
letop augmented reality using the Kinect. In: Proceedings of the Trans Pattern Anal Mach Intell 22(11):1330–1334
27th conference on image and vision computing Dunedin, New Zhang J, Ong S-K, Nee AY (2008) AR-assisted in situ machining simu-
Zealand, pp 210–215 lation: architecture and implementation. In: Proceedings of the
Gheorghe C, Rizzotti D, Tièche F, Carrino F, Khaled OA, Mugellini E 7th ACM SIGGRAPH international conference on virtual-reality
(2015) Occlusion management in augmented reality systems for continuum and its applications in industry, p 26
machine-tools. In: International conference on virtual, augmented
and mixed reality, Los Angeles, CA, USA, pp 438–446 Publisher’s Note Springer Nature remains neutral with regard to
Khattak S, Cowan B, Chepurna I, Hogue A (2014) A real-time recon- jurisdictional claims in published maps and institutional affiliations.
structed 3D environment augmented with virtual objects rendered
with correct occlusion. In: 2014 IEEE games media entertain-
ment, Toronto, ON, Canada, pp 1–8
13