Professional Documents
Culture Documents
Ocean Engineering: Pengpeng He, Delin Hu, Yong Hu
Ocean Engineering: Pengpeng He, Delin Hu, Yong Hu
Ocean Engineering
journal homepage: www.elsevier.com/locate/oceaneng
A R T I C L E I N F O A B S T R A C T
Keywords: In this work, a measurement system integrated with a deep-learning based multi-view stereo (MVS) approach is
Ship shell plates developed to measure ship shell plates. Specifically, a deep learning architecture of CasMVSNet is deployed for
Vision measurement depth map inference from multi-view images, which can remarkably decrease the dense reconstruction time and
Deep learning
GPU memory consumption. This is the first report that a learned MVS architecture is deployed to three-
Multi-view stereo
3D reconstruction
dimensional (3D) reconstruction of ship shell plates. Measurement experiments are then performed for three
typical hull plates to evaluate the accuracy, efficiency and completeness of the proposed measurement method.
The results suggest that the complete point cloud data of the curved hull plates can be reconstructed in about 3
min with the average of errors less than 1 mm, which fulfills the requirements of precision and efficiency in
shipbuilding production. Compared with traditional wooden templates, the proposed measurement method is
more accurate, efficient and inexpensive. The developed measurement system with quantitative data can also be
readily integrated with the 3D computer numerical control (CNC) plate bending machine. Moreover, the
robustness and flexibility of the proposed measurement method have been verified by comparison with the
measurement method based on active binocular stereovision.
* Corresponding author. Key Laboratory of High Performance Ship Technology (Wuhan University of Technology), Ministry of Education, Wuhan, 430063, China.
E-mail address: y.hu@163.com (Y. Hu).
https://doi.org/10.1016/j.oceaneng.2022.111968
Received 10 December 2021; Received in revised form 1 June 2022; Accepted 8 July 2022
Available online 18 July 2022
0029-8018/© 2022 Elsevier Ltd. All rights reserved.
P. He et al. Ocean Engineering 260 (2022) 111968
2
P. He et al. Ocean Engineering 260 (2022) 111968
(1) Firstly, taking a single-scale image of an arbitrary size as the multiple scales. The bottom-up pathway computes the feedforward of
input, Feature Pyramid Network (Lin et al., 2017) outputs pro conv0, conv1 and conv2 in Table 1, which generates feature maps at
portionally sized feature maps at three levels. three scales. The conv0, conv1 and conv2 are the basic blocks consisting
(2) Secondly, N feature maps at the first level are warped into of two or three 2D CNNs for learning the feature extraction. The
different fronto-parallel planes of the reference camera in order top-down pathway adopts the nearest neighbor interpolation with a
to form N feature volumes via the differentiable homography scale factor of 2 in order to produce higher-resolution feature maps from
warping. The variance cost metric is then used to aggregate N spatially coarser, but semantically stronger, feature maps at higher
feature volumes to one cost volume. After that, the cost volume pyramid levels. These upsampling feature maps are then merged with
regularization is designed for refining the above cost volume to the corresponding bottom-up feature maps by element-wise addition.
generate a probability volume for depth inference. In the end, the Lateral connections utilize a 1 × 1 convolutional layer to match the
initial depth map at the first stage is regressed from the regular channel dimensions of feature maps from the bottom-up and top-down
ized probability volume. This part is migrated from MVSNet (Yao pathways.
et al., 2018). The total thirteen-layer 2D CNNs are applied, where the strides of
(3) Thirdly, the hypothesis depth range is narrowed based on the Conv2d_3 and Conv2d_6 in Table 1 are set to two to divide the feature
predicted output from the previous stage, and finer hypothesis maps to three scales. Within each scale, two or three convolutional
plane intervals are applied to recover more detailed outputs. The layers are applied to extract the higher-level image representation. The
next step is same as step (2) in order to output a higher resolution outputs of FPN are the feature maps at three levels, viz. output1, output2
depth map. and output3 in Table 1. Their correspondent spatial resolutions at three
(4) Finally, step (3) is repeated at the third stage in order to narrow levels are 1/16, 1/4, 1 of the input image size, respectively. Detailed
the hypothesis depth range and hypothesis plane interval again, parameters of FPN can be found in Table 1. The conv2d, upsampling and
and the final depth map is acquired. addition refer to a layer 2D CNN operation, the interpolation operation
and the element-wise addition operation respectively, while conv2d_x,
upsampling_x and addition_x represent their middle outputs after the xth
2.1. Feature extraction according operations. The in_chs, out_chs, kernel_size, stride and padding
are parameters of each layer 2D CNN. H and W denote the height and
The first step of CasMVSNet is to extract the deep features of the N width of the input image, respectively.
input images at multiple levels. Here, Feature Pyramid Network (FPN)
(Lin et al., 2017) is applied to obtain feature maps at three levels. The 2.2. Cost volume
construction of FPN involves a bottom-up pathway, a top-down pathway
and lateral connections, which are designed for building feature maps at After the deep feature maps {Fi }Ni=1 at three scales are extracted from
Table 1
Parameters of feature pyramid network.
Output Middle output Input in_chs out_chs kernel_size stride padding Output dimension
3
P. He et al. Ocean Engineering 260 (2022) 111968
N input images through FPN, the next step of CasMVSNet is to warp all source code and the pre-trained model of CasMVSNet provided by Gu
feature maps {Fi }Ni=1 into different fronto-parallel planes of the reference et al. (2020) is available at https://github.
camera frustum by differentiable homographies to build N feature vol com/alibaba/cascade-stereo/tree/master/CasMVSNet. Although the
umes {Vi }Ni=1 in 3D space. CasMVSNet adopts a cascade differentiable model is trained on a DTU indoor dataset, CasMVSNet without any
homography warping, whose homography at the (s + 1)th stage is fine-tuning is still able to produce the excellent reconstructions on the
( ( more complex outdoor scenes, demonstrating the strong generalization
) )
( ) t1 − ti ⋅nT1 ability of the network. Thus, it could be feasible to use the pre-trained
Hi dsk + Δks+1 = Ki ⋅ Ri ⋅ I − ⋅ RT1 ⋅Κ−1 T (1)
dsk + Δks+1 model of CasMVSNet for 3D reconstruction of curved hull plates in the
absence of the curved hull plates dataset. It is worth noting that each
where Hi (dks + Δks+1 ) refers to the homography between the ith feature image from different views takes turns to be the reference image to es
timate depth maps. Moreover, CasMVSNet adopts a three-stage cascade
map and the reference feature map at depth d = dks + Δks+1 . dks denotes the
cost volume and two neighboring images as source images during
predicted depth of the kth pixel at the sth stage, and Δks+1 is the residual training.
depth of the kth pixel to be learned at the (s + 1)th stage. Moreover, Ki,
Ri, ti are the camera intrinsics, rotations and translations of the ith 3. Measurement system
source view respectively. K1, R1, t1 are the camera intrinsics, rotations
and translations of the reference view respectively. n1 denotes the Theoretically, multi-view images of a hull plate can be directly taken
principle axis of the reference camera. I is the identity matrix. manually. However, this mode of image acquisition is quite inefficient,
Next, CasMVSNet aggregates multiple feature volumes {Vi }Ni=1 to one and the quality of the collecting image is easily affected by human fac
cost volume C via a variance cost metric. The cost volume C is defined as: tors. In order to improve the efficiency and quality of images acquisition,
∑
N the MVS measurement system of a curved hull plate, as demonstrated in
(Vi − Vi )2 Fig. 5, is established to realize automatic acquisition and synchronous
C = i=1 (2) transmission of images. It mainly consists of a digital single lens reflex
N
(DSLR) camera to acquire high-resolution images and larger field of
where Vi is the average volume among all feature volumes, and all op view (FOV), a motorized camera slider to improve efficiency of image
erations above are element-wise. acquisition, a metal test frame to simulate the working condition and a
Subsequently, CasMVSNet regularizes the cost volume C using the laptop to process data. The main details of hardware will be presented as
multi-scale 3D CNNs, and regresses a coarse depth map. The later stages follows.
use the estimated depth maps from the earlier stages to adaptively
generate hypothesis planes and construct new cost volumes with higher- 3.1. The DSLR camera
res feature maps. This adaptive depth sampling and adjustment of
feature resolution ensures the computation and memory resources are Canon EOS 90D DSLR camera matched with a Canon EF-S 18–135
spent on more meaningful regions (Gu et al., 2020). Thus, the compu mm f/3.5–5.6 IS USM lens is selected, as shown in Fig. 6 (a) and 6 (b).
tation time and GPU memory consumption can be decreased remarkably The ordinary consumer-grade DSLR camera is served as the vision sensor
via the coarse-to-fine pattern. in this study, mainly because it has the advantages of high resolution,
low cost and flexibility compared with industrial cameras. The camera
body adopts a complementary metal-oxide semiconductor (CMOS) with
2.3. Loss function
about 6960 × 4640 pixels, and each pixel size is around 3.20 × 3.20 μm.
The communication function supports Wi-Fi, Bluetooth and USB 2.0.
The cascade cost volume with N stages produces N − 1 intermediate
The camera lens has a focal length of 18–135 mm. Canon Power Zoom
outputs and a final prediction. The supervision is applied to all of the
Adapter PZ-E1, as indicated in Fig. 5 (c), is used to achieve a stable and
outputs, and the total loss (Gu et al., 2020) is defined as:
smooth power zoom drive. By this device, the camera focal length can be
∑
N remotely adjusted in order to make hull plates of different sizes kept in
Loss = λs ⋅Ls (3) the appropriate proportion in the image.
s=1
where Ls refers to the loss at the sth stage and λs refers to its corre
sponding loss weight. The smooth L1 loss (Girshick, 2015) is adopted for
Ls at each stage.
2.4. Implementations
4
P. He et al. Ocean Engineering 260 (2022) 111968
Fig. 6. The camera body, lens and power zoom adapter PZ-E1. 3.4. Effective FOV
3.2. Motorized camera slider In order to acquire a larger measurement range for a single view
under the present hardware conditions, it is necessary to design the
As shown in Fig. 7, the adopted camera slider is ASHANKS C300S downward tilt angle of the camera lens. On the basis of the ideal pinhole
Smooth ONE, which is mainly composed of two main sliding rails, a model, it can be estimated that when the focal length is 18 mm and the
central guide sliding rail, an electronic control unit, a cable release, a angle between the optical axis of the camera and the horizontal line is
sliding block, a belt and four corner brackets. The main slide rail is made 65◦ , a single view of the camera can cover a largest measurement area.
of carbon fiber material with a total length of 1200 mm and a maximum The effective FOV of MVS measurement system is composed of
load of 8 kg. In fact, with the limitation from the size of the sliding block, overlapping FOV in different views. Since the camera takes pictures in
the movable length of the sliding block on the camera slider is approx the different views, the FOV of the camera is different in various posi
imately 1100 mm. The total weight of the camera body, lens, and power tions of slider. CAD software is used to simulate the overlapping FOV of
zoom adapter is much smaller than the load-bearing capacity of the the camera in different views, and the result is shown in Fig. 9. When the
camera slider. The guide sliding rail is located between the two main camera moves from left to right, the range of the camera rotation angle is
sliding rails. Different shooting modes are obtained by adjusting its left [11◦ , − 11◦ ]. The rotation angle of the camera at the center of the slider
and right ends. When collecting the images of a curved hull plate, the is defined as 0◦ , and turning to the right is positive. The final effective
follow focus mode should be adopted so that a target hull plate is always FOV is about 2140 × 2140 mm. This effective FOV can measure all hull
kept in the center of the FOV. Follow-focus mode means that the camera plates formed by SKWB-800, SKWB-1800 and SKWB-2000 and most of
rotates while translating along two main sliding rails to achieve the hull plates formed by SKWB-2500.
function of shooting around a fixed object. In this mode, the positions of Nevertheless, for the SKWB-2500, the effective FOV of the mea
both ends of the guide sliding rail are “left front and right rear”, as surement system should be larger than its processing area of 2500 ×
illustrated in Fig. 8. The electronic control unit is connected to the 2500 mm. Larger effective FOV could be obtained by increasing the
application program (APP) in the mobile phone or iPad via Bluetooth, working distance or by reducing the focal length of the optical devices.
which can remotely control the sliding block through APP. The first choice would be limited by the maximum working determined
Fig. 7. The camera slider. Fig. 9. Effective FOV of the MVS measurement system.
5
P. He et al. Ocean Engineering 260 (2022) 111968
by the shipyard workshop. Alternatively, the other two lenses with a 5.1.1. Images acquisition and transfer
shorter focal length, EF-S 10–22 mm f/3.5–4.5 USM or EF-S 10–18 mm The time lapse mode is applied for images acquisition of a hull plate.
f/4.5–5.6 IS STM, can be used for the SKWB-2500, which can acquire a This mode means that the sliding block runs when the shutter is closed,
larger effective FOV at the same working condition. whereas the sliding block stops when the shutter is exposed, viz.
moving-stopping-shooting. At this mode, the positions of the starting
4. Measurement objects point (POINT A) and the end point (POINT B), the shooting time
(Tshooting), stabilizing time (Tstabilizing), and the number of the shooting
Three typical hull plates with different sizes and shapes are selected image (N) should be set in the APP. Tshooting refers to the time taking one
as measuring objects in this study. They are formed by 3D CNC plate image, including the camera shutter exposure time and automatic focus
bending machine SKWB-2500 shown in Fig. 1. The dimensions (l1, l2, l3, time. Generally, the shutter exposure time should be set in the camera
l4) of the small-size hull plate, as shown in Fig. 10 (a), are about 837 mm, according to the shooting scene. The shutter exposure time is set to 0.01
844 mm, 838 mm and 841 mm, respectively. Its thickness is about 5 mm. s when shooting images of the hull plate. Tstabilizing refers to the time
This kind of curved hull plate is called the saddle-shaped plate. Fig. 10 waiting for the camera and lens to dead stop in order to prevent them
(b) shows that the dimensions (l1, l2, l3, l4) of the middle-size hull plate, from shaking during shooting. The stabilizing time is generally set to be
named as the sail-shape plate, are about 635 mm, 1394 mm, 1846 mm 0.5–2 s. N refers to the number of images taken in a single journey from
and 1501 mm with the thickness of around 20 mm. Its maximum POINT A to POINT B. Therefore, the total time (TIAT) taking all images is
bending camber (hmax) is around 333 mm. As illustrated in Fig. 10 (c),
the dimensions (l1, l2, l3, l4) of the large-size hull plate are about 1873 TIAT = N × (Tshooting + T stabilizing)+Ttotal moving (4)
mm, 1850 mm, 2355 mm and 1855 mm, whose thickness is about 20 where Ttotal moving is total moving time of the sliding block from POINT A
mm. All of three hull plates belong to the doubly curved surface, and to POINT B.
their molded hull surfaces, viz. the upper surfaces as indicated in Fig. 10, The specific process of automatic collection of images for a curved
will be measured. hull plate is as follows. First, a hull plate is placed within the measure
ment area of the metal frame, using a crane. Second, the PZ-E1 is used to
5. Measurement process remotely adjust the focal length of the lens in order to obtain the
appropriate proportion of the different hull plates in the images. It is also
Fig. 11 shows an overview of measuring process for a hull plate, necessary to ensure that the hull plate is entirely covered by the FOV of
mainly including 3D reconstruction, measurement of approximate the camera in the different positions of the slider. Next, the sharpness of
theoretical values and point clouds registration. the image should be adjusted to the optimum by focusing. Finally, the
start and end positions as well as the moving speed of the sliding block
5.1. The 3D reconstruction of a curved hull plate are set, and the shooting time, stabilizing time and the number of images
acquisition are specified in the time lapse mode.
The flow of 3D reconstruction for a curved hull plate is illustrated in Generally, the transfer of images stored in the secure digital memory
Part I of Fig. 11. Firstly, arbitrary multi-view images of a curved shell card is a simple manual file transfer from the camera to the computer.
plate are automatically taken by the DSLR camera fixed on a slider and However, this step requires user interaction and it is generally time-
synchronously transmitted to a computer by USB cable. Secondly, the consuming. Therefore, further improvement is required. Ideally, the
camera parameters of each image and all undistorting images are ob images taken should be synchronously transferred to the computer
tained from COLMAP’s Structure-from-Motion (SfM) (Schonberger and without any user interaction and extra transfer time. Fortunately, EOS
Frahm, 2016) outputs, which generates the CasMVSNet inputs together 90D can be connected to the EOS Utility software provided free of charge
with sparse point cloud data. Thirdly, the depth maps and dense point by Canon through a USB cable or WiFi, which can make the images
cloud data are acquired by CasMVSNet. Finally, the noise points of the taken by the camera synchronously transferred to a designated folder on
dense point cloud are removed. In the following contents, the 3D a computer. Images of the sail-shape plate acquired by the sensor in four
reconstruction process is elaborated by taking the sail-shaped plate in different directions are presented in Fig. 12.
Fig. 10 (b) as an example.
5.1.2. Sparse 3D reconstruction
The input folder of CasMVSNet should contain the image files,
camera files and view selection file. All undistorted image files are
stored in the images folder. The camera parameter of one image is stored
in a cam.txt file. The text file contains the camera extrinsic E = [R|t],
intrinsic K and the depth range. The depth range and depth resolution
are determined by the minimum depth, the interval between two depth
samples, and the depth sample number. The view selection result is
stored in the pair.txt. For each reference image, its view selection scores
are calculated by each of the other views, and the 10 best views are
stored in the pair.txt file.
In order to apply CasMVSNet to reconstruct the curved hull pate, the
sparse 3D reconstruction should be conducted for obtaining undistorted
images, camera parameters and sparse 3D point cloud data. Here, the
open source software COLMAP’s incremental SfM (Schonberger and
Frahm, 2016) is adopted to recover a sparse representation of the scene
and the camera poses of the input images. It commonly starts with
feature extraction and matching, followed by geometric verification.
The resulting scene graph serves as the foundation for the reconstruction
stage. The sparse reconstruction stage seeds the model with a carefully
selected two-view reconstruction, before incrementally registering new
images, triangulating scene points, filtering outliers, and refining the
Fig. 10. The in-kind images of three curved hull plates.
6
P. He et al. Ocean Engineering 260 (2022) 111968
reconstruction using bundle adjustment. file contains the pose and keypoints of all reconstructed images in the
The specific four steps to acquire the CasMVSNet input are given as dataset using two lines per image. The reconstructed pose of an image is
follows: specified as the projection from world to the camera coordinate system
of an image by using a quaternion and a translation vector. The points3D.
(1) In the first step, sparse feature points in the image are extracted txt file contains the information of all reconstructed 3D points in the
by using the scale-invariant feature transform (SIFT) (Lowe, dataset using one line per point. Eventually, a script colmap2mvsnet.py
2004) and their appearance is described by a numerical provided by MVSNet (Yao et al., 2018) is used to convert COLMAP SfM
descriptor. In this process, an intrinsic camera model of OPENCV result as the input of CasMVSNet.
is selected. The camera model includes focal distances, co
ordinates of the principal points, radial and tangential 5.1.3. Generation of depth maps and 3D dense point cloud
distortions. CasMVSNet is used to estimate a depth value for every pixel. First,
(2) In the second step, the feature matching and geometric verifica the cascade cost volume is built upon a feature pyramid encoding ge
tion are adopted to find correspondences between the feature ometry and context at gradually finer scales. The depth range of each
points in different images. The exhaustive matching mode is stage is then narrowed by the prediction from the previous stage. With
adopted. Since the number of images for the hull plate is rela gradually higher cost volume resolution and adaptive adjustment of
tively low (less than one hundred), this matching mode can be depth intervals, the output is recovered in a coarser to fine manner (Gu
fast enough and leads to the best reconstruction results. Here, et al., 2020).
each image is matched against other images. Before converting the result to the dense point clouds, it is necessary
(3) After producing the scene graph in the previous two steps, the to filter out the outliers at the background and occluded areas. Photo
incremental reconstruction is conducted. COLMAP first loads all metric and geometric consistencies are applied for the robust depth map
extracted data from the database into memory and seeds the filtering. The photometric consistency and the geometric constraint
reconstruction from an initial image pair. Then, the scene is measure the matching quality and the depth consistency among multiple
incrementally extended by registering new images and triangu views, respectively. This simple two-step filtering strategy shows strong
lating new points. The sparse reconstruction results of the sail- robustness for filtering different kinds of outliers.
shape plate are finally obtained in Fig. 13. CasMVSNet itself only produces per-view depth maps. In order to
(4) After recovering SfM result and undistorting all images, COLMAP generate 3D point cloud, a depth map fusion should be carried out to
exports the following three text files for the reconstructed model: integrate depth maps from different views. The fused depth maps are
cameras.txt, images.txt, and points3D.txt. then directly reprojected to space in order to generate 3D point cloud.
The depth map fusion is a general step in MVS reconstruction, and there
The cameras.txt file contains the intrinsic parameters of all recon are commendable implementations in the open-source MVS algorithms.
structed cameras in the dataset using one line per camera. The images.txt Here, a script depthfusion.py provided by MVSNet (Yao et al., 2018) is
7
P. He et al. Ocean Engineering 260 (2022) 111968
Fig. 12. Images of the sail-shape plate acquired by the sensor in four different directions.
8
P. He et al. Ocean Engineering 260 (2022) 111968
adopted to generate 3D dense point cloud with RGB color and normal
vector information, as shown in Fig. 14.
In this work, the sparse point cloud data of a hull plate are obtained
by Metronor One, whose process of measurement can be seen in Part II of
Fig. 11. Metronor One, as shown in Fig. 16, is a single camera electro-
optical portable coordinate measuring system. It uses electro-optics
and photogrammetry for accurately calculating where the handheld
probe, called Lightpen, is located both in terms of position and orien
tation. The unique highlight of Metronor One is that it uses an optical Fig. 16. In-kind photo of Metronor One.
technology based on a special mathematical optical modelling method.
The measuring system mainly includes a camera with tripod, a Lightpen, design values of a curved hull plate are compared with the measurement
a Laptop, the software and transportation case. It has a 3D length ac result to determine whether the shape of a curved hull plate meets the
curacy of ±0.08 mm as well as parallelism and planarity accuracies of design requirement. The four red points marked in Fig. 17 (b) corre
±0.025◦ within a 2.5 × 2.5 × 2.5 m3 volume (Metronor, 2019). More spond to the four markers shown in Fig. 15, which are subsequently used
over, it can measure a whole hull plate formed by the bending machine for point cloud registration.
SKWB-2500 with high precision. With an integrated battery and WiFi
pack, the Lightpen is a true cable-less, joint-less design that gives users
full flexibility to move and measure in the working area. However, the 5.3. Registration and error computation
measuring efficiency is not high due to such point-by-point and contact
measurement mode. Nevertheless, it still serves as a powerful device for In this work, the accuracy of measurement is assessed by computing
verifying the accuracy of the MVS reconstruction in this study. Fig. 17 the proximity between the two aligned point clouds. The accuracy
presents 3D points of three curved hull plates measured by Metronor evaluation process of 3D reconstruction for a curved hull plate is
One. They are considered accurate enough because the current sketched out in Part III of Fig. 11. First, the dense point cloud and sparse
measuring error with wooden templates is 1–2 mm according to point cloud are preliminarily registered by four corner points or markers
workers’ experience. Therefore, these 3D points can be regarded as the of a curved hull plate. Then, they are finely registered by an iterative
theoretical values of the molded surface of a hull plate with negligible closest point (ICP) algorithm proposed by Besl and McKay (1992).
errors. Finally, the gaps between two point cloud data are estimated by calcu
In the present study, 3D points measured by Metronor One are served lating the absolute displacement error, and the results are presented by
as approximate theoretical values to compare with the results of MVS the error histogram and distribution. 3D Point cloud registration and
reconstruction. But it should be noted that in practical applications, the accuracy evaluation are performed in the source software CloudCom
pare, whose key processes will be expounded below.
9
P. He et al. Ocean Engineering 260 (2022) 111968
This method can also allow the optimization of the scale parameter if the has big holes. Therefore, it can be a good idea to locally model the
two clouds have different scales. reference cloud underlying surface by fitting a local model on the
After both clouds are already roughly registered, the ICP algorithm is ‘nearest’ point and several of its neighbors in order to approximate the
adopted to automatically and finely register sparse and dense clouds real surface as well as get a better estimation of the ‘real’ distance, as
representing the same hull plate. The sparse point cloud and the dense shown in Fig. 19. The distance to this local model is statistically more
point cloud are specified as the registered cloud (will eventually move) precise. However, some distance values can be potentially worse than
and the reference cloud (will not move), respectively. Fig. 18 presents the nearest neighbor distance. In order to cope with this effect, the
the alignment and registration for the sail-shape plate. smaller distance between the nearest neighbor distance and the distance
to the local model is kept for each point in this work. Eventually, these
5.3.4. Error computation smaller distances are defined as the displacement error of cloud-cloud
The distances between each point of the compared cloud relatively to distances.
the reference cloud are computed to assess the measuring accuracy. The
reference cloud should have the widest extents and the highest density. 6. Results
Therefore, the dense point cloud is selected as the reference cloud, while
the sparse point cloud is selected as the compared cloud. The octree All reconstructed tasks of curved hull plates in this study are
structure, corresponding to the recursive partition of a cubical volume of executed on a linux operating system of an ubuntu 18.04 release using
space, is applied to increase efficiency for the nearest neighbor one consumer level NVIDIA GeForce GTX 1660 SUPER graphics card
extraction. with a video memory capacity of 6 GB in combination with AMD Ryzen
One way to compute the distances between two point clouds (i.e., the 9 3950X 16-core processors with random access memory (RAM) of 32
cloud-cloud distances) is the nearest neighbor distance: for each point of GB.
the compared cloud, the nearest point in the reference cloud is searched
via the octree structure, and the Euclidean distance is computed. If the
reference point cloud is dense enough, it is acceptable to approximate 6.1. Accuracy, efficiency and completeness
the distance from the compared cloud to the underlying surface repre
sented by the reference cloud. In fact, the nearest neighbor is not Table 2 presents the influence of the image number and image res
necessarily the actual nearest point on the surface represented by the olution on the efficiency and accuracy of 3D reconstruction for the sail-
cloud. This is especially true if the reference cloud has a low density or shaped plate, where IN, IR and NPC denote the image number, the image
resolution and the number of the point cloud after denoising,
10
P. He et al. Ocean Engineering 260 (2022) 111968
respectively. TIAT, TSR, TDR and TTR refer to time of image acquisition and
transfer, time of sparse 3D reconstruction, time of dense 3D recon
struction and total time of 3D reconstruction, respectively. Average of
errors, standard deviation (SD) of errors and the maximum displacement
error are indicated by μ, σ and Dmax, respectively. They are computed
from a grouped frequency distribution:
/
∑
m
μ = fi xi n,
i=1
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
√ m
√∑
)2 / (5)
σ=√ fi (xi − μ (n − 1),
i=1
Dmax = max(D1 , D2 , ..., Di , ..., Dn )
where xi and fi are the midpoint value and frequency of each interval,
respectively. m and n refer to the total number of intervals and samples,
respectively. Di denotes the displacement error of the ith sample. For
simplicity, four image resolutions of 6960 × 4640 pixels, 4800 × 3200
pixels, 3472 × 2320 pixels and 2400 × 1600 pixels are represented by
high resolution, medium resolution, low resolution and minimum res
olution, respectively.
The time of image acquisition and transfer (TIAT) is computed by
equation (4) with Tshooting = 2.5 s, T stabilizing = 1.5 s and Ttotal moving =
30 s for all of the cases. Four cameras are mounted on the four sides of
the top of the metal test frame to take pictures simultaneously in prac
tical applications. Therefore, the time to collect and transmit images for
four cameras is equal to the time taken by any one of them. The time of
sparse 3D reconstruction (TSR) mainly includes the time for feature
extraction, feature matching and SfM computation. The time of dense
3D reconstruction (TDR) comprises the time for undistorting images,
transforming data format and generating depth maps as well as 3D dense
point cloud. The total time of 3D reconstruction (TTR) is the sum of TIAT,
TSR, and TDR. In order to reduce the influence of the noise points around
Fig. 18. Alignment and registration for the sail-shape plate: (a) alignment with the edges of a curved hull plate on the accuracy evaluation, only the
four markers, (b) fine registration with the ICP algorithm. approximate theoretical points within the molded surface are used for
the error statistical analysis. As illustrated in Fig. 17, the small-size
plate, the middle-size plate and the large-size plate have 169, 81 and
88 approximate theoretical points within the molded surface,
respectively.
Table 2 indicates that as the number of images decreases from 52 to
20 at the high resolution, the efficiency of 3D reconstruction is improved
by more than 2.5 times, while its accuracy presents a slight decline.
However, when the number of images decreases from 20 to 12, μ and
Dmax increase from 0.422 mm to 0.864 mm and from 1.061 mm to 2.555
mm, respectively. The loss of accuracy is attributed to the incomplete
ness of 3D reconstruction, as shown in Fig. 20 (d). In addition, as the
image resolution increases gradually by using 20 images, the accuracy of
3D reconstruction is steadily improved, whereas its efficiency is slightly
reduced. Besides, when using 20 high-resolution images, TTR of the sail-
shaped plate is 169 s, and its μ, σ and Dmax are 0.422 mm, 0.271 mm and
1.061 mm, respectively. TTR is within 3 min, and μ as well as Dmax are
both within 1.061 mm. It can also be found that the number of the dense
Fig. 19. Illustration of cloud-cloud distances computation.
point cloud is mainly related to the number of images, and the current
four image resolutions have little effect on its number since they are all
larger than specified maximum resolution (1152 × 864 pixels) in
Table 2
3D reconstruction of the sail-shaped plate with various image numbers and resolutions.
IN IR/pixel NPC TIAT/s TSR/s TDR/s TTR/s μ/mm σ/mm Dmax/mm
11
P. He et al. Ocean Engineering 260 (2022) 111968
Fig. 20. The completeness of 3D reconstruction with different image numbers and resolutions for the sail-shaped plate.
CasMVSNet. Based on the maximum capacity of GPU memory available using 12 high-resolution images. Therefore, according to the accuracy,
in present device, the maximum resolution is specified to save GPU efficiency and completeness of the 3D reconstruction for the sail-shaped
memory consumption of dense reconstruction. Although four image plate, it is suggested that about 20 high-resolution images are used for
resolutions are all downsized to the same maximum resolution to the 3D reconstruction of a hull plate in the present measurement system.
reconstruct a curved hull plate, images with higher resolution still pre Table 3 compares the NPC, efficiency and accuracy of 3D recon
sent higher precision, as shown in Table 2. struction for three curved hull plates with various sizes. It can be found
It is shown in Fig. 20 that when the number of images exceeds 20, the that since the middle-size plate occupies least total pixels with its re
four resolutions can reconstruct a complete surface of the sail-shaped striction of shape, the NPC of the middle-size plate is far less than that of
plate, which is represented by 3D dense point cloud with RGB color the small-size plate and the large-size plate. Therefore, the TTR of
and normal vector information. However, the four corner areas and the middle-size plate is slightly shorter than the other two plates. But TTR of
complete surface of the sail-shaped plate fails to be reconstructed by the three plates are all about 3 min. Moreover, by comparing the mea
surement accuracy i.e., μ, σ and Dmax, of three hull plates, the mea
Table 3 surement accuracy of the small plate is highest among the three hull
3D reconstruction of three curved hull plates with various sizes. plates, since the small plate occupies more pixels per unit area than the
other two plates. In addition, the averages of errors of the small-size
Items Unit Small-size plate Middle-size plate Large-size plate
plate, middle-size plate and large-size plate are 0.215 mm, 0.422 mm
IN 20 20 20
–
and 0.541 mm, respectively. The averages of errors of three hull plates
IR pixel 6960 × 4640 6960 × 4640 6960 × 4640
NPC – 4758094 3872393 5978518 are all within 1 mm.
TIAT s 50 50 50 The histograms of the displacement errors for three curved hull
TSR s 32 23 29 plates are shown in Fig. 21, which assists in understanding the distri
TDR s 103 96 103 bution of errors. Four colors of blue, green, yellow and red are applied to
TTR s 185 169 182
differentiate various error values from small to large. Fig. 21 shows the
μ mm 0.215 0.422 0.541
σ mm 0.177 0.271 0.405 errors of all points are within 0.871 mm for the small-size plate and
Dmax mm 0.871 1.061 2.008 within 1.061 mm for the middle-size plate. For the large-size plate,
12
P. He et al. Ocean Engineering 260 (2022) 111968
96.59% of all points are within 1.410 mm, but only three points are in
1.410 mm–2.008 mm. In addition, the SD of three hull plates are 0.177
mm, 0.271 mm and 0.405 mm, respectively. The data indicate that the
deviation between measured data of the small-size plate and its mean
value is smallest, followed by that of the middle-size plate, and the de
viation between measured data of the large-size plate and its mean value
is largest. Since 1–2 mm interspaces between hull plates and wooden
templates is currently tolerable in the evaluation, the adopted
measuring method fulfills the requirement of measurement accuracy for
various hull plates within 2500 × 2500 mm according to the measure
ment accuracy (μ, σ and Dmax) and error distribution of the three hull
plates.
Table 4
Resampling results for three curved hull plates with different sizes and shapes.
Items Unit Small-size Middle-size Large-size
plate plate plate
IN – 20 20 20
IR pixel 6960 × 4640 6960 × 4640 6960 × 4640
NPC – 4758094 3872393 5978518
Comparing points – 169 81 88
Subsampling mm 5 10 20
space
Subsampling – 22037 16435 9229
points
μ mm 0.324 0.550 0.734
Fig. 21. Histograms of the displacement error: (a) the small-size plate, (b)the
σ mm 0.255 0.376 0.584
middle-size plate, (c)the large-size plate. Dmax mm 1.272 1.508 2.751
13
P. He et al. Ocean Engineering 260 (2022) 111968
Fig. 25. The colouring diagram of the displacement error for the middle-
size plate.
Fig. 22. Histogram of the displacement error for the small-size plate.
Fig. 26. Histogram of the displacement error for the large-size plate.
Fig. 23. The colouring diagram of the displacement error for the small-
size plate.
Fig. 27. The colouring diagram of the displacement error for the large-
size plate.
Fig. 24. Histogram of the displacement error for the middle-size plate.
14
P. He et al. Ocean Engineering 260 (2022) 111968
15
P. He et al. Ocean Engineering 260 (2022) 111968
GPU memory consumption should be further reduced for high- Lowe, D.G., 2004. Distinctive image features from scale-invariant keypoints. Int. J.
Comput. Vis. 60, 91–110.
resolution images by improving the network architecture. A promising
Metronor, 2019. Metronor one: single camera electro-optical portable coordinate
improvement is to devise a novel cost volume regularization scheme to measuring system. http://www.metronor.com.cn/wp-content/uploads/2019/05/
substitute 3D CNNs. Metronor-One-Datasheet-19.1.pdf. (Accessed 9 October 2021).
Mitsuyuki, T., Hiekata, K., Kasahara, T., 2020. Development of manufacturing support
system for ship curved shell plate using laser scanner. Res. Eng. 7, 100157.
CRediT authorship contribution statement Mohd, A.A., Derek, B.D., Albert, K.C., et al., 2017. Improvements to the accuracy of
prototype ship models measurement method using terrestrial laser scanner.
Pengpeng He: Conceptualization, Methodology, Formal analysis, Measurement 100, 301–310.
Munro, C., Daniel, W., 2007. Reconfigurable pin-type tooling: a survey of prior art and
Writing – original draft. Delin Hu: Validation, Investigation. Yong Hu: reduction to practice. J. Manuf. Sci. Eng. 129, 551–565.
Project administration, Supervision, Funding acquisition. Neven, H., Zdenka, K., Marko, H., et al., 2019. Analysis of elastic-plastic steel plates
forming based on typical shipyard’s roller bending machine. Ocean Eng. 190,
106438.
Declaration of competing interest Paoli, A., Razionale, A.V., 2012. Large yacht hull measurement by integrating optical
scanning with mechanical tracking-based methodologies. Robot. Comput. Integrated
The authors declare that they have no known competing financial Manuf. 28 (5), 592–601.
Park, J., Kim, D., Hyun, C., et al., 2016. Thermal forming automation system for curved
interests or personal relationships that could have appeared to influence hull plates in shipbuilding: analysis and design. Int. J. Comput. Integrated Manuf. 29
the work reported in this paper. (3), 287–297.
Park, J.S., Shin, J.G., Ko, K.H., 2007. Geometric assessment for fabrication of large hull
pieces in shipbuilding. Comput. Aided Des. 39 (10), 870–881.
Acknowledgements
Paszke, A., Gross, S., Massa, F., et al., 2019. PyTorch: an imperative style, high-
performance deep learning library. Adv. Neural Inf. Process. Syst. 32, 8026–8037.
This research was supported by the Fundamental Research Funds for Schonberger, J.L., Frahm, J., 2016. Structure-from-Motion revisited. In: Proceedings of
the Central Universities, grant number 215202001; and the National the IEEE Conference on Computer Vision and Pattern Recognition. CVPR),
pp. 4104–4113.
Natural Science Foundation of China, grant number 51779200. The first Shen, H.Q., Son, S.H., Kim, Y.M., et al., 2017. Forming information calculation algorithm
author wishes to thanks for the valuable comments provided by Pro of 3-D template for evaluation of curved hull plates. Proc. CIRP 62, 411–416.
fessor Wei Chai, at Wuhan University of Technology. Shin, J.G., Ryu, C.H., Nam, J.H., 2004. A comprehensive line-heating algorithm for
automatic formation of curved shell plates. J. Ship Prod. 20 (2), 69–78.
Son, S., Kim, B., Ryu, C., et al., 2020. Production automation system for three-
References dimensional template pieces used to evaluate shell plate completeness. Int. J. Nav.
Archit. Ocean Eng. 12, 116–128.
Aanæs, H., Jensen, R.R., Vogiatzis, G., et al., 2016. Large-scale data for multiple-view Su, S.J., Wang, C.F., Hu, Y., 2015. The key technology research about 3D CNC bending
stereopsis. Int. J. Comput. Vis. 120, 153–168. machine and experimental verification. J. Coast Res. 73, 584–588.
Besl, P.J., McKay, N.D., 1992. A method for registration of 3-D shapes. IEEE Trans. Wang, C.F., Hu, Y., Li, J.X., et al., 2010. A novel forming method for 3D ship hull
Pattern Anal. Mach. Intell. 14 (2), 239–256. forming. J. Wuhan Univ. Technol. (Transp. Sci. Eng.) 34, 431–434.
Bonnin-Pascual, F., Ortiz, A., 2019. On the use of robots and vision technologies for the Wang, C.F., Yuan, P., Li, J.X., et al., 2015. Curved Surface Forming Device for Adjustable
inspection of vessels: a survey on recent advances. Ocean Eng. 190, 106420. Segmented Mold Board with Square Rams. United States Patent, 8939754.
Furukawa, Y., Hernandez, C., 2015. Multi-view stereo: a tutorial. Found. Trends® Wang, Z.X., Wu, Z., Zhen, X., et al., 2016. An onsite inspection sensor for the formation of
Comput. Graph. Vis. 9, 1–148. hull plates based on active binocular stereovision. Proc. IME B J. Eng. Manufact. 230
Girshick, R., 2015. Fast R-CNN. In: Proceedings of the IEEE International Conference on (2), 279–292.
Computer Vision. ICCV), pp. 1440–1448. Wei, Y., Ding, Z.R., Huang, H.C., et al., 2019. A non-contact measurement method of ship
Gu, X.D., Fan, Z.W., Zhu, S.Y., et al., 2020. Cascade cost volume for high-resolution block using image-based 3D reconstruction technology. Ocean Eng. 178, 463–475.
multi-view stereo and stereo matching. In: Proceedings of the IEEE/CVF Conference Xiong, J., Hu, Y., Xia, Z.L., 2014. Design of hyperbolic curved hull plates pressing control
on Computer Vision and Pattern Recognition. CVPR), pp. 2495–2504. system based on CAN bus[J], 2003–2009 Open Autom. Control Syst. J. 6.
Heo, E.C., Kwon, M.C., Kim, B.C., 2008. Large free form measurement using slit beam. Xue, B., Zhao, T., Yang, X.X., et al., 2017. Relation study on the measuring space and
Int. Conf. Control Autom. Syst. 1224–1227. accuracy level of the multi-node rotary laser positioning system. Ocean Eng. 130,
Hiekata, K., Yamato, H., Enomoto, M., et al., 2011. Development and case studies of 429–436.
accuracy evaluation system for curved shell plates by laser scanner. J. Ship. Product. Yao, Y., Luo, Z., Li, S., et al., 2018. MVSNet: depth inference for unstructured multi-view
Des. 27 (2), 84–90. stereo. Proc. Eur. Conf. Comput. Vis. (ECCV) 767–783.
Hwang, S.Y., Lee, J.H., 2018. Feasibility of multipoint press with continuously divisional Yao, Y., Luo, Z., Li, S., et al., 2019. Recurrent mvsnet for high-resolution multi-view
forming for double curvature plates in shipbuilding. J. Ship. Product. Des. 34 (2), stereo depth inference. In: Proceedings of the IEEE/CVF Conference on Computer
94–110. Vision and Pattern Recognition. CVPR), pp. 5525–5534.
Kazhdan, M., Bolitho, M., Hoppe, H., 2006. Poisson surface reconstruction. In: Yoshihiko, T., Morinobu, I., Hiroyuki, S., 2011. “IHIMU-α” A fully automated steel plate
Proceedings of the Fourth Eurographics Symposium on Geometry Processing, bending system for shipbuilding. IHI Eng. Rev. 44 (1), 6–11.
pp. 61–70. Yuan, P., Li, J.X., Li, S.Y., et al., 2012. Studies on step by step approximation method of
Kazhdan, M., Hoppe, H., 2013. Screened Poisson surface reconstruction. ACM Trans. springback control of curved plate forming by 3D CNC hull plate forming machine.
Graph. 32 (3), 1–13. Ship. Eng. 34 (3), 65–68.
Lee, H., Lee, D.J., Huh, M.J., 2013. Development of a measurement system for curved Yuan, P., Wang, C.F., Hu, Y., et al., 2014. Development of large plate bending machine
ship hull plates with multi-slit structured light. J. Kore. Soc. Precis. Eng. 30 (30), for shipbuilding with three dimensional numerical control. Shipbuild. China 55 (2),
292–299. 122–131.
Lin, T.Y., Dollar, P., Girshick, R., et al., 2017. Feature pyramid networks for object
detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition. CVPR), pp. 2117–2125.
16