Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Alexandria Engineering Journal (2023) 68, 67–81

H O S T E D BY
Alexandria University

Alexandria Engineering Journal


www.elsevier.com/locate/aej
www.sciencedirect.com

ORIGINAL ARTICLE

Research on unbalanced mining of highway project


key data based on knowledge graph and cloud
model
Yinglin Wang a,b,*, Jiaxin Zhuang c, Guowei Zhou d, Shuhui Wang e

a
School of Transportation and Civil Engineering, Fujian Agriculture and Forestry University, Fuzhou 350100, China
b
School of Public Affairs, Xiamen University, Xiamen 361005, China
c
School of Transportation and Civil Engineering, Fujian Agriculture and Forestry University, Fuzhou 350100, China
d
Fujian Huamin Tongda Information Technology Co., Ltd., Fuzhou 350100, China
e
School of Economics and Management, Fuzhou University, Fuzhou 350000, China

Received 18 June 2022; revised 19 December 2022; accepted 6 January 2023

KEYWORDS Abstract Various stages of highway project construction process involve text, image, audio, video
Highway project; and other related data sources involving many participants, forming a huge amount of data. Accu-
Key data; rately tracing the source of responsibility, refining and applying the unbalanced data in the highway
Knowledge graph; project archives is of great significance for realizing the intelligent transformation of highway con-
Cloud model; struction project management. This paper firstly sorts out the construction process of highway pro-
Unbalanced mining; jects and the main data sources, constructs a data association network between construction entities
G-mean classification and construction process, as well as a knowledge map of highway construction data. Then, accord-
ing to the highway construction stage, an index system based on 12 key data is constructed by using
the entropy weight cloud model method, and the importance of the data is evaluated. Thirdly, based
on the unbalanced characteristics of highway project data, a method of mining big data in highway
project archives using classification evaluation indexes is proposed, and the accuracy of this method
is verified by case calculation. Finally, taking the Shizong Qiubei Expressway in China as an exam-
ple, the intelligent management and control suggestions for key data of transportation projects are
proposed. It is found that the key data with special importance rate in highway construction include
construction data, supervision data and completion data. Boosting algorithm is more accurate than
the traditional SMOTE algorithm for unbalanced data mining, which helps to save the project con-
struction cost and improve the quality of data extraction in the project archives. This study provides
a theoretical reference for key data traceability of highway project intelligent management and con-
trol platform and the improvement of intelligent management efficiency.
Ó 2023 THE AUTHORS. Published by Elsevier BV on behalf of Faculty of Engineering, Alexandria
University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/
licenses/by-nc-nd/4.0/).

* Corresponding author.
E-mail addresses: 510057661@qq.com (Y. Wang), 467002313@qq.com (J. Zhuang), 13950287936@163.com (G. Zhou), 18705079689@qq.com (S.
Wang).
Peer review under responsibility of Faculty of Engineering, Alexandria University.
https://doi.org/10.1016/j.aej.2023.01.010
1110-0168 Ó 2023 THE AUTHORS. Published by Elsevier BV on behalf of Faculty of Engineering, Alexandria University.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
68 Y. Wang et al.

1. Introduction Table 1 Search results of related literature.


Keywords Emerald SpecialSci
Road construction process involves many participants and Database
complex construction processes. Massive archival materials
Transportation archives 3617 336
are formed in the life cycle, providing valuable basic data
Transportation archives 123 0
information for enterprises, government departments and
unbalanced
other organizations. The archives of highway projects record Transportation archives imbalance 231 0
various forms of valuable information generated from the pro- Transportation data 35,764 104,820
ject start-up stage to the completion stage, such as text, images, Transportation data unbalanced 1286 171
audio, etc. It records the time progress of the whole construc- Transportation data imbalance 2054 2
tion project and is the original record of the whole life cycle of Transportation files 9314 1419
the road construction project [1]. These documents reflect Transportation files unbalanced 293 6
information on construction projects such as size, relevant Transportation files imbalance 582 2
responsible parties, construction quality and costs, safety con- Transportation records 17,828 5473
Transportation records imbalance 1096 15
ditions, etc. Therefore, it is an essential reliance document for
Transportation records 621 3
the implementation of the project supervision system and the
unbalanced
lifetime system of quality responsibility, as well as an effective
basis for later construction projects in terms of operation and
management, renovation and expansion, maintenance and
repair work. It can be seen that highway project archives are At present, research on aspects related to the study of
the most original records of transportation infrastructure con- unbalanced mining of big data in road project archives is
struction projects, containing a large number of knowledge mainly focused on traffic data and its management. Its main
data that can facilitate the development of construction pro- contents are broadly summarized as follows.
jects. However, the traditional highway project files are The first aspect focuses on issues related to the opening of
recorded in paper, which has some problems: firstly, the aggre- traffic data. Data is an important basis for driving decision-
gation workload is large, secondly, the traffic project files are making. Foreign countries pay attention to the cultivation of
regional, and it is troublesome to transport the files to other data awareness and optimize scientific decision-making in var-
places, and thirdly, if some natural disasters such as force ious fields by opening up data. For example, Stone & Aravo-
majeure occur, the paper files are easily damaged. Therefore, poulou [5] noted that TfL has transformed the ability of
it is urgent to mine the data in the highway project archives customers and employees to access real-time information by
so as to improve the problems existing in the project construc- opening up data to improve the travel experience. The purpose
tion and better provide foresighted advice for the project con- of traffic data openness is to make it clear that data belongs to
struction [2]. the public. It allows third parties to use the information to sup-
In recent years, highway projects and scale have shown a port the large volume of delivered information through strong
gradual upward trend. But the characteristics of large con- digital strategies, partnerships, and data management organi-
struction scale, long construction period, many personnel zations [6].
involved, many uncertainties and the long-standing issues of The second aspect of the research focuses on traffic data
high energy consumption and poor efficiency make the com- utilization. Grzenda et al. [6] implemented short-term predic-
plexity of highway projects also increasing. It is especially tion of vehicle delay data by means of algorithmic tools,
important to accurately classify and organize the unbalanced hybrid methods and machine learning models with multi-
data in the road project files and to perform data mining on layer perceptron training, batch mode and online learning
the project files. Therefore, it is necessary to sort out the high- methods. This solves the problem of collecting delayed loca-
way construction process, build the correlation network of tion data and improves the timeliness of data collection. Based
highway construction entities, clarify the key data indicators on a discrete choice model, Yap et al. [7] noted that urban tram
in highway construction, and find suitable big data mining and bus travel was only based on display preference data. And
methods to effectively extract the data in the highway project quantitative measures aimed at reducing congestion and sup-
archives, improve the quality of extracted information data porting the decision making process of decision makers. Rapid
in the archives, so as to provide a theoretical basis for building advances in information and communication technologies
the intelligent data management platform of highway con- have revolutionized public transportation. Big data (e.g. smart
struction projects and better provide predictive decisions for cards, detailed vehicle location data, mobile phone traces, data
the allocation of urban resources [3,4]. generated by social media) has started to replace traditional
surveys. Zannat et al. [8] evaluated the usability and potential
2. Literature review strengths and weaknesses through data sources to determine
the usefulness of big data. Through model building and tool
Highway construction cannot be separated from the control of algorithm, it optimizes passenger travel experience and reflects
cost, progress and quality. The advantages brought by big data data value. It can be seen that traffic data has very important
application to intelligent highway construction still revolve utilization value in various fields.
around cost, progress and quality. The authors searched the The third aspect is centered on the highway construction
Emerald Database and SpecialSci with the central search term projects’ control objectives of cost, schedule and quality. For
‘‘transportation archives” until March 9, 2022. The results of example, Vacanas [9] proposed the combination of building
the search are shown in Table 1. information modeling with UAV and big data analysis tech-
Research on unbalanced mining of highway project key data 69

nology to achieve 3D illustration of the project progress during for the establishment of intelligent management platform for
the construction process. It visually reflects the project pro- highway projects.
gress, which is beneficial for managers to make timely deci-
sions and ensure the achievement of progress goals. Babar & 3. Methods and objectives
Arif [10] proposed that the application of big data in smart
highway construction can be specifically divided into three This research combines knowledge mapping, entropy-cloud
parts, including the organization and management of big data, model, boosting integration algorithm and G-mean classifica-
real-time processing of big data and service management. The tion evaluation index to jointly build an intelligent manage-
planning of smart highway construction is realized by real-time ment platform for key data of transportation projects.
processing of big data. Yovanof & Hazapis [11] proposed that Knowledge mapping is used to extract and integrate the entity
smart cities are based on digital city infrastructure, and smart relationships in highway construction, and a network of entity
cities as a user-centered service supply system. The smart high- association relationships in highway construction is con-
way construction is part of the digital city infrastructure, so structed. Then, the entropy-weight-cloud model is used to
one of the important goals is to provide advanced, user- establish the key data evaluation index system in highway con-
centered and user co-created services to the public. struction and to rate the importance of each data in the pro-
Based on the current research status, it can be seen that ject. Based on the existence of unbalanced project data, the
there is a strong demand for highway construction in the con- unbalanced data mining design is carried out. In this phase
text of information technology, which has research space and of research, the boosting integration algorithm and G-mean
value. In addition, there are many advantages of applying classification evaluation index are invoked to perform unbal-
big data to highway construction, which is conducive to the anced data mining on actual cases, and the method is verified
realization of highway construction quality, progress and cost to be applicable to the data collation of road projects. Finally,
objectives. However, there are still policy deficiencies and tal- the intelligent management platform containing survey and
ent shortages in the application of big data. Based on this, design data, progress management data, construction data,
scholars have put forward corresponding solutions to realize supervision data and completion data is built based on key
the transition of big data application in intelligent highway data of highway projects (see Fig. 1 and Fig. 2).
construction from point-like technology pilot verification to
systematic technology integration research. The comprehen- 3.1. Knowledge mapping
sive literature shows that the highway construction process
lacks awareness of key data in data management, i. e., the
focus of data management is not clear. In terms of research As a method to visualize the relationships in the knowledge
methodology, there is a lack of quantitative analysis models system, knowledge mapping reveals the relationships among
to analyze the criticality of the data. In addition, experts have the subjects in a project through data mining, information pro-
begun to pay attention to the research on unbalanced data cessing, knowledge processing and graphing. It provides an
mining. Most experts are still on the research of unbalanced important reference for understanding the project operation
data mining such as human health, but there are fewer studies process and clarifying the sources and functions of data in
on unbalanced mining of big data in road project archives. the project. This paper first analyzes the types of highway pro-
Different from previous studies, this paper is innovative in ject construction data and classifies all the data involved in the
the following aspects: First, based on the source stage of high- project. Then, a network of entity association relationships is
way project data and the division of responsibility of generat- established from the logical relationships among highway con-
ing subjects, this paper traces the data generation chain and struction entities. Finally, based on the entity association rela-
forms the knowledge map of highway construction data and tionship network, the data in highway construction is analyzed
key data index system, extracts and classifies the importance for relevance and the knowledge map of highway construction
level of diverse and complex data to improve the quality of data is drawn.
project data summary, classification and application. Sec-
ondly, based on the quantitative data processing method, 3.2. Entropy-cloud model
Boosting algorithm, the classification evaluation index is
applied to check the accuracy of the unbalanced data in the The entropy-cloud modeling method can effectively transform
highway project archives. The processing scheme of key data fuzzy expressions or qualitative concepts in natural language
integration is explained, which provides the theoretical basis into quantitative data, and visualize the transformation results

Build entity
relationship network

Analysis Expert Ratings


Data linked Obtain data weights Evaluation cloud
Rating results
knowledge graph and entropy values map

Data

Fig. 1 Identifying key data flow.


70 Y. Wang et al.

Real cases

Random samples Tr aini ng Add weight s

Data Training sets Basic model Integration model


binary
classification
problem

Comparison with
Boosting algorithm is G-mean indicator as
traditional SMOTE Data analysis
applied to this study. the main indicator
algorithm

Fig. 2 Key Data Mining Process.

by drawing cloud diagrams. This paper analyzes the impor- Table 2 Evaluation indicators (binary classification problem).
tance of the 12 main data in highway construction in the 7
evaluation indicators of quality management, progress man- Predicted for most Predicted for minority
agement, cost management, information management, safety categories categories
management, environmental management and risk manage- Actual for most TP FN
ment. Firstly, experts from the industry and universities are categories
invited to score the importance of the 12 main data in the 7 Actual for a few FP TN
evaluation indexes using the 5-level scale method, and the categories
results are normalized to obtain the entropy value and weight
of each evaluation index. Then the normal cloud generator is
used to generate a large number of cloud drops at different
portion of all correctly judged samples in the classification
levels, and the reverse cloud generator is used to calculate
dataset. However, in the analysis of problems related to unbal-
the cloud digital characteristic values corresponding to each
anced datasets, the accuracy is strongly influenced by most cat-
index. Finally, the evaluation cloud is drawn. This method
egories. Therefore, to evaluate the classification effect of
provides a good judgment basis for the weight of key data
unbalanced data, new performance evaluation indicators are
evaluation indicators.
needed to accurately represent the accuracy rate.
Accuracy:
3.3. Unbalanced data mining model
TP
precision ¼
In this paper, the Boosting integration algorithm is used to ðTP þ FPÞ
establish an unbalanced data classification model applicable Recall:
to key data of highway projects. Boosting is one of the most
TP
common and effective integrated learning methods. The prin- recall ¼
ciple of the Boosting integration algorithm is to take advan- ðTP þ FNÞ
tage of the independence of the base classifiers, which can F-value:
greatly reduce the error by balancing the influence of each
2  recall  precision
model. The combined model of each base classifier has the F¼
advantage of high efficiency and high usability. recall þ precision
Two important processes in machine learning are model
selection and model evaluation. Performance index is the key G- mean values can be used to rate the imbalance of data
indicator to evaluate the effectiveness of classifier and guide classification. The basic idea is to make the classification
learning. Accuracy is the most commonly used classification accuracy of both majority and minority classes as high
evaluation indicator. To evaluate the performance of the as possible while maintaining a balance between them.
established model, the idea of a binary classification problem If the classification accuracy of the majority category
is incorporated in this paper and performance indicators such is high but the classification accuracy of the minority
as accuracy, F-value and G-mean value are used for evalua- category is low, the G-mean value will not be ideal. Thus
tion. The performance evaluation is shown in Table 2. Where, a larger G-mean value indicates that the model is better
T means correct prediction, F means wrong prediction, N at classifying both types of samples. The G-mean is
means minority prediction, P means majority prediction, so defined as:
TP means correct prediction means majority prediction, and
other indicators are similar. rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
TP TN
According to Table 2, the accuracy rate (TP + TN)/(TP + G  mean ¼ ð Þð Þ
FN + TN + FP) can be obtained, which represents the pro- TP þ FN TN þ FP
Research on unbalanced mining of highway project key data 71

Table 3 Range of highway construction data.


Data type Detailed data types Details
Highway construction project Project data Project proposal, feasibility study report, project evaluation
data
Highway construction survey survey and design data Survey records report, construction design documents, design change documents
and design data
Highway construction project Land acquisition and Progress plan, demolition cost, land acquisition quantity, demolition quantity
management data relocation data
Bidding data Bidding list, bidding documents, tender documents, bid evaluation documents,
successful bid documents
Project contract data Highway construction contract documents, survey and design contract documents,
supervision contract documents, regulation price and change documents
Highway construction Measurement payment Project temporary works submission documents, measurement audit documents,
implementation data data intermediate measurement submission documents, payment reports
Progress management Construction progress plan, quantity completion report
data
Construction data Construction site text, diagram and image records, site inspection records,
measurement records, construction logs
Supervision data Supervisory station records, supervisory site text, diagram and image records,
supervisory notice documents, supervisory logs, supervisory monthly reports,
supervisory annual summaries
Safety production data Technical delivery documents, safety production rules and regulations documents,
safety accidents and their handling records
Highway construction Completion data Project delivery and completion acceptance report
delivery and completion data Project settlement and Project settlement documents, project completion documents
final account data

By using the G-mean classification evaluation indicator can struction data mainly involves various resources related to
effectively obtain prediction results while ensuring the accu- highway construction activities, such as text, diagrams and
racy of a small number of class samples. In the research on images with preservation value. It includes data generated dur-
the classification of unbalanced datasets of highway project ing the project, survey and design, project management, imple-
archives, the G-mean classification evaluation indicator is used mentation, delivery and completion of highway construction
to accurately identify whether a sample is a minority or major- (see Table 3).
ity category, thus reducing the difficulty of data mining. In The main types of data resources generated at different
short, it is to use G-mean classification evaluation indicator stages vary. The highway construction process has many pro-
to identify the probability of classification algorithm to classify cesses and involves complex participants, and lacks uniform
dataset samples into minority classes. This saving data mining standards in data collection and storage. Therefore, multiple
cost, reducing construction risk and improving the efficiency of types of data in different formats are formed during the high-
unbalanced mining of big data in highway project archives. way construction process.

4. Correlation analysis of road construction data based on 4.2. Construction of highway construction entity association
knowledge mapping relationship network

4.1. Analysis of highway construction data Highway construction data content sources cover highway
projects, project-related units and personnel of related units,
With the advent of the era of big data, knowledge mapping has such as construction personnel, construction unit personnel,
been developed in many fields and realized scenario-based construction units, construction units and other entities. The
applications. This has established a foundation for research relationship between the entities is complex and there exists a
on knowledge graphs in the field of road construction data. large amount of valuable knowledge resources.
To promote the correlation analysis of highway construction There are some logical relationships between highway con-
data, this chapter clarifies the main requirements by sorting struction entities, such as the affiliation between construction
out the construction process. It provides ideas for the applica- personnel and construction units; the cooperation relationship
tion of big data in highway construction from the aspects of between construction personnel; and the output relationship
highway construction data analysis, construction of entity between highway projects and highways. For example, during
association relationship network and data association. the construction of a highway project, the construction unit is
Data analysis is the first step for highway construction data responsible for the management of the project and the differ-
correlation analysis. Data analysis requires a comprehensive ent parties involved, the survey and design unit completes
understanding of the data resources and data types generated the survey and design of the project, and the supervisory unit
in the highway construction process. At present, highway con- supervises the construction unit to complete the construction
72 Y. Wang et al.

Table 4 Description of road construction entity relationships.


Entity(e1) Entity(e2) Relationship description
Construction personnel Constructor Affiliated with
Construction personnel Highway projects The relationship between construction personnel and highway projects
Project Manager Highway projects The relationship between project managers and highway projects
Construction unit personnel Highway projects The relationship between construction unit personnel and highway projects
Survey and design personnel Highway projects The relationship between survey and design personnel and highway projects
Supervisors Highway projects The relationship between supervisors and road projects
Construction unit Constructor The management relationship between the construction unit and contractor
Supervisory unit Constructor The regulatory relationship between the supervisory unit and constructor
Construction personnel Construction Cooperation between construction personnel
personnel
Project Manager Construction The management relationship between the project manager and construction personnel
personnel
Highway projects Highway Output relationship between highway projects and highway

task. In this way, the construction of the highway is realized. Hunan Provincial Transportation Planning and Survey Design
During the road construction process of this project, the rela- Institute, responsible for the preliminary survey and design of
tionship between the road project and the personnel of each highway projects. The construction unit of the project is Shuifa
unit, the management and cooperation relationship between (Hunan) Transportation Construction Group Co., Ltd. The
the relevant personnel, and the management relationship whole highway project has only one construction contract sec-
between units occurred. Detailed highway construction entity tion. Zhang is the project manager of the construction unit,
relationships are described in Table 4. managing the road construction activities and cooperating
with the construction personnel, such as Chen, to complete
4.3. Association network of highway construction entities the highway project.Zhou belongs to Changsha Nuclear Engi-
neering Supervision Consulting Co., LTD., to supervise and
In the process of highway construction, the division of labor is manage the construction units, to ensure the quality of the
quite different, so the personnel entity is divided into five cat- highway. Therefore, the knowledge map of highway construc-
egories: construction personnel, project manager, supervisors, tion data association can be obtained, as shown in Fig. 4.
survey and design personnel, and construction unit personnel. Through the mapping of highway construction data associ-
The construction personnel are mainly responsible for carrying ation knowledge map and the relationship between entities, the
out construction activities and producing highway projects. different data are associated with their corresponding partici-
Project manager is responsible for the management of the pants. The highway construction data correlation knowledge
whole construction activities. Supervisors are mainly responsi- mapping system sorts out the pulse of data generation. Pro-
ble for supervising the construction and checking the quality of vides visualization support to effectively display the correla-
the project. The construction unit personnel are responsible for tion of data. It also provides direction for the application of
the preliminary project establishment, project management big data in the construction of smart highway.
and completion acceptance. Survey and design personnel are
responsible for survey and design work. The current highway 5. Research on key data of highway construction based on
construction project involves four types of participants includ- entropy-cloud model
ing constructors, survey and design units, construction units
and supervision units. Since different participants have its To evaluate the key data of highway construction, it is neces-
own specificity, the participants are set up as 4 different enti- sary to establish the key data evaluation indicator system first.
ties. Combined with the description of the relationship of high- Through extensive reading of highway construction perfor-
way construction entities, the different entities are associated. mance evaluation literature and related literature, systemati-
By output, associate highway projects with highway construc- cally sort out the content of the literature. Ultimately, the
tion. Associate the personnel with the corresponding partici- seven management aspects of quality, schedule, cost, informa-
pant through affiliation. Through supervision and tion, safety, environment and risk are used as evaluation indi-
management, the supervisory unit personnel and the supervi- cators [12–16], to construct a key data evaluation indicator
sory unit are associated with the highway project and the con- system for highway construction.
struction unit respectively. Based on this, a highway Based on Chapter 4, the 12 main data from project data to
construction entity association relationship network can be project settlement and final account data involved in the high-
constructed, as shown in Fig. 3. way construction process are clarified. Author invited 22
Taking the G107 Yizhang Bypass as an example, the data experts. Based on years of research and practical experience,
are correlated and analyzed based on a network of correlation experts scored the importance of the seven evaluation indica-
relationships among highway construction entities. Yizhang tors based on the above 12 highway construction data. The
County Shuntong Traffic Co., Ltd. is the construction unit scoring process is based on a five-level scale, divided into five
of the project. Zhong is in charge of the project and is solely levels: unimportant, less important, generally important, more
responsible for the highway project. Song is attached to the important, and especially important, with scores of 1, 2, 3, 4,
Research on unbalanced mining of highway project key data 73

Fig. 3 Network of road construction entity linkages.

Fig. 4 Knowledge map of highway construction data association.

and 5, respectively. The invited experts are from universities least influence on the evaluation results, so the weight value is
and various project participants, covering a wide range of the smallest.
areas, and the results obtained are accurate and true. By nor- In order to graphically describe the importance of individ-
malizing the initial data of key data evaluation (see Table 5), ual indicators more intuitively, this paper passes the normal
the entropy value and weight of each evaluation indicator cloud generator. 3000 cloud drops are generated at each of
can be calculated (see Table 6). the different levels. Using the reverse cloud generator, calcu-
Based on Table 6, it can be seen that security management late the cloud numerical characteristic values corresponding
has the highest weight value of 0.1810. That is, it is clear that to the indicators. And taking quality management, cost man-
there is a large difference in the importance of the 12 data for agement and safety management as examples, the evaluation
security management. The dispersion of the indicators is large, cloud diagram is drawn (see Fig. 5).
and the influence of the indicators on the evaluation results is Combined with the evaluation of the initial data, the cloud
large, so the weight value is large. And information manage- characteristic value of the indicator evaluation level and the
ment has the smallest weight value of 0.0826. That is, the weight of the indicator for weighted calculation can be
importance of the 12 data to the information management obtained highway construction key data evaluation results
indicators is relatively close. The dispersion degree of informa- (see Table 7).
tion management indicators is small, and this indicator has the
74 Y. Wang et al.

Table 5 Highway construction key data evaluation after normalization.


Data Quality Progress Cost Information Security Environmental Risk
management management management management management management management
Project data 0.5217 0.4444 0.4615 0.4118 0.3600 0.4583 0.4706
Survey and design data 0.8696 0.2778 0.3846 0.6471 0.7200 0.5833 0.9412
Land acquisition and 0.0000 0.7778 0.0000 0.0000 0.0400 0.3333 0.2353
relocation data
Bidding data 0.0435 0.0000 0.2308 0.4118 0.2000 0.2500 0.1765
Project contract data 0.6087 0.6667 0.4615 0.7059 0.4400 0.5417 0.6471
Measurement payment 0.2609 0.3889 1.0000 0.5882 0.0000 0.0417 0.5294
data
Progress management 0.4348 1.0000 0.2308 0.9412 0.6000 0.5000 0.4118
data
Construction data 1.0000 0.8333 0.6154 1.0000 0.8000 1.0000 0.9412
Supervision data 1.0000 0.9444 0.6154 0.9412 0.8000 1.0000 1.0000
Safety production data 0.2609 0.3333 0.0000 0.4706 1.0000 0.3750 0.4118
Completion data 0.8261 0.8333 0.5385 1.0000 0.6400 0.9167 1.0000
Project settlement 0.2174 0.0000 0.6923 0.4706 0.0400 0.0000 0.0000

Table 6 Entropy value and weight value of each evaluation indicator.


Evaluation Quality Progress Cost Information Security Environmental Risk
indicators management management management management management management management
Entropy value 0.8873 0.8940 0.8923 0.9425 0.8741 0.8984 0.9160
Weight value 0.1621 0.1524 0.1549 0.0826 0.1810 0.1462 0.1208

Fig. 5 Evaluation cloud map.


Research on unbalanced mining of highway project key data 75

Table 7 Road construction data rating results.


Data Not Less General More Special Assessment
important important importance important importance results
Project data 0.0000 0.0011 0.1756 0.6078 0.0931 More important
Survey and design data 0.0000 0.0002 0.0658 0.5227 0.3681 More important
Land acquisition and demolition data 0.0000 0.0227 0.6255 0.2332 0.1099 General
importance
Bidding data 0.0000 0.0173 0.6330 0.2493 0.0183 General
importance
Project contract data 0.0000 0.0003 0.0830 0.7127 0.1638 More important
Measurement of payment data 0.0003 0.0663 0.2987 0.3383 0.0274 More important
Progress management data 0.0000 0.0004 0.1071 0.5861 0.1568 More important
Construction data 0.0000 0.0000 0.0020 0.3087 0.7951 Special
importance
Supervisory data 0.0000 0.0000 0.0017 0.2794 0.7393 Special
importance
Safety production data 0.0000 0.0029 0.2667 0.4028 0.0292 More important
Completion data 0.0000 0.0000 0.0050 0.4931 0.7068 Special
importance
Engineering settlement and final 0.0006 0.0840 0.5232 0.1399 0.1139 General
account data importance

As can be seen from Table 7, the data ranked as generally (1) The distribution of raw data is unbalanced. One is that
important include land acquisition and demolition data, bid- the sample of project archives has a relatively balanced
ding data, project settlement and final account data. The more distribution of numbers, but an unbalanced distribution
important data are project data, survey and design data, pro- of projects. For example, the distribution of traffic sig-
ject contract data, measurement and payment data, progress nals is either very concentrated or very scattered. The
management data, and safety production data. Especially other is that there is an unbalanced distribution of both
important are construction data, supervision data and comple- the number of raw data and projects. These two unbal-
tion data. Therefore, the key data in highway construction are anced situations often occur in traffic construction,
construction data, supervision data and completion data. Key which requires a high degree of granularity and accuracy
data plays a remarkably significant role in project construc- in project data analysis and mining.
tion. Therefore, when applying big data to highway construc- (2) A large number of small parts exist in highway construc-
tion, the control of key data should be strengthened, and then tion projects. These parts are widely distributed, small in
improve the highway big data application system. size and used for different purposes. It is easy to be over-
looked compared to other large parts or instruments,
6. Establishment and verification of unbalanced data mining and is prone to errors and omissions during construc-
model tion. Once errors and omissions have a significant
impact on project safety, the model is required to have
6.1. Theoretical basis accurate identification and classification performance.
(3) Key data accounts for a relatively small percentage of
the project data. However, as a part of the project with
In essence, highway project archives supervise the quality of
greater impact, more attention should be paid to the
construction. Its method is to record all kinds of information
analysis of key data in data mining.
of traffic engineering by virtue of images, words and diagrams,
etc., and finally complete the subsequent summarization and
The mining method of unbalanced data of highway project
organization. At the same time, during the construction and
archives is mainly divided into two levels: data level and algo-
completion of the project, the project information is the most
rithm level. This paper mainly analyzes and processes the
original project information. It contains all the information of
unbalanced data in the highway project archives from the algo-
the project and is the basis of information for the construction
rithm level. By effectively classifying unbalanced data samples,
and maintenance audit of the project. The traffic management
the data in the project archive can be better utilized for its
department can analyze the archives to grasp more engineering
specific value. Unbalanced data means that the number of
facts, thus realizing the effectiveness and timeliness of informa-
samples in one category is far greater than that in another cat-
tion feedback. In addition, traffic archives can also reflect var-
egory. Usually, the type with a large number of samples is
ious facts and predict construction risks in advance. It is
called majority type, and the other type is minority type. Cur-
convenient to take corresponding measures to strengthen
rently, minority class detection and learning based on unbal-
maintenance. And ultimately promote the orderly develop-
anced data have become cross-disciplinary challenges.
ment of transportation.
Based on the disequilibrium characteristics of traffic project
The related data in the highway project archives show
data, the following hypothesis is proposed:
unbalanced characteristics, which are mainly reflected in the
Hypothesis 1: The unbalanced data of the proposed high-
following three aspects:
way project archives in this paper are mainly reflected in the
76 Y. Wang et al.

damage rate of the facilities. The number of facilities damaged


in a certain number of years is low and the number that remain Training Examples
intact is high.
Hypothesis 2: The proposed dataset samples in this paper
are not correlated with each other.
Test Sample 1 Test Sample 2 Test Sample 3
6.2. Model construction

Highway project archives are huge and complex, and there is a


great imbalance in the distribution of data in the highway pro- Learning Learning Learning
ject archives. Traditional classification methods are mainly Algorithm Algorithm Algorithm
used to study balanced data. For unbalanced datasets, minor-
ity category samples are often ignored in favor of majority cat-
egory, which reduces the accuracy of classification. Traditional
data mining methods tend to ignore samples with less data. It Classifier 1 Classifier 2 Classifier 3
not only affects the accuracy of data mining, but also does not
reflect the value of data well. Suppose there are 1000 highway
monitors, of which 10 are damaged, and the remaining 990 are
working properly, with a damage rate of 1 %. Even if all the
damaged samples are incorrectly classified into the intact sam-
ples, the classification accuracy of the whole sample dataset is
as high as 99 %. In practical cases, minority categories cannot New Data Combined Classifiers Prediction

be ignored, and the cost of minority category misclassification


into majority category is usually much higher. For unbalanced Fig. 6 Boosting integration algorithm flow.
datasets, risks in project construction cannot be accurately pre-
dicted if the dataset samples are incorrectly categorized into
minority categories of data. It will increase the cost of the pro-
ject and even delay the construction progress. Therefore, 6.3. Model validation
unbalanced data mining in highway project archives focuses
on the classification study of data in the sample dataset. To verify the various performances of the proposed model in
Unbalanced data in the highway project archives represent this paper, data from three highway projects are analyzed
the unbalanced data of category distribution, which can be and integrated. These three projects are: Wenshang County
defined as: Transportation Infrastructure Construction PPP Project, Gui-
gang City Nine Roads and Two Bridges Highway Project and

IR ¼ Zhushan County Transportation Infrastructure PPP Project.
N First, the properties of the unbalanced data set in each project
Where Nþ denotes the number of majority category sam- are understood. Then compare the accuracy, F value, G-mean
ples and N denotes the number of minority category samples. and the traditional SMOTE oversampling algorithm to ana-
When IR > 1, it means that the dataset is an unbalanced data- lyze the effectiveness of the proposed model.
set. At present, unbalanced data has been widely existed in the
construction of large-scale engineering projects. Therefore, the 6.3.1. Base data
research of unbalanced data mining has become one of the cur- The total length of Wenshang County Transportation Infras-
rent hot issues. tructure Construction PPP Project is 4989 km, with a total
In practical applications unbalanced data classification is investment of 373,442,400 yuan. Among them, highway pro-
an urgent problem to be solved. The current ability to effec- ject files also play an important role in project construction.
tively categorize unbalanced data is not ideal. Traditional clas- Some of the more common infrastructure in highways are
sification algorithms and some evolutionary-based ensemble street lights, barrier fences, guardrails, signage and lane mon-
methods can only be trained using an iterative process. And itoring. Through the project implementation plan described in
highway project archives are particularly large in order of both sides of the road every 50 m a street light, a total of 2 mil-
magnitude, and iterations can lead to an order of magnitude lion. Similarly, the number of other infrastructures can be
increase in memory consumption. The operation steps of the derived (all units are recorded in tens of thousands), and the
Boosting integration algorithm are as follows (see Fig. 6): unbalanced data information is shown in Table 8.
The Guigang City Nine Roads and Two Bridges Highway
1) Random sampling of data. In each round, n training Project is a highway bridge project consisting of 11 bundled
samples are selected from the original data set samples and packaged projects, including S511 Jiangnan Industrial
with put-back, and m training sets are obtained in m Park in Guigang City to the boundary of bridge and embank-
rounds (the training sets are independent of each other). ment. The total length is 147.35 km, and the total investment
2) One training set trains one model, so m training sets amounts to 4110,700,000 yuan. By analyzing the cross-
yield a total of m base models. sectional design drawings within the implementation plan of
3) Aggregating. Use the m basic models to predict the test the Guigang City Nine Roads and Two Bridges Highway Pro-
set and aggregate the m prediction results. ject, the following six infrastructure quantities can be obtained.
Research on unbalanced mining of highway project key data 77

Table 8 Attributes of unbalanced dataset of Wenshang Table 10 Attributes of unbalanced dataset of Zhushan
County Transportation Infrastructure Construction PPP County Transportation Infrastructure PPP Project.
Project. Unbalanced data sets Sample size Imbalance
Unbalanced data sets Sample size Imbalance (million) rate
(million) rate Street lights 2.87 826:1
Street lights 200 1975:1 Isolation fence 2.34 267:1
Isolation fence 131 172:1 Protective fence 2.05 165:1
Protective fence 653 200:1 Alarm telephone 0.06 433:1
Alarm telephone 12 367:1 installations
installations Signage 2.16 405:1
Signage 106 579:1 Lane monitoring 2.19 792:1
Lane monitoring 64 926:1

used to evaluate the model, it is likely to ignore the accuracy


of a few classes of samples and cannot analyze unbalanced
Table 9 Attributes of unbalanced dataset of Guigang City data well. The F-value is a weighted summed average indicator
Nine Roads and Two Bridges Highway Project. of precision and recall, which pays more attention to the
minority class samples. g-mean is a commonly used indicator
Unbalanced data sets Sample size Imbalance in unbalanced classification problems. In this paper, setting
(million) rate
the accuracy of majority category as precision0, full rate as re-
Street lights 1.47 923:1 call0, and F value as F0, and the accuracy of minority category
Isolation fence 1.13 232:1 as precision1, full rate as recall1, and F value as F1. The data
Protective fence 1.07 134:1 results are shown in Table 11.
Alarm telephone 0.04 265:1
It can be seen from Table 11 that the accuracy of the tradi-
installations
Signage 1.02 369:1
tional SMOTE algorithm in the evaluation indicator standard,
Lane monitoring 0.52 826:1 the average values of F1 and G-mean are all lower than 0.8,
but up to 0.8392. This indicates that the traditional SOMTE
algorithm is suitable for majority category data sets and is rel-
atively poor for classification of minority categories. But the
average values of the three evaluation indexes of the Boosting
The unbalance data information for this project is shown in algorithm proposed in this paper are above 0.85. And the F0 is
Table 9. 0.9021, which shows that the Boosting algorithm is also very
The Zhushan County Transportation Infrastructure PPP effective in classifying the majority category dataset. The value
Project mainly includes five sub-projects such as the tourism of the isolated network data in the Boosting algorithm is as
road around the reservoir in Pankou reservoir area of Zhushan high as 0.9339, while the G-mean value of this data for street
County. The total length is 203.104 km, and the total invest- lights is 0.9321. It shows that the Boosting algorithm greatly
ment estimate is 466.90 million yuan. By analyzing the project improves the stability of unbalanced data.
descriptions within the implementation plan of Zhushan Combined with the above data, it can be seen that the
County’s transportation infrastructure PPP projects, the fol- Boosting algorithm better enhances the unbalanced data min-
lowing six types of infrastructure quantities can be obtained. ing as a whole, ensuring both minority class sample data and
The unbalanced data information of this project is shown in improving the efficiency of data mining. It can save the cost
Table 10. of project construction, improve the quality of information
The unbalance rates in Tables 8, 9 and 10 are assumed to be data extracted from project archives, and reduce project risks.
the rate of damage to the infrastructure over a certain number It can better allocate resources for construction decisions, thus
of years. The number of damaged ones is much less than the adapting to the requirements put forward by development for
number of intact ones, which leads to unbalanced data. The highway project archive management.
five datasets are ranked in order of the unbalance rate of the
dataset from smallest to largest. The extreme unbalance in 7. Case study of key data integration of highway projects based
the distribution between positive and negative samples of these on intelligent management platform
three data sets. The imbalance data set for the unbalanced
Wenshang County project ranged from 172 to 1975, while This part takes Yunnan Shizong Qiubei Expressway as an
the Guigang City project ranged from 134 to 923 and the example to verify and explain the key data sources and respon-
Zhushan County project ranged from 165 to 826. For the sibility traceability of the highway project. Analyze the data
unbalanced data in the highway project archives, the accuracy integration principle of the intelligent control platform from
of the minority category samples is to be pursued more under the management perspective of progress, quality and safety.
the assurance of the classification accuracy of the whole data The total length of the project route is 91 km, with an invest-
set. ment estimate of 17.976 billion yuan. The roadbed width is
25.5 m, with two-way 4-lane construction. Among them, the
6.3.2. Data analysis route of the Wenshan section is 21 km long, with an estimated
Classification evaluation indicators can influence the overall investment of 3.431 billion yuan. There are two flyovers, Wen-
effectiveness of the model classification. If only accuracy is liu and Shuanglongying, with a bridge tunnel ratio of 54.2 %,
78 Y. Wang et al.

Table 11 Evaluation metrics of the traditional SMOTE algorithm on the experimental dataset.
Unbalanced data sets Accuracy F0 F1 G  mean Accuracy F0 F1 G  mean
SMOTE algorithm Boosting algorithm
Street lights 0.8324 0.7938 0.8821 0.7435 0.8932 0.9213 0.8794 0.9321
Isolation fence 0.7745 0.9123 0.8657 0.8744 0.9532 0.9254 0.9339 0.9276
Protective fence 0.7322 0.8653 0.7587 0.7836 0.8922 0.8891 0.9198 0.8736
Alarm telephone installations 0.6435 0.8237 0.6217 0.6041 0.8452 0.8348 0.8126 0.8627
Signage 0.5974 0.7611 0.7823 0.7351 0.8453 0.8672 0.9046 0.8477
Lane monitoring 0.7959 0.8792 0.8635 0.8945 0.8535 0.9748 0.8693 0.8957
Average 0.7293 0.8392 0.7957 0.7725 0.8804 0.9021 0.8866 0.8899

which are invested and constructed by Yunnan Communica- information, preliminary planning, survey and design results
tions Investment Group. The intelligent management platform of the highway project, which is the basis for the smooth
of the project combines digital visualization, intelligent collec- implementation of the project. As important data, project
tion and electronic file management to realize the full integra- approval data and survey data need to be recorded and used
tion and utilization of key project data. Fig. 7 shows the effectively. When managers input data on the platform, the
working architecture of the intelligent control platform. platform combines with big data analysis technology to
From the data assessment in Chapter 5, it is known that the achieve online intelligent audit of survey and design data. By
key data in the construction of highway projects are project establishing a library of standard templates for highway con-
data, survey and design data, progress management data, con- struction survey and design documents, manage the standard
struction data and supervision data, and completion data. forms for survey and record documents to achieve unification
Therefore, it is necessary to design functions that can improve of the forms used. Standardization of data formats is achieved
the collection and processing efficiency of these critical data in through the display of online filling examples. Then, through
the intelligent management platform. the intelligent audit mode, it realizes the double check of data
and improves the accuracy and utilization of data. The con-
7.1. Planning stage struction of the information intelligent management and con-
trol platform has changed the traditional situation that the
Project approval data, survey and design data are mainly pro- survey and design data are simply audited manually, and
vided by Yunnan Trading Group. These data include the basic reducing the phenomenon of inaccurate manual audit.

Entrance Highway Project Master Control System


Data standard
Basic Information BIM+GIS visual Digital business
Data supervision Business system entrance
Management management supervision
Data assets

After being created by the project Create subject databases for each business system
integration platform, it will be stored in and update data from the main business
the intermediate docking theme database management database to the statistics board
synchronously. through MQ message queue mode .

Unified
account Project Company Personnel Divide the Progress Quality Big data
Cost Data ĂĂ
and entrance Information Information Information code Data Safety Data integration
application

Business Integration Master Data Business Result Master Data


Insert the subject data of each business
Get theme data through json
subsystem into the corresponding database
interface
table through the json interface.

Management
Standard
Integrated Land Acquisition
Investment Control Plan Progress
Management and Demolition
Fit the business

Intelligent
Acquisition Contract Security
Quality Management Electronic Record Ă
measurement Management

Interconnection

Business Management Subsystem


Fig. 7 Intelligent management platform working architecture.
Research on unbalanced mining of highway project key data 79

7.2. Construction stage progress data, construction data and supervision data. It
reduces the workload of the supervisors and the construction
The data generated during the project construction process is units and enhances the communication between the construc-
the main data for project progress management and construc- tion units and the supervisors. This has a very positive effect
tion. These data are mainly recorded by the builder and the on improving construction data as well as supervision data.
construction manager.
In the construction of the Yunnan Shizong-Qubei Express- 7.3. Completion stage
way project, video monitoring and 3D laser scanning equip-
ment are installed at the construction site. The actual The completion data includes all the data generated from the
completion of the project is obtained by scanning the 3D laser preparation of the highway project to the completion of the
scanning equipment and the data is transmitted back to the highway project. Therefore, it is difficult to organize the com-
platform. Combined with big data analysis and processing pletion data, and it is easy to cause the completion data to be
technology and BIM technology, the intelligent restoration biased due to human factors. This will reduce the quality of
of panoramic image is realized. It can directly reflect the pro- completion data and lead to the failure of highway projects
gress of the construction site. The comprehensive comparison to meet the completion requirements. Therefore, it is necessary
of the platform’s big data kanban board allows you to view to build an information intelligent management platform,
the construction progress at any time, which helps managers build a database, and realize the intelligent classification of
to control the real progress of the construction site. data. It can improve the quality of completion data while
For construction data, the platform converts 2D drawings reducing manpower and material resources.
into 3D models, and combines the construction scheme to pro-
duce construction animation. This visually reflects the specific 7.4. Data accountability traceability
operation of the construction. According to different project
parts and process requirements, different QR codes are made In construction projects, the wide range of data sources and
and pasted on the construction site. The construction person- the difficulty of managing them have become a major problem
nel can obtain the corresponding process introduction and in project management. Therefore, data accountability trace-
specific operation method by scanning the QR code. The dis- ability as an effective method to reduce the difficulty of data
play of animation is conducive to the construction personnel management needs to be further studied. The data in the pro-
to be familiar with the construction process, which is of great ject of Yunnan Shizong Qiubei Expressway are provided by
help to the improvement of construction quality. The platform many different units. The construction unit provides data at
also manages the construction log. Through the intelligent the beginning of construction of the project such as project
combination of the mobile app and the platform, the construc- data and project contract data, and also provides data at the
tion log can be uploaded at the construction site using a mobile completion stage of the project such as measurement and pay-
phone or tablet. For the completion of the main parts of the ment data and project final account data. Survey and design
photo and accompanied by text, pictures, images of the record. units to provide survey and design data. The construction unit
The specific completion of construction can be traced by provides construction data, safety data, and other data gener-
uploading the daily construction log. If problems are found ated during the construction phase of the project. The supervi-
in the follow-up work, the data can be revised uniformly on sory unit provides supervision data.
the platform in time. The intelligent management platform divides data sources
This enhances the intelligence of construction data manage- and allocates different managers to conduct special control
ment.The supervision unit can view the overall view of the con- on data. The system administrator is mainly responsible for
struction site on the platform through the 24-hour omni- the input and management of the basic project data provided
directional camera shooting records on the construction site, by the construction unit and the survey and design unit. The
and combined with the remote video intelligent monitoring implementation consulting engineer mainly divides the con-
system. The supervision unit can check the construction status struction drawings and processes to clarify the construction
of different parts at any time, so as to realize the diversification schedule of the project. The construction manager imports
and intelligentization of supervision methods. By building an the project construction data for the inspection and manage-
information intelligent management platform, the intelligent ment of the project by the supervision unit and the construc-
management of supervision can be realized, thus improving tion unit. The supervisory unit transmits the project
the quality of supervision data. supervision data into the platform and supervises whether
Highway construction has the characteristics of long con- the construction progress and the quality of the construction
struction period and complex process, and there are inade- work are qualified. Fig. 8 shows the overall framework of data
quate management in progress, construction quality and sources for the project.
supervision. The traditional data collection methods are text,
photo and so on. However, in this project, the intelligent man- 8. Conclusion
agement platform uses two kinds of 3D technology, construc-
tion process QR code, construction log upload and 24-hour
remote monitoring system to achieve real-time records in pro- The lack of awareness of key data in traditional highway con-
gress management. Starting from the construction process to struction hinders the efficiency improvement of intelligent
reduce construction quality risks and improve the comprehen- management of highway projects in the context of Industry
siveness of the rectification issues proposed by the supervision 4.0. Today, with the extensive urbanization process, the scope
unit. In addition, the platform can realize intelligent sharing of of public transport construction has expanded from the urban
center to the urban edges. With the large environmental uncer-
80 Y. Wang et al.

Construction
Basic data unit Unify the formwork
Project
preparation base data for the project
Survey and
design unit

System Division of project


administrator structure according
to formwork
Planning Construction
stage drawings
Formation of quality
assurance task plans
Implementatio
n consulting
engineer

Taking on-site Progress


photos and records registration

Construction Import system


manager
Process Check the quality of
supervision the project
stage

Quality
assurance results
Owner, Unqualified
Supervisor
Qual ified

Continue
construction
Acceptance
stage
Chief Sign off on
supervising divisional works
engineer

Fig. 8 Overall framework of project data sources.

tainty and the large number of construction projects, the archi- project archive data. Finally, combined with the practical case
val data of highway projects need to be recorded also analysis, the paper puts forward suggestions on key data inte-
increases. With the rapid development of cloud computing, gration and quality control of the information intelligent man-
big data and intelligent transportation, there are more and agement and control platform, as well as the application
more methods for collecting and analyzing highway project wisdom promotion strategy. For example, 3D laser scanning,
archive data. The transformation of archives management video monitoring and other equipment are used to restore
mode and management needs to be followed step by step. the panoramic image of the construction site to achieve inte-
This paper analyzes the highway construction data and grated progress management data; Upload daily construction
establishes the entity association relation network. The logs based on WBS templates and intelligent animation
research of data association based on knowledge graph is car- demonstrations to improve the quality of construction and
ried out and 12 key data are determined. The key data evalu- cost data; The key data is mined and displayed on the project
ation model of highway construction is constructed by big data BI Kanban board, and through the real-time compre-
combining entropy weight method with cloud model method. hensive comparison of site progress, quality control progress,
The entropy weight method is combined with the cloud model and data completion, the managers can realize the multi-
method to construct a road construction key data evaluation objective integrated control of the project.
model. Then, the unbalanced data in the project archives are The key data integration and imbalance analysis proposed
compared and analyzed by discussing the composition struc- in this paper can help the project related personnel to quickly
ture of highway project data. A classification mining method and clearly understand the project progress and find the pro-
for unbalanced data based on Boosting ensemble algorithm cesses that need to be rectified. It not only saves time for con-
is proposed, which can mine the hidden value of the highway structors and managers to understand the actual situation of
Research on unbalanced mining of highway project key data 81

the project, but also avoids managers to repeatedly check the References
same process, reduce rework time and unnecessary consum-
ables. Thus, it improves project management efficiency and [1] C. Yuan, T. Mcclure, H. Cai, et al, Life-Cycle Approach to
reduces labor utilization and project costs. However, this Collecting, Managing, and Sharing Transportation
paper only validates the accuracy of the Boosting integration Infrastructure Asset Data[J], J. Constr. Eng. Manag. 143 (6)
algorithm in data prediction, and does not classify the tran- (2017) 04017001.1–04017001.15.
sient data in each phase of the highway project for unbalanced [2] G. Srivastava et al, A Pre-Large Weighted-Fusion System of
Sensed High-Utility Patterns[J], IEEE Sens. J. 21 (14) (2021)
analysis. In the future research, the key transient data gener-
15626–15634.
ated by the highway project can be refined, and the data value [3] Z. Aziz et al, Leveraging BIM and Big Data to deliver well
can be further mined based on the unbalanced algorithm to maintained highways[J], Facilities 35 (13–14) (2017) 818–832.
achieve the data prediction function. [4] A.M. Bazan et al, New Perspectives for BIM Usage in
Transportation Infrastructure Projects[J], Appl. Sci.-Basel 10
Funding (20) (2020) 7072.
[5] M. Stone, E. Aravopoulou, Improving journeys by opening
data:the case of Transport for London [J], The Bottom line 31
This research was supported by the Science and technology (1) (2018) 2–15.
innovation special fund project of Fujian Agriculture and For- [6] M. Grzenda, K. Kwasiborska, T. Zaremba, Hybrid short term
estry University (Grant No. CXZX2022024) and the Fujian prediction to address limited timeliness of public transport data
Province Innovation Strategy (Soft Science) Research Project streams[J], Neurocomputing 391 (2020) 305–317.
(2021R0029). [7] M. Yap, O. Cats, B. van Arem, Crowding valuation in urban
Ethics declarations. tram and bus transportation based on smart card data[J],
Disclosure of potential conflicts of interest. Transport. A: Transp. Sci. 16 (1) (2020) 23–42.
Not applicable. [8] K.E. Zannat, C.F. Choudhury, Emerging Big Data Sources for
Public Transport Planning: A Systematic Review on Current
Research involving Human Participants and/or Animals.
State of Art and Future Research Directions[J], J. Indian Inst.
Not applicable.
Sci. 99 (4) (2019) 601–619.
Informed consent. [9] Vacanas Y., Themistocleous K., Agapiou A., et al. The
Not applicable. combined use of Building Information Modelling (BIM) and
Ethics approval Unmanned Aerial Vehicle (UAV) technologies for the 3D
Not applicable. illustration of the progress of works in infrastructure
Consent to participate construction projects[C]. International Conference on Remote
Not applicable. Sensing & Geoinformation of the Environment. International
Consent for publication Society for Optics and Photonics, 2016, 9688: 1-8.
Not applicable [10] M. Babar, F. Arif, Real-time data processing scheme using big
data analytics in internet of things based smart transportation
Funding Statement: This work was supported in part by the
environment[J], J. Ambient Intell. Hum. Comput. 10 (10) (2019)
Science and technology innovation special fund project of
4167–4177.
Fujian Agriculture and Forestry University (Grant No. [11] G.S. Yovanof, G.N. Hazapis, An Architectural Framework and
CXZX2022024 and Fujian Province Innovation Strategy (Soft Enabling Wireless Technologies for Digital Cities & Intelligent
Science) Research Project (2021R0029). Urban Environment[J], Wirel. Pers. Commun. 49 (3) (2009)
445–463.
CRediT authorship contribution statement [12] K. Choi, I. Jung, Y. Yin, et al, Holistic Performance Evaluation
of Highway Design-Build Projects[J], J. Manag. Eng. 36 (4)
(2020) 04020024.
Yinglin Wang: Conceptualization, Methodology, Validation, [13] N. Khademi, A.A. Choupani, Investigating the road safety
Formal analysis, Data curation, Writing – original draft, Writ- management capacity: Toward a lead agency reform[J], IATSS
ing – review & editing, Supervision. Jiaxin Zhuang: Validation, Research 42 (3) (2018) 105–120.
Data curation, Writing – original draft. Guowei Zhou: Valida- [14] Y. Chen, J.J. Li, Performance Evaluation of Construction
tion, Investigation, Resources, Visualization. Shuhui Wang: Materials Management in Expressway Project[J], Adv. Mat.
Conceptualization, Investigation, Visualization. Res. 179–180 (2011) 475–481.
[15] Stevenson, et al. Workplace road safety risk management: An
investigation, into Australian practices[J]. Accident Analysis
Declaration of Competing Interest
and Prevention, 2017, 98: 64-73.
[16] J.O. Han, J.Y. Park, H.B. Kim, et al, Performance Evaluation of
The authors declare that they have no known competing JPCP with Changes of Pavement Mix Design Using Pavement
financial interests or personal relationships that could have Management Data[J], Adv. Civil Eng. (2019).
appeared to influence the work reported in this paper.

You might also like