Predicting Bus Passenger Flow and Prioritizing Influential Factors Using Multi-Source Data

NAME : VINEETH KUMAR.
ROLL NO : 110520504025.
GROUP : MSC(COMPUTER SCIENCE) 2ND YEAR.
COLLEGE : JAGRUTHI DEGREE & PG COLLEGE.
Predicting Bus Passenger Flow and Prioritizing

Influential Factors Using Multi-Source Data Scaled
Stacking Gradient Boosting Decision Trees
ABSTRACT
Accurate bus passenger flow prediction contributes to informed decisions and full
utilization of transit supply. Passenger flow is affected by an extensive range of
attributes featuring travel environment, which can be collected through multi-
source information. A successful prediction model should not only fully utilize the
latent knowledge hidden in multisource data, but also address the resulting
multicollinearity issue. Based on this principle, we propose a novel scaled stacking
gradient boosting decision tree (SS-GBDT) model to predict bus passenger flow
with multi-source datasets.
SS-GBDT includes two modules:
• The prior feature-generation module and
• The subsequent GBDT-prediction module.
The prior module entails a couple of basic models with similar performance,
which generates several enhanced features of multi-source data by stacking
process. Particularly, we devise a scaled stacking method by introducing a quasi-
attention based mechanism. It can also prioritize the influential factors on
passenger flow prediction. The prediction model is flexible and scalable, which
enables the integration of various influential factors in the presence of big data.
EXISTING
SYSTEM
In contrast to the parametric approaches, the principle of the non-parametric

approaches is to build a nonlinear relationship between the input variables and the
output variables without prior knowledge. Artificial neural network (ANN) models
can handle the complex relationships in datasets and have gained wide popularity
in transportation. However, the drawback of ANN is the potential occurrence of
over-fitting or under-fitting. As another non-parametric models, support vector
machine (SVM) and support vector regression (SVR) models can potentially
overcome the drawbacks of neural networks and address the issues of nonlinearity,
small samples, high dimension, local minima and over-fitting. Markovi´c et al.
In recent years, the advent and prevalence of deep learning models have provoked
a storm in the field of transportation. There are also a handful of studies on the
passenger flow prediction using deep learning models. Liu and Chen [20]
developed a multi-stage deep learning architecture to forecast the passenger flow
for bus rapid transit stations.
To defeat the drawbacks of single models and take advantage of different models,
an increasing number of researchers have developed hybrid models by integrating
different single models. their method integrates empirical mode decomposition and
ANN. Ma et al. (2014) [28] presented an integrating approach with interactive
multi-model pattern in the short-term passenger demand forecasting.
Disadvantages
• In the existing work, the system did not implement novel scaled
stacking gradient boosting decision tree (SS-GBDT) model.
• This system is less performance due to lack of Implicit linkage
between features and predicted labels.
PROPOSED
SYSTEM
The system proposes a novel scaled stacking gradient boosting decision tree (SS-
GBDT) model to predict bus passenger flow with multi-source datasets. SS-GBDT
includes two modules: the prior feature-generation module and the subsequent
GBDT-prediction module. The prior module entails a couple of basic models with
similar performance, which generates several enhanced features of multi-source
data by stacking process.
Results show that SS-GBDT not only presents superiority in both prediction
accuracy and stability, but can also better handle the multicollinearity issue with
multisource data. It can also prioritize the influential factors on passenger flow
prediction. The prediction model is flexible and scalable, which enables the
integration of various influential factors in the presence of big data.
Advantages
• The system is more effective since it presents Scaled Stacking Process for
Multi-Source Data.
• The system is accurate since it is implemented novel scaled stacking gradient
boosting decision tree (SS-GBDT) model.
SYSTEM
REQUIREMENTS
➢ H/W System Configuration:-
➢ Processor - Pentium –IV

➢ RAM - 4 GB (min)
➢ Hard Disk - 20 GB
➢ Key Board - Standard Windows Keyboard
➢ Mouse - Two or Three Button Mouse
➢ Monitor - SVGA
SOFTWARE REQUIREMENTS:
• Operating system : Windows 7 Ultimate.
• Coding Language : Python.
• Front-End : Python.
• Back-End : Django-ORM
• Designing : Html, css, javascript.
• Data Base : MySQL (WAMP Server).

Predicting Bus Passenger Flow and Prioritizing Influential Factors Using Multi-Source Data

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Predicting Bus Passenger Flow and Prioritizing Influential Factors Using Multi-Source Data

Uploaded by

Copyright:

Available Formats

NAME : VINEETH KUMAR.

Predicting Bus Passenger Flow and Prioritizing

In contrast to the parametric approaches, the principle of the non-parametric

➢ H/W System Configuration:-

➢ Processor - Pentium –IV

• Coding Language : Python.

• Designing : Html, css, javascript.

• Data Base : MySQL (WAMP Server).

You might also like