Professional Documents
Culture Documents
Orginal Sales Forecasting Project Documentation
Orginal Sales Forecasting Project Documentation
SESSION (2017-2020)
A MAJOR PROJECT REPORT
“SALES FORECASTING”
Submitted by
Danish Rashid Shabnam Nisar Ovais Bhat
MCA (6th sem.) MCA (6th sem.) MCA (6th sem.)
(17045113002) (17045113040) (17045113043)
“SALES FORECASTING”
Project Submitted to the Department Of Computer Science, South
Campus, University Of Kashmir in the partial fulfillment of the require-
ment for the award of the Degree of
Submitted by
CERTIFICATE
This is to certify that the Project entitled
“SALES FORECASTING”
Is the original work carried out by
1) Danish Rashid 17045113002
2) Shabnam Nisar 17045113040
3) Ovais Ahmad Bhat 17045113043
In the partial fulfillment of the requirement for the award of the Degree of
Date:
This is to Certify that the above statement made by the candidates are correct to
the best of my knowledge
Acknowledgement
In the name of “Allah” , the most beneficent and merciful, the creator of treasures
of knowledge and wisdom, who gave us strength and knowledge to complete this
project.
We are highly thankful to our project guide Dr. Hilal Ahmad Khanday (Assis-
tant Prof. Department Of Computer Science, University of Kashmir South Cam-
pus) for without his support this project was impossible. His constructive advice
and constant motivation have been responsible for the successful completion of
this project.
We are also thankful to our teachers and other staff members of our Department.
They have been very helpful and kind to us throughout our MCA course.
We would like to extend our thanks to all those authors and researchers whose
research papers, articles have provided the diversity of interesting material that
helped us to make this work possible.
.
ABSTRACT....................................................................................................................................................9
INTRODUCTION...........................................................................................................................................2
Problem Statement.................................................................................................................................4
Objectives................................................................................................................................................4
System Requirements..............................................................................................................................5
Development tools..................................................................................................................................7
IMPLEMENTATION....................................................................................................................................12
Chapter1:-Data Handling:..........................................................................................................................13
Import packages:-..................................................................................................................................13
Import datasets:-...................................................................................................................................13
Chapter 2:- Exploratory Data Analysis.......................................................................................................15
Exploring Data through visualizations:-.................................................................................................15
Chapter 3:-Feature Engineering:...............................................................................................................26
Treating the missing values...................................................................................................................26
Drawing the correlation matrix:-...........................................................................................................30
Dealing with categorical variables:-.......................................................................................................33
Label encoding for the categorical variables:..............................................................................33
One Hot Encoding for the categorical variables:-..............................................................................35
Chapter 4:-Model Building:.......................................................................................................................37
Splitting the Dataset..............................................................................................................................37
Labeling of Data:-..................................................................................................................................39
Chapter 5:- Modelling................................................................................................................................40
Linear Regression:-................................................................................................................................40
Decision Tree Regression:......................................................................................................................42
XGBoost Regression:..............................................................................................................................43
Random Forest Regression:...................................................................................................................44
_Toc64580643
Support Vector Regression:-..................................................................................................................45
RESULT:-....................................................................................................................................................47
FUTURE......................................................................................................................................................48
REFRENCES................................................................................................................................................49
ABSTRACT
Sales Forecasting is the process of using a company’s sales records over the past
years to predict the short-term or long-term sales performance of that company in
the future. This is one of the pillars of proper financial planning. As with any pre-
diction-related process, risk and uncertainty are unavoidable in Sales Forecasting
too. This is the age of the internet where the amount of data being generated is so
huge that man alone is not able to process through the data. Nowadays shopping
malls and Big Marts keep the track of their sales data of each and every individual
item for predicting future demand of the customer and update the inventory man-
agement as well. These data stores basically contain a large number of customer
data and individual item attributes in a data warehouse. Further, anomalies and fre-
quent patterns are detected by mining the data store from the data warehouse. The
resultant data can be used for predicting future sales volume with the help of dif-
ferent machine learning techniques for the retailers like Big Mart. Many machine
learning techniques hence have been discovered for this purpose. In this project,
we are trying to predict the sales of a retail store using different machine learning
techniques and trying to determine the best algorithm suited to our particular prob-
lem statement. We have implemented various regression techniques and found that
this model produces better performance as compared to various existing models.
1
2
INTRODUCTION
Sales Forecasting can be defined as the prediction of upcoming sales based on the
past sales occurred. Sales forecasting is of paramount importance for companies
which are entering new markets or are adding new services, products or which are
experiencing high growth. The main reason a company does a forecast is to bal-
ance marketing resources and sales against supply capacity planning.
3
Problem Statement
Predicting sales of a company needs time series data of that company and based on
that data the model can predict the future sales of that company or product. So, in
this research project we will analyze the time series sales data of a company and
will predict the sales of the company for the coming quarter and for a specific
product.
The data scientists at Big Mart have collected 2013 sales data for 1559 products
across 10 stores in different cities. Also, certain attributes of each product and store
have been defined. We will use their data set to train our model .The aim is to build
a predictive model and find out the sales of each product at a particular store.
Objectives
4
Build predictive ML model to obtain forecasts.
System Requirements
Operating System Processors Disk Space RAM
processor support-
Pack 3 ing only, (At least 2048 MB
Windows Server
2003 Installation
R2 with Service
Pack 2
Pack 1 or 2
Windows Server
2008
Service Pack 2 or R2
Windows 7
5
(Leopard) and
above only, (At least 2048 MB
Mac OS X 10.6.x
(Snow Installation
Leopard)
processor support-
and 9.10 ing only, (At least 2048 MB
Red Hat Enterprise SSE2 instruction set 3–4 GB for a typical recommended)
6
Development tools
There are a lot of sources that can adequately cover all our machine learning
and artificial intelligence needs.
We used several tools to develop our “Sales Forecasting model” .These
tools are briefly listed below:-
Programming Languages
a. Python:-is a language that is favored for its readability, relatively mild learning
curve and functional structure that is used in many cases. This language is beginner
friendly and quite simple. To use this language for machine learning, you do not
have to be knowledgeable of all the intricacies of it. The Python machine learning is
used in this model.
Data Analytics and Visualization Tools
7
than the one used by Tensorflow. In some cases, the package may appear to work
but produce different results in detail.
In contrast, conda analyses the current environment including everything currently
installed, and, together with any version limitations specified (e.g. the user may
wish to have Tensorflow version 2,0 or higher), works out how to install a compat-
ible set of dependencies, and shows a warning if this cannot be done.
.
c) Pandas
It is a popular library used for retrieving and preparing data to be used later in other
machine learning libraries. Pandas enables its users to fetch data from different
sources easily. It acts as a tool that simplifies analysis by converting JSON, SQL,
TSV or CSV database into a data frame; it makes a python object look like an SPSS
table with rows and columns or an Excel sheet.
d) Matplotlib
8
This means that a user can always visualize data and results obtained from your
models.
e) Seaborn :-
9
In today’s modern world,
huge shopping centers
10
11
12
IMPLEMENTATION
13
Chapter1:-Data Handling:
Import packages:-
A package is basically a directory with Python files and a file with the
name __init__.py. This means that every directory inside of the Python path, which
contains a file named __init__.py, will be treated as a package by Python. It's pos-
sible to put several modules into a Package.
We used various type of packages in our model. We imported them at the start of
our project.
import warnings
warnings.filterwarnings('ignore')
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
14
import xgboost as xgb
Import datasets:-
The first step is to look at the data and try to identify the information which we hy-
pothesized vs the available data. A comparison between the data dictionary on the
competition page and out hypotheses is shown below:
Observations: 8,523
Variables: 12
Variable Description
Item_Identifier Unique product ID
Item_Weight Weight of product
Item_Fat_Content Whether the product is low fat or not
The % of total display area of all products in a
Item_Visibility
store allocated to the particular product
Item_Type The category to which the product belongs
Item_MRP Maximum Retail Price (list price) of the product
Outlet_Identifier Unique store ID
Outlet_Establishment_Year The year in which store was established
15
Variable Description
The size of the store in terms of ground area cov-
Outlet_Size
ered
Outlet_Location_Type The type of city in which the store is located
Whether the outlet is just a grocery store or some
Outlet_Type
sort of supermarket
Sales of the product in the particular store. This is
Item_Outlet_Sales
the outcome variable to be predicted.
We will start off by plotting and exploring all the individual variables to gain some
insights.
Univariate Analysis:
for i in Train_data.describe().columns:
16
sns.distplot(Train_data[i].dropna())
plt.show()
17
18
19
We also used one more type of graph to plot these visualizations:-
# Boxplot:-
for i in Train_data.describe().columns:
sns.boxplot(Train_data[i].dropna())
plt.show()
Bivariate Analysis:
After looking at every feature individually, let’s now explore them again with re-
spect to the target variable. Here we will make use of scatter plots for continuous
or numeric variables.
plt.xlabel("Item_Weight")
plt.ylabel("Item_Outlet_Sales")
sns.scatterplot(x='Item_Weight',y='Item_Outlet_Sales',hue='Item_Type',size='Ite
m_Weight', data =Train_data)
plt.xlabel("Item_Visibility")
plt.ylabel("Item_Outlet_Sales")
20
sns.scatterplot(x='Item_Visibility',y='Item_Outlet_Sales',hue='Item_Type',size='Ite
m_Weight', data =Train_data)
21
Impact of Outlet_Type on Item_Outlet_Sales:-
Outlet_Type_Pivot=\
Train_data.pivot_table(index='Outlet_Type',values="Item_Outlet_Sales",aggfunc=
np.median)
Outlet_Type_Pivot.plot(kind='bar',color='brown',figsize=(12,7))
plt.xlabel("Outlet_Type")
plt.ylabel("Item_Outlet_Sales")
22
plt.xticks(rotation=0)
plt.show()
Train_data.pivot_table(index='Item_Fat_Content',values="Item_Outlet_Sales",ag-
gfunc=np.median)
23
Item_Fat_Content_pivot.plot(kind='bar',color='blue',figsize=(12,7))
plt.xlabel("Item_Fat_Content")
plt.ylabel("Item_Outlet_Sales")
plt.xticks(rotation=0)
plt.show()
24
Distribution Of Outlet_Type:-
plt.figure(figsize = (10,8))
sns.countplot(Train_data.Outlet_Type)
plt.xticks(rotation = 90)
25
Distribution Of Outlet_Location Type:-
plt.figure(figsize = (10,8))
sns.countplot(Train_data.Outlet_Location_Type)
Train_data.Outlet_Location_Type.value_counts()
26
Chapter 3:-Feature Engineering:
Most of the times the given features in a dataset are not enough to give satisfactory
predictions .In such cases, we have to create new features which might help in im-
proving the model’s performance. Let’s try to create some new features for our
dataset.
In order to do the feature engineering we combined both data sets so that we can
do the feature engineering of both sets combinely.
Train_data['Source'] = 'train'
Test_data['Source'] = 'test'
df = pd.concat((Train_data, Test_data), ignore_index = True)
df.shape
(14204, 13)
df.columns
Missing data can have a severe impact on building predictive models because the
missing values might be contain some vital information which could help in mak-
27
ing better predictions. So, it becomes imperative to carry out missing data imputa-
tion. There are different methods to treat missing values based on the problem and
the data. Some of the common techniques are as follows:
1. Deletion of rows: In train dataset, observations having missing values in any
variable are deleted. The downside of this method is the loss of information
and drop in prediction power of model.
2. Mean/Median/Mode Imputation: In case of continuous variable, missing
values can be replaced with mean or median of all known values of that vari-
able. For categorical variables, we can use mode of the given values to re-
place the missing values.
#Item_Weight:-
df['Item_Weight'].mean()
df['Item_Weight'].fillna(df['Item_Weight'].mean(),inplace=True)
df.isnull().sum()
Item_Fat_Content 0
Item_Identifier 0
Item_MRP 0
Item_Outlet_Sales 5681
Item_Type 0
Item_Visibility 0
Item_Weight 0
Outlet_Establishment_Year 0
Outlet_Identifier 0
Outlet_Location_Type 0
Outlet_Size 4016
Outlet_Type 0
Source 0
dtype: int64
#Outlet_Size
28
#Outlet_Size:-
df['Outlet_Size'].value_counts()
df['Outlet_Size'].fillna('Medium',inplace=True)
df.isnull().sum()
#We will make one more column here that will show us how old the store
is and we will name it as Outlet_Years:-
df['Outlet_Establishment_Year'].value_counts()
df['Outlet_Years']=2020-df['Outlet_Establishment_Year']
df['Outlet_Years'].describe()
df['Item_Type'].value_counts()
df['Item_Identifier'].value_counts()
# 'FD'-FOOD
# 'DR'-DRINK
# 'NC'-NON-CONSUMABLE
#We will be creating 3 categories instead of the already existing 16 categories.
# Changing only the first 2 characters
df['New_Item_Type']=df['Item_Identifier'].apply(lambda x:x[0:2])
# Rename them to make categories:-
df['New_Item_Type']=df['New_Item_Type'].map({'FD':'Food','NC':'Non-Consum-
able','DR':'Drinks'})
df['New_Item_Type'].value_counts()
df.loc[(df['New_Item_Type'] == 'Non-
Consumable','Item_Fat_Content')]='Non-Edible'
df['Item_Fat_Content'].value_counts()
30
Non-Edible 2686
LF 367
reg 195
low fat 134
Name: Item_Fat_Content, dtype: int64
df.head()
df.describe()
df.columns()
Train_data.corr()
31
plt.figure(figsize=(35,15))
sns.heatmap(Train_data.corr(),vmax=1,square=True,annot=True,cmap='viridis')
plt.title('Correlation between different attributes')
plt.show()
32
corr=df.corr()
sns.heatmap(corr,annot=True,cmap='coolwarm')
33
Dealing with categorical variables:-
In this stage, we will convert our categorical variables into numerical ones. We
will use 2 techniques — Label Encoding and One Hot Encoding.
1. Label encoding simply means converting each category in a variable to a
number. It is more suitable for ordinal variables — categorical variables
with some order.
2. In One hot encoding, each category of a categorical variable is converted
into a new binary column (1/0).
We will use both the encoding techniques.
34
Example:
Suppose we have a columns height in some data set
where 0 is the label for tall, 1 is the label for medium and 2 is label for short
height.
35
We will label encode
['Item_Fat_Content','Outlet_Location_Type','Outlet_Size','New_Item_Type','Out
let_Type','Outlet' ]as these are ordinal variables.
from sklearn.preprocessing import LabelEncoder
label=LabelEncoder()
df['Outlet']=label.fit_transform(df['Outlet_Identifier'])
varib['Item_Fat_Content','Outlet_Location_Type','Outlet_Size','New_Item_Type','Outlet_
Type','Outlet']
for i in varib:
df[i]=label.fit_transform(df[i])
df.head()
#DummyVariable:-
36
df=pd.get_dummies(df,columns=['Item_Fat_Content','Outlet_Location_Typ
e','Outlet_Size','New_Item_Type','Outlet_Type','Outlet'])
df.head()
df.drop(['Item_Type','Outlet_Establishment_Year'],axis=1,inplace=True)
df.columns
train=df.loc[df['Source']=='train']
train.shape
38
(8523, 38)
test=df.loc[df['Source']=='test']
test.shape
(5681, 38)
train.drop(['Source'],axis=1,inplace=True)
train.columns
OUTPUT:
Index(['Item_Identifier', 'Item_MRP', 'Item_Outlet_Sales',
'Item_Visibility',
'Item_Weight', 'Outlet_Identifier', 'Outlet_Years',
'Item_Visib_avg',
'Item_Fat_Content_0', 'Item_Fat_Content_1',
'Item_Fat_Content_2',
'Item_Fat_Content_3', 'Item_Fat_Content_4',
'Item_Fat_Content_5',
'Outlet_Location_Type_0', 'Outlet_Location_Type_1',
'Outlet_Location_Type_2', 'Outlet_Size_0',
'Outlet_Size_1',
'Outlet_Size_2', 'New_Item_Type_0',
'New_Item_Type_1',
'New_Item_Type_2', 'Outlet_Type_0', 'Outlet_Type_1',
'Outlet_Type_2',
'Outlet_Type_3', 'Outlet_0', 'Outlet_1', 'Outlet_2',
'Outlet_3',
'Outlet_4', 'Outlet_5', 'Outlet_6', 'Outlet_7', 'Out-
let_8', 'Outlet_9'],
dtype='object')
test.drop(['Item_Outlet_Sales','Source'],axis=1,inplace=True)
test.columns
39
Labeling of Data:-
For supervised learning to work, we need a labeled set of data that the model can
learn from to make correct decisions. Data labeling typically starts by asking hu-
mans to make judgments about a given piece of unlabeled data. For example, label-
ers may be asked to tag all the images in a dataset where “does the photo contain a
bird” is true. The tagging can be as rough as a simple yes/no or as granular as iden-
tifying the specific pixels in the image associated with the bird. The machine learn-
ing model uses human-provided labels to learn the underlying patterns in a process
called "model training." The result is a trained model that can be used to make pre-
dictions on new data.
In machine learning, a properly labeled dataset that we use as the objective stan-
dard to train and assess a given model is often called “ground truth.” The accuracy
of your trained model will depend on the accuracy of your ground truth, so spend-
ing the time and resources to ensure highly accurate data labeling is essential.
X_train=train.drop(['Item_Outlet_Sales','Item_Identifier','Outlet_Identifier'],axi
s=1)
X_train
y_train=train['Item_Outlet_Sales']
y_train.head()
X_test=test.drop(['Item_Identifier','Outlet_Identifier'],axis=1)
X_test.head()
40
Chapter 5:- Modelling
Finally we have arrived at most interesting stage of the whole process — predictive
modeling. We will start off with the simpler models and gradually move on to
more sophisticated models. We will build the models using…
Linear Regression:-
Linear regression is the simplest and most widely used statistical technique for pre-
dictive modeling. Given below is the linear regression equation:
where X1, X2,…,Xn are the independent variables, Y is the target variable and all
thetas are the coefficients. Magnitude of a coefficient wrt to the other coefficients
determines the importance of the corresponding independent variable.
For a good linear regression model, the data should satisfy a few assumptions. One
of these assumptions is that of absence of multi collinearity , i.e, the independent
variables should be correlated. However, as per the correlation plot above, we have
a few highly correlated independent variables in our data. This issue of multi-
collinearity can be dealt with regularization.
For the time being , let’s build our linear regression model with all the variables:-
regressor = LinearRegression(normalize=True)
regressor.fit(X_train,y_train)
y_test=regressor.predict(X_test)
y_test
41
array([1843., 1454., 1883., ..., 1798., 3582., 1264.])
print("Linear Regression Model Score:",regressor.score(X_train,y_train))
lr_accuracy=round(regressor.score(X_train,y_train)*100)
42
Decision Tree Regression:
Decision tree regression observes features of an object and trains a model in the
structure of a tree to predict data in the future to produce meaningful continuous out-
put. Continuous output means that the output/result is not discrete, i.e., it is not repre-
sented just by a discrete, known set of numbers or values.
let’s build our linear regression model with all the variables:-
tree_accuracy=round(tree.score(X_train,y_train)*100)
print("Decision Tree Regression Accuracy:",tree_accuracy)
43
XGBoost Regression:
XGBoost is a fast and efficient algorithm and has been used to by the winners of
many data science competitions. It’s a boosting algorithm. There are many tuning
parameters in XGBoost which can be broadly classified into General Parameters,
Booster Parameters and Task Parameters.
Let’s have a look at the parameters that we are going to use in our model.
1. eta: It is also known as the learning rate or the shrinkage factor. It actually
shrinks the feature weights to make the boosting process more conservative.
The range is 0 to 1. Low eta value means model is more robust to overfit-
ting.
2. gamma: The range is 0 to ∞. Larger the gamma more conservative the algo-
rithm is.
3. max_depth: We can specify maximum depth of a tree using this parameter.
4. subsample: It is the proportion of rows that the model will randomly select
to grow trees.
5. colsample_bytree: It is the ratio of variables randomly chosen for build
each tree in the model.
model=XGBRegressor(learning_rate=0.05)
model.fit(X_train,y_train)
y_pred=model.predict(X_test)
44
y_pred
array([1660.4456, 1315.1495, 574.9488, ..., 1865.4355,
3743.5881,
1242.0835], dtype=float32)
model_accuracy=round(model.score(X_train,y_train)*100)
print("XGBoost Regression Accuracy:",model_accuracy)
rf=RandomForestRegressor()
rf.fit(X_train,y_train)
45
predict_r=rf.predict(X_test)
predict_r
array([1621.5559 , 1516.3595 , 755.74958, ..., 1796.52814,
4939.70336,
1414.75842])
rf.score(X_train,y_train)
print("Random Forest Regression Model Score:",rf.score(X_train,y_train))
rf_accuracy=round(rf.score(X_train,y_train)*100)
print("RandomForest Regression Accuracy:",rf_accuracy)
Support Vector Machine can also be used as a regression method, maintaining all
the main features that characterize the algorithm (maximal margin). The Support
Vector Regression (SVR) uses the same principles as the SVM for classification,
with only a few minor differences. First of all, because output is a real number it
becomes very difficult to predict the information at hand, which has infinite possi-
bilities. In the case of regression, a margin of tolerance (epsilon) is set in approxi-
mation to the SVM which would have already requested from the problem. But be-
sides this fact, there is also a more complicated reason, the algorithm is more com-
plicated therefore to be taken in consideration. However, the main idea is always
the same: to minimize error, individualizing the hyperplane which maximizes the
margin, keeping in mind that part of the error is tolerated.
46
from sklearn.svm import SVR
svm=SVR(epsilon=15,kernel='linear')
svm.fit(X_train,y_train)
predict_r=svm.predict(X_test)
predict_r
array([1623.10709496, 1339.00939903, 2376.77467647, ...,
1770.54684921,
3465.33724055, 1265.79536608])
svm.score(X_train,y_train)
print("Support Vector Regression Model Score:",svm.score(X_train,y_train))
svm_accuracy=round(svm.score(X_train,y_train)*100)
47
Result:-
After trying and testing 5 different algorithms, the RANDOM FOREST REGRESSOR
has the BEST SCORE= (0.9142221121648255) followed by XGBoost REGRESSOR
with SCORE = (0.6781707786110868)
48
Future Work
Since no system in the world is complete and only time can prove its incomplete-
ness, same is the case with this system. Since it is academic project, there is lot of
scope for this project in future. Some of the future enhancements include:-
We will try to increase its accuracy more by doing more feature engineering.
Train one of the models on our own data
We will try to add a recommender system in this project
49
References
1. Beheshti-Kashi, S., Karimi, H.R., Thoben, K.D., Lutjen, M., Teucke, M.: A survey
on retail sales forecasting and prediction in fashion markets. Systems Science &
Control Engineering 3(1), 154{161 (2015)
2. Bose, I., Mahapatra, R.K.: Business data mininga machine learning perspective.
Information & management 39(3), 211{225 (2001)
3. Chu, C.W., Zhang, G.P.: A comparative study of linear and nonlinear models for
aggregate retail sales forecasting. International Journal of production economics
86(3), 217{231 (2003)
4. Claypool, M., Gokhale, A., Miranda, T., Murnikov, P., Netes, D., Sartin, M.:
Combing content-based and collaborative _lters in an online newspaper (1999)
5. Das, P., Chaudhury, S.: Prediction of retail sales of footwear using feedforward and
recurrent neural networks. Neural Computing and Applications 16(4-5), 491{502
(2007)
6. Domingos, P.M.: A few useful things to know about machine learning. Commun.
acm 55(10), 78{87 (2012)
7. Langley, P., Simon, H.A.: Applications of machine learning and rule induction.
Communications of the ACM 38(11), 54{64 (1995)
8. Loh, W.Y.: Classi_cation and regression trees. Wiley Interdisciplinary Reviews:
Data Mining and Knowledge Discovery 1(1), 14{23 (2011)
9. Makridakis, S., Wheelwright, S.C., Hyndman, R.J.: Forecasting methods and applications.
John wiley & sons (2008)
10. Ni, Y., Fan, F.: A two-stage dynamic sales forecasting model for the fashion retail.
Expert Systems with Applications 38(3), 1529{1536 (2011)
11. Punam, K., Pamula, R., Jain, P.K.: A two-level statistical model for big mart sales
prediction. In: 2018 International Conference on Computing, Power and Communication
Technologies (GUCON). pp. 617{620. IEEE (2018)
12. Ribeiro, A., Seruca, I., Dur~ao, N.: Improving organizational decision support: Detection
of outliers and sales prediction for a pharmaceutical distribution company.
Procedia Computer Science 121, 282{290 (2017)
13. Shrivas, T.: Big mart dataset@ONLINE (Jun 2013), https://datahack.
analyticsvidhya.com/contest/practice-problem-big-mart-sales-iii/
14. Smola, A.J., Scholkopf, B.: A tutorial on support vector regression. Statistics and
computing 14(3), 199{222 (2004)
50
51