Professional Documents
Culture Documents
SVR (4)
SVR (4)
1
Contents
1 Introduction 3
2 Methodology 5
2.1 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Support Vector Regression . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Cross Validation Method . . . . . . . . . . . . . . . . . . . . . . . 6
3 Results 6
3.1 Summary Statistics of Rice Parameters . . . . . . . . . . . . . . . 6
3.2 Rice Yield Prediction of Province of Tarlac Using Various Kernels
of SVR with Hyper Tuning Parameters. . . . . . . . . . . . . . . 7
3.3 SVR with Different Kernels for Allocated Testing Data of Rice
Yield . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4 Discussions 9
5 Resources 9
2
1 Introduction
Rice has an integral part of meal in countries in Asia. Asians consumed
three or more rice meals per day, the meal cannot be completed without the
rice served on the table. Especially in the Philippines rice is essential, despite of
the availability of other food such as bread and noodles, rice remains the main
and preferred to eat by Filipino people. Rice has various names depending on
which stage it is from rice grains to well milled rice and ready to cook. It is
known as “Palay” (un-milled rice), “Bigas” (milled rice), and “Kanin” (cooked
rice) in the Philippines.
Every Filipino meal must serve rice on the table, whether for breakfast,
lunch, dinner, and even snacks. Filipino have various ways of cooking rice this
is a result of how they love eating rice in every day. Since it is a one of the
primary commodities of every Filipino lives and to sustain the needs for rice,
Rice farming is one of the main livelihood of the people.
In 2018, the Philippines ranked eighth in global rice production (FAOSTAT,
2020). Rice is widely grown in the Philippines, particularly in Luzon, West-
ern Visayas, Southern Mindanao, and Central Mindanao. Rice production has
increased over the last two decades, from 12 Mt in 1999 to 19 Mt in 2008. (FAO-
STAT, 2020). The annual mean of total rice harvested area is approximately
4.7M ha, with an average yield of approximately 3.95 tons per harvested hectare
are harvested in the Philippines. The largest rice producing regions are Central
Luzon and Cagayan Valley region (FAO, 2002).
According to Mindanao Times (2022), among top 10 rice regions in the
Philippines with most rice production the, Region 3 (Central Luzon) got the
first spot. Because of its enormous flatlands and swaps has a clear natural
advantage when it comes in rice production and it has been remains the top
rice producer for decades. In 2021, the provinces of Aurora, Bataan, Bulacan,
Nueva Ecija, Pampanga, Tarlac, and Zambales produced 699,043.50 metric tons
of palay (un-milled rice), accounting for 15.1% of total production in the country.
The province of Nueva Ecija leads the region and the country in rice production.
Rice is the chief commodity in Asia. It is the most consumable goods in the
world, particularly in the Philippines. Because of the love of rice of the Filipino
people it has many translations in its language. Rice translates to palay which
is the rice grain, to bigas which is uncooked rice, to kanin which is the cooked
state, to tutong the burned part and lastly to bahaw the cold rice. In fact,
one of the UNESCO World Heritage in the Philippines is the Rice Terraces of
the Philippine Cordilleras, a 2000 years old rice fields made by Ifugaos in the
contours of the mountains. It is a living cultural landscape that withstands the
diversity of time. These only show how rice is richly embedded in the Filipino
culture.
The increase in world population has led to a significant increase in food
demand throughout the world, so agricultural policy makers in all countries try
to estimate their annual food requirements in advance in order to provide food
security for their people. In order to achieve this goal, this study developed a
3
novel predictive model based on the energy inputs employed during the produc-
tion season. Rice caters more than 30% of the calorie requirement for the Asian
countries. In Iran too rice is one of the most important agricultural products.
According to the researchers Chen, H., Et al (2016) The study investigated
the relative importance of climate factors in the yield alteration of paddies in
southwestern China. A comparison between an SVM with multiple linear re-
gression (MLR) and an artificial neural network (ANN) have been carried out
and validated by various cross-validation techniques such as (those abbreviated
as) MAE, mean relative absolute error (MRAE), RMSE, relative root Mean
square error (RRMSE), and a coefficient of determination. It was further sug-
gested to consider various parameters of soil management practices to increase
the precision in the developed models.
Palanivel, K. Et al (2019) looked at using different machine learning tech-
niques to predict crop yield data and validating the findings using RMSE values.
A study used Modular Artificial Neural Networks (MANN) and SVR to estimate
Kharif crop production in Visakhapatnam, with the amount of monsoon rainfall
factored in to improve accuracy. Other researchers used SVR with RBF ker-
nel to construct a model of wetland rice production based on climate changes
in the Kalimantan province to predict with greater precision. Additionally,
some researchers used four machine learning algorithms (SVM, KNN, Linear
Regression, and Elastic Net Regression) to predict potato tuber yield with soil
and crop properties through proximal sensing on a dataset of six fields across
Atlantic Canada with different zones for the year 2017–2018.
The rationale behind choosing this research topics is to predict the rice
cultivation in Tarlac province. The researchers will use the Support vector
Algorithm approach in order to know which model is the most efficient using
different kernel function. The result of this study will help Tarlac province to
secure food for the Tarlaqueño.
Objectives
The objective of this study was to develop a model based on artificial intel-
ligence for predicting the output in rice production. Such a model could help
farmers and policy makers. This model employed the polynomial and radial ba-
sis function (RBF) as the kernel function for support vector regression (SVR).
Specifically, it seeks to answer the following questions:
1. Determine the summary statistics of:
1.1 Yield
1.2 Area
1.3 Production
2. Determine the error of training and testing data set
2.1 RMSE (Root Means Square Error)
2.2 MAE (Mean Absolute Error)
3. Calculate different of kernel functions involved in the study and create
graphical representation
3.1 Polynomial
4
3.2 Linear
3.2 RBF (Radial Basis Function)
4. Which model is most efficient among different kernel function?
2 Methodology
2.1 Data Collection
The data was gathered with the help of the available database online of the
Philippine Statistics Authority. The rice yield data ranges from 1987-2022 where
we took the annual Volume of Production and the Area Harvested in Tarlac
Province. Also, we used the two data to get the annual Yield (KG/Hectare) of
the province.
= wxT + b (2)
where x = {(x1 , x2 , . . . , xn )}, y = {(y1 , y2 , . . . , yn )}, and w = {(w1 , w2 , . . . , wn )};
x, w ∈ R.
The formula
n
1 X
M in ||w||2 + C (δi + δi∗ ) (3)
2 i=1
such that
yi − wxTi − b ≤ ϵ + δi ,
wxTi + b − yi ≤ ϵ + δi∗ (4)
δi ≥ 0, δi∗ ≥0
is the optimization problem where slack variables are added for the variables
that are outside the hyperplane of the regression model. The parameter C is
the misclassification cost, ϵPis the constraint or the tolerance level, while δi and
n
δi∗ are the slack variables. i=1 (δi + δi∗ ) is in Equation (3) so that the variables
outside of the margin are included.
5
For easier solving, the optimization problem is translated into its Lagrange
dual formulation. The non-linear SVR in this formulation is:
n n n
1 XX X
L(γ) = min K(ai + a∗i )(ai − a∗i ) + ϵ yi (ai − a∗i ) (5)
2 i=1 i=1 i=1
where ai and a∗i are the non-negative multipliers for each observation xi subject
to constraints:
Xn
(ai − a∗i ) = 0; 0 ≥ ai , a∗i ≤ C (6)
i=1
, K is the kernel function such that K = K(i, j) = ϕ(xi )ϕ(xj )T . The kernel
function used in this research involves three such as:
Linear → K(xi , xj ) = xi xTj
P olynomial → K(xi , xj ) = (xi xTj + 1)d (7)
(−γ|xi −xj |2 )
RadialBasisF unction → K(xi , xj ) = e
n
1X
M AE = |yi − yi∗ | (9)
n i=1
3 Results
3.1 Summary Statistics of Rice Parameters
The table shows the characteristics of the data from Area Harvested to yield
hectares. The dataset was gathered from the official website of the Philippines
Statistics Authority (PSA) and to get the yield, the area harvested was divided
by the volume of Production in terms of Kilograms. The below data, show
the mean of Area Harvested (115202.52), Volume of production in kilograms
6
(442875896.4), and yield (3767.52778).In terms of SD the area harvest Area
Harvested (16740.38087), Volume of production in kilograms (131422562.7), and
yield (699.38).It also showed that Area Harvested has negatively skewed and
the rest of the data are slightly skewed. The kurtosis values have leptokurtic
distribution. It suggests an increase in rice output in Tarlac City over the
recorded years because, as the population increases, the region and production
of the states contribute to the total yield.
7
Table 2: Table 2. Error analysis and cost values of training and testing datasets
by using SVR kernels for rice yield prediction.
Province Dataset Parameter RMSE MAE C d
Tarlac Training Polynomial 336.9 225.8 10 0.1 1
Linear 336.9 225.8 10 n.a n.a
RBF 224.69 149.86 100 0.1 n.a
Tarlac Testing Polynomial 116.95 83.49 10 0.1 1
Linear 116.95 83.49 10 n.a n.a
RBF 112.67 74.06 100 0.1 n.a
Table 2 shows the RMSE, MAE, and specified cost function for the Province
of Tarlac. It is apparent that linear has the most outliers for training and
testing datasets with error validation such as RMSE (336.9) and MAE (225.8)
with cost function C = 10. Polynomial shows exact results with respect to the
scale parameters as =0.1, =0.1, and d=1. The Radial Basis function however
showed the lowest error such as RMSE (224.69) and MAE (149.89) with respect
to the scale parameters as =0.1, =0.1, and d=1.
8
4 Discussions
5 Resources
FAOSTAT (2020). FAO. (http://www.fao.org/faostat/en/#data. Accessed
on June 25, 2020).
9
Figure 1: SVR RBF kernel for testing data.
10
Figure 3: SVR Linear kernels for testing data.
11