AlbarrakA 2013-1 BODY

TIME SERIES ANALYSIS OF SAUDI ARABIA OIL
PRODUCTION DATA
A THESIS
SUBMITTED TO THE GRADUATE SCHOOL
IN PARTIAL FULFILLMENT OF THE
REQUIREMENTS
FOR THE DEGREE
MASTER OF SCIENCE
BY
ABDULMAJEED ALBARRAK
ADVISER DR. RAHMATULLAH IMON
BALL STATE UNIVERSITY
MUNCIE, INDIANA
DECEMBER, 2013
Time Series Analysis of Saudi Arabia Oil Production Data
A THESIS
SUBMITTED TO THE GRADUATE EDUCATIONAL POLLCIES COUNCIL
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
for the degree
MASTER OF SCIENCE
By
Abdulmajeed Albarrak
Committee Approval:
………………………………………………………………………………………………….
Committee Chairman Date
………………………………………………………………………………………………
Committee Member Date
…………………………………………………………………………………………………
Committee Member Date
Department Head Approval:
…………………………………………………………………………………………………
Head of Department Date
Graduate office Check:
…………………………………………………………………………………………………
Dean of Graduate School

Date
Ball State University
Muncie, Indiana
November 2013
ACKNOWLEDGEMENTS
Foremost, I would like to express my sincere gratitude to my advisor Professor Dr. Rahmatullah
Imon for the continuous support of my thesis study, for his patience, motivation, enthusiasm, and
immense knowledge. His guidance helped me in all the time during my analysis and writing the
report. I could not have imagined having a better advisor and mentor for my thesis other than
him. Besides my advisor, I would like to thank the rest of my thesis committee: Dr. Dale
Umbach and Dr. Munni Begum for their encouragement, insightful comments and patience. I am
thankful to all my classmates for their kind supports. Last but not the least, I would like to thank
my family: my parents, my brothers and sisters, for supporting me throughout my life.
Abdulmajeed Albarrak
November 3, 2013
ABSTRACT
THESIS PAPER: Time series analysis of Saudi Arabia oil production data
STUDENT: Abdulmajeed Albarrak
DEGREE: Master of Science
COLLEGE: Sciences and Humanities
DATE: December, 2013
PAGES: 123
Saudi Arabia is the largest petroleum producer and exporter in the world. Saudi Arabian
economy hugely depends on production and export of oil. This motivates us to do research on oil
production of Saudi Arabia. In our research the prime objective is to find the most appropriate
models for analyzing Saudi Arabia oil production data. Initially we think of considering
integrated autoregressive moving average (ARIMA) models to fit the data. But most of the
variables under study show some kind of volatility and for this reason we finally decide to
consider autoregressive conditional heteroscedastic (ARCH) models for them. If there is no
ARCH effect, it will automatically become an ARIMA model. But the existence of missing
values for almost each of the variable makes the analysis part complicated since the estimation of
parameters in an ARCH model does not converge when observations are missing. As a remedy
to this problem we estimate missing observations first. We employ the expectation maximization
(EM) algorithm for estimating the missing values. But since our data are time series data, any
simple EM algorithm is not appropriate for them. There is also evidence of the presence of
outliers in the data. Therefore we finally employ robust regression least trimmed squares (LTS)
based EM algorithm to estimate the missing values. After the estimation of missing values we
employ the White test to select the most appropriate ARCH models for all sixteen variables
under study. Normality test on resulting residuals is performed for each of the variable to check
the validity of the fitted model.

Table of Contents
CHAPTER 1 1
INTRODUCTION 1
1.1 The History of Oil Production in Saudi Arabia 1
1.2 Saudi Arabia Oil Production Data 3
1.3 Outline of the Study 13
CHAPTER 2 15
ARCH/GARCH MODELS, OUTLIERS AND ROBUSTNESS, TESTS FOR NORMALITY AND ESTIMATION
OF MISSING VALUES IN TIME SERIES 15
2.1 ARCH and GARCH Models 15
2.2 Outliers and Robustness 28
2.3 Tests for Normality 35
2.4 Estimation of Missing values 37
2.5 Computation 41
CHAPTER 3 42
OUTLIER ANALYSIS AND ESTIMATION OF MISSING VALUES BY ROBUST EM ALGORITHM FOR
SAUDI ARABIA OIL PRODUCTION DATA 42
3.1 Outlier Analysis 42
3.2 Estimation of Missing Values 45
CHAPTER 4 61
SELECTION OF ARCH MODELS FOR SAUDI ARABIA OIL PRODUCTION DATA 61
4.1 Crude Oil Production 61
4.2 Total Export of Refined Oil 67
4.3 Export of Crude Oil to North America 70
4.4 Export of Refined Oil to North America 73
4.5 Export of Crude Oil to South America 76
4.6 Export of Refined Oil to South America 79
4.7 Export of Crude Oil to Western Europe 82
4.9 Export of Crude Oil to Middle East 88
4.10 Export of Refined Oil to Middle East 90
4.11 Export of Crude Oil to Africa 93
4.12 Export of Refined Oil to Africa 96
4.13 Export of Crude Oil to Asia and Far East 99
4.14 Export of Refined Oil to Asia and Far East 103
4.15 Export of Crude Oil to Oceania 106
4.16 Export of Refined Oil to Oceania 108
4.17 Result Summary 111
CHAPTER 5 113
CONCLUSIONS AND DIRECTION OF 113
FUTURE RESEARCH 113
5.1 Conclusions 113
5.2 Direction of Future Research 114
REFERENCES 115
APPENDIX 117
List of Tables
CHAPTER 2
Table 2.1: Specification of ARCH Models 27
CHAPTER 3
Table 3.1 Estimates of Missing Values for the Saudi Arabia Oil Production Data 46
CHAPTER 4
Table 4.1.1 The ACF and PACF Values for the Crude Oil Production Data 61
Table 4.1.2 Order of ARCH Using the White Test for the Crude Oil Production Data 63
Table 4.1.3 Order of ARCH Using the Breusch-Pagan Test for the Crude Oil Production Data 63
Table 4.1.4 Normality Test of ARCH (1) Rresiduals for the Crude Oil Production Data 64
Table 4.1.5 The ACF and PACF Values for the LTS Crude Oil Production Data 65
Table 4.1.6 Order of ARCH Using the LTS White Test for the Crude Oil Production Data 66
Table 4.1.7 Order of ARCH Using the LTS Breusch-Pagan Test for the Crude Oil Production Data 66
Table 4.1.8 Normality Test of ARCH (1) Rresiduals for the Crude Oil Production Data 67
Table 4.2.1 The ACF and PACF Values for the Total Export of Refined Oil Data 68
Table 4.2.2 Order of ARCH Using the White Test for the Total Export of Refined Oil Data 69
Table 4.2.3 Normality Test of ARCH (1) Rresiduals for the Total Export of Refined Oil Data 70
Table 4.3.1 The ACF and PACF Values for the Export of Crude Oil to North America Data 70
Table 4.3.2 Order of ARCH Using the White Test for the Export of Crude Oil to North America
Data 71
Table 4.3.3 Normality Test of ARCH (2) Rresiduals for the Export of Crude Oil to North America
Data 73
Table 4.4.1 The ACF and PACF Values for the Export of Refined Oil to North America Data 73
Table 4.4.2 Order of ARCH Using the White Test for the Export of Refined Oil to North America
Data 74
Table 4.4.3 Normality Test of ARCH (1) Rresiduals for the Export of Refined Oil to North
America Data 76
Table 4.5.1 The ACF and PACF Values for the Export of Crude Oil to South America Data 76
Table 4.5.2 Order of ARCH Using the White Test for the Export of Crude Oil to South America
Data 77
Table 4.5.3 Normality Test of ARCH (1) Rresiduals for the Export of Crude Oil to South America
Data 79
Table 4.6.1 The ACF and PACF Values for the Export of Crude Oil to South America Data 79
Data 80
Table 4.6.3 Normality Test of ARCH (1) Rresiduals for the Export of Crude Oil to South America
Data 81
Table 4.7.1 The ACF and PACF Values for the Export of Crude Oil to Western Europe Data 82
Table 4.7.2 Order of ARCH Using the White Test for the Export of Crude Oil to Western Europe
Data 83
Table 4.7.3 Normality Test of ARCH (1) Rresiduals for the Export of Crude Oil to Western
Europe Data 84
Table 4.8.1 The ACF and PACF Values for the Export of Refined Oil to Western Europe Data 85
Table 4.8.2 Order of ARCH Using the White Test for the Export of Refined Oil to Western
Europe Data 86
Table 4.8.3 Normality Test of ARCH (1) Rresiduals for the Export of Refined Oil to Western
Europe Data 87
Table 4.9.1 The ACF and PACF Values for the Export of Crude Oil to Middle East Data 88
Table 4.9.2 Order of ARCH Using the White Test for the Export of Crude Oil to Middle East Data 89
Table 4.9.3 Normality Test of AR (1) Rresiduals for the Export of Crude Oil to Middle East Data 90
Table 4.10.1 The ACF and PACF Values for the Export of Refined Oil to Middle East Data 91
Table 4.10.2 Order of ARCH Using the White Test for the Export of Refined Oil to Middle East
Data 92
Table 4.10.3 Normality Test of AR (1) Rresiduals for the Export of Refined Oil to Middle East
Data 93
Table 4.11.1 The ACF and PACF Values for the Export of Crude Oil to Africa Data 93
Table 4.11.2 Order of ARCH Using the White Test for the Export of Crude Oil to Africa Data 95
Table 4.11.3 Normality Test of AR (1) Rresiduals for the Export of Crude Oil to Africa Data 96
Table 4.12.1 The ACF and PACF Values for the Export of Refined Oil to Africa Data 96
Table 4.12.2 Order of ARCH Using the White Test for the Export of Refined Oil to Africa Data 97
Table 4.12.3 Normality Test of AR (1) Rresiduals for the Export of Refined Oil to Africa Data 98
Table 4.13.1 The ACF and PACF Values for the Export of Crude Oil to Asia and Far East Data 99
Table 4.13.2 Order of ARCH Using the White Test for the Export of Crude Oil to Asia and Far
East Data 100
Table 4.13.3 Normality Test of ARCH (2) Rresiduals for the Export of Crude Oil to Asia and Far
East Data 101
Table 4.14.1 The ACF and PACF Values for the Export of Refined Oil to Asia and Far East Data 102
Table 4.14.2 Order of ARCH Using the White Test for the Export of Refined Oil to Asia and Far
East Data 103
Table 4.14.3 Normality Test of ARCH (1) Rresiduals for the Export of Refined Oil to Asia and
Far East Data 104
Table 4.15.1 The ACF and PACF Values for the Export of Crude Oil to Oceania Data 105
Table 4.15.2 Order of ARCH Using the White Test for the Export of Crude Oil to Oceania Data 106
Table 4.15.3 Normality Test of ARCH (2) Rresiduals for the Export of Crude Oil to Oceania Data 107
Table 4.16.1 The ACF and PACF Values for the Export of Refined Oil to Oceania Data 108
Table 4.16.2 Order of ARCH Using the White Test for the Export of Refined Oil to Oceania Data 109
Table 4.16.3 Normality Test of ARCH (1) Rresiduals for the Export of Refined Oil to Oceania
Data 110
Table 4.17 Selected Models for Saudi Arabia Oil Production Data 111
List of Figures
CHAPTER 1
Figure 1.1 Time Series Plot of Crude Oil Production 5
Figure 1.2 Time Series Plot of Export of Refined Oil 5
Figure 1.3 Time Series Plot of Export of Crude Oil to North America 6
Figure 1.4 Time Series Plot of Export of Refined Oil to North America 6
Figure 1.5 Time Series Plot of Export of Crude Oil to South America 7
Figure 1.6 Time Series Plot of Export of Refined Oil to South America 7
Figure 1.7 Time Series Plot of Export of Crude Oil to Western Europe 8
Figure 1.8 Time Series Plot of Export of Refined Oil to Western Europe 8
Figure 1.9 Time Series Plot of Export of Crude Oil to Middle East 9
Figure 1.10 Time Series Plot of Export of Refined Oil to Middle East 9
Figure 1.11 Time Series Plot of Export of Crude Oil to Africa 10
Figure 1.12 Time Series Plot of Export of Refined Oil to Africa 10
Figure 1.13 Time Series Plot of Export of Crude Oil to Asia and Far East 11
Figure 1.14 Time Series Plot of Export of Refined Oil to Asia and Far East 11
Figure 1.15 Time Series Plot of Export of Crude Oil to Oceania 12
Figure 1.16 Time Series Plot of Export of Refined Oil to Oceania 12
CHAPTER 3
Figure 3.1 Time Series Plot of Original and Missing Value Estimated Export of Crude Oil to
North America 47
Figure 3.2 Time Series Plot of Original and Missing Value Estimated Export of Refined Oil to
North America 48
South America 49
South America 50
Western Europe 51
Western Europe 52
Middle East 53
Middle East 54
Africa 55
Africa 56
Asia and Far East 57
Asia and Far East 58
Oceania 59
Oceania 60
CHAPTER 4
Figure 4.1.1 The ACF and PACF Values for the Crude Oil Production Data 62
Figure 4.1.2 Normal Probability Plot of ARCH (1) Residuals for the Crude Oil Production Data 64
Figure 4.1.3 The LTS ACF and PACF Values for the Crude Oil Production Data 65
Figure 4.1.4 Normal Probability Plot of ARCH (1) LTS Residuals for the Crude Oil Production
Data 67
Figure 4.2.1 The ACF and PACF Values for Total Export of Refined Oil Data 68
Figure 4.2.2 Normal Probability Plot of ARCH (1) Residuals for the Total Export of Refined Oil
Data 69
Figure 4.3.1 The ACF and PACF Values for the Export of Crude Oil to North America Data 71
Figure 4.3.2 Normal Probability Plot of ARCH (2) Residuals for the Export of Crude Oil to North
America Data 72
Figure 4.4.1 The ACF and PACF Values for the Export of Refined Oil to North America Data 74
Figure 4.4.2 Normal Probability Plot of ARCH (1) Residuals for the Export of Refined Oil to
North America Data 75
Figure 4.5.1 The ACF and PACF Values for the Export of Crude Oil to South America Data 77
Figure 4.5.2 Normal Probability Plot of ARCH (1) Residuals for the Export of Crude Oil to South
America Data 78
Figure 4.6.1 The ACF and PACF Values for the Export of Crude Oil to South America Data 80
Figure 4.6.2 Normal Probability Plot of ARCH (1) Residuals for the Export of Crude Oil to South
America Data 81
Figure 4.7.1 The ACF and PACF Values for the Export of Crude Oil to Western Europe Data 83
Figure 4.7.2 Normal Probability Plot of ARCH (1) Residuals for the Export of Crude Oil to
Western Europe Data 84
Figure 4.8.1 The ACF and PACF Values for the Export of Refined Oil to Western Europe Data 86
Western Europe Data 87
Figure 4.9.1 The ACF and PACF Values for the Export of Crude Oil to Middle East Data 88
Figure 4.9.2 Normal Probability Plot of AR (1) Residuals for the Export of Crude Oil to Middle
East Data 90
Figure 4.10.1 The ACF and PACF Values for the Export of Refined Oil to Middle East Data 91
Figure 4.10.2 Normal Probability Plot of AR (1) Residuals for the Export of Refined Oil to Middle
East Data 93
Figure 4.11.1 The ACF and PACF Values for the Export of Crude Oil to Africa Data 94
Figure 4.11.2 Normal Probability Plot of AR (1) Residuals for the Export of Crude Oil to Africa
Data 95
Figure 4.12.1 The ACF and PACF Values for the Export of Refined Oil to Africa Data 97
Figure 4.12.2 Normal Probability Plot of AR (1) Residuals for the Export of Refined Oil to Africa 98
Data
Figure 4.13.1 The ACF and PACF Values for the Export of Crude Oil to Asia and Far East Data 100
Figure 4.13.2 Normal Probability Plot of ARCH (2) Residuals for the Export of Crude Oil to Asia
and Far East Data 101
Figure 4.14.1 The ACF and PACF Values for the Export of Refined Oil to Asia and Far East Data 103
Asia and Far East Data 104
Figure 4.15.1 The ACF and PACF Values for the Export of Crude Oil to Oceania Data 105
Oceania Data 107
Figure 4.16.1 The ACF and PACF Values for the Export of Refined Oil to Oceania Data 108
Oceania Data 110
CHAPTER 1
INTRODUCTION
Saudi Arabia is the largest petroleum producer and exporter in the world. Saudi Arabia possesses
18 per cent of the world’s proven petroleum reserves, which is over 260 billion barrels. The oil
and gas sector accounts for roughly 50 per cent of gross domestic product, and 90 per cent of
export earnings. Saudi refineries produce around 10.78 million barrels of oil per day. Oil was
first struck in Saudi Arabia in March 1938.
1.1 The History of Oil Production in Saudi Arabia
Oil exploration had been initiated in Middle Eastern area before World War I. But the search was
not initiated largely in Saudi Arabia before 1933. The Standard Oil of California (SOCAL now
Chevron) was given exploration rights to some area of Saudi Arabia in 1933. SOCAL set up a
subsidiary company, the California Arabian Standard Oil Company (CASOC) to develop the oil
concession. SOCAL also joined forces with the Texas Oil Company when together they formed
CALTEX in 1936 to take advantage of the latter’s formidable marketing network in Africa and
Asia. When CASOC geologists surveyed the concession area, they identified a promising site
and named it Dammam No. 1. Over the next three years, the drillers were unsuccessful in
making a commercial strike. The drillers finally struck oil on March 3, 1938 in Dammam No. 7.
This discovery would turn out to be first of many, eventually revealing the largest source of
crude oil in the world. The name of the operating company in Saudi Arabia was changed to
Arabian American Oil Company (Aramco) in January 1944. Two partners, Standard Oil
Company of New Jersey (later renamed Exxon) and Socony-Vacuum (now Mobil Oil
Company), were added in 1946 to gain investment capital and marketing outlets for the large
1
reserves being discovered in Saudi Arabia. These four companies were the sole owners of
Aramco until the early 1970s.
Once the existence of oil in quantity was ascertained, the advantages of a pipeline to the
Mediterranean Sea seemed obvious, saving about 3,200 kilometers of sea travel and the transit
fees of the Suez Canal. The Trans-Arabian Pipeline Company (Tapline), a wholly owned
Aramco subsidiary, was formed in 1945, and the pipeline was completed in 1950. Tax problems
with Saudi authorities and transit fees due Jordan, Iraq, and Lebanon plagued Tapline for many
years. The line was damaged and out of operation several times in the 1970s. And while
operating costs of Tapline increased, supertankers were reducing seaborne expenses. By 1975
Tapline was no longer used to export Saudi crude via Sidon. In 1982 the line was again
damaged. In late 1983, Tapline filed formal notice to cease operations in Syria and Lebanon,
although small amounts of crude would reportedly continue, albeit temporarily, to supply a
refinery in Jordan.
The General Petroleum and Mineral Organization (Petromin) was established in 1962 as a public
corporation wholly owned by the Saudi government to develop industries based on petroleum,
natural gas, and minerals by itself or in conjunction with other investors, foreign or domestic.
Although its activities predominantly centered on the country's hydrocarbon resources, Petromin
also explored for and developed other mineral resources.
After two decades of organizational change, the reshaping of the oil industry in Saudi Arabia
reared completion by the late 1980s. During the 1970s and early 1980s, the industry was
transformed from one controlled by foreign oil companies (the Aramco parent companies) to one
owned and operated by the government. Decisions made directly by the ruling family
increasingly became a feature of the industry in the late 1970s. Saudi Arabia's participation in the
2
Arab oil embargo in 1973 and foreign policy goals were featives of this transition. In 1992 the
government had title to all mineral resources in the country (except in the former Divided Zone,
where both Kuwait and Saudi Arabia had interests in the national resources of the whole zone).
Through the Supreme Oil Council, headed by the king, and the Ministry of Petroleum and
Mineral Resources the government initiated, funded, and implemented all investment decisions.
It also controlled daily operations related to production and pricing.
Saudi Arabia is the world’s largest producer and exporter of oil, and has one quarter of the
world’s known oil reserves – more than 260 billion barrels. Most are located in the Eastern
Province, including the largest onshore field in Ghawar and the largest offshore field at Safaniya
in the Arabian Gulf. Saudi refineries produce around 8 million barrels of oil per day, and there
are plans to increase production to around 12 million barrels per day. As the world’s largest
producer and exporter of oil, Saudi Arabia plays a unique role in the global energy industry. Its
policies on the production and export of oil, natural gas and petroleum products have a major
impact on the energy market, as well as the global economy. Mindful of this responsibility, Saudi
Arabia is committed to ensuring stability of supplies and prices.
1.2 Saudi Arabia Oil Production Data
In our study we would like to consider various types of oil production data from Saudi Arabia.
Here we consider both crude oil and refined oil and look at the data regarding the export of these
two types of oils in different continents. This data set is taken from the official website of Saudi
Arabian Moneytary Agency (SAMA). Here is the link of the data:
http://www.sama.gov.sa/sites/samaen/ReportsStatistics/statistics/Pages/YearlyStatistics.aspx
3
At first we present time series plot of all sixteen oil production variables of Saudi Arabia from
1962 to 2010 that we consider in our study. These are
Crude Oil Production
Export of Refined Oil
Export of Crude Oil to North America
Export of Refined Oil to North America
Export of Crude Oil to South America
Export of Refined Oil to South America
Export of Crude Oil to Western Europe
Export of Refined Oil to Western Europe
Export of Crude Oil to Middle East
Export of Refined Oil to Middle East
Export of Crude Oil to Africa
Export of Refined Oil to Africa
Export of Crude Oil to Asia and Far East
Export of Refined Oil to Asia and Far East
Export of Crude Oil to Oceania
Export of Refined Oil to Oceania
4
Time Series Plot of Crude_Production_mil_barl
4000
3500
Crude_Production_mil_barl
3000
2500
2000
1500
1000
500
1962 1970 1978 1986 1994 2002 2010
Year
Figure 1.1 Time Series Plot of Crude Oil Production
Time Series Plot of Total_refine_exp_mil_barl

600
500
Total_refine_exp_mil_barl
400
300
200
100
1962 1970 1978 1986 1994 2002 2010

Year
Figure 1.2 Time Series Plot of Export of Refined Oil
5
Time Series Plot of Crude_North_america
700
600
Crude_North_america 500
400
300
200
100
0
1962 1970 1978 1986 1994 2002 2010
Year
Figure 1.3 Time Series Plot of Export of Crude Oil to North America
Time Series Plot of Refine_North_america

60
50
Refine_North_america
40
30
20
10
1962 1970 1978 1986 1994 2002 2010

Year
Figure 1.4 Time Series Plot of Export of Refined Oil to North America
6
Time Series Plot of Crude_South_america
500
400
Crude_South_america
300
200
100
0
1962 1970 1978 1986 1994 2002 2010
Year
Figure 1.5 Time Series Plot of Export of Crude Oil to South America
Time Series Plot of Refine_South_america

50
40
Refine_South_america
30
20
10
0
1962 1970 1978 1986 1994 2002 2010
Year
Figure 1.6 Time Series Plot of Export of Refined Oil to South America
7
Time Series Plot of Crude_Western_europe
1600
1400
Crude_Western_europe 1200
1000
800
600
400
200
0
1962 1970 1978 1986 1994 2002 2010
Year
Figure 1.7 Time Series Plot of Export of Crude Oil to Western Europe
Time Series Plot of Refine_Western_europe

90
80
70
Refine_Western_europe
60
50
40
30
20
10
0
1962 1970 1978 1986 1994 2002 2010
Year
Figure 1.8 Time Series Plot of Export of Refined Oil to Western Europe
8
Time Series Plot of Crude_Middle_east
120
110
Crude_Middle_east 100
90
80
70
60
50
40
30
1962 1970 1978 1986 1994 2002 2010
Year
Figure 1.9 Time Series Plot of Export of Crude Oil to Middle East
Time Series Plot of Refine_Middle_east

80
70
60
Refine_Middle_east
50
40
30
20
10
0
1962 1970 1978 1986 1994 2002 2010
Year
Figure 1.10 Time Series Plot of Export of Refined Oil to Middle East
9
Time Series Plot of Crude_Africa
100
80
Crude_Africa
60
40
20
0
1962 1970 1978 1986 1994 2002 2010
Year
Figure 1.11 Time Series Plot of Export of Crude Oil to Africa
Time Series Plot of Refine_Africa
50
40
Refine_Africa
30
20
10
1962 1970 1978 1986 1994 2002 2010

Year
Figure 1.12 Time Series Plot of Export of Refined Oil to Africa
10
Time Series Plot of Crude_Asia_and_Far_east
1600
1400
Crude_Asia_and_Far_east
1200
1000
800
600
400
200
1962 1970 1978 1986 1994 2002 2010
Year
Figure 1.13 Time Series Plot of Export of Crude Oil to Asia and Far East
Time Series Plot of Refine_Asia_and_Far_east
350
300
Refine_Asia_and_Far_east
250
200
150
100
50
1962 1970 1978 1986 1994 2002 2010
Year
Figure 1.14 Time Series Plot of Export of Refined Oil to Asia and Far East
11
Time Series Plot of Crude_Oceania
50
Crude_Oceania 40
30
20
10
0
1962 1970 1978 1986 1994 2002 2010
Year
Figure 1.15 Time Series Plot of Export of Crude Oil to Oceania
Time Series Plot of Refine_Oceania

25
20
Refine_Oceania
15
10
1962 1970 1978 1986 1994 2002 2010

Year
Figure 1.16 Time Series Plot of Export of Refined Oil to Oceania
12
The time series plots of the variables reveal lots of interesting features of time series. Most of the
plots show some kind of volatility which indicates that perhaps ARCH models are more
appropriate for these variables. Whatever model we consider here there is always a chance that
there might be few outliers in the data set. It is necessary to check the normality assumption of
error which is the key to any kind of statistical inference drawn from these data. Figures 1.3 –
1.16 show that one observation is consistently missing. This is the year 1987. We searched for
the reasons of the missing data, but could not find any discussions neither in any authentic
websites nor in any article that why the Saudi Arabian government did not publish the data in
1987. So we just treat this case as a missing value problem. For the ‘Export of Refined Oil to
Africa’ another observation (for year 1982) is missing as shown in Figure 1.12. Before fitting
any kind of econometric or time series models to these data we need to estimate the missing
values.
1.3 Outline of the Study
We organize this thesis in the following way. In chapter 2, we introduce different methodologies
we use in our research that include the diagnostic and robust methods of outlier detection,
estimation of missing values by robust EM algorithm, ARCH models and determination of order
of ARCH in time series using graphical and analytical tests, tests for normality of errors etc.
Since almost all variables that we consider in our study have missing observations, in chapter 3,
we employ the EM algorithm for estimating the missing values. It is now well known that
outliers can adversely affect the missing value estimation procedure and for this reason we
employ the robust EM algorithm where the outliers are identified at first by the least median of
squares (LMS) or the least trimmed squares (LTS) before applying the EM algorithm. In Chapter
4 in order to determine the most appropriate ARCH model for all seventeen Saudi Arabia oil
13
production data we employ the three most commonly used tests such as the Bruesch-Pagan, the
Goldfeld-Quandt, and the White test and also by using the Ljung-Box test based on the
autocorrelation function (ACF) and the partial autocorrelation function (PACF) . We observe
that different ARCH models are adequate for different variables. In Chapter 5, we draw
conclusions on our current research and suggest appropriate ARCH models for all seventeen
Saudi Arabia oil production variables. In this chapter we also outline our directions for future
research.
14
CHAPTER 2
ARCH/GARCH MODELS, OUTLIERS AND ROBUSTNESS,
TESTS FOR NORMALITY AND ESTIMATION OF MISSING
VALUES IN TIME SERIES
In this chapter we discuss different aspects of data analysis techniques useful in time series
analysis. Although the prime topic of our discussion will be regression analysis, but we will
consider some other important topics that we are going to use in our study. A time series is a
chronological sequence of observations on a particular variable. A time series model accounts for
patterns of the past movement of a variable and uses that information to predict its future
movements, i.e., it is a sophisticated method of extrapolating data.
2.1 ARCH and GARCH Models

The use of regression analysis in econometrics and time series is increasing day by day.
Researchers also use this technique to forecast and comment on goodness of fit of their model. In
regression analysis we usually use least squares models to find out the contribution of each
explanatory variables on the response variable. All the assumptions, including equal variance of
the error terms, required to meet. Equality of variance of the error terms is defined as
homoscedasticity. There are many situations where this assumption is violated and least squares
model does not work well. Violation of this assumption is called heteroskedasticity. In any area
of research, including econometrics, error terms may be larger in some points than the other and
the volatility may not be explained by explanatory variables. In such situation, ordinary least
square regression may not fit well and the variance of estimated coefficients of each explanatory
variables may be unreasonably high. In econometrics or in time series data, usually the unequal
15
variations of error terms related with the preceding time points. Engle (1982) defined it as
exhibition of time-varying volatility clustering and proposed a new method to address this kind
of heteroskedasticity. He proposed that variance of the current error term may be set as a
function of the previous time periods’ error terms. He proposed the Autoregressive Conditionally
Heteroskedasticity (ARCH) models to address such condition.
The ARCH model allows to fit a model considering the conditional variance of error terms to
change over time as a function of past error terms. In ARCH model we have flexibility to use
any lag structure. Bollerslev (1986) discussed about the risk of using a totally free lag
distribution that may lead to violation of the non-negativity constraints. Then he proposed
GARCH (Generalized Autoregressive Conditionally Heteroskedastic) model which allows much
more flexibility in lag structure. In GARCH model we consider variance of the error terms as a
function of previous time points.
Time series data can be modeled in different forms for different stochastic process. Addressing
the variations in the process we may consider autoregressive (AR) models, the integrated (I)
models and the moving average (MA) models. All of these models consider the assumption of
linearity on the previous data points. The general AR(p) models have the following form:
(2.1)
where the term is the source of randomness and is called white noise. It is assumed to hold the
following assumptions:
E[ ] = 0,
E[ ] = and
In modeling time series data we need to meet the above assumptions and one of the assumptions
16
is the model possesses constant error variance (white noise). To be sure that the selected time
series model is valid, we must test the assumptions (i.e., E[ ] = ). According to Granger and
Ramanathan (1984), there is really no reason to believe that the errors are white noise without
testing. Engle (1982) has written that under some circumstances, the “error variance may change
over time and be predicted by past forecast errors.’’ In the area of financial econometrics and in
the analysis of inflation, autoregressive heteroskedasticity is very common and subject to
consider seriously [Engle (1982)]. In cases of regression models, where the value of error
variance is a function of the time lag, an autoregressive model with conditional heteroskedastic
(ARCH) error variance may be in the appropriate model to model that risk or volatility.
2.1.1 ARCH Models

ARCH model have error variances that can be expressed in a simple functional form. If Yt is a
model that has a variance, ht , that is conditional on the error variance at a previous time periods,
that model, with its conditional variance, can be expressed as:
. (2.2)
and
(2.3)
or when the model is of order q, that is, ARCH(q):
(2.4)
The generalized version of ARCH model have been first used by Bollerslev in 1986. The
variance is a function of previous conditional variances and also previous innovations in the
GARCH model. The GARCH(q, p) model can be expressed as:
(2.5)
17
2.1.2 Testing for ARCH
Before fitting the ARCH model we need to know whether ARCH effect is present in the data or
not. A number of detection methods are now available in the literature [see Pindyck and
Rubinfeld (1997), Green (1997)] for the detection of ARCH. They can be categorized broadly by
two types; graphical tests and analytical tests.
Graphical Tests
The simplest graphical test is to plot the data against time on the graph paper which is popularly known
as the time series (TS) plot. If the ARCH effect is present it is expected that this plot will show a
pattern: one is likely to find periods of high volatility followed by periods of low volatility and so on.
ARCH effect can be visible if we plot residuals against time. A plot containing periods of large
residuals followed by periods of small residuals will indicate the existence of the ARCH effect. The
other two graphs available for the same; plot of time series values against their corresponding lag
values and plot of residuals against the corresponding lag time series values. For both plots, linear
pattern will indicate the presence of ARCH.
Analytical Tests
The graphical methods are simple and very easy to understand. But it may often produce ambiguous
pictures and analysts may come up with conflicting conclusions. That is why more formal tests like the
analytical tests are required. Several analytical methods are available to test the ARCH test. In each
case we wish to find a test for the null hypothesis of no ARCH. The specific alternative hypothesis
against which the null hypothesis is to be tested depends on the estimation procedure that is considered
to yield the most desirable correction for ARCH.
18
White’s general ARCH Test
White test is considered to be the most popular test for detecting the ARCH effect which has been in
use from the very beginning [see Engel (1982)]. Let us consider an ARCH (1) model as defined in
(2.1), the generalization to the k-variable model is straightforward:
The White test proceeds as follows:
Step 1. Given the data, we estimate the parameters given in (2.6) and compute the residuals û t .
Step 2. We then run the following (auxiliary) regression for the necessary order, i.e., for the pth order
ARCH model we fit a regression model
uˆ t2 = α 0 + α1 uˆ t2−1 +α 2 uˆt2−2 + ! + α p uˆt2− p + vt (2.6)
In other words, here the squared residuals ( uˆ t2 ) from the original regression are regressed on the lag of
squared residuals. Higher powers of regressors can also be introduced. Note that there is a constant
term in this equation even though the original regression may or may not contain it. We obtain R 2
from this (auxiliary) regression.
Step 3. Under the null hypothesis that there is no ARCH, it can be shown that sample size n times R 2
obtained from the auxiliary regression asymptotically follows the chi-square distribution with p degrees
of freedom, i.e.,
n R 2 ~ χ (2p ) (2.7)
Step 4. If the chi-square value obtained in equation (2.7) exceeds the critical value at the chosen level
of significance, this indicates the existence of ARCH in the data.
19
Goldfeld-Quandt ARCH Test
Let us consider the model (2.2) once again. Here we wish to test the null hypothesis of
homoscedasticity against the alternative hypothesis that
σ t2 = α 0 + α1 ut2−1 (2.8)
The Goldfeld-Quandt test procedure involves the calculation of two least square regression lines,
one using data thought to be associated with low variance errors and the other associated with
high variance errors. If the residuals variances associated with each regression line are
approximately equal, the homoscedasticity assumption cannot be rejected, but if the residuals
variance increases substantially, it is possible to reject the null hypothesis. The test can be carried
out in the following manner:
Step 1. Fit the two-variable model (2.6) and compute residuals û t . Consider the square of
residuals uˆ t2 as the dependent variable and one period lag of uˆ t2 (i.e. uˆ t2−1 ) as independent
variable.
Step 2. Order the data by the magnitude of the independent variable uˆ t2−1 , which is thought to be
related to the error variance.
Step 3. Omit the middle d observations, d might be chosen, for example, to be approximately
one-fifth of the total sample size.
Step 4. Fit to separate regressions, the first (indicated by subscript 1) for the portion of the data
associated with low values of uˆ t2−1 and the second (indicated by subscript 2) associated with high
values of uˆ t2−1 . Each regression will involve (n-d)/2 pieces of data with [(n-d)/2] – 2 degrees of
20
freedom. The portion d must be small enough to ensure that sufficient degrees of freedom are
available to allow for the proper estimation of each of the separate regressions.
Step 5. Calculate the residuals sum of squares with each regression: ESS1 , associated with low
values of uˆ t2−1 , and ESS2 , associated with high values of uˆ t2−1 .
Step 6. Assume that the error process is normally distributed (and no serial correlation is
present), the statistic ESS2 / ESS1 will be distributed as an F statistic with (n-d-2k)/2 degrees of
freedom in both the numerator and the denominator. We can reject the null hypothesis at a
chosen level of significance if the calculate statistic is greater than the critical value of the F
distribution.
The Goldfeld-Quandt test can easily be applied to the general linear model by ordering the
observation by the magnitude of one of the independent variables. The test works because it
allows for the independent regression estimation of both high and low observation data.
However, there is an important cost involved. Because no restrictions are made on the regression
parameters (as well as the error variances) in each of the two-regression run, statistical power is
lost. A more powerful test (one that has smaller Type ΙΙ errors) would take into account the
information that the regression parameters are identical for both sets of data and that only the
error variance has changed. But the main shortcoming of this test is it can only detect whether or
not the data set is affected by ARCH but cannot say anything about its order.
Breusch-Pagan ARCH Test
The Goldfeld-Quandt test is a natural test to apply when one can order the observations in terms of the
increasing variance of the error term (or one independent variable). An alternative test, which does not
21
require such an ordering and is easy to apply, is the Breusch-Pagan test. Let us consider the model (2.2)
which includes a general assumption about the relationship between the true error variance and an
independent variable ut2−1 :
σ t2 = f ( α 0 + α1 ut2−1 + α 2 ut2−2 + ! + α p ut2− p ) (2.9)
Equation (2.4) provides the specification of the form taken by autoregressive conditional
heteroscedasticity (ARCH) if it is indeed present, f (.) represents a general function that allows, for
example, for both non-linear and logarithmic form. ut2−1 , u t2− 2 ,..., u t2− p could be independent variable,
or it could represent a group of independent variables.
Step 1. To test the ARCH, we first calculate the least squares residuals û t from the regression in
equation (2.9). We consider the square of residuals uˆ t2 as the dependent variable and lag values of uˆ t2 ,
i.e., uˆ t2−1 , uˆ t2− 2 , …, uˆ t2− p as independent variables.
Step 2. We also compute the mean of squared residuals
n
2
∑u t
σˆ 2 = t =1
(2.10)
n
and construct variables pt define as
uˆ t2
pt = (2.11)
σˆ 2
Step 3. Now we run the following regression:
pt = α 0 + α1 uˆ t2−1 + vt (2.12)
22
Step 4. If the error term vt in equation (2.12) is normally distributed and there exists no
heteroscedasticity, then half of the regression sum of squares provides a suitable test statistic.
Specifically, under the null hypothesis of homoscedasticity,
2
Φ = RSS / 2 = (TSS – ESS) / 2 ~ χ m (2.13)
−1
when there are m independent variables (including constant term). Therefore, if in an application the
quantity Φ exceeds the critical χ 2 value at the chosen level of significance, one can reject the
hypothesis of no ARCH.
2.1.3 Parameter Estimation in ARCH Model

In the pth-order linear case, the specification and likelihood for the ARCH model are given by:
1 n
l= ∑ lt
n t =1
(2.14)
where xt, may include lagged dependent and exogenous variables and an irrelevant constant has
been omitted from the likelihood. This likelihood function can be maximized with respect to the
unknown parameters and using the ordinary least square techniques.
2.1.4 Diagnostic Checking

In this section we discuss few well-known diagnostics which are frequently used in ARCH
models, especially for the determination of the order of ARCH. A good number of them are
23
based on the autocorrelation function because it provides a partial description of the process for
modeling purposes. The autocorrelation function tells us how much correlation there is between
neighboring data points in the series y t . We define the autocorrelation with lag k as
Cov ( y t , y t + k )
ρk = (2.15)
V ( y t )V ( y t + k )
The Sample Autocorrelation Function
In practice, we use an estimate of the autocorrelation function, called the sample autocorrelation
(SAC) function
T −k
∑ (y
t =1
t − y )( y t + k − y )
rk = T (2.16)
2
∑ (y
t =1
t − y)
A geometrically decline pattern of the sample autocorrelation indicates the presence of ARCH
effect in the model.
The t-test based on SAC
The standard error of rk is given by
⎧ 1/ n if k = 1
⎪
SE (rk ) = ⎨ ⎛ k − 1
2 ⎞ (2.17)
⎪ ⎜1 + 2 ∑ ri ⎟ / n if k > 1
⎩ ⎝ i =1 ⎠
The t-statistic for testing the hypothesis H 0 : ρ k = 0 for k > 0 is defined as
T = rk /SE( rk ) (2.18)
and this test is significant when |T| > 2.
24
Box and Pierce Test and Ljung and Box Test
To test the joint hypothesis that all the autocorrelation coefficients are zero we use a test statistic
introduced by Box and Pierce. Here the null hypothesis is
H 0 : ρ1 = ρ 2 = … = ρ k = 0.
Box and Pierce show that the appropriate statistic for testing this null hypothesis is
k
2
Q=n ∑r i (2.19)
i =1
is distributed as chi-square with k degrees of freedom.
A slight modification of the Box-Pierce test was suggested by Ljuang and Box, which is
known as the Ljuang-Box Q (LBQ) test defined as
k
2
Q = n(n + 2)∑ (n − k ) −1 ri (2.20)
i =1
Thus, if the calculated value of Q is greater than, say, the critical 5% level, we can be 95% sure
that the true autocorrelation coefficients are not all zero.
The Partial Autocorrelation Function
The sample autocorrelation can indicate about whether there is any ARCH effect is present in the
data but cannot tell much about the order of ARCH. The partial autocorrelation function is often
used to determine the order of an ARCH model. For an autoregressive process of order p, the
covariance with displacement k is determined from
γ k = E [y t − k (φ1 y t −1 + φ 2 y t − 2 + ... + φ p y t − p + ∈t )] (2.21)
which gives
γ0 = φ1 γ 1 + φ 2 γ 2 + … + φ p γ p + σ ∈ 2 (2.22)
25
γ1 = φ1 γ 0 + φ 2 γ 1 + … + φ p γ p −1
……………………………………
γ p = φ1 γ p −1 + φ 2 γ p − 2 + … + φ p γ 0 (2.23)
The above equations also give a set of p equations, known as Yule-Walker equations, to
determine the first p values of the autocorrelation functions:
ρ1 = φ1 + φ 2 ρ1 + φ p ρ p −1
………………………………
ρ p = φ1 ρ p −1 + φ 2 ρ p − 2 + … + φ p (2.24)
The solution of the Yule-Walker equations requires the knowledge of p. Therefore we solve
these equations for successive values of p. We begin by hypothesizing that p = 1. We compute
the sample autocorrelation φˆ1 as an estimate of ρ1 . If this value is significantly different from 0,
we know that the autoregressive process is at least order 1. Next we consider the hypothesis that
p = 2. We solve the Yule-Walker equations for p = 2 and obtain a new set of estimates for φ1
and φ 2 . If φ 2 is significantly different from 0, we may conclude that the process is at least order
2. Otherwise we conclude that the process is order 1. We repeat this process for successive
values of p. We call the series φ1 , φ 2 , …, partial autocorrelation function. If the true order of the
process is p, we should observe that φˆ j ≈ 0 for j > p.
To test whether a particular φ j is zero, we can use the fact that it is approximately
normally distributed with mean 0 and variance 1/n. Hence we can check whether it is statistically
significant at, say, the 5% level by determining whether it exceeds 2/ n in magnitude.
26
Table 2.1: Specification of ARCH Models
Model ACF PACF
ARCH (1) Geometric decline from 1 lag Zero after 1 lag
ARCH (2) Geometric decline from 2 lags Zero after 2 lags
… … …
ARCH (p) Geometric decline from pth lag Zero after p-lags
The determination of the order of ARCH using the ACF and PACF values are summarized in the
above table.
2.1.5 GARCH Model

Autoregressive conditionally heteroscedastic (ARCH) models were introduced by Engle (1982)
and their GARCH (generalized ARCH) extension is due to Bollerslev (1986). In these models,
the key concept is the conditional variance, that is, the variance conditional on the past. In the
classical GARCH models, the conditional variance is expressed as a linear function of the
squared past values of the series.
In general GARCH (p, q) model follows the following two conditions:
i) E ( .
ii) There exist constants
(2.25)
2.1.6 Estimation of GARCH Models

The quasi-maximum likelihood (QML) method is particularly relevant for GARCH models
because it provides consistent and asymptotically normal estimators for strictly stationary
27
GARCH processes under mild regularity conditions, but with no moment assumptions on the
observed process. By contrast, the least-squares methods of the previous chapter require
moments of order 4 at least. The QML considers an iterative procedure for computing the
Gaussian log-likelihood, conditionally on fixed or random initial values. The likelihood is
written as if the law of the variables ηt were Gaussian N (0, 1) (refer to pseudo- or quasi-
likelihood), but this assumption is not necessary for the strong consistency of the estimator.
To write the likelihood of the model, a distribution must be specified for the iid variables ηt.
Here we do not make any assumption on the distribution of these variables, but we work with a
function, called the (Gaussian) quasi-likelihood, which, conditionally on some initial values,
coincides with the likelihood when the ηt are distributed as standard Gaussian. The conditional
Gaussian quasi-likelihood is:
(2.26)
The Quasi-Maximum Likelihood Estimator of is defined as any measurable solution of
. One can use Ljung-Box and Jarque-Bera tests for diagnostic checking.
2.2 Outliers and Robustness

The outlier analysis has become an essential part of data analysis. Any surprising observation in
the data set is called an outlier. According to Barnett and Lewis (1994), ‘We shall define an
outlier in a set of data to be an observation (or subset of observations) which appears to be
inconsistent with the remainder of that set of data.’ Outliers do not inevitably ‘perplex’ or
‘mislead’; they are not necessarily ‘bad’ or ‘erroneous’, and the experimenter may be tempted in
28
some situations not to reject an outlier but to welcome it as an indication of some unexpectedly
useful industrial treatment or surprisingly successful agricultural variety. Outliers are considered
as an empirical reality. Hampel et al. (1986) claim that a routine data set typically contains about
1-10% outliers, and even the highest quality data set cannot be guaranteed free of outliers.
2.2.1 Consequences of Outliers

One immediate consequence of the presence of outliers is that they may cause apparent non-
Normality and the entire classical inferential procedure might breakdown in the presence of
outliers.
2.2.2 Sources of Outliers

There are mainly three different sources of outliers.
Inherent Variability: Natural feature of a population that is uncontrollable.
Measurement Error: The rounding of obtained values or mistakes in recording compound
measurement error.
Execution Error: Imperfect collection of data. We may inadvertently choose a biased sample or
include individuals not truly representative of the population we aimed to sample.
2.2.3 Identification of Outliers

An excellent review of different methods for outlier detection is available in Hadi et al. (2009).
Among them the three-sigma rule has become very popular with the statisticians. If we assume a
normal distribution, a single value may be considered as an outlier if it falls outside a certain
range of the standard deviation.
A traditional measure of the ‘outlyingness’ of an observation xi with respect to a sample
is the ratio between its distance to the sample mean and the sample SD:
29
xi − x
ti =
s (2.27)
Observations with | ti | > 3 are traditionally deemed as suspicious (the three-sigma rule), based on
the fact that they would be very unlikely under normality, since P (|t| > 3) = 0.003 for a random
variable t with a standard normal distribution.
Although the three-sigma rule is very popular sometimes it may fail to identify outliers
because the statistic (2.27) is based on mean and standard deviation which may be severely
contaminated in the presence of outliers. When we have multiple outliers, the three sigma rule
usually does not work mainly because of masking and swamping effects of outliers. Masking
occurs when we fail to detect the outliers (false negative). Swamping occurs when observations
are incorrectly declared as outliers (false positive).
2.2.4 Robust Statistics

A simple way to handle outliers is to detect them and remove them from the data set. Deleting an
outlier, although better than doing nothing, still poses a number of problems [see Maronna et al.
(2006)]:
• When is deletion justified? Deletion requires a subjective decision. When is an
observation ‘outlying enough’ to be deleted?
• The user may think that ‘an observation is an observation’ (i.e., observations should
speak of themselves) and hence feel uneasy about deleting them. Sometimes atypical data
may be the most informative data and its deletion may outliers.
• Since there is generally some uncertainty as to whether an observation is really atypical,
there is a risk of deleting ‘good’ observations, which results in underestimating data
variability.
30
• Since the results depend on the user’s subjective decisions, it is difficult to determine the
statistical behavior of the complete procedure.
The word “Robust” literary means something “very strong.” So robust statistics are those
statistics which do not breakdown easily. The term robustness signifies insensitivity to small
deviations from the assumption. That means a robust procedure is nearly as efficient as the
classical procedure when classical assumptions hold strictly but is considerably more efficient
over all when there is a small departure from them. One objective of robust techniques is to cope
with outliers by trying to keep small the effects of their presence. The analogous term used in the
literature: Resistant Statistics.
Here we introduce several statistics which are robust in the presence of outliers. Median and
trimmed mean are robust measures of location. For the measure of dispersion we can use the
normalized median absolute deviation (MADN). For a set of data the Median Absolute Deviation
(MAD) is defined as
MAD (x) = Med {|x – Med (x)|} (2.28)
To make the MAD comparable to the SD in terms of efficiency, we consider the normalized
MAD defined as
MADN (x) = MAD (x) / 0.6745 (2.29)
Two other well-known dispersion estimates are the range defined as
R = x(n ) – x(1) (2.30)
and the inter-quartile range (IQR) defined as
IQR (x) = Q3 – Q1 (2.31)
31
Both of them are based on order statistics; the former is clearly very sensitive to outliers, while
the latter is not.
2.2.5 Robust Outlier Detection Methods

It is now evident that classical t-statistics as used in the three-sigma rule might be very
ineffective in the identification of outliers since its components the mean and the standard
deviation are not outlier resistant. For this reason we need robust outlier detection methods
which are unaffected in the presence of outliers. In this section we discuss few robust outlier
detection methods.
Robust t like Statistic
Let us now use the robust plug-in technique to obtain a robust t-like statistic from (2.27) by
replacing mean by median and SD by the normalized median absolute deviation (MADN). Thus
the modified statistic becomes
xi − Median(x ) (2.32)
tiʹ′ =
MADN(x )
Observations with | tiʹ′ | > 3 are identified as outliers.
Interquartile Range
The above-mentioned strategies for identifying outliers are probably most appropriate for
symmetric unimodal distributions. If a distribution is skewed, it is recommended to calculate the
threshold for outliers from the interquartile distance:
Q1 – 1.5 IQR < xi < Q3 + 1.5 IQR (2.33)
32
Hampel’s Test
In recent years Hampel (1984)’s test for outliers has become very popular in data mining and
knowledge discovery. According to this rule an observation xi is identified as an outlier if
| xi – median(x)| > 4.5 MAD(x) (2.34)
It is interesting to note that Hampel’s test is equivalent to robust t test. Recall that according to
the robust t test as described in (2.32). It is easy to show from (2.32) that an observation is
identified as an outlier if
| xi – median(x)| > 3 MADN(x) = 4.4474 MAD(x) (2.35)
2.2.6 Detection of Outliers in ARCH/GARCH Models

We have strong reason to believe that outliers are the prime source of ARCH-influential observations in
time series data. Outliers occur in statistical data quite frequently. Hampel et al. (1986) indicate that
routine data generally contain 1-10% gross errors and even high quality data may not be guaranteed
free from it. Outliers are more critical in time series, especially in non-linear time series, because, their
effects may be much longer persistent and they may have serious impact in parameter estimation. An
excellent review of detection of outliers in time series is available in Gounder et al. (2007). Software
packages based on the work of Box and Jenkins (1976) are widely available, but unfortunately they are
restricted to the least square approach and do not provide for handling outliers. Indeed the field of
robust time series analysis has come into existence only fairly recently and has seen most of its activity
during the last decade. This is partly because one had to wait for the development of robust regression
techniques (of which extensive use is made) and also because of the increased difficulty inherent in
dealing with dependencies between the observations. But most of the tests for outliers in time series
analysis available in the literature are designed to identify additive and/or innovation outliers for
33
autoregressive (AR), moving average (MA), autoregressive moving average (ARMA) model, but
detection of outlier for ARCH model are not developed until quite recently [see Franses and van Dijk
(2000)]. But these methods are computationally extensive and are not readily available in
econometrical/statistical packages.
Rousseeuw and Leroy (1987) gave a rough and ready suggestion to use the robust regression
techniques like LMS and LTS for any time series model. Rousseeuw (1984) proposed Least Median of
Squares (LMS) regression which is a fitting technique less sensitive to outliers than the OLS. In OLS,
we estimate parameters by
n
2
minimizing the sum of squared residuals ∑u t
t =1
which is obviously the same if we
1 n 2
minimize the mean of squared residuals ∑ ut .
n t =1
Sample means are sensitive to outliers, but medians are not. Hence to make it less sensitive we can
replace the mean by a median to obtain median sum of squared residuals
MSR ( β̂ ) = Median { uˆ t 2 } (2.36)
Then the LMS estimate of β is the value that minimizes MSR ( β̂ ). Rousseeuw and Leroy (1987)
have shown that LMS estimates are very robust with respect to outliers and have the highest possible
50% breakdown point.
The least trimmed (sum of) squares (LTS) estimator is proposed by Rousseeuw (1984). In this method
we try to estimate β in such a way that
h
2
LTS ( β̂ ) = minimize ∑ uˆ ( )
t =1
t (2.37)
34
Here û (t ) is the t-th ordered residual. For a trimming percentage of α , Rousseeuw and Leroy (1987)
suggested choosing the number of observations h based on which the model is fitted as h = [n (1 – α )]
+ 1. The advantage of using LTS over LMS is that, in the LMS we always fit the regression line based
on roughly 50% of the data, but in the LTS we can control the level of trimming. When we suspect that
the data contains nearly 10% outliers, the LTS with 10% trimming will certainly produce better result
than the LMS. We can increase the level of trimming if we suspect there are more outliers in the data.
In quest of which robust fit does well in ARCH models Doula et al (2007) show that the LTS in general
performs better than the LMS for detecting outliers. We employ the LTS method to fit the time series
model and also use it to identify outliers if any. We consider graphical and analytical tests for ARCH
with and without the points thus identified.
2.3 Tests for Normality

In this section we examine basic three assumptions required for the application conventional
statistical analysis which are normality, data screening and randomness. At first we check the
condition of normality assumption for the data. This is the most crucial diagnostic check as the
entire classical statistics are based on the normality assumption of observations. At the time of
the development of the classical statistics there was a general believe among the statisticians that
the data set follow a normal distribution. It was observed that most of the classical data such as
height, weight etc followed normal distribution. In the last hundred years, attitudes towards the
assumption of a normal distribution in statistical models have varied from one extreme to
another. To quote Pearson (1905), ‘Even towards the end of the nineteenth century not all were
convinced of the need for curves other than normal.’ By the middle of this century Geary (1947)
35
made this comment ‘Normality is a myth; there never was and never will be a normal
distribution.’ Now it is evident that nonnormal data are more prevalent in nature. A nice review
of different tests for normality is available in Imon (2003).
2.3.1 Graphical Test for Normality

The simplest graphical display for checking normality is the normal probability plot. This
method is based on the fact that if the ordered observations are plotted against their cumulative
probabilities on normal probability paper, the resulting points should lie approximately on a
straight line.
2.3.2 Analytical Tests for Normality

Here we discuss few analytical tests for normality. A test based on the correlation of the
observations and the expectation of normalized order statistics is known as the Shapiro–Wilk
test. A test based on empirical distribution function is known as the Anderson–Darling test.
Jarque-Bera Test
A test based on the coefficients of skewness and kurtosis is known as Bowman–Shenton test.
This test is popularly known as the Jarque–Bera test. If we denote the sample size by n, the
sample skewness by S and the sample kurtosis by K, then the Jarque–Bera test statistic is defined
as
JB = [n / 6] [ S 2 + ( K − 3) 2 / 4] (2.38)
The standard theory tells us that a normal distribution has skewness 0 and the value of the
kurtosis is 3. So a departure from these two values will indicate non-normality and that is how
this test statistic was developed. The JB statistic follows a chi-square distribution with 2 degrees
of freedom.
36
Rescaled Moments Test
Imon (2003) suggests a slight adjustment to the JB statistic to make it more suitable for the
regression problems. The skewness and kurtosis components of the JB test are based on the
unobserved errors but in reality we use residuals instead. Those estimates are not unbiased either.
To overcome these problems Imon (2003) proposed a statistic based on rescaled moments (RM)
of ordinary least squares residuals is defined as
RM = [n c 3 / 6] [ S 2 + c ( K − 3) 2 / 4] (2.39)
where c = n/(n – k), k is the number of independent variables in a regression model. Both the JB
and the RM statistic follow a chi square distribution with 2 degrees of freedom. If the values of
these statistics are greater than the critical value of the chi square, we reject the null hypothesis
of normality.
Robust Rescaled Moments Test
The RM test performs better than the JB in every respect, but both the JB and the RM use the
least squares residuals in it which can be largely affected by outliers. To overcome this problem
Rana et al. (2009) suggested a normality test whose form is exactly same as the RM statistic as
shown in (2.39), but instead of the least squares residuals it uses robust LMS or LTS residuals.
This test is known as the robust rescaled moments (RRM) test for normality.
2.4 Estimation of Missing values

Missing data are a part of almost all research and it has a negative influence on the analysis, such
as information loss and, as a result, a loss of efficiency, loss of unbiasedness of estimated
parameters and loss of power. An excellent review of different aspects of missing values is
37
available in Little and Rubin (2002). In this section we introduce few commonly used missing
values estimation techniques.
2.4.1 Mean Imputation Technique

Among the different methods for solving the missing value, Imputation methods (Little and
Rubin (2002)) is one of the most widely used technique to solve incomplete data problems.
Therefore, this study stresses on several imputation methods to determine the best methods to
replace missing data.
Let us consider n observations x1 , x2 ,..., xn of which m values are missing denoted
by x1* , x 2* ,..., x m* . Thus the observed data with missing values are
x1 , x2 ,..., xn1 , x1* , xn1 +1 , xn1 +2 ,..., xn2 , x2* , xn2 +1 , xn2 +2 ,..., xm* , xn (2.40)
Therefore, the first missing value occurs after n1 observations, the second missing value occur
after n2 observations, and so on. Note that there might be more than one consecutive missing
observation.
Mean-before Technique
The mean-before technique is one of the most popular imputation techniques in handling missing
data. This technique consists of substituting all missing values with the mean of all available data
before missing values. Thus for the data in (2.40), x1* will be replaced by
1 n1
x1= ∑ xi (2.41)
n1 i =1
and x 2* will be replaced by
n2
1
x 2= ∑ xi (2.42)
(n2 − n1 − 1) i = n1 +1
38
and so on.
Mean-before-after Technique
The mean-before-after technique substitutes all missing values with the mean of one datum
before the missing value and one datum after the missing value. Thus for the data in (2.40),
x1* will be replaced by
xn1 + xn1 +1
x1= (2.43)
2
and x 2* will be replaced by
x n2 + x n2 +1
x2= (2.44)
2
and so on.
2.4.2 Expectation Maximization Algorithm

Let y is an incomplete data vector whose density function is ) where is a p-dimensional
parameter. If y were complete, the maximum likelihood of would be based on the distribution
of y. The log-likelihood function of y, is required to be
maximized. As y is incomplete we may denote that as ( ) where is the observed
incomplete data and is the unobserved missing data. Let assume that the missing data is
missing by random, then:
(2.45)
Considering the log-likelihood function:
(2.46)
The EM algorithm focused on maximizing in each iteration by replacing it by its
39
conditional expectation given the observed data . The EM algorithm has an E-step
(estimation step) followed by an M-step (maximization step) as follows:
E-step: Compute where
(2.47)
M-step: Find such that
(2.48)
The E-step and M-step are repeated alternately until the difference is less that
, where is a small quantity.
If the convergence attribute of the likelihood function of the complete data, that is , is
attainable then convergence of EM algorithm also attainable. The rate of convergence depends
on number of missing observations. Dempster, Laird, and Rubin (1977) show that convergence is
linear with rate proportional to the fraction of information about in that is observed.
2.4.3 Estimation of Missing Values in Time Series

The missing value estimation discussed above are designed for independent observations. But in
regression or in time series we assume a model and that should have a consideration when we try
to estimate missing values. In time series things are even more challenging as the observations
are dependent. In this study, we consider EM (LTS) method for estimating the missing values.
2.4.4 Performance Indicator

In this study, three performance indicators; say, mean absolute error (MAE), root mean square
error (RMSE) and estimated bias (EB) are considered to examine the accuracy of theses
40
imputation methods. In order to select the best method for estimation missing values, the
predicted and observed data were compared.
The mean absolute error is the average difference between predicted and actual data values, and
is given by
N
1
MAE = ∑ P −O i i
N i =1 (2.49)
where N is the number of imputations, Pi and Oi are the imputed and observed data points,
respectively. MAE varies from 0 to infinity and perfect fit is obtained when MAE=0.
The root mean squared error is one of the most commonly used measure and it is computed by
1 N
RMSE = ∑ [Pi − Oi ]2 (2.50)
N i =1
The smaller is the RMSE value, the better is the performance of the model.
The estimated bias is the absolute difference between the observed and the estimated value of the
respective parameters and defined as
EB = Oi − Ei (2.51)
where Ei is the estimated value of the parameter that obtained from the imputation methods.
2.5 Computation
We have used a number of modern and sophisticated statistical software such as R, S-Plus and
Minitab for computational purposes.
41
CHAPTER 3
OUTLIER ANALYSIS AND ESTIMATION OF MISSING
VALUES BY ROBUST EM ALGORITHM FOR SAUDI ARABIA
OIL PRODUCTION DATA
In this section our main objective is to estimate missing values of Saudi Arabia oil production
data because if any data is missing we cannot fit any ARCH model to the data. It is now evident
that outliers may have an adverse effect [see Mamun (2013)] on the estimation of missing values
and also in the determination of the order of ARCH [see Imon et al. (2007)]. The presence of
outliers may also break the normality assumption [see Imon (2003)] which is one of the most
important assumptions required for statistical inference. So we would like to apply a robust
approach of missing value estimation. For this reason we need to know which observations are
outliers for each of the variables under study.
3.1 Outlier Analysis
At first we would like to identify outliers (if any) from all sixteen variables we are using in our
study. An excellent review of different outlier detection methods is available in Hadi et al.
(2009). But since our final objective is to fit the data by ARCH models we restrict our attention
to the identification of outliers in time series and ARCH models. An excellent review of methods
appropriate for such a condition is available in Murugeson et al. (2007). We would also like to
employ robust methods for the identification of outliers in time series data. Again we have lots of
different choices but following the suggestions of Rousseeuw and Leroy (1987), Barnett and
Lewis (1993), Imon et al. (2007) we would use the least trimmed squares (LTS) method for the
42
identification purpose. We consider sixteen different data one by one using S-Plus and the results
are given below:
Crude Oil Production
The first variable that we consider is the Crude Oil Production data. The attached S-Plus output
shows that cases 17 – 26, i.e., observations for the years 1978 – 1987 are appearing as outliers.
Outliers in Model 1:
> (lts res[res>3])
17 18 19 20 21 22 23
3.325345 3.914581 3.556345 3.899828 4.188320 3.928173 3.533848
24 25 26
3.690732 4.051608 4.296814
Export of Refined Oil
There is no outlier for this data.
Export of Crude Oil to North America
Export of Refined Oil to North America
Export of Crude Oil to South America
43
Export of Refined Oil to South America
Export of Crude Oil to Western Europe
Export of Refined Oil to Western Europe
Export of Crude Oil to Middle East
Export of Refined Oil to Middle East
Export of Crude Oil to Africa
Export of Refined Oil to Africa
Export of Crude Oil to Asia and Far East
Export of Refined Oil to Asia and Far East
44
Export of Crude Oil to Oceania
The attached S-Plus output shows that cases 16 – 26, i.e., observations for the years 1977 – 1987
are appearing as outliers.
16 17 18 19 20 21 22
3.052137 3.790098 6.167967 7.182089 7.856801 7.997653 7.064254
23 24 25 26
5.702614 5.484033 5.717978 6.331793
Export of Refined Oil to Oceania
The attached S-Plus output shows that cases 20 – 26, i.e., observations for the years 1981 – 1987
are appearing as outliers.
20 21 22 23 24 25 26
7.463973 8.052987 8.865852 7.034707 7.384796 5.695348 7.941014
The above results make some sense. The Saudi Arabian pipeline for exporting oil was damaged
and out of operation several times from 1975 to 1987.
3.2 Estimation of Missing Values
In this section we would like to estimate the missing values of the oil production variables of
Saudi Arabia. We have observed in section 1.2 that 14 out of 16 variables have missing
observations. Information for the year 1987 is missing for all sixteen variables. For the ‘Export
of Refined Oil to Africa’ another observation for the year 1982 is also missing. Estimation of
missing values is really necessary while fitting an ARCH model because if there exists any
45
discontinuity the process does not converge. In section 2.4 we discussed different methods of
estimating missing values, but since we wre dealing with time series data, the simple mean
imputation or EM estimation should not be appropriate for them. That is not all, few of the
variables also contain outliers. For this reason we use the robust EM (LTS) method for
estimating missing values using S-Plus and the estimates are given in the following table.
Table 3.1 Estimates of Missing Values for the Saudi Arabia Oil Production Data
Variable Estimate of Missing Values
1982 1987
Export of Crude Oil to North America 358.20
Export of Refined Oil to North America 14.99
Export of Crude Oil to South America 101.40
Export of Refined Oil to South America 10.66
Export of Crude Oil to Western Europe 606.30
Export of Refined Oil to Western Europe 30.36
Export of Crude Oil to Middle East 80.52
Export of Refined Oil to Middle East 26.22
Export of Crude Oil to Africa 0.22 45.30
Export of Refined Oil to Africa 45.30
Export of Crude Oil to Asia and Far East 840.60
Export of Refined Oil to Asia and Far East 180.80
Export of Crude Oil to Oceania 20.55
Export of Refined Oil to Oceania 7.05
46
Next we construct time series plot for all fourtine variables where some values were missing and
for comparison they are displayed with their corresponding time series plots of the original data.

700
600
Crude_North_america
500
400
300
200
100
0
1962 1970 1978 1986 1994 2002 2010
Year

700
600
Crude_North_america
500
400
300
200
100
0
1962 1970 1978 1986 1994 2002 2010
Year
North America
47
60
50
40
30
20
10
1962 1970 1978 1986 1994 2002 2010

Year

60
50
40
30
20
10
1962 1970 1978 1986 1994 2002 2010

Year
North America
48
500
400
Crude_South_america
300
200
100
0
1962 1970 1978 1986 1994 2002 2010
Year

500
400
Crude_South_america
300
200
100
0
1962 1970 1978 1986 1994 2002 2010
Year
South America
49
50
40
30
20
10
0
1962 1970 1978 1986 1994 2002 2010
Year

50
40
30
20
10
0
1962 1970 1978 1986 1994 2002 2010
Year
South America
50
1600
1400
Crude_Western_europe 1200
1000
800
600
400
200
0
1962 1970 1978 1986 1994 2002 2010
Year

1600
1400
Crude_Western_europe
1200
1000
800
600
400
200
0
1962 1970 1978 1986 1994 2002 2010
Year
Western Europe
51
90
80
Refine_Western_europe 70
60
50
40
30
20
10
0
1962 1970 1978 1986 1994 2002 2010
Year

90
80
70
Refine_Western_europe
60
50
40
30
20
10
0
1962 1970 1978 1986 1994 2002 2010
Year
Western Europe
52
120
110
Crude_Middle_east 100
90
80
70
60
50
40
30
1962 1970 1978 1986 1994 2002 2010
Year

120
110
100
Crude_Middle_east
90
80
70
60
50
40
30
1962 1970 1978 1986 1994 2002 2010
Year
Middle East
53
80
70
Refine_Middle_east 60
50
40
30
20
10
0
1962 1970 1978 1986 1994 2002 2010
Year

80
70
60
Refine_Middle_east
50
40
30
20
10
0
1962 1970 1978 1986 1994 2002 2010
Year
Middle East
54
100
80
Crude_Africa
60
40
20
0
1962 1970 1978 1986 1994 2002 2010
Year

100
80
Crude_Africa
60
40
20
0
1962 1970 1978 1986 1994 2002 2010
Year
Africa
55
50
40
Refine_Africa
30
20
10
1962 1970 1978 1986 1994 2002 2010

Year
50
40
Refine_Africa
30
20
10
1962 1970 1978 1986 1994 2002 2010

Year
Africa
56
1600
1400
1200
1000
800
600
400
200
1962 1970 1978 1986 1994 2002 2010
Year

1600
1400
1200
1000
800
600
400
200
1962 1970 1978 1986 1994 2002 2010
Year
Asia and Far East
57
350
300
250
200
150
100
50
1962 1970 1978 1986 1994 2002 2010
Year
350
300
250
200
150
100
50
1962 1970 1978 1986 1994 2002 2010
Year
Asia and Far East
58
50
Crude_Oceania 40
30
20
10
0
1962 1970 1978 1986 1994 2002 2010
Year

50
40
Crude_Oceania
30
20
10
0
1962 1970 1978 1986 1994 2002 2010
Year
Oceania
59
25
20
Refine_Oceania
15
10
1962 1970 1978 1986 1994 2002 2010

Year

25
20
Refine_Oceania
15
10
1962 1970 1978 1986 1994 2002 2010

Year
Oceania
Figures 3.1 – 3.14 clearly show that the estimation of missing values using EM (LTS) perfectly
match with the rest of the data.
60
CHAPTER 4
SELECTION OF ARCH MODELS FOR SAUDI ARABIA OIL
PRODUCTION DATA
In this chapter our main objective is to find appropriate ARCH models for all sixteen oil
production variables under study. At first we look at the autocorrelation function (ACF) and
partial autocorrelation functions (PACF) to detect whether there is any ARCH effect in the
model and if so what is the order of it. But we know that ACF and PACF can only give an
indication. We need to employ formal test to answer this question. Later we use the White test
and the Bruesch-Pagan test to confirm the order of ARCH if at all. The Goldfeld-Quandt test can
only indicate whether there is any ARCH effect in the model, but it cannot determine its order.
After each fitting we have done the normality test. This is hugely important when we draw
inference as all of our conventional inferential procedures heavily rely on normality assumptions.
4.1 Crude Oil Production

Our first variable under study is the total crude oil production. The ACF and PACF values for
this data together with the Ljuang-Box t and χ 2 tests are given in Table 4.1.1 and in Figure 4.1.1.
Table 4.1.1 The ACF and PACF Values for the Crude Oil Production Data
Index ACF PACF
Values TSTAT LBQ Values TSTAT
1 0.883583 6.18508 40.646 0.883583 6.18508
2 0.743369 3.25133 70.028 -0.170332 -1.19232
3 0.580001 2.12028 88.303 -0.179431 -1.25602
61
4 0.415648 1.39671 97.897 -0.093001 -0.65101
5 0.265335 0.85810 101.896 -0.040665 -0.28465
6 0.110073 0.35075 102.600 -0.154165 -1.07915
7 -0.009263 -0.02944 102.605 0.030932 0.21653
8 -0.135424 -0.43044 103.723 -0.177907 -1.24535
9 -0.217332 -0.68819 106.674 0.061359 0.42951
10 -0.252386 -0.79157 110.755 0.085131 0.59592
11 -0.257288 -0.79682 115.109 0.014199 0.09939
12 -0.205055 -0.62699 117.949 0.150318 1.05223
Autocorrelation Function for Crude_Production_mil_barl Partial Autocorrelation Function for Crude_Production_mil_barl

(with 5% significance limits for the autocorrelations) (with 5% significance limits for the partial autocorrelations)
1.0 1.0
0.8 0.8
0.6 0.6
Partial Autocorrelation
0.4 0.4
Autocorrelation
0.2 0.2
0.0 0.0
-0.2 -0.2
-0.4 -0.4
-0.6 -0.6
-0.8 -0.8
-1.0 -1.0
1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12
Lag Lag
Figure 4.1.1 The ACF and PACF Values for the Crude Oil Production Data
The above table and figure clearly show an indication of autoregressive pattern. The ACF values
show a geometrically declined pattern and only the first PACF value is significant, hence the
possible model could be ARCH (1). To confirm this we employ both the White and Bruesch-
Pagan tests. Table 4.1.2 gives the significance of ARCH effects based on the White test.
62
Table 4.1.2 Order of ARCH Using the White Test for the Crude Oil Production Data
Model Variables Values TSTAT p-value White Statistic p-value
ARCH (1) Lag 1 0.6664 6.05 0.000 21.264 0.000
ARCH (2) Lag 1 0.5776 3.87 0.000
Lag 2 0.1337 0.89 0.377
The above results clearly show that the data fits an ARCH (1) model. ARCH (1) effect is highly
significant but ARCH (2) is not and the White statistic for ARCH (1) is highly significant.
Next we employ the Breusch-Pagan test and the results are presented in Table 4.1.3
Table 4.1.3 Order of ARCH Using the Breusch-Pagan Test for the Crude Oil Production Data
Model Variables Bruesch-Pagan Statistic p-value
ARCH (1) Lag 1 35.378 0.000
ARCH (2) Lag 1 35.378 0.000
Lag 2 0.720 0.401
Here we observe the same type of results. The above results show that only ARCH (1) is
significant for the data. Thus we can conclude that the crude oil production data fits an ARCH
(1) model.
63
Probability Plot of Residual
Normal - 95% CI
99
Mean -4.12115E-13
StDev 334.0
95 N 48
AD 0.404
90
P-Value 0.343
80
70
Percent
60
50
40
30
20
10
1
-1000 -500 0 500 1000
Residual
Figure 4.1.2 Normal Probability Plot of ARCH (1) Residuals for the Crude Oil Production Data
For the validity of the inference we now check the normality assumption of the errors. At
first we give a normal probability plot of the ARCH (1) residuals which is given in Figure 4.1.2.
Table 4.1.4 Normality Test of ARCH (1) Rresiduals for the Crude Oil Production Data
Statistic Value p-value
Skewness -0.83
Kurtosis 4.81
Jarque-Bera 18.171 0.0001
Rescaled Moments 21.067 0.0000
It is not very clear from the above plot that whether the errors satisfy normality assumptions
here. Hence we employ two analytic tests, the Jarque-Bera and the rescaled moments to check
normality of errors and the results are presented in Table 4.1.4.
The above table clearly shows non-normality of erros. Both the JB and RM tests appear
to be significant. For this particular dat we observe in section 3.2 that there are few outliers. The
above normal probability plot also suggests the evidence of few outliers. We believe these
64
outliers are the main reason of this apparent non-normality. Since ARCH (1) model fails the
normality test we cannot apply it to the data as it is.
Now we recompute all results using the robust LTS residuals instead of the least squares
residuals. At first we look at the ACF and PACF values which are now given in Table 4.1.5 and
in Figure 4.1.3.
Table 4.1.5 The ACF and PACF Values for the LTS Crude Oil Production Data
Index ACF PACF
1 0.869082 5.28642 30.275 0.883583 6.18508
2 0.769898 2.95559 54.713 -0.170332 -1.19232
3 0.649262 2.05423 72.604 -0.179431 -1.25602
4 0.537785 1.53540 85.250 -0.093001 -0.65101
5 0.451992 1.21534 94.463 -0.040665 -0.28465
6 0.365022 0.94451 100.665 -0.154165 -1.07915
7 0.278090 0.70282 104.385 0.030932 0.21653
8 0.184502 0.46019 106.078 -0.177907 -1.24535
9 0.097385 0.24152 106.567 0.061359 0.42951
Autocorrelation Function for Del_Crude_Production_mil_barl_1 Partial Autocorrelation Function for Del_Crude_Production_mil_barl_1

1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
Autocorrelation
0.2 0.2
0.0 0.0
-0.2 -0.2
-0.4 -0.4
-0.6 -0.6
-0.8 -0.8
-1.0 -1.0
1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9
Lag Lag
Figure 4.1.3 The LTS ACF and PACF Values for the Crude Oil Production Data
65
Likewise the previous results, the ACF and PACF values indicate that the possible model could
be ARCH (1). Tables 4.1.6 and 4.1.7 show that Both the White test and the Breusch-Pagan test
confirm that ARCH (1) model fits the data. For the rest of the examples we observe that the
Bruesch-Pagan test is consistently giving the same conclusion and since White test is much
easier and more popular test, we report only the White test for brevity.
Table 4.1.6 Order of ARCH Using the LTS White Test for the Crude Oil Production Data
ARCH (1) Lag 1 0.596 4.46 0.000 13.884 0.0001
ARCH (2) Lag 1 1.001 3.31 0.003
Lag 2 0.001 0.454 0.653
Table 4.1.7 Order of ARCH Using the LTS Breusch-Pagan Test for the Crude Oil Production
Data
Model Variables Bruesch-Pagan Statistic p-value
ARCH (1) Lag 1 23.311 0.000
ARCH (2) Lag 1 23.311 0.000
Lag 2 0.313 0.580
Finally we check the normality assumption. The normal probability plot of the ARCH (1) LTS
residuals show normal pattern as given in Figure 4.1.4.
We use the robust rescaled moments (RRM) test which includes LTS residuals. To make
the results comparable with the tests with the outliers we present this result with the JB test as it
was presented in Table 4.1.8.
66
Probability Plot of Del_Res
Normal - 95% CI
99
Mean 33.99
StDev 253.0
95 N 36
AD 0.426
90
P-Value 0.299
80
70
Percent 60
50
40
30
20
10
1
-500 0 500 1000
Del_Res
Figure 4.1.4 Normal Probability Plot of ARCH (1) LTS Residuals for the Crude Oil Production
Data
Table 4.1.8 Normality Test of ARCH (1) Rresiduals for the Crude Oil Production Data
Statistic Skewness Kurtosis Value of the Test p-value
Jarque-Bera -0.83 4.81 18.171 0.0001
Robust Rescaled Moments 0.36 2.65 1.73 0.4210
The above results clearly show that the LTS residuals show normal pattern. So we can make a
reliable inference based on these and the appropriate model is ARCH (1).
4.2 Total Export of Refined Oil
Next we consider is the total export of refined oil. The ACF and PACF values for this data
together with the Ljuang-Box t and χ 2 tests are given in Table 4.2.1 and in Figure 4.2.1.
67
Table 4.2.1 The ACF and PACF Values for the Total Export of Refined Oil Data
Index ACF PACF
1 0.938355 6.56849 45.842 0.938355 6.56849
2 0.881212 3.71230 87.130 0.005870 0.04109
3 0.800898 2.69917 121.977 -0.223009 -1.56106
4 0.732093 2.16615 151.741 0.041358 0.28951
5 0.651713 1.76656 175.863 -0.100598 -0.70419
6 0.580703 1.48249 195.461 -0.002857 -0.02000
7 0.511764 1.25156 211.044 0.011651 0.08156
8 0.452294 1.07237 223.513 0.005160 0.03612
9 0.385564 0.89343 232.801 -0.100934 -0.70654
10 0.324113 0.73909 239.532 -0.027951 -0.19566
11 0.263789 0.59494 244.108 -0.008552 -0.05986
Autocorrelation Function for Total_refine_exp_mil_barl Partial Autocorrelation Function for Total_refine_exp_mil_barl

1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
Autocorrelation
0.2 0.2
0.0 0.0
-0.2 -0.2
-0.4 -0.4
-0.6 -0.6
-0.8 -0.8
-1.0 -1.0
1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12
Lag Lag
Figure 4.2.1 The ACF and PACF Values for Total Export of Refined Oil Data
show a geometrically declined pattern and only the first PACF value is significant indicating that
68
the possible model could be ARCH (1). To confirm this we employ the White test and the results
are presented in Table 4.2.2.
Table 4.2.2 Order of ARCH Using the White Test for the Total Export of Refined Oil Data
ARCH (1) Lag 1 0.7650 6.62 0.000 23.91 0.0000
ARCH (2) Lag 1 0.6520 4.13 0.000
Lag 2 0.1674 1.03 0.310

Normal - 95% CI
99
Mean 3.903635E-12
StDev 79.52
95 N 49
AD 0.370
90
P-Value 0.413
80
70
Percent
60
50
40
30
20
10
1
-200 -100 0 100 200
Residual
Figure 4.2.2 Normal Probability Plot of ARCH (1) Residuals for the Total Export of Refined Oil
Data
Here we compare three different ARCH models. At first we fit an ARCH (1) model and observe
that the lag effect is significant. Then we fit an ARCH (2) model and observe that the first lag
effect is significant, but the second one is not. Thus we may conclude that ARCH (1) is the most
appropriate model for this data. The White statistic for ARCH (1) is 23.91 and which is highly
significant.
69
Table 4.2.3 Normality Test of ARCH (1) Rresiduals for the Total Export of Refined Oil Data
For the validity of the inference we now check the normality assumption of the errors. At first we
give a normal probability plot of the ARCH (1) residuals which is given in Figure 4.2.2. This
graph shows that perhaps the errors satisfy normality assumptions here. Finally we employ the
Jarque-Bera and the rescaled moments test for checking the normality of errors and the results
are presented in Table 4.2.3. Thus the ARCH (1) model passes the normality test and we can
conclude that ARCH (1) is the most appropriate model for this data.
4.3 Export of Crude Oil to North America

Next we consider the export of crude oil to North America. The ACF and PACF values for this
data together with the Ljuang-Box t and χ 2 tests are given in Table 4.3.1 and in Figure 4.3.1.
Table 4.3.1 The ACF and PACF Values for the Export of Crude Oil to North America Data
Index ACF PACF
1 0.856604 5.61713 33.8059 0.856604 5.61713
2 0.650320 2.71475 53.7655 -0.313454 -2.05545
3 0.435950 1.57049 62.9592 -0.113406 -0.74365
4 0.252890 0.86287 66.1323 -0.017270 -0.11325
5 0.127777 0.42862 66.9637 0.048393 0.31734
70
6 0.067654 0.22598 67.2031 0.069746 0.45736
7 0.062725 0.20927 67.4145 0.065406 0.42890
8 0.049657 0.16550 67.5509 -0.135159 -0.88629
9 0.050363 0.16775 67.6952 0.069936 0.45860
10 0.098116 0.32659 68.2597 0.212922 1.39622
11 0.165971 0.55108 69.9254 0.075400 0.49443
Autocorrelation Function for Crude_North_america Partial Autocorrelation Function for Crude_North_america

1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
Autocorrelation
0.2 0.2
0.0 0.0
-0.2 -0.2
-0.4 -0.4
-0.6 -0.6
-0.8 -0.8
-1.0 -1.0
1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11
Lag Lag
Figure 4.3.1 The ACF and PACF Values for the Export of Crude Oil to North America Data
show a geometrically declined pattern and the first two PACF values are significant indicating
that the possible model could be ARCH (2). To confirm this we employ the White test and the
results are presented in Table 4.3.2.
Table 4.3.2 Order of ARCH Using the White Test for the Export of Crude Oil to North America
Data
ARCH (1) Lag 1 0.6535 5.46 0.000
ARCH (2) Lag 1 0.9093 6.09 0.000 17.934 0.0001
71
Lag 2 -0.3969 -2.63 0.012
ARCH (3) Lag 1 0.9896 5.95 0.000
Lag 2 -0.5841 -2.61 0.013
Lag 3 0.1994 1.15 0.259

Normal - 95% CI
99
Mean 3.437044E-12
StDev 146.5
95 N 43
AD 0.606
90
P-Value 0.108
80
70
Percent
60
50
40
30
20
10
1
-400 -300 -200 -100 0 100 200 300 400 500
Residual
North America Data
that the lag effect is significant. Then we fit an ARCH (2) model and observe that both the first
two lag effects are significant. Next we fit an ARCH (3) model and observe that both the first
two lag effects are significant, but the third one is not. Thus we may conclude that ARCH (2) is
the most appropriate model for this data. The White statistic for ARCH (2) is 17.934 and that is
highly significant as well.
This graph shows that perhaps the errors satisfy normality assumptions here. Finally we employ
72
the Jarque-Bera and the rescaled moments test for checking the normality of errors and the
Table 4.3.3 Normality Test of ARCH (2) Rresiduals for the Export of Crude Oil to North
America Data
Thus the ARCH (2) model passes the normality test and we can conclude that ARCH (2) is the
most appropriate model for this data.
4.4 Export of Refined Oil to North America

Our next example is the export of refined oil to North America. The ACF and PACF values for
Table 4.4.1 The ACF and PACF Values for the Export of Refined Oil to North America Data
Index ACF PACF
1 0.751325 4.92677 26.0068 0.751325 4.92677
2 0.755553 3.39558 52.9486 0.438713 2.87683
3 0.554030 2.00885 67.7973 -0.267878 -1.75659
4 0.469284 1.56134 78.7240 -0.137922 -0.90442
5 0.311616 0.98256 83.6687 -0.068921 -0.45194
6 0.206670 0.63750 85.9024 -0.069528 -0.45592
73
7 0.112194 0.34285 86.5790 0.029984 0.19662
8 0.029734 0.09062 86.6279 -0.015582 -0.10218
9 -0.101216 -0.30840 87.2109 -0.260777 -1.71003
10 -0.190987 -0.58065 89.3497 -0.156969 -1.02931
11 -0.243040 -0.73318 92.9215 0.138350 0.90722
Autocorrelation Function for Refine_North_america Partial Autocorrelation Function for Refine_North_america

1.0 1.0
0.8 0.8
0.6
0.6
0.4
0.4
Autocorrelation
0.2
0.2
0.0
0.0
-0.2
-0.2
-0.4
-0.4
-0.6
-0.6
-0.8
-0.8 -1.0
-1.0
1 2 3 4 5 6 7 8 9 10 11
1 2 3 4 5 6 7 8 9 10 11 Lag
Lag
Figure 4.4.1 The ACF and PACF Values for the Export of Refined Oil to North America Data
Table 4.4.2 Order of ARCH Using the White Test for the Export of Refined Oil to North
America Data
ARCH (1) Lag 1 0.5422 3.99 0.000 14.964 0.0001
ARCH (2) Lag 1 0.5417 3.83 0.000
Lag 2 0.1240 0.74 0.462
74
appropriate model for this data. The White statistic for ARCH (1) is 14.964 and that is highly
significant as well.

Normal - 95% CI
99
Mean -0.8356
StDev 6.613
95 N 40
AD 0.485
90
P-Value 0.215
80
70
Percent
60
50
40
30
20
10
1
-20 -10 0 10 20
Residual
North America Data
75
Table 4.4.3 Normality Test of ARCH (1) Rresiduals for the Export of Refined Oil to North
America Data
4.5 Export of Crude Oil to South America

Our next example is the export of crude oil to South America. The ACF and PACF values for
Table 4.5.1 The ACF and PACF Values for the Export of Crude Oil to South America Data
Index ACF PACF
1 0.858561 5.62996 33.9605 0.858561 5.62996
2 0.643569 2.68292 53.5078 -0.355900 -2.33379
3 0.462616 1.66927 63.8607 0.082693 0.54226
4 0.281587 0.95599 67.7948 -0.221779 -1.45430
5 0.148907 0.49513 68.9239 0.138656 0.90923
6 0.057597 0.19043 69.0974 -0.106458 -0.69809
7 -0.012773 -0.04220 69.1061 0.025019 0.16406
8 -0.058325 -0.19267 69.2942 -0.051521 -0.33784
76
9 -0.040192 -0.13265 69.3862 0.225219 1.47686
10 -0.000809 -0.00267 69.3862 -0.103319 -0.67751
11 0.000114 0.00037 69.3862 -0.072207 -0.47349
Autocorrelation Function for Crude_South_america Partial Autocorrelation Function for Crude_South_america

1.0 1.0
0.8
0.8
0.6
0.6
0.4
0.4
Autocorrelation
0.2
0.2
0.0
0.0
-0.2
-0.2
-0.4
-0.4
-0.6
-0.6 -0.8
-0.8 -1.0
-1.0
1 2 3 4 5 6 7 8 9 10 11
1 2 3 4 5 6 7 8 9 10 11 Lag
Lag
Figure 4.5.1 The ACF and PACF Values for the Export of Crude Oil to South America Data
Data
ARCH (1) Lag 1 0.5780 4.52 0.000 14.534 0.0001
ARCH (2) Lag 1 0.6442 4.01 0.000
Lag 2 -0.1241 -0.78 0.442
77
appropriate model for this data. The White statistic for ARCH (1) is 14.534 and that is highly
significant as well.
Although this graph shows a little bit non-normality pattern in the middle, but perhaps overall the
errors satisfy normality assumptions here. Finally we employ the Jarque-Bera and the rescaled
moments test for checking the normality of errors and the results are presented in Table 4.5.3.

Normal - 95% CI
99
Mean -6.248
StDev 21.56
95 N 38
AD 0.894
90
P-Value 0.020
80
70
Percent
60
50
40
30
20
10
1
-75 -50 -25 0 25 50
Residual
South America Data
78
Table 4.5.3 Normality Test of ARCH (1) Rresiduals for the Export of Crude Oil to South
America Data
4.6 Export of Refined Oil to South America

Our next example is the export of refined oil to South America. The ACF and PACF values for
Table 4.6.1 The ACF and PACF Values for the Export of Refined Oil to South America Data
Index ACF PACF
1 0.707751 4.64103 23.0777 0.707751 4.64103
2 0.491994 2.28025 34.5017 -0.017866 -0.11716
3 0.408489 1.69890 42.5737 0.133692 0.87668
4 0.376987 1.47218 49.6250 0.088349 0.57934
5 0.290293 1.08048 53.9161 -0.066028 -0.43297
6 0.182896 0.66298 55.6655 -0.068661 -0.45024
7 0.061759 0.22162 55.8705 -0.131401 -0.86165
8 -0.094585 -0.33902 56.3651 -0.221497 -1.45245
9 -0.103750 -0.37088 56.9777 0.109253 0.71642
79
10 -0.093397 -0.33281 57.4892 -0.002826 -0.01853
11 -0.042887 -0.15243 57.6004 0.160091 1.04978
Autocorrelation Function for Refine_South_america Partial Autocorrelation Function for Refine_South_america

1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
Autocorrelation
0.2 0.2
0.0 0.0
-0.2 -0.2
-0.4 -0.4
-0.6 -0.6
-0.8 -0.8
-1.0 -1.0
1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11
Lag Lag
Figure 4.6.1 The ACF and PACF Values for the Export of Refined Oil to South America Data
Table 4.6.2 Order of ARCH Using the White Test for the Export of Refined Oil to South
America Data
ARCH (1) Lag 1 0.3385 2.28 0.028 4.945 0.0261
ARCH (2) Lag 1 0.3754 2.33 0.025
Lag 2 -0.1154 -0.72 0.478
80
appropriate model for this data. The White statistic for ARCH (1) is 4.945 and that is significant
at the 5% level as the p-value is 0.0261.

Normal - 95% CI
99
Mean -0.7119
StDev 4.689
95 N 41
AD 0.526
90
P-Value 0.169
80
70
Percent
60
50
40
30
20
10
1
-15 -10 -5 0 5 10
Residual
South America Data
Table 4.6.3 Normality Test of ARCH (1) Rresiduals for the Export of Refined Oil to South
America Data
81
4.7 Export of Crude Oil to Western Europe

Next we consider is the export of crude oil to Western Europe. The ACF and PACF values for
Table 4.7.1 The ACF and PACF Values for the Export of Crude Oil to Western Europe Data
Index ACF PACF
1 0.871934 6.10354 39.582 0.871934 6.10354
2 0.694381 3.06161 65.218 -0.274843 -1.92390
3 0.530913 1.99080 80.531 -0.005874 -0.04112
4 0.398117 1.38502 89.333 0.001150 0.00805
5 0.308376 1.03313 94.734 0.057432 0.40203
6 0.212397 0.69657 97.356 -0.155390 -1.08773
7 0.089732 0.29141 97.835 -0.162445 -1.13711
8 -0.089356 -0.28969 98.322 -0.324855 -2.27398
9 -0.254341 -0.82315 102.363 -0.030354 -0.21247
10 -0.338711 -1.08136 109.714 0.121842 0.85290
11 -0.366302 -1.14250 118.538 -0.008099 -0.05669
82
Autocorrelation Function for Crude_Western_europe Partial Autocorrelation Function for Crude_Western_europe
1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
Autocorrelation
0.2 0.2
0.0 0.0
-0.2 -0.2
-0.4 -0.4
-0.6 -0.6
-0.8 -0.8
-1.0 -1.0
1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12
Lag Lag
Figure 4.7.1 The ACF and PACF Values for the Export of Crude Oil to Western Europe Data
Table 4.7.2 Order of ARCH Using the White Test for the Export of Crude Oil to Western Europe
Data
ARCH (1) Lag 1 0.6420 4.33 0.000 21.456 0.0000
ARCH (2) Lag 1 0.6593 4.38 0.000
Lag 2 -0.0025 -0.02 0.987
significant
83

Normal - 95% CI
99
Mean -2.54577E-12
StDev 362.7
95 N 49
AD 1.301
90
P-Value <0.005
80
70
Percent
60
50
40
30
20
10
1
-1000 -500 0 500 1000
Residual
Western Europe Data
Table 4.7.3 Normality Test of ARCH (1) Rresiduals for the Export of Crude Oil to Western
Europe Data
84
4.8 Export of Refined Oil to Western Europe
Next we consider is the export of refined oil to Western Europe. The ACF and PACF values for
Table 4.8.1 The ACF and PACF Values for the Export of Refined Oil to Western Europe Data
Index ACF PACF
1 0.824530 5.77171 35.395 0.824530 5.77171
2 0.729172 3.32276 63.665 0.154059 1.07842
3 0.587915 2.22435 82.442 -0.152652 -1.06857
4 0.502046 1.73257 96.439 0.045989 0.32192
5 0.377266 1.22884 104.523 -0.120816 -0.84571
6 0.308006 0.97369 110.037 0.037517 0.26262
7 0.262613 0.81458 114.140 0.102090 0.71463
8 0.207980 0.63655 116.776 -0.077526 -0.54268
9 0.138283 0.41978 117.971 -0.097598 -0.68319
10 0.145840 0.44114 119.334 0.196568 1.37598
11 0.122004 0.36758 120.313 -0.043607 -0.30525
85
Autocorrelation Function for Refine_Western_europe Partial Autocorrelation Function for Refine_Western_europe
1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
Autocorrelation
0.2 0.2
0.0 0.0
-0.2 -0.2
-0.4 -0.4
-0.6 -0.6
-0.8 -0.8
-1.0 -1.0
1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12
Lag Lag
Figure 4.8.1 The ACF and PACF Values for the Export of Refined Oil to Western Europe Data
Table 4.8.2 Order of ARCH Using the White Test for the Export of Refined Oil to Western
Europe Data
ARCH (1) Lag 1 0.3190 2.22 0.031 12.77 0.0003
ARCH (2) Lag 1 0.3185 2.22 0.032
Lag 2 0.0571 0.37 0.715
significant
86

Normal - 95% CI
99
Mean -0.7527
StDev 8.609
95 N 44
AD 1.020
90
P-Value 0.010
80
70
Percent
60
50
40
30
20
10
1
-30 -20 -10 0 10 20 30
Residual
Western Europe Data
Table 4.8.3 Normality Test of ARCH (1) Rresiduals for the Export of Refined Oil to Western
Europe Data
87
4.9 Export of Crude Oil to Middle East
Next we consider is the export of crude oil to Middle East. The ACF and PACF values for this
Table 4.9.1 The ACF and PACF Values for the Export of Crude Oil to Middle East Data
Index ACF PACF
1 0.691265 4.53293 22.0151 0.691265 4.53293
2 0.432054 2.02592 30.8250 -0.087700 -0.57509
3 0.155106 0.66646 31.9889 -0.210613 -1.38108
4 -0.052154 -0.22181 32.1238 -0.102695 -0.67342
5 -0.205124 -0.87142 34.2664 -0.103367 -0.67782
6 -0.319396 -1.33353 39.6014 -0.141250 -0.92624
7 -0.311338 -1.24925 44.8115 0.032152 0.21083
8 -0.317933 -1.23178 50.3998 -0.136476 -0.89493
9 -0.162171 -0.60725 51.8966 0.178321 1.16933
10 -0.072124 -0.26778 52.2016 -0.071152 -0.46658
11 -0.085415 -0.31660 52.6428 -0.252328 -1.65463
Autocorrelation Function for Crude_Middle_east Partial Autocorrelation Function for Crude_Middle_east

1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
Autocorrelation
0.2 0.2
0.0 0.0
-0.2 -0.2
-0.4 -0.4
-0.6 -0.6
-0.8 -0.8
-1.0 -1.0
1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11
Lag Lag
Figure 4.9.1 The ACF and PACF Values for the Export of Crude Oil to Middle East Data
88
Table 4.9.2 Order of ARCH Using the White Test for the Export of Crude Oil to Middle East
Data
ARCH (1) Lag 1 0.2306 1.51 0.140 2.322 0.1275
ARCH (2) Lag 1 0.2153 1.33 0.190
Lag 2 0.0358 0.22 0.825
At first we fit an ARCH (1) model. Although the ACF and PACF values indicated that an ARCH
(1) model could fit this data, we observe that the lag effect is insignificant. Then we fit an ARCH
(2) model and observe that both of the first lag effects are insignificant. The White statistic for
ARCH (1) is only 2.322 having p-value of 0.1275 which is insignificant at both 5% and 10%
levels. Thus we may conclude that this data do not show any evidence of ARCH effect and hence
can be fitted by an AR(1) model.
first we give a normal probability plot of the AR (1) residuals which is given in Figure 4.9.2.
89
Normal - 95% CI
99
Mean 2.015958E-14
StDev 18.05
95 N 43
AD 0.687
90
P-Value 0.068
80
70
Percent
60
50
40
30
20
10
1
-50 -25 0 25 50
Residual
Figure 4.9.2 Normal Probability Plot of AR (1) Residuals for the Export of Crude Oil to Middle
East Data
Table 4.9.3 Normality Test of AR (1) Rresiduals for the Export of Crude Oil to Middle East Data
Thus the AR (1) model passes the normality test and we can conclude that AR (1) is the most
appropriate model for this data.
4.10 Export of Refined Oil to Middle East

Our next example is the export of refined oil to Middle East. The ACF and PACF values for this
90
Table 4.10.1 The ACF and PACF Values for the Export of Refined Oil to Middle East Data
Index ACF PACF
1 0.824530 5.77171 35.395 0.902715 5.91950
2 0.729172 3.32276 63.665 -0.000142 -0.00093
3 0.587915 2.22435 82.442 0.212498 1.39344
4 0.502046 1.73257 96.439 -0.206355 -1.35316
5 0.377266 1.22884 104.523 -0.143126 -0.93854
6 0.308006 0.97369 110.037 0.035519 0.23292
7 0.262613 0.81458 114.140 -0.116550 -0.76427
8 0.207980 0.63655 116.776 0.194645 1.27637
9 0.138283 0.41978 117.971 0.114064 0.74797
10 0.145840 0.44114 119.334 -0.170948 -1.12098
11 0.122004 0.36758 120.313 -0.010193 -0.06684
Autocorrelation Function for Refine_Western_europe Partial Autocorrelation Function for Refine_Middle_east

1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
Autocorrelation
0.2 0.2
0.0 0.0
-0.2 -0.2
-0.4 -0.4
-0.6
-0.6
-0.8
-0.8
-1.0
-1.0
1 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 7 8 9 10 11
Lag
Lag
Figure 4.10.1 The ACF and PACF Values for the Export of Refined Oil to Middle East Data
91
Table 4.10.2 Order of ARCH Using the White Test for the Export of Refined Oil to Middle East
Data
ARCH (1) Lag 1 0.2736 1.79 0.080 3.27 0.0725
ARCH (2) Lag 1 0.2217 1.39 0.172
Lag 2 0.1997 1.25 0.220
(1) model could fit this data, we observe that the first lag effect is significant at the 10% level,
but not at the 5% level. Then we fit an ARCH (2) model and observe that both of the first lag
effects are insignificant. The White statistic for ARCH (1) is 3.27 having p-value of 0.0725
which is significant at the 10% level, but not at the 5% level. Thus we may conclude that this
data do not show any strong evidence of ARCH effect and hence can be fitted by an AR(1)
model.
92
Normal - 95% CI
99
Mean 6.373403E-13
StDev 10.31
95 N 43
AD 0.584
90
P-Value 0.121
80
70
Percent
60
50
40
30
20
10
1
-30 -20 -10 0 10 20 30
Residual
Figure 4.10.2 Normal Probability Plot of AR (1) Residuals for the Export of Refined Oil to
Middle East Data
Table 4.10.3 Normality Test of AR (1) Rresiduals for the Export of Refined Oil to Middle East
Data
4.11 Export of Crude Oil to Africa

Next we consider is the export of crude oil to Africa. The ACF and PACF values for this data
93
Table 4.11.1 The ACF and PACF Values for the Export of Crude Oil to Africa Data
Index ACF PACF
1 0.737031 4.83303 25.0267 0.737031 4.83303
2 0.583771 2.65018 41.1102 0.088787 0.58221
3 0.489464 1.92917 52.6997 0.070543 0.46258
4 0.426078 1.55050 61.7070 0.053793 0.35275
5 0.260561 0.89924 65.1641 -0.223338 -1.46453
6 0.255779 0.86659 68.5856 0.201167 1.31914
7 0.197201 0.65675 70.6758 -0.087480 -0.57364
8 0.204919 0.67571 72.9974 0.135924 0.89131
9 0.149440 0.48762 74.2683 -0.068532 -0.44939
10 0.067116 0.21780 74.5325 -0.203040 -1.33142
11 -0.041654 -0.13502 74.6374 -0.065675 -0.43066
These table and figure clearly show an indication of autoregressive pattern. The ACF values
94
Autocorrelation Function for Crude_Africa Partial Autocorrelation Function for Crude_Africa
1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
Autocorrelation
0.2 0.2
0.0 0.0
-0.2 -0.2
-0.4 -0.4
-0.6 -0.6
-0.8 -0.8
-1.0 -1.0
1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11
Lag Lag
Figure 4.11.1 The ACF and PACF Values for the Export of Crude Oil to Africa Data
Table 4.11.2 Order of ARCH Using the White Test for the Export of Crude Oil to Africa Data
ARCH (1) Lag 1 0.1486 0.95 0.346 0.946 0.3307
ARCH (2) Lag 1 0.1372 0.84 0.403
Lag 2 0.0463 0.29 0.777
95
Normal - 95% CI
99
Mean 5.023372E-14
StDev 21.15
95 N 43
AD 0.222
90
P-Value 0.819
80
70
Percent
60
50
40
30
20
10
1
-50 -25 0 25 50 75
Residual
Figure 4.11.2 Normal Probability Plot of AR (1) Residuals for the Export of Crude Oil to Africa
Data
Table 4.11.3 Normality Test of AR (1) Rresiduals for the Export of Crude Oil to Africa Data
4.12 Export of Refined Oil to Africa

Next we consider is the export of refined oil to Africa. The ACF and PACF values for this data
96
Table 4.12.1 The ACF and PACF Values for the Export of Refined Oil to Africa Data
Index ACF PACF
1 0.829961 5.44242 31.736 0.829961 5.44242
2 0.763688 3.24769 59.261 0.240553 1.57741
3 0.690431 2.40492 82.321 0.032133 0.21071
4 0.604034 1.86771 100.423 -0.070770 -0.46407
5 0.554209 1.58954 116.064 0.049617 0.32536
6 0.502731 1.36398 129.281 0.021674 0.14212
7 0.461338 1.20080 140.721 0.021586 0.14155
8 0.400580 1.00936 149.592 -0.077138 -0.50582
9 0.349865 0.86139 156.559 -0.029822 -0.19556
10 0.302189 0.73150 161.913 -0.013596 -0.08915
11 0.253418 0.60594 165.797 -0.019238 -0.12615
The table and figure clearly show an indication of autoregressive pattern. The ACF values show
a geometrically declined pattern and only the first PACF value is significant indicating that the
possible model could be ARCH (1). To confirm this we employ the White test and the results are
presented in Table 4.12.2.
97
Autocorrelation Function for Refine_Africa Partial Autocorrelation Function for Refine_Africa
1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
Autocorrelation
0.2 0.2
0.0 0.0
-0.2 -0.2
-0.4 -0.4
-0.6 -0.6
-0.8 -0.8
-1.0 -1.0
1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11
Lag Lag
Figure 4.12.1 The ACF and PACF Values for the Export of Refined Oil to Africa Data
Table 4.12.2 Order of ARCH Using the White Test for the Export of Refined Oil to Africa Data
ARCH (1) Lag 1 0.0981 0.63 0.533 0.43 0.5119
ARCH (2) Lag 1 0.0836 0.52 0.605
Lag 2 0.1536 0.96 0.345
98
Normal - 95% CI
99
Mean -0.2040
StDev 7.747
95 N 43
AD 0.530
90
P-Value 0.166
80
70
Percent
60
50
40
30
20
10
1
-20 -10 0 10 20
Residual
Figure 4.12.2 Normal Probability Plot of AR (1) Residuals for the Export of Refined Oil to
Africa Data
Table 4.12.3 Normality Test of AR (1) Rresiduals for the Export of Refined Oil to Africa Data
4.13 Export of Crude Oil to Asia and Far East

Next we consider is the export of crude oil to Asia and Far East. The ACF and PACF values for
this data together with the Ljuang-Box t and χ 2 tests are given in Table 4.13.1 and in Figure
4.13.1.
99
Table 4.13.1 The ACF and PACF Values for the Export of Crude Oil to Asia and Far East Data
Index ACF PACF
1 0.882659 5.78798 35.8936 0.737031 4.83303
2 0.741047 3.03819 61.8109 0.088787 0.58221
3 0.571933 1.96132 77.6347 0.070543 0.46258
4 0.400371 1.26451 85.5879 0.053793 0.35275
5 0.236887 0.72181 88.4453 -0.223338 -1.46453
6 0.097322 0.29302 88.9407 0.201167 1.31914
7 0.020396 0.06129 88.9630 -0.087480 -0.57364
8 -0.033969 -0.10206 89.0268 0.135924 0.89131
9 -0.030254 -0.09088 89.0789 -0.068532 -0.44939
10 -0.015336 -0.04606 89.0927 -0.203040 -1.33142
11 0.004186 0.01257 89.0938 -0.065675 -0.43066
100
Autocorrelation Function for Crude_Asia_and_Far_east Partial Autocorrelation Function for Crude_Africa
1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
Autocorrelation
0.2 0.2
0.0 0.0
-0.2 -0.2
-0.4 -0.4
-0.6 -0.6
-0.8 -0.8
-1.0 -1.0
1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11
Lag Lag
Figure 4.13.1 The ACF and PACF Values for the Export of Crude Oil to Asia and Far East Data
Table 4.13.2 Order of ARCH Using the White Test for the Export of Crude Oil to Asia and Far
East Data
ARCH (1) Lag 1 1.1763 8.12 0.000
ARCH (2) Lag 1 1.1998 8.43 0.000 31.69 0.0000
Lag 2 -0.4802 -3.38 0.002
ARCH (3) Lag 1 1.2290 7.33 0.000
Lag 2 -0.5587 -2.22 0.033
Lag 3 0.0630 0.37 0.712
101
Normal - 95% CI
99
Mean 4.896466E-12
StDev 235.1
95 N 43
AD 0.798
90
P-Value 0.036
80
70
Percent
60
50
40
30
20
10
1
-800 -600 -400 -200 0 200 400 600 800
Residual
Asia and Far East Data
Table 4.13.3 Normality Test of ARCH (2) Rresiduals for the Export of Crude Oil to Asia and Far
East Data
results are presented in Table 4.13.3. Thus the ARCH (2) model passes the normality test and we
can conclude that ARCH (2) is the most appropriate model for this data.
102
4.14 Export of Refined Oil to Asia and Far East
Next we consider is the export of refined oil to Asia and Far East. The ACF and PACF values for
this data together with the Ljuang-Box t and χ 2 tests are given in Table 4.14.1 and in Figure
4.14.1.
Table 4.14.1 The ACF and PACF Values for the Export of Refined Oil to Asia and Far East Data
Index ACF PACF
1 0.943605 6.18763 41.022 0.943605 6.18763
2 0.872009 3.42903 76.909 -0.167698 -1.09967
3 0.809421 2.55914 108.602 0.064228 0.42117
4 0.748021 2.07058 136.364 -0.049149 -0.32229
5 0.683675 1.72801 160.165 -0.054378 -0.35658
6 0.624659 1.47944 180.571 0.018878 0.12379
7 0.556548 1.25576 197.220 -0.142537 -0.93468
8 0.484037 1.05417 210.173 -0.048348 -0.31704
9 0.402659 0.85512 219.400 -0.143790 -0.94289
10 0.308697 0.64470 224.988 -0.166646 -1.09277
11 0.205942 0.42601 227.553 -0.142391 -0.93372
103
Autocorrelation Function for Refine_Asia_and_Far_east Partial Autocorrelation Function for Refine_Asia_and_Far_east
1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
Autocorrelation
0.2 0.2
0.0 0.0
-0.2 -0.2
-0.4 -0.4
-0.6 -0.6
-0.8 -0.8
-1.0 -1.0
1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11
Lag Lag
Figure 4.14.1 The ACF and PACF Values for the Export of Refined Oil to Asia and Far East
Data
Here we compare three different ARCH models. At first we fit an ARCH (1) model and
observe that the lag effect is significant. Then we fit an ARCH (2) model and observe that the
first lag effect is significant, but the second one is not. Thus we may conclude that ARCH (1) is
the most appropriate model for this data. The White statistic for ARCH (1) is 24.08 and which is
highly significant.
Table 4.14.2 Order of ARCH Using the White Test for the Export of Refined Oil to Asia and Far
East Data
ARCH (1) Lag 1 0.8885 7.14 0.000 24.08 0.0000
ARCH (2) Lag 1 1.0333 6.15 0.000
Lag 2 -0.2525 -1.33 0.190
104
Normal - 95% CI
99
Mean 9.980647E-13
StDev 52.50
95 N 43
AD 0.575
90
P-Value 0.127
80
70
Percent
60
50
40
30
20
10
1
-150 -100 -50 0 50 100 150
Residual
Asia and Far East Data
Table 4.14.3 Normality Test of ARCH (1) Rresiduals for the Export of Refined Oil to Asia and
Far East Data
results are presented in Table 4.14.3. Thus the ARCH (1) model passes the normality test and we
can conclude that ARCH (1) is the most appropriate model for this data.
105
4.15 Export of Crude Oil to Oceania
Next we consider is the export of crude oil to Oceania. The ACF and PACF values for this data
Table 4.15.1 The ACF and PACF Values for the Export of Crude Oil to Oceania Data
Index ACF PACF
1 0.832307 5.45780 31.9153 0.832307 5.45780
2 0.662498 2.81276 52.6294 -0.098406 -0.64529
3 0.497425 1.80566 64.5989 -0.086544 -0.56750
4 0.348591 1.17914 70.6280 -0.056672 -0.37163
5 0.228481 0.74901 73.2862 -0.016245 -0.10653
6 0.167607 0.54243 74.7554 0.094237 0.61795
7 0.128955 0.41451 75.6492 0.004750 0.03114
8 0.050868 0.16286 75.7923 -0.183207 -1.20137
9 0.054082 0.17304 75.9587 0.213171 1.39786
10 0.087754 0.28059 76.4103 0.096684 0.63400
11 0.153059 0.48850 77.8269 0.127901 0.83870
Autocorrelation Function for Crude_Oceania Partial Autocorrelation Function for Crude_Oceania

1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
Autocorrelation
0.2 0.2
0.0 0.0
-0.2 -0.2
-0.4 -0.4
-0.6 -0.6
-0.8 -0.8
-1.0 -1.0
1 2 3 4 5 6 7 8 9 10 11
1 2 3 4 5 6 7 8 9 10 11
Lag
Lag
Figure 4.15.1 The ACF and PACF Values for the Export of Crude Oil to Oceania Data
106
Table 4.15.2 Order of ARCH Using the White Test for the Export of Crude Oil to Oceania Data
ARCH (1) Lag 1 0.5530 4.19 0.000
ARCH (2) Lag 1 0.7246 4.71 0.000 16.04 0.0003
Lag 2 -0.3167 -2.05 0.047
ARCH (3) Lag 1 0.7727 4.69 0.000
Lag 2 -0.4308 -2.19 0.035
107
Lag 3 0.1567 0.95 0.348

Normal - 95% CI
99
Mean -1.46673E-13
StDev 9.124
95 N 43
AD 0.929
90
P-Value 0.017
80
70
Percent 60
50
40
30
20
10
1
-30 -20 -10 0 10 20 30
Residual
Oceania Data
Table 4.15.3 Normality Test of ARCH (2) Rresiduals for the Export of Crude Oil to Oceania
Data
4.16 Export of Refined Oil to Oceania

Next we consider is the export of refined oil to Oceania. The ACF and PACF values for this data
108
Table 4.16.1 The ACF and PACF Values for the Export of Refined Oil to Oceania Data
Index ACF PACF
1 0.767059 5.02995 27.1075 0.767059 5.02995
2 0.588859 2.61722 43.4727 0.001164 0.00763
3 0.515937 1.99696 56.3497 0.155191 1.01766
4 0.440116 1.56456 65.9603 -0.011934 -0.07826
5 0.314341 1.05880 70.9918 -0.123842 -0.81208
6 0.120193 0.39469 71.7473 -0.274019 -1.79686
7 0.007164 0.02344 71.7501 -0.028236 -0.18516
8 -0.074007 -0.24215 72.0529 -0.082935 -0.54384
9 -0.158297 -0.51723 73.4790 -0.039391 -0.25830
10 -0.216615 -0.70342 76.2303 0.018619 0.12209
11 -0.299238 -0.96073 81.6449 -0.121313 -0.79550
Autocorrelation Function for Refine_Oceania Partial Autocorrelation Function for Refine_Oceania

1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
Autocorrelation
0.2 0.2
0.0 0.0
-0.2 -0.2
-0.4 -0.4
-0.6 -0.6
-0.8 -0.8
-1.0 -1.0
1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11
Lag Lag
Figure 4.16.1 The ACF and PACF Values for the Export of Refined Oil to Oceania Data
109
significant.
Table 4.16.2 Order of ARCH Using the White Test for the Export of Refined Oil to Oceania
Data
ARCH (1) Lag 1 0.5054 3.66 0.001 10.79 0.0010
ARCH (2) Lag 1 0.5441 3.34 0.002
Lag 2 -0.0918 -0.56 0.576
110
Normal - 95% CI
99
Mean 2.491031E-14
StDev 5.445
95 N 43
AD 0.965
90
P-Value 0.014
80
70
Percent
60
50
40
30
20
10
1
-20 -10 0 10 20
Residual
Oceania Data
Table 4.16.3 Normality Test of ARCH (1) Rresiduals for the Export of Refined Oil to Oceania
Data
4.17 Result Summary

In this section we summarize results that we obtain in sections 4.1 – 4.16 and the results are
presented in the following table.
111
Table 4.17 Selected Models for Saudi Arabia Oil Production Data
Variable Selected Model
Total Crude Oil Production ARCH (1)
Total Export of Refined Oil ARCH (1)
Export of Crude Oil to North America ARCH (2)
Export of Refined Oil to North America ARCH (1)
Export of Crude Oil to South America ARCH (1)
Export of Refined Oil to South America ARCH (1)
Export of Crude Oil to Western Europe ARCH (1)
Export of Refined Oil to Western Europe ARCH (1)
Export of Crude Oil to Middle East AR (1)
Export of Refined Oil to Middle East AR (1)
Export of Crude Oil to Africa AR (1)
Export of Refined Oil to Africa AR (1)
Export of Crude Oil to Asia and Far East ARCH (2)
Export of Refined Oil to Asia and Far East ARCH (1)
Export of Crude Oil to Oceania ARCH (2)
Export of Refined Oil to Oceania ARCH (1)
The autocorrelation functions and partial autocorrelation functions show that all sixteen variables
show autoregressive pattern. The White test suggests to select ARCH (1) model for nine of them.
Three other variables fit ARCH (2) model. We do not find any ARCH effect in four other
variables where AR should be the appropriate model.
112
CHAPTER 5
CONCLUSIONS AND DIRECTION OF
FUTURE RESEARCH
In this chapter we will summarize the findings of our research to draw some conclusions and
outline ideas for our future research.
5.1 Conclusions
In our research the prime objective was to find the most appropriate models for analyzing Saudi
Arabia oil production data. Since these are time series data we could consider ARIMA models to
fit the data. But most of the variables showed some kind of volatility and for this reason we
select ARCH models for them. If there is no ARCH effect, it will automatically become an
ARIMA model. But the existence of missing values for almost each of the variable makes the
analysis part complicated since an ARCH model does not converge when observations are
missing. As a remedy to this problem we estimate missing observations first. We intended to
employ the EM algorithm for estimating the missing values. But since our data are time series
simple EM algorithm would not be appropriate for them. There is also evidence of the presence
of outliers in the data and robust regression techniques conclude that three out of sixteen
variables contained multiple outliers in it. Hence we finally employed robust regression LTS
based EM algorithm to estimate the missing values.
After the estimation of missing values we employed the White test to select the most
appropriate ARCH models for all sixteen variables under study. The ACF and PACF values
suggest that all of them showed autoregressive pattern. Nine of them matched with ARCH (1)
model, three with ARCH (2) and the remaining four did not show any ARCH effect and they
113
match with AR (1). Normality tests on resulting residuals were performed to check the validity
of the fitted models and all of them supported the normality assumption confirming that our
conclusions are all valid.
5.2 Direction of Future Research

We have to complete this research within a short period of time. So because of time constraints it was not
possible for us to look at every aspect of time series properties of Saudi Arabia oil production data.
Because of time constraint we could not study the inter relationships among the variables. We have
selected the appropriate ARCH model for each of the variables but we could not study the goodness of fit
of them. We also could not study how effective these ARCH models are in the forecasting. A cross
validation study could be used here to judge the quality of prediction for different models. For cross-
validation we used the data splitting technique. However, there is evidence [see Efron and Tibshirani
(1997)] that cross validations can be improved by implementing a special type of bootstrap. We only
consider ARCH models in our study, sometimes GARCH could be a better alternative for this type of
variables. We would like to extend our research in this direction in future.
114
REFERENCES
1. Barnet, V. and Lewis, T. (1994). Outliers in Statistical Data, 3rd ed., Wiley, New York.
2. Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedascity, Journal of

Econometrics, 31, 307 – 327.
3. Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977) Maximum likelihood from

incomplete data via the EM algorithm, Journal of the Royal Statistical Society.
Series B, 39, 1-38
4. Efron, B. and Tibshirani, R. (1997), Improvements on cross-validation: The .632 +

Bootstrap Method, Journal of the American Statistical Association, 92, 548–560.
5. Engle, R.F. (1982), Autoregressive conditional heteroskedasticity with estimates of

variance of U.K inflation, Econometrica, 50, 987-1006.
6 Franses, P.H. and van Dijk, D. (2000). Nonlinear Time Series Models in Empirical
Finance, Cambridge University Press, Cambridge.
7. Gounder, M.K., Shitan, M and Imon, A.H.M.R. (2007). Detection of outliers in non-
linear time series: A review, Festschrifts in Honour of Professor Mir Masoom Ali,
Department of Mathematical Sciences, Ball State University, USA, 213 – 224.
8. Geary, R.C. 1947. Testing for normality, Biometrika, 34, 209-242.
9. Greene, W.H. (1997). Econometric Analysis, 3rd ed., Prentice Hall, New Jersey.
10. Hadi, A.S., Imon, A.H.M.R. and Werner, M. (2009). Detection of outliers, Wiley
Interdisciplinary Reviews: Computational Statistics, 1, 57 – 70.
11. F.R. Hampel, E.M. Ronchetti, P.J. Rousseeuw and W. Stahel, Robust statistics:The
approach based on influence function, Wiley, New York, 1986.
12. Imon, A. H. M. R. (2003). Regression residuals, moments, and their use in tests for
normality, Communications in Statistics—Theory and Methods, 32, 1021 – 1034.
13. Imon, A.H.M.R., Doula, M.S. and Hamzah, N.A. (2007). On the detection of ARCH
effect in time series data, Proceedings of an International Conference on
Mathematical Sciences on ‘Integrating Mathematical Sciences within Society’,
Bangi – Putrajaya, Malaysia, pp. 783 – 789.
14. Little, R.J.A. and Rubin D. B. (2002). Statistical Analysis with Missing Data, 2nd
115
ed., Wiley, New York.
15. Mamun, A.S.A. (2013). Robust Statistics in Linear Structural Relationship Model
and Analysis of Missing Values. Unpublished Ph.D. thesis, University of Malaya.
16 Maronna, R.A., Martin, R.D. and Yohai, V.J. (2006), Robust Statistics: Theory and
Methods, Wiley, New York.
17. Pearson, K. (1905). On the general theory of skew correlation and non-linear
regression, Biometrika, 4: 171-212.
18. Pindyck, R. S. and Rubenfeld, D. L. (1998), Econometric Models and Economic

Forecasts, 4th Ed. Irwin/McGraw-Hill Boston, 1998.
19. Rana, M.S., Habshah, M. and Imon, A.H.M.R. (2009). A robust rescaled moments test
for normality in regression, Journal of Mathematics and Statistics, 5, 54–62.
20. Rousseeuw, P.J. (1984). Least median of squares regression, Journal of the American
Statistical Association, 79, 871 – 880.
21. Rousseeuw, P.J. and Leroy, A.M. (1987). Robust Regression and Outlier Detection,
Wiley, New York.
116
APPENDIX
SAUDI ARABIA OIL PRODUCTION DATA
Source: Saudi Arabian Moneytary Agency (SAMA)
http://www.sama.gov.sa/sites/samaen/ReportsStatistics/statistics/Pages/YearlyStatistics.aspx
Export of Export of Export of Export of
Total Total Crude Oil Refined Oil Crude Oil Refined Oil
Crude Oil Export of to North to North to South to South
Year Production Refined Oil America America America America
1962 599.76 81.59
1963 651.71 88.33
1964 694.13 95.76
1965 804.94 110.43
1966 948.57 113.19
1967 1023.84 122.16
1968 1113.71 151.74 46.19 1.12 36.33 1.76
1969 1173.89 158.21 41.62 0.87 34.44 3.64
1970 1386.67 207.89 20.82 0.08 50.59 6.4
1971 1740.68 193.95 70.3 2.92 91.29 7.46
1972 2201.96 208.1 90.31 5.83 115.15 5.32
1973 2772.61 213 137.14 7.87 247.29 9.04
1974 3095.09 210.57 139.8 8.24 342.02 4.49
1975 2582.53 175.26 117 8.91 344.88 10.35
1976 3139.28 205.78 171.15 8.24 490.96 8.5
117
1977 3357.96 188.39 359.68 2.63 369.21 6.11
1978 3029.9 174.8 509.2 1.71 139.04 3.69
1979 3479.15 175.13 641.74 6.57 116.38 2.48
1980 3623.8 178.45 619.11 4.53 127.41 2.31
1981 3579.89 193.75 508.28 9 142.81 6.74
1982 2366.41 195.1 171.05 7.53 93.93 11.92
1983 1656.88 146.67 128.06 7.35 67.64 6.67
1984 1492.9 177.85 83.32 6.4 37.13 7.66
1985 1158.8 196.9 47.11 12.21 44.33 1.79
1986 1746.2 265.53 243.15 27.29 78.9 3.77
1987 1505.4 248.11 * * * *
1988 1890.1 417.45 359.42 54.67 67.96 5.01
1989 1848.5 398.92 380.48 30.32 35.59 4.84
1990 2340.5 478.98 481.04 50.61 59.71 6.43
1991 2963 450.23 663.89 36.37 72.35 8.02
1992 3049.4 473.88 614.84 46.02 67.67 18.2
1993 2937.4 516.05 487.75 47.54 61.51 45.52
1994 2937.9 498.18 521.41 36.65 60.36 35.1
1995 2928.54 482.38 504.02 24.1 53.3 17.29
1996 2965.45 546.07 490.66 35.58 47.15 26.73
1997 2924.28 508.42 488.73 27.29 33.16 25.54
1998 3022.27 499.66 544.24 16.81 31.42 31.75
1999 2761.1 467.08 534.2 10.43 26.95 16.44
2000 2962.6 448.24 577.17 7.7 22.47 20.56
2001 2879.46 395.13 560.06 5.76 36.76 13.55
118
2002 2588.98 362.64 488.8 4.91 22.08 10.87
2003 3069.74 411.94 596.92 10.61 23.84 11.43
2004 3256.3 487.07 558.38 22.25 22.32 13.77
2005 3413.94 505.67 530.93 18.55 23.79 12.12
2006 3360.9 466.31 534.5 13.23 23.78 7.23
2007 3217.77 415.66 571.78 11.04 22.34 9.36
2008 3366.34 386.27 590.66 5.82 23.03 8.97
2009 2987.27 368.06 386.12 4.67 23.01 7.48
2010 2980.43 347.06 442.24 5.32 24.43 5.61
Export of Export of Export of Export of Export of Export of
Crude Oil Refined Oil Crude Oil Refined Oil Crude Oil Refined Oil
to Western to Western to Middle to Middle to Africa to Africa
Year Europe Europe East East
1962 180.92 2.03
1963 199.74 2.77
1964 247.71 1.68
1965 301.09 1.69
1966 404.54 2.85
1967 425.15 4.03
1968 472.25 2.25 64.75 2.39 28.91 6.21
1969 496.04 1.28 67.66 1.98 45.43 4.32
1970 608.09 6.71 69.84 1.67 50.64 7.58
1971 814.52 9.12 75.07 2.23 67.43 6.48
119
1972 1130.36 7.18 71.31 2.14 57.3 7.87
1973 1332.9 19.83 77.1 1.49 80.77 7.39
1974 1526.68 19.22 79.4 1.66 36.85 4.03
1975 1113.12 12.71 66.97 0.86 40.2 3.51
1976 1268.86 13.47 83.18 1.7 31.74 4.36
1977 1296.05 12.26 114.34 1.85 21.19 2.21
1978 1092.28 15.91 94.87 1 13.83 2.97
1979 1337.21 21.16 104.27 1.21 31.44 1.45
1980 1432.3 36.93 98.66 1.88 43.43 0.39
1981 1396.69 50.98 114.48 7.29 55.82 0.25
1982 727.72 33.89 76.95 9.93 37.59 --
1983 364.53 21.02 67.83 0.6 25.96 0.13
1984 247.87 32.48 52.07 11.38 20.36 1.14
1985 218.5 26.95 36.52 19.71 14.31 7.06
1986 458.72 44.82 81.93 16.62 1.93 6.78
1987 * * * * * *
1988 356.7 63.02 78.99 22.52 13.51 12.59
1989 320.92 48.27 69.86 25.32 4.61 11.34
1990 380.19 80.67 77.35 34.64 32.71 18.73
1991 623.1 67.82 78.8 48.39 61.44 16.08
1992 636.24 65.05 78.2 45.08 35.48 18.92
1993 628.37 76.28 74.69 60.14 33.96 17.37
1994 601.77 64.22 81.52 56.63 35.15 18.41
1995 598.37 33.54 80.54 53.19 34.96 14.88
1996 530.62 30.01 83.61 60.51 35.24 19.8
120
1997 591.13 29.95 77.69 67.68 38.57 29.73
1998 645.73 34.8 76.49 50.36 49.61 41.84
1999 454.33 22.83 68.97 33.75 73.66 33.2
2000 483.8 28.48 60.44 43.77 79.45 39.66
2001 405.86 29.52 57.44 44.93 64.62 37.02
2002 343.12 18.3 49.46 36.61 68.36 29.98
2003 434.86 29.76 72.69 45.93 96.34 34.25
2004 459.56 49.11 95.45 51.76 88.74 36.91
2005 440.67 55.57 112.87 56.75 86.02 41.4
2006 374.8 49.64 109.48 72.01 79.01 45.17
2007 306.04 36.28 113.06 61.04 71.76 45.31
2008 310.97 38.9 110.23 56.66 74.7 49.42
2009 228.45 31.32 104.25 71.77 60.33 40.03
2010 180.92 25.96 107.28 74.87 54.17 33.86
121
Export of Crude Export of Export of Export of
Oil to Asia and Refined Oil to Crude Oil to Refined Oil
Year Far East Asia and Far East Oceania to Oceania
1962
1963
1964
1965
1966
1967
1968 289.01 58.99 28.34 4.55
1969 306.59 63.63 28.27 4.38
1970 347.45 96.2 26.74 8.03
1971 389.01 74.75 20.57 4.41
1972 518.69 78.49 9.41 10.09
1973 670.94 69.77 14.2 5.66
1974 743.95 75.15 22.98 4.57
1975 699.69 75.95 27.53 8.42
1976 860.6 100.77 33.15 5.7
1977 938.37 96.2 43.21 3.34
1978 928.6 102.51 34.88 2.7
1979 947.43 102.47 40 1.74
1980 1008.06 100.34 46.72 2.51
1981 1024.16 100.79 49.3 0.98
1982 913.46 120.16 37.7 0.95
1983 753.49 103.24 23.57 1.03
1984 705.74 112.66 21.4 1.92
1985 410.84 122.52 9.11 4.99
122
1986 321.47 154.66 3.92 8.65
1987 * * * *
1988 346.72 246.56 22.19 10.64
1989 388.96 259.13 17.08 12.21
1990 593.1 274.67 18.32 10.87
1991 861.78 255.13 20.75 15.72
1992 958.19 264.53 18.36 13.47
1993 986.4 246.87 24.24 12.07
1994 957.36 281.42 17.7 3.96
1995 1006.31 323.92 18.63 15.46
1996 1031.49 350.95 17.24 22.49
1997 1010.81 311.23 17.24 17
1998 971.35 309.51 13.64 14.59
1999 921.77 335.66 7.8 14.77
2000 1044.67 292.83 14.38 15.24
2001 1067.98 252.25 10.38 12.1
2002 942.89 256.46 14.18 5.51
2003 1149.87 276.82 6.33 3.14
2004 1251.06 309.57 11.26 3.7
2005 1435.34 317.22 1.62 4.06
2006 1440.63 275.92 3.52 3.11
2007 1453.23 249.66 2.95 2.97
2008 1560.86 224.27 1.97 2.23
2009 1482.61 211.71 2.89 1.08
2010 1555.22 201.44 1.6 0
123

AlbarrakA 2013-1 BODY

Uploaded by

Copyright:

Available Formats

You might also like

AlbarrakA 2013-1 BODY

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

AlbarrakA 2013-1 BODY

Uploaded by

Copyright:

Available Formats

TIME SERIES ANALYSIS OF SAUDI ARABIA OIL

SUBMITTED TO THE GRADUATE EDUCATIONAL POLLCIES COUNCIL

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

for the degree

Committee Chairman Date

Committee Member Date

Committee Member Date

Department Head Approval:

Head of Department Date

Graduate office Check:

Dean of Graduate School

Ball State University

my family: my parents, my brothers and sisters, for supporting me throughout my life.

STUDENT: Abdulmajeed Albarrak

DEGREE: Master of Science

COLLEGE: Sciences and Humanities

DATE: December, 2013

consider autoregressive conditional heteroscedastic (ARCH) models for them. If there is no

the validity of the fitted model.

first struck in Saudi Arabia in March 1938.

1.1 The History of Oil Production in Saudi Arabia

Aramco until the early 1970s.

also explored for and developed other mineral resources.

It also controlled daily operations related to production and pricing.

Arabia is committed to ensuring stability of supplies and prices.

1.2 Saudi Arabia Oil Production Data

Arabian Moneytary Agency (SAMA). Here is the link of the data:

1962 to 2010 that we consider in our study. These are

Crude Oil Production

Export of Refined Oil

Export of Crude Oil to North America

Export of Refined Oil to North America

Export of Crude Oil to South America

Export of Refined Oil to South America

Export of Crude Oil to Western Europe

Export of Refined Oil to Western Europe

Export of Crude Oil to Middle East

Export of Refined Oil to Middle East

Export of Crude Oil to Africa

Export of Refined Oil to Africa

Export of Crude Oil to Asia and Far East

Export of Refined Oil to Asia and Far East

Export of Crude Oil to Oceania

Export of Refined Oil to Oceania

Figure 1.1 Time Series Plot of Crude Oil Production

Time Series Plot of Total_refine_exp_mil_barl

1962 1970 1978 1986 1994 2002 2010

Figure 1.2 Time Series Plot of Export of Refined Oil

Time Series Plot of Refine_North_america

1962 1970 1978 1986 1994 2002 2010

Time Series Plot of Refine_South_america

Time Series Plot of Refine_Western_europe

Time Series Plot of Refine_Middle_east

Figure 1.11 Time Series Plot of Export of Crude Oil to Africa

Time Series Plot of Refine_Africa

1962 1970 1978 1986 1994 2002 2010

Figure 1.12 Time Series Plot of Export of Refined Oil to Africa