Professional Documents
Culture Documents
Econ1313-As2 Sample
Econ1313-As2 Sample
sID S3879951
1. Overview of topic 2
2. Data Importation and Descriptive Statistics 3
a) Missing values 3
b) Natural logarithm 3
Part 2. Descriptive Statistics and initial estimation 4
1. Descriptive Statistics 4
2. Model 1 5
3. OLS model of Model 1 5
Part 3. Interpretation 6
1. Interpret Goodness-of-Fit - R² 6
2. F-test for Model 1 7
3. t-test for Model 1 7
4. Discussion 8
5. Testing multicollinearity 8
Part 4. Further Estimation 9
1. High DEPRATIO 9
2. Hypothesis t-test for HDP 10
3. Generate interaction term 10
4. Hypothesis t-test for interaction term 10
5. Model 4 11
6. Choosing the best Model 11
Part 5. Conclusion 11
1. Summarizing the findings 11
2. Policy recommendations 12
3. Limitations and suggestions 12
References 12
Appendices 13
1. Overview of topic
Access to healthcare is a prominent issue in many countries, as health is a fundamental factor
affecting the growth of not only humans but the country as a whole. In fact, it is the objective
of countries and one of the prime goals of the Sustainable Development Goals to ameliorate
the health conditions. Thus, it is crucial to identify the factors which determine healthcare
expenditures.
In a study in the previous year, researchers examined the determinants of healthcare expenses
in developing and transitional countries. The results showed that, among the examined
factors, Foreign Direct Investment (FDI), personal remittances (PR), urbanization, life
expectancy (LE), population age 65 and above (POP65) and unemployment are some
noteworthy determinants of healthcare spending in developing and transitional countries. In
particular, FDI and PR have negative relationships with wellbeing expenditure in developing
countries, while PR has a significant positive effect among transitional countries.
Additionally, wellbeing spending in transitional countries is also impacted by unemployment
with positive signs. On the other hand, urbanization has a considerably negative effect on
both developing and transitional countries. Meanwhile, LE and POP65 have positive
associations with wellbeing expenses in income-related classified countries (Awais et al.
2021).
Typically, FDI has a major impact on healthcare expenditure due to its ability to raise
awareness of health related goods and services in low-income countries, thus affecting the
stocks of said goods and services. Meanwhile, PR is found to affect healthcare knowledge of
the populace and reduce poverty. Moreover, trade liberalization has great relation with LE
and child fatality rates. LE is proven to be one of the main determinants of wellbeing
expenditure and child mortality rate, as well as having a major effect on FDI (Akca et al.
2017). Furthermore, urbanization helps people to have more access to healthcare. On the
other hand, some studies suggest that the unemployment rate has a negative effect on
healthcare expenditure (Abbas & Hiemenz 2011) while some say otherwise (Braendles &
Colombier 2016). As for population age, specifically population age 65 and above, it is
observed that the variable has an affirmative association with healthcare expenditure (Awais
et al. 2021).
a) Missing values
In the dataset given, there are missing values for Niger and South Sudan in 2012. To solve the
issue, the value missing for Niger is replaced by the average of the two nearest years, 2011
and 2013. Regarding South Sudan, since there are many values missing and there are no
nearest values possible, the country is removed from the dataset.
b) Natural logarithm
In the suggested model, natural logarithm of current healthcare expenditure per capita
(PCHE), crude birth rate (per 1000 people) (CBR), gross domestic product per capita (current
US$) (GDPPC), and net official development assistance received (current US$) (NETODA)
will be used. The main reasons are that natural logarithm is useful for multiple linear
regression models in the sense that it helps the data be more skewed and normally distributed.
In addition, in the process of deriving the descriptive statistics of the variables, outliers are
found; natural logarithm plays a role in minimizing the impact of the outliers to the dataset.
Another benefit of natural logarithm is that it eradicates heteroscedasticity, which causes OLS
ceases to be the minimum variance estimator, as well as nullify F-test and t-test. Thus, it is in
best interest to take the natural logarithm of the variables.
1. Descriptive Statistics
The descriptive statistics of the variables are demonstrated as below.
2. Model 1
The population multiple linear regression Model 1 is as demonstrated as below:
MLR.1: The relationship between dependent and independent variables is linear. This
assumption is satisfied.
MLR.2: Random sampling is met. The dataset given consists of 36 countries from different
continents and income classes.
MLR.3: this is the assumption that the independent variables in the Model 1 have no perfect
collinearity relationship. This assumption is applied for Model 1.
MLR.4: In this assumption, zero conditional mean, implying that there should be no
information regarding the explanatory variables contained in the mean of error. The residual
means is figured out to be nearly zero, thus, MLR.4 is satisfied.
MLR.5: Homoscedasticity. This assumption is tested using the Breusch-Pagan test. The
result given provided that the p-value (0.666) is larger than 0.05. Hence, we reject the
hypothesis that there is the presence of heteroscedasticity in the data, and MLR.5 is met.
Since 4 out of 6 OLS assumptions are satisfied as mentioned above, OLS can be applied to
Model 1 as below.
Part 3. Interpretation
1. Interpret Goodness-of-Fit - R²
Using the results given, Model 1 is written as below in standard regression format:
The adjusted R-squared is 0.81, which means that approximately 81% of the variations in
Because adding more variables to the Model can increase the R-squared value, despite
whether the variable has any relationship with the Model, the adjusted R-squared penalizes
the addition of new variables. If the gap between multiple and adjusted R-squared is great,
then it indicates overfitting. The multiple R-squared of Model 1 is 0.839, so the difference
between multiple and adjusted R-squared is approximately 0.029; this implies that the
variables are mostly significant to the Model.
Null hypothesis: , all regression coefficients are equal to 0 (The variables are not at all useful
in determining healthcare expenditure).
Alternative: , reject null hypothesis (At least one variable has an impact on healthcare
expenditure).
Using R, we were able to derive that F-statistics is 31.204 and p-value of 5.002e-11. At the
significance level of , we reject the null hypothesis as the p-value is significantly smaller than
the significance level.
Thus, as the result of the F-test, at least one variable in the data has an impact on healthcare
expenditure.
Alternative: , reject null hypothesis (At least one variable is the determinants of healthcare
expenditure).
Interpretation:
Coefficient of log(GDPPC) = 0.673904599 indicates that if GDP per capita increases by 1%,
healthcare expenditure will increase by 0.674%, holding other factors constant.
4. Discussion
From the analysis above and the research article in part 1, the findings are somewhat
expected. The relationship between urbanization and healthcare expenditure is found to be
negatively correlated in developing and transitional countries. On the other hand, in the
author’s findings, the relationships between GDP with developing and transitional countries
are negative, with the coefficients of -0.09 and -0.02 (Awais et al. 2021). However, since the
coefficients are not significantly low, the negative relationships are weak.
5. Testing multicollinearity
Using the variance inflation factor (VIF) method, multicollinearity is tested between the
variables.
VIF
log(CBR) 7.351083
URBAN 3.446012
DEPRATIO 8.733128
log(GDPPC) 4.548462
NETODA 1.224260
Table 4. Variance inflation factor of variables
It can be seen in Table 4, the VIF of log(CBR) and DEPRATIO are relatively high. However,
the values are below 10, so the correlation result is acceptable and no potential
multicollinearity is detected in the Model.
1. High DEPRATIO
The median of DEPRATIO is estimated at 32.0195. The new dummy variable, High
DEPRATIO (HDP) is generated and receives 1 with the condition of if DEPRATIO is higher
than its mean and 0 if otherwise. The new Model 2 is created by replacing the DEPRATIO
variable with the HDP variable. The population multiple linear regression of Model 2 is
demonstrated below.
The p-value HDP is higher than the level of significance (0.243 > 0.05). As a result, the
alternative hypothesis is rejected, meaning that HDP does not affect the healthcare
expenditure per capita.
At a significance level of 0.05, the p-value of the interaction term is 0.975. Thus, the
alternative hypothesis is rejected, which indicates that the effect of GDPPC is not influenced
by HDP.
5. Model 4
Throughout re-working the Model, we found out that the data of NETODA is extremely
volatile due to a huge outlier. To eliminate the problem as well as secure homoscedasticity,
Model 4 could use the natural logarithm of NETODA (log(NETODA)). The estimating
equation for Model 4 would be:
However, the major problem of the dataset was the appearance of negative numbers (there are
two). Because of that, natural logarithm cannot be applied to NETODA. Hence, Model 4 is
not the best of choice.
Part 5. Conclusion
10
2. Policy recommendations
Based on my analysis, the policy recommendation for the governments of my assigned group
of countries would be to focus on factors which induce economic growth. As the economy is
more stable, this will attract FDI to invest in healthcare related systems. Another note is that
GDP per capita remarkably affects healthcare expenditure, which can explain in the way that
as income increases, people are more willing to access healthcare, which might be expensive
in some places.
References
Abbas F and Hiemenz U (2011) 'Determinants of Public Health expenditures in Pakistan',
ZEF-Discussion Papers on Development Policy No. 158, 1 November 2011, accessed 6
December 2022, <https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1967070>.
Awais M, Khan A and Ahmad SM (2021) 'Determinants of health expenditure from global
perspective: A panel data analysis', Liberal Arts &Social Sciences International Journal, 29
June 2021, 5(1) : 481-496, doi: https://doi.org/10.47264/idea.lassij/5.1.31, accessed 9
December 2022, <https://ideapublishers.org/index.php/lassij/article/view/306/177>.
Braendle T and Colombier C (2016) 'What drives public health care expenditure growth?
Evidence from Swiss cantons, 1970–2012', Health Policy, September 2016, 120(9) : 1051-
1060, doi: https://doi.org/10.1016/j.healthpol.2016.07.009, accessed 10 December 2022,
11
<https://www.sciencedirect.com/science/article/abs/pii/S0168851016301816?via%3Dihub>.
Appendices
Dependent/Independent
Variables Unit Abbreviation
Variables
Current health
expenditure per Current US$ PCHE Dependent variable
capita
urban population, %
Urbanization of the total Urban Independent variable
population
Share of the
population that is
under 15 years of
age or above 65 DepRatio Independent variable
years of age as a
percentage of the
population
Net official
development current US$ NETODA Independent variable
assistance received
Appendix 1. Variables used in estimating multiple regression of healthcare expenditure
12