Professional Documents
Culture Documents
Healthcare Cost Analysis (Source Code) : Description
Healthcare Cost Analysis (Source Code) : Description
(SOURCE CODE)
DESCRIPTION
Background and Objective:
A nationwide survey of hospital costs conducted by the US Agency for Healthcare consists of hospital
records of inpatient samples. The given data is restricted to the city of Wisconsin and relates to
patients in the age group 0-17 years. The agency wants to analyze the data to research on
healthcare costs and their utilization.
Domain: Healthcare
Dataset Description:
Attribute Description
Analysis to be done:
1. To record the patient statistics, the agency wants to find the age category of people who
frequently visit the hospital and has the maximum expenditure.
2. In order of severity of the diagnosis and treatments and to find out the expensive treatments, the
agency wants to find the diagnosis-related group that has maximum hospitalization and expenditure.
3. To make sure that there is no malpractice, the agency needs to analyze if the race of the patient is
related to the hospitalization costs.
4. To properly utilize the costs, the agency has to analyze the severity of the hospital costs by age
and gender for the proper allocation of resources.
5. Since the length of stay is the crucial factor for inpatients, the agency wants to find if the length of
stay can be predicted from age, gender, and race.
6. To perform a complete analysis, the agency wants to find the variable that mainly affects hospital
costs.
rm(list = ls())
getwd()
setwd(choose.dir())
# Import Data
# 1. To record the patient statistics, the agency wants to find the age category of people who
frequently visit the hospital and has the maximum expenditure.
View(hosp_cost)
head(hosp_cost)
summary(hosp_cost)
summary(hosp_cost$AGE)
table(hosp_cost$AGE)
summary(as.factor(hosp_cost$AGE))
View(age_table)
class(age_table)
N_Max_Patient
Patient_Age_idx <- which.max(age_table$Freq)
Patient_Age_idx
Patient_Age
class(Expenditure_by_Age)
View(Expenditure_by_Age)
Max_Expenditure
Age_Max_Exp_idx
Age_Max_Exp
# Printing results
t <- table(hosp_cost$APRDRG)
d <- as.data.frame(t)
Group_Max_Hosp
Group_Max_Hosp_Freq
Expenditure_by_Group
Group_Max_Expe_idx
Group_with_Max_Expe
Group_Max_Expe
barplot(Expenditure_by_Group$TOTCHG,names.arg =Expenditure_by_Group$APRDRG,xlab =
"Diagnosis Group",ylab = "Expenditure",main = "Diagnosis Vs Expenditure")
# Printing results
#3. To make sure that there is no malpractice, the agency needs to analyze if the race of the patient
is related to the hospitalization costs
table(hosp_cost$RACE)
fit
summary(fit)
summary(fit1)
View(fit$fitted.values)
# Printing results
print("From R-squared value of linear model & p value of Analysis of Variance Model we can
conclude that race of the patient is not related to the hospitalization costs ")
#4. To properly utilize the costs, the agency has to analyze the severity of the hospital costs by age
and gender for proper allocation of resources.
table(hosp_cost$FEMALE)
summary(a)
summary(b)
# Printing results
print("From p value we can conclude that Age & gender have siginificane role in deciding cost")
#5. Since the length of stay is the crucial factor for inpatients, the agency wants to find if the length
of stay can be predicted from age, gender, and race.
table(hosp_cost$LOS)
summary(cat)
summary(cat)
print("From Variance test & linear model we can conclude that stay can not be predicted using age
gender or race")
#6. To perform a complete analysis, the agency wants to find the variable that mainly affects the
hospital costs.
aov(TOTCHG ~.,data=hosp_cost)
summary(mod)
print("From Variance test & linear model we can conclude that variables like Age,LOS & Aprdrg play
major role in the outcome of expanditure")