Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

• 1a) Descriptive Statistics and Visualizations

Descriptive Statistics: Descriptive statistics provide a summary of the main


aspects of a dataset. Common measures include:
o Central Tendency:
§ Mean: Average value.
§ Median: Middle value.
§ Mode: Most frequent value.
o Variability:
§ Range: Difference between the maximum and minimum values.
§ Variance: Average squared deviation from the mean.
§ Standard Deviation: Square root of the variance.
Visualizations: Visualizations help in understanding the data distribution.
Common visualizations include histograms, box plots, scatter plots, and
summary tables.

1b) Hypothesis Tests for Vehicle Usage


Hypothesis Testing: Hypothesis tests determine whether there is a
significant difference or relationship in the population based on sample data.
Common tests include:
o t-test:
§ Compares means of two groups.
o ANOVA (Analysis of Variance):
§ Compares means of more than two groups.
Example: In the provided code, a t-test is conducted to compare the number
of hours per week in the vehicle based on gender. The null hypothesis might
be that there is no difference in the mean hours spent driving between
genders.

1c) Hypothesis Tests for Vehicle Preference


ANOVA: ANOVA is used to compare means across multiple groups. In the
example, ANOVA is applied to test whether there's a significant difference in
age based on vehicle type.

Example: The null hypothesis might be that there is no difference in the mean
age across different vehicle types.

1d) Identifying Variables Affecting Miles Driven


Linear Regression: Linear regression models the relationship between a
dependent variable and one or more independent variables. In the example, a
linear regression model is fitted to identify variables affecting miles driven per
week.
Example: The coefficients in the regression model provide insights into how
each variable influences miles driven.

1e) Advertisement Recommendation


Insights from Analysis: Based on the analysis, insights are generated to
inform the ad campaign. This could include understanding which vehicle types
are preferred, how satisfaction levels impact usage, and which demographic
factors play a significant role.

Recommendations: Recommendations for the ad campaign can be tailored


based on the identified insights. For example, if people prefer driving SUVs for
longer distances, the ad campaign could focus on the comfort and features of
SUVs for long drives.

The R code snippet to address each part of the questions:


# Load required libraries
library(tidyverse)

# Input data
data <- read.table(text = "
Vehicle_Driven Type Satisfaction_with_Vehicle Gender Age
#_of_hours_per_week_in_vehicle Miles_driven_per_week
Number_of_Children Average_number_of_riders
Miles_from_work
Truck Domestic Yes Male 31 10 450 0
1 30
Truck Domestic Yes Male 29 5 370 1
1 22
# ... (data for all 50 individuals)
Car Foreign No Female 19 5 500 0 2
4
", header = TRUE, stringsAsFactors = FALSE)

# 1a) Descriptive statistics and visualizations


summary(data)
str(data)
head(data)

# 1b) Hypothesis tests for vehicle usage across different factors


# Example: t-test for # of hours per week by Gender
t.test(#_of_hours_per_week_in_vehicle ~ Gender, data = data)
# Conduct similar tests for other factors (Vehicle Type, Vehicle
Driven, Satisfaction with Vehicle)

# 1c) Hypothesis tests for vehicle preference based on distance from


work or age
# Example: ANOVA for Age vs Vehicle Type
lm_model <- lm(Age ~ Vehicle_Type, data = data)
anova(lm_model)

# Conduct similar tests for other factors (Vehicle Driven,


Satisfaction with Vehicle, Miles from Work)

# 1d) Identify variables significantly affecting miles driven per


week
lm_model_miles <- lm(Miles_driven_per_week ~ ., data = data)
summary(lm_model_miles)

# 1e) Advertisement recommendation based on insights


# Generate insights based on the analysis and provide recommendations
for the ad campaign.

# Note: The code snippets for 1b, 1c, 1d are placeholders. You need
to replace them with appropriate tests and models based on the nature
of your data and hypotheses.

.
Below is a more comprehensive R code that includes reading data from a file,
performing descriptive statistics, conducting hypothesis tests, and providing
recommendations for the ad campaign:
# Load required libraries
library(tidyverse)

# Read data from a CSV file (replace 'your_data.csv' with the actual
file name)
data <- read.csv("your_data.csv", header = TRUE)

# 1a) Descriptive statistics and visualizations


summary(data)
str(data)
head(data)

# 1b) Hypothesis tests for vehicle usage across different factors


# Example: t-test for # of hours per week by Gender
t_test_gender <- t.test(#_of_hours_per_week_in_vehicle ~ Gender, data
= data)
print(t_test_gender)

# Conduct similar tests for other factors (Vehicle Type, Vehicle


Driven, Satisfaction with Vehicle)

# 1c) Hypothesis tests for vehicle preference based on distance from


work or age
# Example: ANOVA for Age vs Vehicle Type
anova_age_vehicle_type <- aov(Age ~ Vehicle_Type, data = data)
print(summary(anova_age_vehicle_type))

# Conduct similar tests for other factors (Vehicle Driven,


Satisfaction with Vehicle, Miles from Work)

# 1d) Identify variables significantly affecting miles driven per


week
lm_model_miles <- lm(Miles_driven_per_week ~ ., data = data)
summary(lm_model_miles)

# 1e) Advertisement recommendation based on insights


# Generate insights based on the analysis and provide recommendations
for the ad campaign.

# Note: The code snippets for 1b, 1c, 1d are placeholders. You need
to replace them with appropriate tests and models based on the nature
of your data and hypotheses.

Make sure to replace "your_data.csv" with the actual file path or name of
your data file. Additionally, customize the hypothesis tests and models based
on your specific research questions and data characteristics.

You might also like