Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Data Science – Fall 2023

Assignment 01
Instructions: Submit the hand written responses.
Scenario-1: Customer Churn Prediction for an E-commerce Platform
Business Problem:
An e-commerce platform named "ShopSmart" has been experiencing a decline in
customer retention over the past year. The management is concerned about losing
valuable customers and wants to implement strategies to reduce churn.
Data Science Problem Scenario:
Objective: Predict potential customer churn and identify factors influencing customer
retention.
Background:
ShopSmart has collected data on customer transactions, browsing behavior, purchase
history, and customer feedback over the last two years.
Dataset Information:
The dataset for ShopSmart's customer churn prediction scenario encompasses a
variety of data sources related to customer interactions, transactions, and feedback,
including:
 Customer ID (anonymized)
 Transaction history
 Browsing behavior data
 Purchase history
 Customer feedback and reviews
 Customer demographics (if available)
 Customer loyalty program participation
 Marketing campaign interactions
Problem Solving Questions:
 Can we build a predictive model to identify customers at risk of churning based
on their historical behavior and interactions?
 What are the key features or factors that contribute to customer churn in the e-
commerce platform?
 How can we leverage this predictive model to proactively engage with at-risk
customers and improve retention rates?
Expected Outcomes:
Implementing targeted marketing strategies and personalized incentives for customers
identified as high-risk for churn.
Scenario 2: Healthcare Resource Optimization for a Hospital
Business Problem:
A large hospital, "HealthCare Haven," is looking to optimize its resource allocation to
enhance patient care and operational efficiency.
Data Science Problem Scenario:
Objective: Optimize resource allocation in the hospital for better patient care and
operational efficiency.
Background:
HealthCare Haven has a vast dataset containing information on patient admissions,
discharge times, medical staff schedules, equipment usage, and historical patient
outcomes.
Dataset Information:
The dataset for Healthcare Haven's resource optimization includes comprehensive data
on various aspects of hospital operations, encompassing:
 Patient admissions data
 Discharge times
 Medical staff schedules
 Equipment usage and availability
 Historical patient outcomes
 Emergency room wait times
 Specialized department utilization metrics
 Patient demographics and medical history (anonymized)
 Staff performance metrics
Problem Solving Questions:
 Can we develop a predictive model to forecast patient admission rates for
different departments?
 How can we optimize medical staff schedules based on predicted patient
admission patterns?
 Is there an opportunity to streamline the usage of medical equipment to minimize
downtime and improve resource utilization?
Expected Outcomes:
Improved patient care, reduced waiting times, and enhanced operational efficiency
through optimized resource allocation based on data-driven insights.
Scenario 3: Social Media Campaign Effectiveness Analysis
Business Problem:
A digital marketing agency, "SocialBoost," is running social media campaigns for
various clients. The agency wants to assess the effectiveness of these campaigns and
optimize future strategies.
Data Science Problem Scenario:
Objective: Analyze the effectiveness of social media campaigns and optimize future
strategies.
Background:
SocialBoost has data on campaign impressions, click-through rates, conversion rates,
and customer engagement metrics from various social media platforms for the past two
years.
Dataset Information:
The dataset comprises detailed information on SocialBoost's social media campaigns,
including:
 Campaign ID
 Impressions
 Click-through rates
 Conversion rates
 Customer engagement metrics
 Demographic information of the audience
Problem Solving Questions:
 Can we identify key performance indicators (KPIs) that correlate with successful
campaign outcomes?
 How can we segment the audience to tailor campaigns for different
demographics?
 Are there specific times or days when campaigns are more effective in terms of
engagement and conversions?
Expected Outcomes:
Data-driven insights to optimize future social media campaigns, improve targeting, and
maximize ROI for clients.
Based on the above business objectives and problem solving questions, follow
the Data science problem solving cycle to formulate the proposed solution w.r.t
following parameters:
a. Understand the problem and classify as parametric or statistics
b. Design the questions from business objectives
c. Verify the questions from data and objectives
d. Is the current data appropriate for analysis?
e. Classify the sampling techniques where applicable
f. If the desired data is not available for any of the objective mentioned above,
which data collection technique is more appropriate and why?
g. Identify the potential variables (Independent & Dependent) for each business
objective
h. If the experiment will be executed without collecting the missing data desired for
specific business objective then which type of error may occur?

You might also like