Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

The Client

XYZ is a private equity firm in US. Due to remarkable growth in the Cab Industry in last few years and multiple key players
in the market, it is planning for an investment in Cab industry.
The Engagement
You have been provided with multiple data sets that contains information on 2 cab companies. Each file (data set)
provided represents different aspects of the customer profile. XYZ is interested in using your actionable insights to help
them identify the right company to make their investment.
The outcome of your engagement will be a presentation to XYZ’s Executive team. This presentation will be judged
based on the visuals provided, the quality of your analysis and the value of your recommendations and insights.
The Data
You have been provided 4 individual data sets. Time period of data is from 31/01/2016 to 31/12/2018.
Below are the list of datasets which are provided for the analysis:
Cab_Data.csv – this file includes details of transaction for 2 cab companies
Customer_ID.csv – this is a mapping table that contains a unique identifier which links the customer’s demographic details
Transaction_ID.csv – this is a mapping table that contains transaction to customer mapping and payment mode
City.csv – this file contains list of US cities, their population and number of cab users

You should fully investigate and understand each data set.


 Review the Source Documentation  Determine which files should be joined versus
which ones should be appended
 Understand the field names and data types
 Aggregate the data into smaller files and define
 Identify relationships across the files
relationships
 Field/feature transformations
 Identify and remove duplicates
 Identify any biasness existing in the data
Be Creative
The idea is to create a hypothesis, engage with the data, think critically, and use various analytical approaches to produce
unique insights.
You are not limited to only utilizing the data you have been provided.
We encourage you to find third party data sets which correspond to the overall theme and geographical properties of the
data provided. For Example: you can leverage US weather data
Also, do research on overall cab industry in US and try to relate that with the trend in data
Analysis
Create multiple hypothesis and investigate:

Prepare a PowerBI dashboard which answers some of the following questions but not restricted to them alone. Add
appropriate filters to make the dashboard dynamic.

1. Evolution of profit margin over the given time period. Create a view to be able to drill down from Yearly to Quarterly to
Monthly.

2. i) Customer concentration - show the revenue of the Top 20 customers and their contribution to the overall revenue of the
cab company.

ii) Create a drill through view of the customer demographics upon selecting any particular customer in this view.
3. Create views to show the Revenue and Margin distribution of customers by Gender, Age and Income brackets. Would be
great to have this view dynamic upon user selection (example - a dropdown with Revenue/Margin selector. If user selects
"Revenue", the chart here shows a revenue distribution. If "Margin" is selected, the chart shows a Margin distribution).

4. Which is a more popular payment mode in each of the companies.

5. Create a geographic map view to show the Revenue and Margin distribution across different cities in which both cab
companies operate.

Generate 5-7 hypothesis initially to investigate as some will not prove what you are expecting.
For Example: “Is there any seasonality in number of customers using the cab service?”
Areas to investigate:
 Which company has maximum cab users at a particular time period?
 Does margin proportionally increase with increase in number of customers?
 What are the attributes of these customer segments?

Although not required, we encourage you to document the process and findings
 What is the business problem?
 What are the properties of the data provided?
 What steps did you take in order to create an applicable data set?
 What type of analysis is applicable to your hypothesis?
 How did you prepare and perform your analysis?
 What type of analysis did you perform?
 Why did you choose to use certain analytical techniques over others?
 What were the results?
 How do the results translate to support your hypothesis?

Remember, there are no wrong answers as long as the data supports


them.

You might also like