Professional Documents
Culture Documents
MarketLytics_DA
MarketLytics_DA
Get a list of the last 1 Million repos from Github, load data into a dataset, and run
an exploratory data analysis
source: List public repositories.
Highlight any differences and trends in the dataset by user attributes, programming
languages, and other attributes.
Note: you need to fetch at least 600,000 repos for us to consider this adequately
completed.
Deliverable:
1. Share the codebase in a notebook or python file, the dataset, any assets
created, and instructions to run it.
2. Would be assessed on:
a. Quality, testing, and reliability of the data pipeline
Bonus: Loading data into a cloud data warehouse (GCP, Azure,
AWS, etc.)
b. Visualization of results
Focus on identifying and analyzing distinct audiences in the
dataset (you can use descriptive or predictive approaches).
Objective 2:
Perform descriptive analysis on the following dataset using SQL. Please attach the
code and the output with the analysis in your response.
Dataset: https://docs.google.com/spreadsheets/d/1cvmeIwsYtIjO7plae-
VNXrpQoQEPywKd/edit?
usp=sharing&ouid=100816496332535760387&rtpof=true&sd=true
Deliverable:
Share the codebase, any assets created, and instructions to run it.
We are checking:
Quality of code and analysis
Bonus: Use a cloud platform (GCP, Azure, or AWS)
Statistical techniques used for descriptive/ predictive analysis
Data Cleaning and Preparation
Findings
Visualizations
Actionable Insights
Formulating Hypothesis
Objective 3
Current process:
A vendor contacts the marketplace to be listed. After an initial call, the marketplace
sends people to the vendor’s warehouse/shop to check if the business is legitimate.
They match the stock and SKUs promised and give a KYC (know your customer)
form to fill out. After these formalities, the vendor provides an SKU catalog (images
and price list). Furthermore, both agreements must be signed, and a vendor must
be onboarded.
Requirement:
We need to automate the process and have a system in place so there are minimal
manual steps; please assume the vendors are computer-literate.
Please take assumptions and provide anything from the list below:
1 pager on the new solution
System Architecture Diagram
Wireframes (for ex: Balsamiq/ draw.io) - partial with explicit assumptions is
also acceptable
ERD (for ex: erdplus.com) - partial with explicit assumptions is also
acceptable
We are checking:
If you can limit your imagination and focus on key areas to come up with a
complete, practical, and industry-standard solution that meets the vision of the
client, and If you possess a technical understanding of how systems work, do you
keep empathy towards people who will use your solutions and all the stakeholders
you have to onboard with your solutions.