Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

Objective 1:

Get a list of the last 1 Million repos from Github, load data into a dataset, and run
an exploratory data analysis
source: List public repositories.

Highlight any differences and trends in the dataset by user attributes, programming
languages, and other attributes.

Note: you need to fetch at least 600,000 repos for us to consider this adequately
completed.

Deliverable:
1. Share the codebase in a notebook or python file, the dataset, any assets
created, and instructions to run it.
2. Would be assessed on:
a. Quality, testing, and reliability of the data pipeline
 Bonus: Loading data into a cloud data warehouse (GCP, Azure,
AWS, etc.)
b. Visualization of results
 Focus on identifying and analyzing distinct audiences in the
dataset (you can use descriptive or predictive approaches).

continued on the next page…

Objective 2:
Perform descriptive analysis on the following dataset using SQL. Please attach the
code and the output with the analysis in your response.

Dataset: https://docs.google.com/spreadsheets/d/1cvmeIwsYtIjO7plae-
VNXrpQoQEPywKd/edit?
usp=sharing&ouid=100816496332535760387&rtpof=true&sd=true

Deliverable:
Share the codebase, any assets created, and instructions to run it.
We are checking:
 Quality of code and analysis
Bonus: Use a cloud platform (GCP, Azure, or AWS)
 Statistical techniques used for descriptive/ predictive analysis
 Data Cleaning and Preparation
 Findings
 Visualizations
 Actionable Insights
 Formulating Hypothesis

Continued on the next page...

Objective 3

Consider the similarity to Daraz-type Ecom Marketplace, which has to onboard


vendors for their marketplace.

Current process:
A vendor contacts the marketplace to be listed. After an initial call, the marketplace
sends people to the vendor’s warehouse/shop to check if the business is legitimate.
They match the stock and SKUs promised and give a KYC (know your customer)
form to fill out. After these formalities, the vendor provides an SKU catalog (images
and price list). Furthermore, both agreements must be signed, and a vendor must
be onboarded.

Requirement:
We need to automate the process and have a system in place so there are minimal
manual steps; please assume the vendors are computer-literate.
Please take assumptions and provide anything from the list below:
 1 pager on the new solution
 System Architecture Diagram
 Wireframes (for ex: Balsamiq/ draw.io) - partial with explicit assumptions is
also acceptable
 ERD (for ex: erdplus.com) - partial with explicit assumptions is also
acceptable

You can wow us by doing something else as well!


Essential Things to Remember:
Complete the solution without overcomplicating it. Remember the prototyping
principle: to make a car, first make a skateboard, cycle, motorcycle, quad bike, then
a car (here we are looking for a cycle with the vision of one day being a car).

We are checking:
If you can limit your imagination and focus on key areas to come up with a
complete, practical, and industry-standard solution that meets the vision of the
client, and If you possess a technical understanding of how systems work, do you
keep empathy towards people who will use your solutions and all the stakeholders
you have to onboard with your solutions.

You might also like