Welcome to Scribd!

Proposal For Coding Challenge

Uploaded by

0% found this document useful (0 votes)

40 views3 pages

The data engineer is tasked with creating a proof of concept (PoC) for migrating historic data from CSV files to a new SQL database. The PoC must: 1) migrate CSV data to the database, 2) create an API to ingest new data in batches while adhering to data rules, 3) backup each table to AVRO files, and 4) restore a table from backup. The code will be published to GitHub and evaluated based on updates showing development progress. Python, Java, Go or Scala can be used.

Original Description:

Original Title

Proposal for coding challenge

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

0% found this document useful (0 votes)

40 views3 pages

Proposal For Coding Challenge

Uploaded by

Kimball

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

Jump to Page

You are on page 1of 3

Search inside document

Challenge #1

You are a data engineer at Globant and you are about to start an important project. This project
is big data migration to a new database system. You need to create a PoC to solve the next
requirements:

1. Move historic data from files in CSV format to the new database.
2. Create a Rest API service to receive new data. This service must have:
2.1. Each new transaction must fit the data dictionary rules.
2.2. Be able to insert batch transactions (1 up to 1000 rows) with one request.
2.3. Receive the data for each table in the same service.
2.4. Keep in mind the data rules for each table.
3. Create a feature to backup for each table and save it in the file system in AVRO format.
4. Create a feature to restore a certain table with its backup.

You need to publish your code in GitHub. It will be taken into account if frequent updates are
made to the repository that allow analyzing the development process.

Clarifications
● You decide the origin where the CSV files are located.
● You decide the destination database type, but it must be a SQL database.
● The CSV file is comma separated.
● "Feature" must be interpreted as "Rest API, Stored Procedure, Database functionality,
Cron job, or any other way to accomplish the requirements".

Not mandatory, but taken into account:

● Create a markdown file for the Readme.md
● Security considerations for your API service
● Use the Git workflow to create versions
● Create a Dockerfile to deploy the package
● Use cloud tools instead of local tools

You can use Python, Java, Go or Scala to solve it!

Data Rules
● Transactions that don't accomplish the rules must not be inserted but they must be
logged.
● All the fields are required.

Csv files structures

hired_employees.csv:
id INTEGER Id of the employee

name STRING Name and surname of the

employee

datetime STRING Hire datetime in ISO format

department_id INTEGER Id of the department which

the employee was hired for

job_id INTEGER Id of the job which the

employee was hired for

4535,Marcelo Gonzalez,2021-07-27T16:02:08Z,1,2
4572,Lidia Mendez,2021-07-27T19:04:09Z,1,2
Link to the file

departments.csv

id INTEGER Id of the department

department STRING Name of the department

1, Supply Chain
2, Maintenance
3, Staff
Link to the file

jobs.csv:

id INTEGER Id of the job

job STRING Name of the job

1, Recruiter
2, Manager
3, Analyst
Link to the file
Challenge #2
You need to explore the data that was inserted in the first challenge. The stakeholders ask for
some specific metrics they need. You should create an end-point for each requirement.

Requirements

● Number of employees hired for each job and department in 2021 divided by quarter. The
table must be ordered alphabetically by department and job.

Output example:

department job Q1 Q2 Q3 Q4

Staff Recruiter 3 0 7 11

Staff Manager 2 1 0 2

Supply Chain Manager 0 1 3 0

● List of ids, name and number of employees hired of each department that hired more
employees than the mean of employees hired in 2021 for all the departments, ordered
by the number of employees hired (descending).

Output example:

id department hired

Staff Staff 45

Staff Supply Chain 12

Not mandatory, but taken into account:

● Create a visual report for each requirement using your favorite tool

M5L2 Homwork
Document11 pages
M5L2 Homwork
Richard Liu
No ratings yet
PowerEdge - How To Recover From A Corrupt BIOS or Interrupted BIOS Update - Dell UK
Document8 pages
PowerEdge - How To Recover From A Corrupt BIOS or Interrupted BIOS Update - Dell UK
sts100
No ratings yet
Exercise 2 - Project Schedule and Resource Planning
Document2 pages
Exercise 2 - Project Schedule and Resource Planning
Rouda Aldosari
No ratings yet
Logical Data Modeling Guide
Document13 pages
Logical Data Modeling Guide
Boyapally Ravikanth Reddy
No ratings yet
Counteract 6 3 4 0 Installation Guide
Document82 pages
Counteract 6 3 4 0 Installation Guide
chaosmagnet
No ratings yet
Employee Management System: CS8582-Object Oriented Analysis and Design Lab
Document19 pages
Employee Management System: CS8582-Object Oriented Analysis and Design Lab
GANESHKARTHIKEYAN V
No ratings yet
Project Milestone Requirements: Behavioral/functional Non-Behavioral/extra-Functional
Document2 pages
Project Milestone Requirements: Behavioral/functional Non-Behavioral/extra-Functional
Javed Ahamed
No ratings yet
Assignment Instructions
Document5 pages
Assignment Instructions
Obuya Ochieng
No ratings yet
Problem Definition: Problem Definition: 1.: Group 1-Di200109-G05 Eproject Semester 2, Year 1-Accp I7.1 I
Document3 pages
Problem Definition: Problem Definition: 1.: Group 1-Di200109-G05 Eproject Semester 2, Year 1-Accp I7.1 I
Ngoc Buu
No ratings yet
Creating A Bill of Material Row Number in Plant 3D ReportsProcess Design, From The Outside - Process Design, From The Outside
Document5 pages
Creating A Bill of Material Row Number in Plant 3D ReportsProcess Design, From The Outside - Process Design, From The Outside
Tiago Ferreira
No ratings yet
BPD-PS-BPML-01 - Masters
Document13 pages
BPD-PS-BPML-01 - Masters
Sudhir Verma
No ratings yet
Nguyen Dao - Analytics Engineer Resume
Document3 pages
Nguyen Dao - Analytics Engineer Resume
Nguyen Dao
No ratings yet
Aneesh Palla SQL Resume
Document4 pages
Aneesh Palla SQL Resume
svinfotech
No ratings yet
DataStage Technical Design and Construction Procedures
Document93 pages
DataStage Technical Design and Construction Procedures
Nithin Aramkuni
No ratings yet
Lab 6 and 7 ITSE
Document8 pages
Lab 6 and 7 ITSE
ghazi members
No ratings yet
Naukri SheetalGoyal (8y 6m)
Document5 pages
Naukri SheetalGoyal (8y 6m)
Smriti Pal
No ratings yet
Lecture 10: Database Lisa (Ling) Liu: C# Programming in Depth
Document34 pages
Lecture 10: Database Lisa (Ling) Liu: C# Programming in Depth
Anurag SuperTramp Sharma
No ratings yet
Level 4
Document10 pages
Level 4
Esubalew
No ratings yet
A Project Report On Employee Tracking System
Document10 pages
A Project Report On Employee Tracking System
ss1sumitsatish
No ratings yet
Title: WUCT Human Resource Management Produced By: Israa Ata, Huda Osho & Raneen Supervisor: Mr. Bassam Rajabi Course: Software Engineering
Document49 pages
Title: WUCT Human Resource Management Produced By: Israa Ata, Huda Osho & Raneen Supervisor: Mr. Bassam Rajabi Course: Software Engineering
Bassam Rajabi
No ratings yet
Assignment #2: Entity - Relationship Diagram and Implementation in Access (DUE: Thursday, Feb 17, 2005)
Document3 pages
Assignment #2: Entity - Relationship Diagram and Implementation in Access (DUE: Thursday, Feb 17, 2005)
warda hassan
No ratings yet
Maintenance Program Development For New Assets Services
Document8 pages
Maintenance Program Development For New Assets Services
Elvis Diaz
No ratings yet
Australia PR Docs Information
Document7 pages
Australia PR Docs Information
rsreddy434
No ratings yet
QP SSC Q2212 Domestic Data Entry Operator
Document25 pages
QP SSC Q2212 Domestic Data Entry Operator
Abhishek Das
No ratings yet
4COSC003W.Y Computer Science Practice
Document5 pages
4COSC003W.Y Computer Science Practice
Tharusha Nethmini
No ratings yet
HH
Document48 pages
HH
bipin012
No ratings yet
CPAS 211L - Assignment 1
Document2 pages
CPAS 211L - Assignment 1
mirror
No ratings yet
SQL Guid Vs Int
Document19 pages
SQL Guid Vs Int
Francisco Carabez
No ratings yet
IE CoreE InvoiceMicroServiceDesign 270423 0628
Document20 pages
IE CoreE InvoiceMicroServiceDesign 270423 0628
ali.taruf
No ratings yet
Resume Template For Teradata
Document7 pages
Resume Template For Teradata
satheesh_240
No ratings yet
Vimal Raja V Resume
Document6 pages
Vimal Raja V Resume
raaji
No ratings yet
Atchut Talend
Document4 pages
Atchut Talend
Pranay Jain
No ratings yet
1 Overview (Bi)
Document12 pages
1 Overview (Bi)
narreddykarthik
No ratings yet
Sam
Document3 pages
Sam
Shubham Deokate
No ratings yet
Profile: Nguyễn Thị Hồng Thủy
Document4 pages
Profile: Nguyễn Thị Hồng Thủy
Tôi Là
No ratings yet
AshutoshLaxmanGaikwad (6y 0m)
Document6 pages
AshutoshLaxmanGaikwad (6y 0m)
sajema4401
No ratings yet
Project Architecture
Document7 pages
Project Architecture
Gondrala Balaji
100% (1)
SQL Guid Vs Int
Document19 pages
SQL Guid Vs Int
Francisco Carabez
No ratings yet
IT Short Form Project Document: 001HR - BPC - METADATA - EXTRACT - 04012015
Document8 pages
IT Short Form Project Document: 001HR - BPC - METADATA - EXTRACT - 04012015
anu
No ratings yet
Project PaySlip 2016 Vishal.d
Document30 pages
Project PaySlip 2016 Vishal.d
Vishal Krishna
No ratings yet
Sudesh Php-Project-Report
Document35 pages
Sudesh Php-Project-Report
atul gaikwad
No ratings yet
Sandeep Talend
Document3 pages
Sandeep Talend
ebaad chowdhry
No ratings yet
Dao Ngoc Bach Nguyen
Document3 pages
Dao Ngoc Bach Nguyen
Nguyen Dao
No ratings yet
Manoranjanbhola Application Production Support MPHASIS
Document3 pages
Manoranjanbhola Application Production Support MPHASIS
Soniya chaudhary
No ratings yet
(8 Marks) : Activity Description Optimistic Likely Pessimistic Predecessors
Document6 pages
(8 Marks) : Activity Description Optimistic Likely Pessimistic Predecessors
pavan
No ratings yet
Software Company
Document14 pages
Software Company
Hema Sundar
No ratings yet
Rahul Neekhra: Summary of Qualifications
Document4 pages
Rahul Neekhra: Summary of Qualifications
Rahul Neekkhra
No ratings yet
ERD1 - Group Consultants Group Consultants Is A Multinational IT Firm. The Company Employs
Document1 page
ERD1 - Group Consultants Group Consultants Is A Multinational IT Firm. The Company Employs
San Ese
No ratings yet
Using DAO (Data Access Objects) Code - Visual Basic 6 (VB6
Document72 pages
Using DAO (Data Access Objects) Code - Visual Basic 6 (VB6
Avijit Jaanaa
33% (3)
Bhaskar ADE - Altimetrik
Document3 pages
Bhaskar ADE - Altimetrik
raaman
No ratings yet
2022-10-21 Assignemts Normalization Implementation Solutions
Document10 pages
2022-10-21 Assignemts Normalization Implementation Solutions
Void Seeker
No ratings yet
Software Project Management: Online It Administratio N
Document26 pages
Software Project Management: Online It Administratio N
Prafulla S Chavadi
No ratings yet
Naukri sudharshanReddyG (8y 3m)
Document5 pages
Naukri sudharshanReddyG (8y 3m)
Arup Naskar
No ratings yet
Solaris SCSAI
Document24 pages
Solaris SCSAI
srinivas
No ratings yet
G.Eswar Reddy Mobile:-91+9642182661 Mail
Document4 pages
G.Eswar Reddy Mobile:-91+9642182661 Mail
SriReddy
No ratings yet
Krishna Reddy: 3+ Years of IT Experience in Which 2+ Years IT Experience in SAP BO EIM Tools SAP
Document4 pages
Krishna Reddy: 3+ Years of IT Experience in Which 2+ Years IT Experience in SAP BO EIM Tools SAP
SriReddy
No ratings yet
13 Er PDF
Document4 pages
13 Er PDF
Obaid Rehman
No ratings yet
I. Objective: Exercise #3 Creating Data Flow Diagrams Using Oracle Designer
Document4 pages
I. Objective: Exercise #3 Creating Data Flow Diagrams Using Oracle Designer
jtpml
No ratings yet
Understand WorkFlow in Detail
Document118 pages
Understand WorkFlow in Detail
Saquib Mahmood
No ratings yet
Phone: 9492294470 E-Mail:: K. Durga Srinivas Rao
Document6 pages
Phone: 9492294470 E-Mail:: K. Durga Srinivas Rao
Srinivas
No ratings yet
Naukri VinodChandrakantKhot (10y 0m)
Document5 pages
Naukri VinodChandrakantKhot (10y 0m)
Main Shayar Badnam
No ratings yet
Visual Basic 2010 Coding Briefs Data Access
From Everand
Visual Basic 2010 Coding Briefs Data Access
Kevin Hough
Rating: 5 out of 5 stars
5/5 (1)
02 Huawei Elab Service Product V1.0 Quick Reference Guide
Document2 pages
02 Huawei Elab Service Product V1.0 Quick Reference Guide
Kimball
No ratings yet
Alex Turner - Project B10
Document11 pages
Alex Turner - Project B10
Kimball
No ratings yet
ICPNA - Proyect Basic5
Document13 pages
ICPNA - Proyect Basic5
Kimball
No ratings yet
Three Popular Restaurants in Lima
Document13 pages
Three Popular Restaurants in Lima
Kimball
No ratings yet
Order of Adjectives
Document1 page
Order of Adjectives
Kimball
No ratings yet
Java Multiple Choice Qs.
Document65 pages
Java Multiple Choice Qs.
api-3757004
100% (3)
Eventlog
Document4 pages
Eventlog
Daniel Quiroga
No ratings yet
EasyUnicode5 ReadMe PDF
Document3 pages
EasyUnicode5 ReadMe PDF
michaelpequod
No ratings yet
NetXMS Presentation
Document11 pages
NetXMS Presentation
ಪ್ರತೀಕ್ಷಣ ಪ್ರೀತಿಯಿಂದ ಪ್ರಣವ
No ratings yet
Query Designer - Exclude Values, Pattern Matching - SCN
Document3 pages
Query Designer - Exclude Values, Pattern Matching - SCN
mig007
No ratings yet
Tips and Tricks Using The Preprocessor (Part Two)
Document9 pages
Tips and Tricks Using The Preprocessor (Part Two)
VindyachalTiwari
No ratings yet
!# (Oo$Z6Si) # Pubg Mobile Hack ##Newest Cheat - Free Uc Pubg Mobile Generator 2021 (IOS, ANDROID)
Document2 pages
!# (Oo$Z6Si) # Pubg Mobile Hack ##Newest Cheat - Free Uc Pubg Mobile Generator 2021 (IOS, ANDROID)
Maddy
No ratings yet
BT2 - Project 1 - 43MF750
Document4 pages
BT2 - Project 1 - 43MF750
Bichvan Ngo
No ratings yet
Abap HR Document PDF
Document83 pages
Abap HR Document PDF
David Pieschacon Rueda
No ratings yet
Core Security Introduction To Software Vulnerability Exploitation
Document74 pages
Core Security Introduction To Software Vulnerability Exploitation
mohdamer161616
No ratings yet
Lexical Analyser in C++ - ASHWATH KV - 106120017 For Full Code
Document4 pages
Lexical Analyser in C++ - ASHWATH KV - 106120017 For Full Code
lilly pushparani
No ratings yet
What Is Ramdisk - Ramdisk Demystified From The Perspective of Embedded System
Document2 pages
What Is Ramdisk - Ramdisk Demystified From The Perspective of Embedded System
chuiyewleong
No ratings yet
4 1 e A Softwaremodelingintroductionvideo
Document7 pages
4 1 e A Softwaremodelingintroductionvideo
api-291536844
No ratings yet
GeoGebra 5.04 Brochure English PDF
Document2 pages
GeoGebra 5.04 Brochure English PDF
Archana
No ratings yet
PP 2 Man
Document128 pages
PP 2 Man
Tuan Tran Thanh
No ratings yet
???? ???????? ???????
Document3 pages
???? ???????? ???????
Maharani Rani Fitria
No ratings yet
TWPP WebRTC Plugin Website
Document4 pages
TWPP WebRTC Plugin Website
Jefferson Jhon Delgado Vasquez
No ratings yet
Industry 4.0 Revolution PowerPoint Templates
Document48 pages
Industry 4.0 Revolution PowerPoint Templates
chymaula
No ratings yet
FAQs
Document54 pages
FAQs
Gabriel Aguilar
No ratings yet
AWS Toolkit For VS Code: User Guide
Document34 pages
AWS Toolkit For VS Code: User Guide
shadman
No ratings yet
How To Unprotect Excel Sheet With Password
Document11 pages
How To Unprotect Excel Sheet With Password
Sabah Salih
No ratings yet
Manual MindSphere GettingStarted en - PDF PDF
Document103 pages
Manual MindSphere GettingStarted en - PDF PDF
fasp
No ratings yet
JVC Music Play Application Troubleshooting
Document2 pages
JVC Music Play Application Troubleshooting
04147074974 Torres
No ratings yet
Office 2016 Activation
Document2 pages
Office 2016 Activation
RUSLI
No ratings yet
XDuooX3 - Main - Wiki PDF
Document1 page
XDuooX3 - Main - Wiki PDF
Mita Febriana51418406
No ratings yet
Log
Document111 pages
Log
Earlene Alva
No ratings yet
Activar Office 365
Document1 page
Activar Office 365
BrendaPaolaMartinezVasquez
0% (1)
Cad 12
Document6 pages
Cad 12
Satria Budi Utama
No ratings yet