Professional Documents
Culture Documents
Ankita_Kesari_Resume
Ankita_Kesari_Resume
Ankita_Kesari_Resume
Technical Skills
Big Data Technologies, PySpark, SparkSQL, Airflow, Hadoop, Hive, HDFS, MapReduce, Oozie, Python, AWS, EMR, Glue,
PLSQL, ETL, SQL, No-SQL Database, RDBMS, Data Analysis, Data Modeling, Data Warehousing, Data Flow Development,
TDD, CI/CD, Power BI, Tableau, Shell Scripting, Git
Experience
S&P Global Nov 2022 – Present
Data Engineer 2
• Key contributor to migrating Hadoop infrastructure to AWS using Airflow, Amazon S3, Athena, Glue, and EMR
• Designed architecture for safeguarding PII data across cloud services, ensuring compliance with industry standards.
• Automated data ingestion into a 15+ Data Lake Catalog using Python, enhancing data discovery and access.
• Developed Python scripts to maintain consistency across 250+ Tableau dashboards and reports via the Meta Data
API.
S&P Global Jul 2021 – Nov 2022
Data Engineer
• Contributed to the development of end-to-end solutions, overseeing and optimizing a portfolio of 30+ data pipelines
and workflows.
• Took Ownership and performed root cause analysis for client such as Carfax and Worldview for data discrepancies
Education
Maharaja Agrasen Institute of Technology, GGSIPU 2017-2021
Bachelor of Computer Science Engineering (CGPA: 8.2) Delhi
Projects
Hadoop Migration | AWS, Glue, Airflow, Lambda, EMR, Athena, PySpark
• Played a key role in migration of On-prem Hadoop workflows to AWS
• Migrated and re-engineered legacy Hadoop jobs to be compatible with AWS services, ensuring minimal disruption.
• Automated workflow orchestration with Apache Airflow, enabling seamless coordination of data tasks.
• Optimized ETL workflows using Amazon EMR, reducing data processing time by 30%.
Global Auto Demand Tracker | AWS, Python, PLSQL Packages, Control-M, Power BI, SQL, Unix Scripting
• Collaborated with the Modeling team to create a data extraction package for new vehicle monthly data.
• Developed python scripts that generates a flat file, converts it to parquet format and pushes it to AWS S3 buckets
• Designed Power BI reports based on the dataset for embedding into the front-end platform for user accessibility
BD3 (Business Data Driven Decisions | AWS, Glue, ETL, Lambda, Python, Athena, PowerBI
• Designed and implemented data pipelines using AWS services such as S3, Glue, and Lambda
• Created and defined business-critical metrics and KPIs that align with business objectives
• Partnered with engineering, product, business, and finance teams to align on data requirements and deliver insights