Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

Summary

Experienced and detail-oriented Data Engineer with [X years] of expertise in designing, implementing,
and maintaining scalable data solutions. Proficient in ETL processes, data modeling, and database
management. Adept at collaborating with cross-functional teams to deliver high-quality and reliable
data pipelines.

Technical Skills
 Languages: Python, SQL
 Big Data Technologies: Apache Spark, Hadoop
 Data Warehousing: Snowflake, Amazon Redshift
 Database Technologies: MySQL, PostgreSQL, MongoDB
 ETL Tools: Apache NiFi, Talend
 Cloud Platforms: AWS, Azure, Google Cloud Platform
 Version Control: Git
 Workflow Management: Apache Airflow
 Data Modeling: ERD, Dimensional Modeling

PROFESSIONAL EXPERIENCE

Affinity Health Plan, Newark, CA


Senior Data Engineer
Responsibilities:
 Spearheaded the development of a real-time data processing pipeline using Apache Spark and
Kafka, resulting in a 30% reduction in data processing time.
 Designed and implemented a data lake architecture on AWS S3, optimizing data storage and
retrieval for analytical purposes.
 Collaborate with data scientists and analysts to understand data requirements and design data
models.
 Collaborate with cross-functional teams to gather and analyze data requirements for new projects.
 Implemented high-performance data processing pipelines with Apache Spark, Hadoop, and
Kafka to ensure accurate and reliable data ingestion, transformation, and loading.
 Designed and developed custom analytics solutions using Python, Pyspark, Databricks, SQL, and
created interactive dashboards with Tableau and Power BI.
 Implement and manage version control for code and configuration files using Git.
 Implement and optimize data pipelines for real-time and batch processing using Apache Spark and
Apache NiFi.
 Ensure data quality and consistency by implementing data validation and cleansing processes.
 Led the development of a real-time analytics platform using Apache Spark, Kafka, and AWS S3.
 Implemented streaming data processing for live dashboards, providing actionable insights to
stakeholders.
 Designed and optimized data schemas for efficient storage and retrieval.
 Led the integration of external data sources into the company's data ecosystem, ensuring data
quality and consistency.

Aetna Inc. Hartford, CT


Data Engineer
Responsibilities:
 Oversaw the migration of an on-premises data warehouse to Snowflake on AWS, improving query
performance by 40%.
 Designed and implemented complex ETL processes using Talend for ingesting, transforming, and
loading large volumes of data.
 Collaborated with data scientists to deploy machine learning models into production, enabling real-
time data predictions.
 Involved in loading the structured and semi structured data into spark clusters using Spark SQL
and Data Frames API.
 Ensured data security and privacy compliance throughout the data lifecycle.
 Design, develop, and maintain ETL processes for ingesting, transforming, and loading data into
various data stores.
 Manage and optimize cloud-based data storage solutions, such as AWS S3 and Snowflake.
 Monitor and troubleshoot data pipeline issues, ensuring high availability and reliability.
 Oversaw the migration of a legacy on-premises data warehouse to Snowflake on AWS.
 Designed and implemented ETL processes for seamless data migration, ensuring data consistency
and integrity.
 Collaborated with business analysts to define data models and optimize queries for analytics.
 Developed a data integration solution for IoT devices using Apache NiFi and MongoDB.
 Implemented data cleansing and transformation processes to handle large volumes of sensor data.

Capital one, McLean, VA Aug 2018 - Apr 2019


Data Engineer
Responsibilities:
 Built and maintained scalable databases on cloud-based infrastructure such as AWS, Azure, and
GCP, and managed database platforms like MySQL, PostgreSQL, and MongoDB to ensure data
availability, reliability, and security.
 Developed and maintained efficient ETL pipelines with Talend and Apache NiFi to ingest data
from multiple sources, transform and load it into data warehouses or data lakes for further
processing.
 Configured different File Formats like Avro, parquet for HIVE querying and processing based on
business logic.
 Utilized Sequence files, RC files, Map side joins, bucketing, partitioning for Hive performance
enhancement and storage
improvement.
 Implemented Hive UDF to implement business logic and performed extensive data validation
using Hive.
 Automated data processing workflows with DevOps methodologies, including CI/CD pipelines.
 Kept up-to-date with emerging technologies and trends in the data engineering space,
demonstrating a passion for continuous learning and professional development.

Allianz Life Insurance, US Dec 2016 - May 2018


Data Analyst
Responsibilities:
Conducted data analysis, modeling, and testing, utilizing tools such as Tableau, Alteryx, and logical
regression to perform score mapping and visualize data.
• Proficient in SQL, Elastic Search, and Python, leveraging query builders to extract, transform, and
load
data into containers and S3 buckets.
• Designed and executed micro-targeting and deck analysis for political and media clients,
conducting
message regression and candidate correlation analyses.
• Delivered diverse requests spanning Salesforce configuration, reporting functionality, SQL, Power
BI,
advanced Excel, data configuration, and storytelling.
• Conducted data analysis to assess the quality and meaning of data, and maintained databases and
data systems to ensure data was organized in a readable format.

You might also like