Professional Documents
Culture Documents
Data Engineer Resume v1.1
Data Engineer Resume v1.1
Data Engineer Resume v1.1
Summary
Resourceful and dynamic Data Engineer with over 4+ years of experience. Armed with a diverse skill set in programming
languages including Python, SQL and Java. Proficient in data tools and technologies such as Snowflake, Airflow, Spark, Kafka
and AWS. Demonstrated expertise in data warehousing, ETL processing, Data wrangling and advanced statistical analysis
Skills
Programming Languages: Python, SQL, Java, Bash/Shell script
ETL Tools: Snowflake, Airflow, Tableau, Informatica, Power BI, dbt, Excel
Frameworks: Spark(Databricks), Hadoop, Kafka, Pandas, Numpy, Pytorch, LLM, RAG
Databases: MSSQL, Postgres, Cassandra, MongoDB, Redis, Firebase
Cloud: AWS(S3,Glue,Redshift), Azure(Data Factory,Synapse Analytics), GCP (BigQuery, Looker, pub/sub)
CI/CD: Docker, Kubernetes, Terraform, Jenkins
Certifications: AWS Certified Solutions Architect, Apache Airflow Certified
Experience
Software Engineer- Data, Ulytics May 2022 – Present
Responsibilities:
● Designed and developed Spark/ETL pipelines to consume batch data from cloud-based sources like CRM Dynamics and
applied business logic to transform(Parquet/Text Files), load data into delta tables and enabled self-service reporting and
advanced analytics
● Developed Python scripts utilizing pandas library to automate metadata setup in report log database, saving over 100
hours for data science and analytics team to meet evolving business requirements
● Reduced ETL job failures by 50% through identification and resolution of SQL query mismatches and schema errors in SSIS,
optimizing query execution times, and significantly hastening the pace of data-driven analysis
● Implemented medallion architecture to logically organize the data in the data lake/lakehouse with the goal of
incrementally and progressively improving the structure and quality of data as it flows through each layer of the
architecture
● Architected a diagnostic framework to address extreme changes in data, reducing mean time to detect anomalies by 35%
ensuring data integrity and reliability
● Optimized legacy Spark pipeline, enhancing performance by implementing tuning techniques: avoiding skewed joins,
broadcast joins, repartitioning, and coalescing partitions
● Orchestrated full daily ETL loads across various teams, leveraging Airflow with Celery Executor and enhancing data
integration by dynamically fetching configurations from our internal service registry
● Established secure access to S3 buckets, created roles and policies across AWS accounts reducing service costs by 15%
using aws-sdk(Boto3) and optimal resource allocation strategies for privileged access of documents
● Established CI/CD pipeline, integrating Terraform and AWS, for efficient deployment of ML project artifacts to Databricks,
significantly accelerating iteration cycles
● Enhanced project outcomes through meticulous A/B testing and hypothesis validation using SQL and statistical
frameworks, leading to data-driven recommendations that improved project performance by an average of 20%
● Streamlined Linux servers upgrade effort by encapsulating legacy applications in a Docker container and automating their
configuration using Bash scripting
●
● Produced insights pivotal for shaping corporate strategies by creating predictive models, interactive data visualizations and
dashboards using PowerBI, which translated complex data findings into actionable insights for stakeholders
Environment:
Projects
Distributed Data Stream Pipeline [Python, Kafka, Spark, Airflow, Cassandra] github.com/neveram/Data-Pipeline
● Developed an end-to-end ETL pipeline integrating real-time streaming from REST API endpoint
YOLO-Tennis [pytorch, pandas, numpy, opencv] github.com/neveram/YOLO-Tennis
● Developed a computer vision project to analyze tennis player’s movements, speeds, and shots utilizing object trackers
across frames of a live Tennis match
E-Shopify [Rails, Hotwire, Tailwind CSS, Stripe, PostgreSQL] github.com/neveram/E-Shopify
● Developed a Full-stack web Application for Online store with with dynamic UI and robust payment processing
Education
San Jose State University – MS in Software Engineering 2023
Vardhaman College of Engineering – B.Tech in Computer Science 2019