Cloud Based Developer - RizwanShaikh (3y - 8m)

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 1

Rizwan Shaikh

Data Engineer
I am a highly skilled Data Engineer with 3+ years of experience, specializing in various data technologies including Snowflake, AWS Glue,
Pyspark, and Spark. My expertise lies in data transformation, data ingestion using Snowpipe, SnowSQL, making me proficient in data
sharing, analysis, and reporting. I have a strong background in ETL/ELT processes, data Processing, Agile methodologies, and debugging,
delivering robust solutions. My deep understanding of Snowflake's core features including Time Travel, Zero copy cloning, Stream & Tasks
, star schema and snowflake schema, allows me to design scalable ETL/ELT processes. My problem-solving skills, strong communication,
and multitasking abilities enable effective performance tuning, data governance, and translating business needs into technical solutions.
Proficient in SQL, Python, SQL Server, Oracle, data warehouse and data pipeline infrastructure, I ensure seamless data processing and
data management in the Big Data landscape and have understanding of big data technologies like Hadoop and Spark.

rizwankarimshaikh@gmail.com 7020577398 Pune

WORK EXPERIENCE SKILLS


Data Engineer Snowflake AWS AWS Glue AWS Lambda
Life Tree IT SOFT PRIVATE LIMITED
05/2021 - Present, Python SQL SnowSQL Spark PySpark
Roles & Responsibilites
Designed and implemented end-to-end ETL processes using EC2 AWS S3 AWS IAM Big Data
PySpark, AWS Glue, and Snowpipe for seamless data
ingestion, transformation, and loading into the Snowflake data
Hadoop Hive Terraform
warehouse.
Provide clean & transformed data to be consumed by
analytics and data science teams.
Conducted error handling, debugging, and troubleshooting to
identify and resolve issues in data pipelines, maintaining data ACHIEVEMENTS
quality and integrity.
-Reliable Sprint Engagement: Consistently participated in
Optimized data pipelines and queries, reduction in query all sprints, contributing to seamless project progression
execution time and improving overall system performance. and fostering effective teamwork.
Perform tasks such as writing scripts, calling APIs, write SQL
queries, etc.
Stayed updated with the latest industry trends and best
practices, continuously improving data engineering processes PROJECTS
and contributing to the organization's data-driven success.
Migration Analytics in Retail Domain
Collaborated with cross-functional teams to understand and Developed an ETL pipeline to extract data stored in AWS S3 for
address data requirements, translating business needs into further processing using PySpark.
technical solutions. Data Transformation and Cleaning: Developed PySpark and AWS
Glue jobs for efficient data transformation and cleansing of data.
Job Monitoring and Maintenance: Actively monitored and maintained
Big Data Trainee Glue jobs, including debugging and resolving issues to ensure job
Infister Technology stability and accuracy.
01/2019 - 02/2020, Cross-Functional Collaboration: Collaborated closely with Data
Scientists, Analysts, and team members to comprehend data
Roles & Responsibilites requirements, guaranteeing that processed data aligned with their
Implement Data Validation, Quality Checks, Profiling needs.
Thorough Documentation: Created comprehensive documentation
Involved in import data from various RDBMS into HDFS using for PySpark scripts, SQL queries, and data transformations, fostering
Spark which includes Incremental Load to populate Hive transparency and facilitating collaboration.
External Table and vice-versa. SQL Query Execution: Executed SQL queries in Snowflake data
Involved in import data from various RDBMS into Hive Tables warehouse to fulfil client requirements and support business KPIs
effectively.
which includes Queries using Spark.
Designed both Managed and External Hive Tables and Defined Data Ingestion in Media & Entertainment Domain
static and dynamic partitions as per requirement for optimized Data Retrieval: Fetched API data using Python, applying data
performance on production datasets. cleaning, transformation, and extraction.
Cloud Deployment: Utilized Terraform to deploy AWS Lambda,
automating data retrieval and processing.
Data Integration: Designed Lambda function to fetch, process, and
EDUCATION store data in S3 for analysis.
Monitoring and Debugging: Set up AWS CloudWatch for real-time
monitoring and issue resolution.
Bachelor of Engineering
Data Accessibility: Ensured processed data availability in S3, well
Pune University organized for downstream use.

You might also like