Shweta_Sakhale_8828396084

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Shweta Milind Sakhale Phone:

(+91)-8828396084
shwetasakhale13@gmail.com

PROFESSIONAL SUMMARY

❖ Having almost 2.5+ years of experience in designing and developing Big Data applications using
the HDFS, Hive, Sqoop, Spark, DSL and AWS

❖ Understands the Complex Data Processing needs of big data and have experience in developing
codes to address those needs.

❖ Knowledge of Spark SQL optimization techniques, such as cost-based query optimization,


column pruning, and predicate pushdown, and their impact on query performance and resource
utilization.

❖ Strong understanding of Spark SQL integration with other big data technologies, such as Hadoop,
Hive and their impact on data processing workflows and performance.

❖ Experience working with Spark SQL in production environments and implementing performance
monitoring and alerting systems to detect and resolve performance issues proactively.

❖ Proficient in processing serialized data in Spark using various formats, such as Avro, Parquet,
ORC, CSV, text file and their features and limitations.

❖ Experienced in using Spark serialization libraries, such as Java serialization, to optimize data
serialization and deserialization performance.

❖ Skilled in working with binary and textual data formats in Spark, such as CSV, JSON, and XML, and
their serialization and deserialization using Spark Data Frames and RDDs.

❖ Expertise in using Spark serialization and compression techniques, such as block-level


compression, dictionary encoding, and off-heap storage, to reduce data storage and processing
overhead.

❖ Maintained and monitored Spark clusters on AWS EMR, ensuring high availability and fault
tolerance.

❖ Optimized Spark jobs and data processing workflows for scalability, performance, and cost
efficiency using techniques such as partitioning, compression, and caching

❖ Designed and developed Spark applications to implement complex data transformations and
aggregations for batch processing jobs, leveraging Spark SQL and Data Frames.

❖ Experience in data architecture including data ingestion, pipeline design.


❖ Experience in importing and exporting data using Sqoop from HDFS to RDBMS and vice-versa.

❖ Brought in simplification process and Optimization initiatives to bring efficiency into


applications.

❖ Able to collaborate with stakeholders and perform source-to-target data mapping, design and
review.

❖ Data base experience in SQL Server and MYSQL.

❖ Have good problem solving and analytical skills and ready to innovate in order to perform better.

❖ Have strong Interpersonal skills and communication skills.

TECHNICAL SKILLS

Data Eco System : Hadoop, Sqoop, Hive, Spark and AWS


Databases : SQL Server, MySQL

Languages : Scala, Python, SQL, Linux

Operating Systems : Linux and Windows

PROFESSIONAL EXPERIENCE

PROJECT

Client: AT&T January 2021 – Present


Project Role: BIG DATA Developer

❖ Experienced in efficiently using Hive managed and external table with respect to the business
requirement.

❖ Expertise in utilizing Spark RDD transformations and actions for processing large-scale
structured and unstructured datasets, including tasks like filtering, mapping, reducing, grouping,
and aggregating data.

❖ Skilled in employing Spark RDD persistence and caching mechanisms to minimize data
processing overhead and enhance query performance.
❖ Familiarity with Spark RDD lineage and fault tolerance mechanisms and their impact on the
reliability and performance of data processing.

❖ Knowledge of Spark RDD optimization techniques, such as data partitioning, shuffle tuning, and
pipelining, and their effects on query performance and resource utilization.

❖ Ability to troubleshoot common issues with Spark RDD, such as data processing errors,
performance bottlenecks, and limitations in scalability.

❖ Experience working with Spark RDD in production environments and implementing proactive
performance monitoring and alerting systems to identify and resolve performance issues.

❖ Knowledge of best practices in data engineering and data science domains for Spark RDD, such
as data preprocessing, feature engineering, model training, and inference.

❖ Proficient in developing and implementing Spark DataFrame-based data processing workflows


using Scala or Python programming languages.

❖ Experienced in optimizing Spark DataFrame performance by adjusting various configuration


settings, such as memory allocation, caching, and serialization.

❖ Expertise in using Spark DataFrame transformations and actions to process large-scale


structured and semi-structured datasets, including tasks like filtering, mapping, reducing,
grouping, and aggregating data.

❖ Skilled in employing Spark DataFrame persistence and caching mechanisms to minimize data
processing overhead and enhance query performance.

❖ Familiarity with Spark DataFrame schema and data type operations, such as adding, renaming,
and dropping columns, casting data types, and handling null values.

❖ Familiarity with Spark Data Frame APIs and SQL syntax, and the ability to write complex SQL
queries and DataFrame operations to address business problems.

❖ Experienced in optimizing Spark SQL performance by adjusting various configuration settings,


such as memory allocation, caching, and serialization.

❖ Expertise in using Spark SQL to process large-scale structured and semi-structured datasets,
including tasks like querying, filtering, mapping, reducing, grouping, and aggregating data.

❖ Skilled in employing Spark SQL persistence and caching mechanisms to minimize data
processing overhead and enhance query performance.

❖ Familiarity with Spark SQL schema and data type operations, such as creating, modifying, and
dropping tables, views, and indexes, and handling null values.

Responsibilities:
Technologies: HDFS, Hive, SQL, DSL, Sqoop, Spark, Scala, Python, AWS
WORK EXPERIENCE

Accenture Pvt Ltd– January 2021– Present

EDUCATION

Institute/College Duration Percentage Obtained


Veermata Jijabai Technological Institute (VJTI),
2021-22 7.56 CGPI
Mumbai
Jawahar Education Society Annasaheb Chudaman
2015-20 6.59 CGPI
Patil College of Engineering, Navi Mumbai

V.M Pilankar Junior College, Alibaug 2014-15 65.38 %

S.R.T High School, Alibaug 2013 73.20%

I hereby declare that all the information in this document is accurate and true to best of
my knowledge.
Yours Sincerely

Place: Navi Mumbai (Shweta Sakhale)

You might also like