Partha's Resume

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

Parthasarathi Swain Data Scientist

+91-7795909536 Linkedin Github partha.datascientist@gmail.com

SKILLS
• Python, PySpark, Data Analysis, Machine Learning, MLOps, MySQL, SQL, Deep Learning, Computer Vision, Natural Language Processing, Tableau, Numpy,
Pandas, Stats & Probability, Neural Networks, Snowflake, Streamsets, Databricks, Streamlit, Flask, AWS, Azure, Product Analytics, Control-M, Cassandra,
Github Actions, Bitbucket, Hadoop, Hive, Java

EXPERIENCE
Data Scientist Jul 2023 - Apr 2024

Vestas Bangalore

Recruited as a Data Scientist, currently working as several Use Cases on Machine Learning, Time Series Forecasting, AWS cloud deployment using
SageMaker, Application Development using Flask and Streamlit, Workflow orchestraction using Databricks,Building data pipelines using Streamsets as data
ingestion tool etc.
Conducted exploratory data analysis (EDA) on sensor data, manufacturing logs, maintenance records to identify key failure patterns and maintenance triggers
by using Numpy, Pandas, Matplotlib, Seaborn & SQL, implemented business logic using Python & Pyspark and stored clearned data into AWS S3 in parque
format.
Deployed model using MLOPs by integrating with Github actions by creating CI/CD pipelines.
Employed 5 different Machine Learning algorithms (Logistic Regression, Decision Tree, Random Forest, Bagging Classifier & Boosting Classifier such
as XGBoost) & 5 fold cross validation was performed to calculate an average customized metric score on training dataset.
Developed a predictive maintenance model using historical sensor and operational data to forecast potential failures before they occur & achieved a remarkable
F1 Score of 0.79 after refining the model iteratively, thus improving efficiency in component failure identification.

Data Scientist Jul 2021 - Jun 2023

Genpact Bangalore

Spearheaded the development of a classification model for identification of potential customers, utilized different aspects of Data Science(through Projects) like
Data Analysis, Machine Learning, Application Development using Flask and Streamlit, Heroku cloud deployment, Workflow orchestraction using Databricks,
cleaning and analyzing customers data with the help of Snowflake, AWS SageMaker and Python.
Conducted data preprocessing and feature engineering on large-scale insurance datasets by using Numpy, Pandas, Matplotlib, Seaborn & EDA, including
policy information, claim history, and external risk factors.
Implemented end-to-end flow starting from building data pipelines using Streamsets for data ingestion,applying ETL operations,stored data into Avro and
parque format in staging and target tables, implemented business logic using Python and Pyspark and deployed using MLOps.
Worked on KPI Monitoring, KPI Analysis, Capacity for different Technologies and currently working on insurance management system for different customers,
delivering an increase in the Project Deliveries by 15%.
Provided Deliveries in Tableau for customer presentations & engagements based on different technologies, increasing the number of business by 30 %.
Decreased claims processing time by 40% and improved customer satisfaction.

Data Scientist May 2019 - May 2021

Deloitte Bangalore

Trained and worked on different aspects of Data Science (through Projects) like Data Analysis, Machine Learning, Application Development using Flask &
Streamlit, AWS cloud deployment, Workflow orchestraction using Databricks, conducting Statistical & Probabilistic Analysis on a Dataset of customer's
details.
Built Streamsets & Python Competency to support towards Automations & pipelines dvelopment, reducing the manual hours by 20%.
Conducted data preprocessing by performing Exploratory Data Analysis (EDA), SQL, Numpy, Pandas, Matplotlib, Seaborn. Implemented Pyspark for
distributed data processing, deployed model using MLOps, SageMaker by creating CI/CD pipelines using Github Actions.
Developed optimization models to streamline the supply chain and reduce costs.

Data Engineer Nov 2018 - May 2019

Intuit Bangalore

Increased the performance of the job from 45 mins to 7 mins by optimizing the batch size in Streamsets data collector tool.
Build data ingestion pipelines using Streamsets & process bigdata (10 GB daily) & schedule jobs by using Control-M.
Utilized Cassandra in staging layer, Tableau for visualization, Hive for data query & analysis, Hadoop for data storage by using Cloudera, Python & Pyspark
for data processing and ETL operations.
Handled bigdata processing by enabling Auto-Scaling and End-time in Databricks.

Data Engineer Apr 2017 - Jul 2018

ITC Infotech Bangalore

Built scalable ETL pipelines using Streamsets & schedule jobs by using Control-M.
Optimized batch jobs & Tungsten issues by handling DBNULL. Support all round the clock for 10+ applications.
Designed, develop & support data processing pipelines using Python & Pyspark to process the customer eligibility data.
Migrated from on-premises Cloudera environment (such as Hadoop, Hive, Hue used for data storage and analysis purpose) to cloud based Snowflake
datawarehouse.

Senior Software Engineer Sep 2014 - Dec 2016

HCL Technologies Bangalore

Worked in RTB (Run the Bank) team to provide the support for Wealth planning application 24/5 availability.
Expertised in production support processes such as outage/incident, current issues, Onsite/ Offshore coordination workflow, documentation, shift
handover process, Escalation procedures etc.
Analyzed & debugged the production issues and provide the update as quickly as possible.
Initiated conference call / bridge in case of incidents based on severity and make sure to resolve the issues as quickly as possible.
Involved with multiple stakeholders in an organization and handle their support related queries/ issues.
Implemented tech stacks like Java 1.6, J2EE, UNIX, Linux, Oracle 10g, WebSphere, Putty, AutoSys, Service Now (SNOW) as part of this project.

Software Engineer Sep 2013 - Aug 2014

Aricent Group Bangalore

Led on production calls for the development team - understand the infrastructure and production environment, assist with a site availability issue participating
on calls and engage the developers as needed to resolve the problem both short term and long term.
Specialized in production support processes such as outage/incident management. Conducted daily huddle calls with onsite & offshore teams.
Automated application checkout and make sure all the applications are up and running 24/7 after any weekend deployments.
Displayed custom message by using HTML 5 in UI during maintance activities using Log Viewer tool.
Created & exposed web API using Flask and Web development using Java 6 .
Integrated Spring 3.0 MVC with Structs & Hibernate using MVC architecture.

PROJECTS
Time Series Forecasting for Wind Turbine
Led a cross-functional project to develop predictive maintenance models for critical components in wind turbines, aimed at reducing downtime and optimizing
maintenance schedules. Collaborated with engineering, operations, and IT teams to define project objectives and data requirements.
Analyzed sensor data from over 800 wind turbines to identify patterns and optimize energy output that led to a 22% improvement in overall turbine performance.
Evaluated model performance using metrics such as precision, recall, and F1-score, achieving a reduction of 30% in maintenance costs and improved turbine
uptime by 20%.
Saving $4 million annually in maintenance costs, enabling proactive maintenance and improving turbine reliability.

Insurance Management System


Developed a Project which aims to gain insights and understand the characteristics of Real and Fake requirement of insurance, by performing Exploratory Data
Analysis (EDA) and conducting Statistical & Probabilistic Analysis on a Dataset of customer's details, and achieving an Accuracy of 95% on Model created.
Led a comprehensive project focused on developing predictive models for insurance claims, aimed at improving risk assessment and loss mitigation strategies.
Handled class imbalance by Synthetic Minority Oversampling Technique (SMOTE) & Random Under Sampler. Improved policy renewal rates by 18% and
reduced customer churn.
Implemented a fraud detection system using supervised and unsupervised learning methods to identify and mitigate fraudulent claims, resulting in a 15%
reduction in claims costs and improved risk selection.
Improved detection accuracy and reduced false positives, resulting in a 40% reduction in fraudulent claims payouts, saving $5 million annually.

Inventory Management System


Developed a Project which aims to get an Estimated Delivery Time that it can provide the customers on the basis of what they are ordering, from where and also
the Delivery Partners, with an accuracy of 92%.
Built a project which aims to automate the inventory management system by predicting the availability of equipment's for a particular product based on
customer details provided while filling online application form.
Developed a project which aims to improve the inventory management system by predicting if an ordered will get approved or not for a particular
customer/client based on their demand.
Utilized machine learning algorithms to analyze historical sales data and market trends, improving inventory accuracy by 30%.

ACHIEVEMENTS
Having more than 5+ years of teaching experience in Data Science along with AI&ML instructor at one of the India's largest Ed-Tech platform.
Completed Oracle Certified Java(1.6) Programmer(OCJP) with 88%.
Received best employee award & sport award in HCL Technologies.

EDUCATION
Scaler 2024

Specialized in Data Science & Machine Learning

BPUT 2011
BE/B.Tech/BS in Electronics 8.01 CGPA
Completed B.Tech in Electronics and Telecommunication Engineering from 2007-2011, with an aggregate of 8.01 CGPA.

You might also like