SDET

Arram
Professional Summary:
Data Analyst with Over 6 years of experience in Data Extraction, Data Screening, Data Cleaning, Data
Exploration, Data Visualization and Statistical Modelling of varied datasets, structured and unstructured,
as well as implementing large-scale Machine Learning and Deep Learning Algorithms to deliver resourceful
insights and inferences significantly impacting business revenues and user experience.
 Experienced in facilitating the entire lifecycle of a data science project: Data Extraction, Data Pre-
Processing, Feature Engineering, Algorithm Implementation & Selection, Back Testing and Validation.
 Skilled in using Python libraries NumPy, Pandas for performing Exploratory Data Analysis.
 Proficient in Data transformations using log, square-root, reciprocal, cube root, square and complete
box-cox transformation depending upon the dataset.
 Adept at handling Missing Data by exploring the causes like MAR, MCAR, MNAR and analyzing
Correlations and similarities, introducing dummy variables and various Imputation methods.
 Experienced in Machine Learning techniques such as Regression and Classification models like Linear
Regression, Logistic Regression, Decision Trees, Support Vector Machine using scikit-learn on Python.
 In-depth Knowledge of Dimensionality Reduction (PCA, LDA), Hyper-parameter tuning, Model
Regularization (Ridge, Lasso, Elastic net) and Grid Search techniques to optimize model performance.
 Skilled at Python, SQL, R and Object-Oriented Programming (OOP) concepts such as Inheritance,
Polymorphism, Abstraction, Encapsulation.
 Working knowledge of Database Creation and maintenance of Physical data models with Oracle, DB2
and SQL server databases as well as normalizing databases up to third form using SQL functions.
 Experience in Web Data Mining with Python’s ScraPy and Beautiful Soup packages along with working
knowledge of Natural Language Processing (NLP) to analyze text patterns.
 Proficient in Natural Language Processing (NLP) concepts like Tokenization, Stemming, Lemmatization,
Stop Words, Phrase Matching and libraries like SpaCy and NLTK.
 Experienced in developing Supervised Deep Learning algorithms which include Artificial Neural
Networks, Convolution Neural Networks, Recurrent Neural Networks, LSTM, GRU and Unsupervised
Deep Learning Techniques like Self-Organizing Maps (SOM’s) in Keras and TensorFlow.
 Skilled at Data Visualization with Tableau, PowerBI, Seaborn, Matplotlib, ggplot2, Bokeh and
interactive graphs using Plotly & Cufflinks.
 Experience including analysis, modeling, design, and development of Tableau reports and dashboards
for analytics and reporting applications
 Knowledge of Cloud services like Amazon Web Services (AWS) and Microsoft Azure ML for building,
training and deploying scalable models.
 Highly proficient in using T-SQL for developing complex Stored Procedures, Triggers, Tables, Views,
User Functions, User profiles, Relational Database Models and data Integrity, SQL joins and query
Writing.
 Proficient in using PostgreSQL, Microsoft SQL server and MySQL to extract data using multiple types of
SQL Queries including Create, Join, Select, Conditionals, Drop, Case etc.
Education:
Texas A&M University, Kingsville GPA – 4.0 Masters in Mechanical Engineering (2016)
JNT University, Hyderabad GPA – 4.0 Bachelor’s in mechanical engineering (2013)
Technical Skills:
Languages Python 2.x/3.x, R, SQL

Database MySQL, PostgreSQL, MongoDB, Microsoft SQL Server, Oracle.
Hypothetical Testing, Confidence Intervals, Bayes Law, MLE, Fish
Statistics Information, Principal Component Analysis (PCA), Cross-Validation,
correlation.
BI Tools Tableau, Excel, Microsoft Power BI.
Logistic regression, random forest, XG Boost, KNN, SVM, neural network rk,
Algorithms
linear regression, lasso regression, k-means.
MS Office (Word/Excel/PowerPoint/ Visio/Outlook), Crystal Reports XI,
Reporting Tools
SSRS, Cognos 7.0/6.0.
MS Visio, ERWIN 4.5/4.0, Star Schema/Snowflake Schema modeling, Fact &
Database Design Tools and
Dimensions tables, physical & logical data modeling, Normalization and De-
Data Modelling
normalization techniques, Kimball &Inmon Methodologies
Professional Experience:
SiriusXM, Irving TX Mar ’18 - Present

Senior Data Analyst
Responsibilities:
 Utilized Python's data visualization libraries like Matplotlib and Seaborn to communicate findings to the
data science, marketing and engineering teams.
 Conducted Data blending, Data Preparation using Python for Tableau consumption and published data
sources to Tableau server.
 Performed univariate, bivariate and multivariate analysis on the BMI, age and employment to check
how the features were related in conjunction to each other and the risk factor.
 Trained several machine learning models like Logistic Regression, Random Forest and Support vector
machines (SVM) on selected features to predict Customer churn.
 Worked on Statistical methods like data driven Hypothesis Testing and A/B Testing to draw inferences,
determined significance level and derived P-value, and to evaluate the impact of various risk factors.
 Worked on Data Cleaning and ensured Data Quality, consistency and integrity using Pandas and
Numpy.
 Created Database designs through data-mapping using ER diagrams and normalization up to the 3 rd
normal form and extracted relevant data whenever required using joins in PostgreSQL and Microsoft
SQL Server.
 Created multiple custom SQL queries in MySQL Workbench to prepare datasets for Tableau
dashboards and retrieved data from multiple tables using join conditions to efficiently extract data for
Tableau workbooks.
 Implemented and tested the model on AWS EC2 and collaborated with development team to get the
best algorithms and parameters.
 Prepared data-visualization designed dashboards with Tableau, and generated complex reports
including summaries and graphs to interpret the findings to the team.
Environment: Python (NumPy, Pandas, Matplotlib, Sk-learn), AWS, Jupyter Notebook, Tableau, SQL
Citibank, Irving, TX Feb ’17 – Mar ‘18
Senior Data Analyst
Responsibilities:
 Involved in Data Profiling to learn about user behavior and merge data from multiple data sources.
 Designing and developing various machine learning frameworks using Python and R.
 Tackled highly imbalanced Fraud Dataset using under-sampling, over-sampling with SMOTE and cost
sensitive algorithms with Python.
 Collaborate with data engineers to implement ETL process, write and optimized SQL queries to
perform data extraction from Cloud and merging from Oracle 12c.
 Collect unstructured data from PostgreSQL, MongoDB and completed data aggregation.
 Conducted analysis of assessing customer consuming behaviors and discover the value of customers
with RMF analysis; applied customer segmentation with clustering algorithms such as K-Means
Clustering and Hierarchical Clustering.
 Used pandas, NumPy, Seaborn, Scipy, Matplotlib, SKLearn and NLTK (Natural Language Toolkit),
in Python for developing various machine learning algorithms
 Utilized machine learning algorithms such as Decision Tree, linear regression, multivariate regression,
Naive Bayes, Random Forests, K-means, & KNN.
 Perform data integrity checks, data cleaning, exploratory analysis and feature engineer using R 3.4.
 Worked on different data formats such as JSON, XML and performed machine learning algorithms in R
 Perform data visualizations with Tableau and generated dashboards to present the findings.
 Work on Text Analytics, Naïve Bayes, Sentiment analysis, creating word clouds, and retrieving data
from Twitter and other social networking platforms
 Used Git2.6 to apply version control. Tracked changes in files and coordinated work on the files among
multiple team members.
Environment: Python, Tableau, R, MySQL, MS SQL Server, AWS, S3, EC2, , RNN, ANN
Equinix, Redwood City, CA Sept ’15 – Oct ‘16

Data Analyst
Responsibilities:
 Responsible for analyzing large data sets to develop multiple custom models and algorithms to drive
innovative business solutions.
 Perform preliminary data analysis and handle anomalies such as missing, duplicates, outliers, and
imputed irrelevant data.
 Remove outliers using Proximity Distance and Density based techniques.
 Involved in Analysis, Design and Implementation/translation of Business User requirements.
 Experienced in using supervised, unsupervised and regression techniques in building models.
 Performed Market Basket Analysis to identify the groups of assets moving together and
recommended the client their risks
 Implemented techniques like forward selection, backward elimination and step wise approach for
selection of most significant independent variables.
 Performed Feature selection and Feature extraction dimensionality reduction methods to figure out
significant variables.
 Performed Exploratory Data Analysis using R. Also involved in generating various graphs and charts for
analyzing the data using Python Libraries.
 Involved in the execution of multiple business plans and projects Ensures business needs are being
met Interpret data to identify trends to go across future data sets.
 Developed interactive dashboards, created various Ad Hoc reports for users in Tableau by connecting
various data sources.
Environment: Python, SQL server, Sqoop, Mahout, MLLib, MongoDB, Tableau, ETL.
Cloudpolitian Technologies India Pvt Limited Hyderabad, India Aug ’13 – Jun ‘15
Data Analyst
Responsibilities:
 Collaborated with data engineers to implement ETL process, wrote and optimized SQL queries to
perform data extraction and merging from Oracle
 Performed data integrity checks, data cleaning, exploratory analysis and feature engineer using R and
Python
 Developed personalized product recommendation with Machine Learning algorithms, including
Gradient Boosting Tree and Collaborative filtering to better meet the needs of existing customers and
acquire new customers
 Used Python to implement different machine learning algorithms, including Generalized Linear Model,
Random Forest, SVM, Boosting and Neural Network
 Worked on data cleaning, data preparation and feature engineering with Python, including Numpy,
Scipy, Matplotlib, Seaborn, Pandas, and Scikit-learn
 Recommended and evaluated marketing approaches based on quality analytics on customer
consuming behavior
 Performed data visualization and Designed dashboards with Tableau and provided complex reports,
including charts, summaries, and graphs to interpret the findings to the team and stakeholders.
 Identified process improvements that significantly reduce workloads or improve quality
Environment: R Studio, Python, Tableau, SQL Server and Oracle.

SDET

Uploaded by

Copyright:

Available Formats

You might also like

SDET

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SDET

Uploaded by

Copyright:

Available Formats

Arram

Languages Python 2.x/3.x, R, SQL

SiriusXM, Irving TX Mar ’18 - Present

Equinix, Redwood City, CA Sept ’15 – Oct ‘16

You might also like