Nelson - Mbi - Data Scientist - A

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Nelson MBI

Data Scientist
nelsonmbi@yahoo.com
Mobile: 256-479-2676

Results-oriented, visionary Data Scientist with 8 years of experience in machine learning and deep
learning and master’s degrees in Data Analytics and Political Economy with a focus on Statistics,
Econometrics, Machine Learning, and Political events predictions. Worked with teams integrating
data science and analytics into decision-making and portfolio of products and services. Led teams in
implementing cutting-edge solutions, providing thought leadership, and prototyping enterprise data
science solutions.

● Led teams of size 3 people (onsite and offshore) providing leadership, modeling, and mentorship
to more than 15 projects.
● Strong leadership, team management, and problem-solving skills.
● Worked closely with business, data governance, SMEs, and vendors to define data requirements.
● Evaluated 3rd party data vendors and acquired data to increase model accuracy.
● Built models such as LSTM, Karas on Tensor Flow, HMM, Random Forests, k-NN, logistic
regressions, and time series models using packages such as ggplot, dplyr, NumPy, sci-kit learn,
pandas, matplotlib, etc.
● Experience building NLP models using word embedding, Bag of n-grams, genism, and word2vec.
● Automated by building workflows to extract data from various REST APIs and databases,
processing responses, and data transformations in python and R.
● Established feedback loops, automated processes, platform integrations, optimization models,
and models to increase user experience.
● Implemented cutting-edge solutions using on-premise and AWS solutions such as S3, Spark,
MySQL, Hadoop, EMR, Aurora, Glacier, MongoDB, Cassandra, Elastic Search, Logstash, Kibanna,
APIs, EC2, Lambda, Quick Sight.
● Implemented and proposed use cases ranging from statistical analysis and testing, churn
prediction, Time Series forecasting, anomaly detection, Customer LTV, text mining, A/B testing for
feature selection, and dashboards.
● Reported analytical findings to C-level executives using dashboards built in Tableau, Qlik View,
and R-Shiny.
● Managed teams to perform data analysis on classification and forecast models, statistical models,
and risk analysis and solved data-driven problems using SPSS, SAS E-Miner, R, SAS, Python, E-
Views, Tableau, and Qlik.
● Published Tableau reports to clients on a weekly basis and presented a monthly graphical
summary to clients.

PROGRAMMING LANGUAGES: Python, R, SAS, C, Matlab, Java, SQL, Hive, Linux, VBA
Macro, Linux, HTML, CSS, JavaScript, and Bootstrap.

TOOLS & DATABASES: RStudio, python, Spark, AWS, SPSS, SAS, Hadoop, Hive, MongoDB,
Cassandra, Zeppelin, S3, Aurora, Glacier, Elastic Search, EC2, Lambda, Quick Sight, Tableau, Qlik,
Adobe Site Catalyst, Google Analytics, MS Visual Studio, Excel, MS PowerPoint.

LANGUAGES: Speak English and French fluently.

Data Scientist/Data Engineer

Toyota Financial Group– Dallas, TX -10/2021 - 11/2022

• Ingesting data to snowflake table using python snowflake SDK, providing end-to-
end automation.
• Building code architecture and application development fulling the business
needs.
• Draw data insights through Visualization.
• Implemented Drift Detection on input data and ML models.
• Deploying models on Azure ML Kubernetes for inference.
• Create Machine Learning and predictive models involving various algorithms
and frameworks.
• Documented and communicated database schemas using accepted
notations.
• Building an end-to-end pipeline with Python and the groundwork environment
for the Data Scientists.
• Solving different Machine Learning problems to meet business needs.
• Automation of data pipeline from Redshift & s3 using Python- SDK(B
• Worked on different data formats such as JSON, and XML and performed
machine learning algorithms in R.
• Enhanced statistical models (linear mixed models) for predicting the best
products for commercialization using Machine Learning Linear regression models,
KNN, and K-means clustering algorithms
• Participated in all phases of data mining, data collection, data cleaning,
developing models, validation, visualization, and performed Gap Analysis.

• Designed and implemented effective database solutions and models to store


and retrieve data.
• Built databases and table structures for web applications.
• Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity,
and verifying pipeline stability.
• Developed, implemented, and maintained data analytics protocols, standards,
and documentation.
• Explained data results and discussed how best to use data to support project
objectives.
• Designed data models for complex analysis needs.

• Environment: Oracle, SAS, R, Python, R-Shiny, Tableau, SQL, Apache Sqoop,


Apache ZOO, Apache, Oozie, R-Studio, Python, Theano, SQL, MS Excel 2016,
Tableau, WINDOWS/Linux platform, Power BI, Snowflakes.

Data Scientist
General Electric - Schenectady, NY
November 2018 – October 2021

More than two years in statistical programming, and analytics using SAS, HDFS, Spark, R, Python,
Bayesian methods (naïve and belief networks), predictive forecasting, linear programming models
(prescriptive), clustering, an expert in text mining techniques, market basket analysis/sequence
analytics, and many other machine learning techniques.

Expert level in R, Python, SAS, Tableau, Power BI, Elastic Search, and other tools for developing,
visualizing, and deploying machine learning & statistical algorithms.

Data Scientist
Save Mart Supermarkets – New York
February 2017 to November 2018

SCM - Supply Chain Management


Goal - To reduce safety stock inventory levels from 8 weeks to 4 weeks thereby saving more than
~10 million dollars.

• Extracted inventory flow and stock level data across various nodes (hubs, stores, etc.) by joining
tables from more than 10 databases.
• Assessed the scope and scale of the project based on the current and future scope of the project.
• Implemented hybrid architecture using Hive, PrestoDB, MariaDB, AWS Aurora, S3, Glacier,
Spark on EMR, and QuickSight.
• Established data engineering pipeline by extracting data from Hive, PrestoDB to local MariaDB
and AWS Aurora database.
• Forecasted Average weekly demand using historical demand data and calculated safety stock, cycle
stock, and max stock across all nodes (Hubs, stores, etc) based on predictions.
• Built MinMax, LSTM, ARIMA, and Naive machine learning models and sent results back to local
databases.
• Performed data munging and preparation using Spark.
• Build Tableau dashboards for ad-hoc analysis and to compare results.
• Architect infrastructure by gathering the scope and scale of the project.
• Created the definition of the faulty device based on analysis results and discussion with the
business (no standard definition)
• Extract data to identify device flows across various nodes and defined flow direction based on
creating business rules.
• Computed parameters based on customer flows of each device.
• Build machine learning models using SVM, Logistic regression, and Random forest algorithms
to predict devices likely to be lemons (faulty devices) prediction.

Tools - AWS Aurora, S3, Glacier, EMR, Python, Spark, Rest API, Linux, Hive, PrestoDB, MariaDB,
SQL Server, Tableau.
Data Scientist
OneMain Financial Group, LLC - NY
September 2015 to February 2017
Built a new team and managed in designing cost-effective A/B tests to determine high-performance
marketing campaigns and contributed to an increase in sales by 20% and reduced promotional cost
by 35%.

● Established automation to extract huge corpora of text from blogs, news feed, tweets, and other
data sources.
● Used ggplot2, dplyr, lm, e1071, rpart, Random Forest, nnet, and tree packaged in R to build
predictive models for Insurance and bank clients and successfully incorporated models into
workflows.
● Built predictive models to predict churn of customers & 'Next product to buy' models using
logistic regression and neural nets respectively using R, Python, SPSS, and SAS.
● Managed many analytical projects in parallel such as building predictive models, optimization
models, unstructured data analysis, and data graphs.
● Extracted social media data, crunched, and built word clouds, data graphs, and storyboards using
SAS E-Miner. Provided in-depth story analysis and provided recommendations.
● Designed web marketing, planted, and analyzed java tags, and tracked performance using Google
Analytics.
● Helped companies with social media analysis, performed text mining, NLP, and sentiment
analysis, and presented the results using Link Graphs in SAS E-Miner.

Tools: Python, R, SAS, SAS E-Miner, R-Shiny, Tableau, Google Analytics, Linux, Hive, Hadoop,
Elastic Search, Logstash, Kibanna (ELK)

Jr. Data Scientist


USAID - Washington, DC
January 2015 to August 2015
• Used Pandas, NumPy, Seaborn, SciPy, Matplotlib, and Scikit-learn in Python for
developing various machine learning algorithms and utilized machine learning algorithms
such as linear regression, Decision trees, Neural Networks, and Random Forests for data
analysis.
• Validated the machine learning classifiers using Accuracy, AUC, ROC Curves, and Lift Charts.
• Performed random forests and analyzed graphs on training and testing errors w.r.t sample size.
• Carried out neural networks through back propagation method and 10-fold cross-validation
are used to select the best parameters of the Ebola network. The tuning parameters included-
the number of layers, converge rate, and the range of initial random weight.
Compared the model accuracies before and after applying PCA.
• Created Tableau scorecards, and dashboards using Stack bars, bar graphs, scattered plots,
geographical maps, and Gantt charts using show me functionality. Created dashboards to have a
clear view of descriptive statistics of all variables, Ebola-country-wise trend analysis, and
predicted vs actual response rate for each country. Worked extensively with Advance analysis
Actions, Calculations, Parameters, Background images, and Maps. Effectively used data blending
feature in tableau.
• Worked on different data formats such as JSON, and XML and performed machine learning
algorithms in R.
• Worked as Data Architects and IT Architects to understand the movement of data and its
storage and ERStudio9.7
• Processed huge datasets (over a billion data points, over 1 TB of datasets) for data association
pairing and provided insights into meaningful data association and trends
• Developed cross-validation pipelines for testing the accuracy of predictions
• Enhanced statistical models (linear mixed models) for predicting the best products for
commercialization using Machine Learning Linear regression models, KNN, and K-means
clustering algorithms
• Participated in all phases of data mining, data collection, data cleaning, developing
models, validation, visualization, and performed Gap Analysis.

Environment: Oracle, Python, R-Shiny, Tableau, SQL, Apache Sqoop, Apache ZOO,
Apache, Oozie, R-Studio, Python, Theano, SQL, MS Excel 2016, Tableau,
WINDOWS/Linux platform

Data Analyst
United Nations (UN), New York, USA
January 2010 to January 2015
• Designed Excel, Power BI, and Tableau dashboards for higher management reports, and
other data visualization reports as per the needs of the user to measure KPI. Frequently prepared
ad-hoc reports using Microsoft Access and Excel.
• Performed Data Analysis using QlikView, and Tableau software to analyze financial data.
• Created BRDs and PRDs with the collaboration of the project manager, QA, and systems
analyst. Data Analysis with SQL for queries from DB2 and Oracle DB, OBIEE, Excel, Access
DB
• Developed UML diagrams showing a high-level map of business interactions.
• Work experience with Project management & Collaboration tools: MS Project, MS
PowerPoint, MS Excel, Atlassian Confluence, JIRA.
• Working with the business to do UAT. Acted as an SME and worked with the testing team.
• Interpret, and break down accumulated data in a forthright & concise manner to facilitate
efficient use for end users. Reconciling the accuracy of data in the system.
• Providing efficient technical support during and after implementation for EDI configuration
• Effectively utilized CRM systems like SAP and Salesforce CRM, Supplier related systems
like SAP SCM and SRM, SAP ERP Experience with core ERP modules like HR (Human
Resource), MM (Material Management), FI-CO (Financial Accounting and Controlling),
among others. SAP HCM and Sap Basis.
• Experience with Microsoft SharePoint and Atlassian Confluence to collaborate with
other team members and clients.
• Combining structured and unstructured information for data warehousing and analyzing claims
data to help identify cost and recovery. Working with formats like CSV, XML, and Excel
• Wrote SQL queries in Microsoft Access based on management's requirements/expectations.

Environment: R, Python, Microsoft Excel, Access, Query Analyzer, SharePoint, Office 365,
Tableau, MySQL, SQL Server 2005/2008, PL/SQL, Oracle, UNIX, Windows

Education
Master in Plastic Engineering

Penn State Unversity

Post Graduate Program in


Data Science and Data
Management Systems-

The University of Texas at


Austin

August 2022
Certification in Data Science
Northeastern University -
Boston, 2018

Bachelor’s Degree in Computer Science


University of Buea, Cameroon
August 2005 to December 2008

Skills
SQL, MS SQL Server, MySQL, MICROSOFT SQL SERVER, Hadoop, Linux, JavaScript,
Python, R, SAS, Hadoop, Tableau, Business Intelligence, Unix Administration, Adobe, Coding,
Apache, C++, CMS, Database Development, Database Administration, Data Entry, Data
Analysis, Data Mining, Big Data, Mongo DB, AWS, Deep Learning, Machine Learning, Project
Management, Data Management, Data Warehousing, BI, Excel, Essbase, Fisma, Hyperion

You might also like