Professional Documents
Culture Documents
Nelson - Mbi - Data Scientist - A
Nelson - Mbi - Data Scientist - A
Nelson - Mbi - Data Scientist - A
Data Scientist
nelsonmbi@yahoo.com
Mobile: 256-479-2676
Results-oriented, visionary Data Scientist with 8 years of experience in machine learning and deep
learning and master’s degrees in Data Analytics and Political Economy with a focus on Statistics,
Econometrics, Machine Learning, and Political events predictions. Worked with teams integrating
data science and analytics into decision-making and portfolio of products and services. Led teams in
implementing cutting-edge solutions, providing thought leadership, and prototyping enterprise data
science solutions.
● Led teams of size 3 people (onsite and offshore) providing leadership, modeling, and mentorship
to more than 15 projects.
● Strong leadership, team management, and problem-solving skills.
● Worked closely with business, data governance, SMEs, and vendors to define data requirements.
● Evaluated 3rd party data vendors and acquired data to increase model accuracy.
● Built models such as LSTM, Karas on Tensor Flow, HMM, Random Forests, k-NN, logistic
regressions, and time series models using packages such as ggplot, dplyr, NumPy, sci-kit learn,
pandas, matplotlib, etc.
● Experience building NLP models using word embedding, Bag of n-grams, genism, and word2vec.
● Automated by building workflows to extract data from various REST APIs and databases,
processing responses, and data transformations in python and R.
● Established feedback loops, automated processes, platform integrations, optimization models,
and models to increase user experience.
● Implemented cutting-edge solutions using on-premise and AWS solutions such as S3, Spark,
MySQL, Hadoop, EMR, Aurora, Glacier, MongoDB, Cassandra, Elastic Search, Logstash, Kibanna,
APIs, EC2, Lambda, Quick Sight.
● Implemented and proposed use cases ranging from statistical analysis and testing, churn
prediction, Time Series forecasting, anomaly detection, Customer LTV, text mining, A/B testing for
feature selection, and dashboards.
● Reported analytical findings to C-level executives using dashboards built in Tableau, Qlik View,
and R-Shiny.
● Managed teams to perform data analysis on classification and forecast models, statistical models,
and risk analysis and solved data-driven problems using SPSS, SAS E-Miner, R, SAS, Python, E-
Views, Tableau, and Qlik.
● Published Tableau reports to clients on a weekly basis and presented a monthly graphical
summary to clients.
PROGRAMMING LANGUAGES: Python, R, SAS, C, Matlab, Java, SQL, Hive, Linux, VBA
Macro, Linux, HTML, CSS, JavaScript, and Bootstrap.
TOOLS & DATABASES: RStudio, python, Spark, AWS, SPSS, SAS, Hadoop, Hive, MongoDB,
Cassandra, Zeppelin, S3, Aurora, Glacier, Elastic Search, EC2, Lambda, Quick Sight, Tableau, Qlik,
Adobe Site Catalyst, Google Analytics, MS Visual Studio, Excel, MS PowerPoint.
• Ingesting data to snowflake table using python snowflake SDK, providing end-to-
end automation.
• Building code architecture and application development fulling the business
needs.
• Draw data insights through Visualization.
• Implemented Drift Detection on input data and ML models.
• Deploying models on Azure ML Kubernetes for inference.
• Create Machine Learning and predictive models involving various algorithms
and frameworks.
• Documented and communicated database schemas using accepted
notations.
• Building an end-to-end pipeline with Python and the groundwork environment
for the Data Scientists.
• Solving different Machine Learning problems to meet business needs.
• Automation of data pipeline from Redshift & s3 using Python- SDK(B
• Worked on different data formats such as JSON, and XML and performed
machine learning algorithms in R.
• Enhanced statistical models (linear mixed models) for predicting the best
products for commercialization using Machine Learning Linear regression models,
KNN, and K-means clustering algorithms
• Participated in all phases of data mining, data collection, data cleaning,
developing models, validation, visualization, and performed Gap Analysis.
Data Scientist
General Electric - Schenectady, NY
November 2018 – October 2021
More than two years in statistical programming, and analytics using SAS, HDFS, Spark, R, Python,
Bayesian methods (naïve and belief networks), predictive forecasting, linear programming models
(prescriptive), clustering, an expert in text mining techniques, market basket analysis/sequence
analytics, and many other machine learning techniques.
Expert level in R, Python, SAS, Tableau, Power BI, Elastic Search, and other tools for developing,
visualizing, and deploying machine learning & statistical algorithms.
Data Scientist
Save Mart Supermarkets – New York
February 2017 to November 2018
• Extracted inventory flow and stock level data across various nodes (hubs, stores, etc.) by joining
tables from more than 10 databases.
• Assessed the scope and scale of the project based on the current and future scope of the project.
• Implemented hybrid architecture using Hive, PrestoDB, MariaDB, AWS Aurora, S3, Glacier,
Spark on EMR, and QuickSight.
• Established data engineering pipeline by extracting data from Hive, PrestoDB to local MariaDB
and AWS Aurora database.
• Forecasted Average weekly demand using historical demand data and calculated safety stock, cycle
stock, and max stock across all nodes (Hubs, stores, etc) based on predictions.
• Built MinMax, LSTM, ARIMA, and Naive machine learning models and sent results back to local
databases.
• Performed data munging and preparation using Spark.
• Build Tableau dashboards for ad-hoc analysis and to compare results.
• Architect infrastructure by gathering the scope and scale of the project.
• Created the definition of the faulty device based on analysis results and discussion with the
business (no standard definition)
• Extract data to identify device flows across various nodes and defined flow direction based on
creating business rules.
• Computed parameters based on customer flows of each device.
• Build machine learning models using SVM, Logistic regression, and Random forest algorithms
to predict devices likely to be lemons (faulty devices) prediction.
Tools - AWS Aurora, S3, Glacier, EMR, Python, Spark, Rest API, Linux, Hive, PrestoDB, MariaDB,
SQL Server, Tableau.
Data Scientist
OneMain Financial Group, LLC - NY
September 2015 to February 2017
Built a new team and managed in designing cost-effective A/B tests to determine high-performance
marketing campaigns and contributed to an increase in sales by 20% and reduced promotional cost
by 35%.
● Established automation to extract huge corpora of text from blogs, news feed, tweets, and other
data sources.
● Used ggplot2, dplyr, lm, e1071, rpart, Random Forest, nnet, and tree packaged in R to build
predictive models for Insurance and bank clients and successfully incorporated models into
workflows.
● Built predictive models to predict churn of customers & 'Next product to buy' models using
logistic regression and neural nets respectively using R, Python, SPSS, and SAS.
● Managed many analytical projects in parallel such as building predictive models, optimization
models, unstructured data analysis, and data graphs.
● Extracted social media data, crunched, and built word clouds, data graphs, and storyboards using
SAS E-Miner. Provided in-depth story analysis and provided recommendations.
● Designed web marketing, planted, and analyzed java tags, and tracked performance using Google
Analytics.
● Helped companies with social media analysis, performed text mining, NLP, and sentiment
analysis, and presented the results using Link Graphs in SAS E-Miner.
Tools: Python, R, SAS, SAS E-Miner, R-Shiny, Tableau, Google Analytics, Linux, Hive, Hadoop,
Elastic Search, Logstash, Kibanna (ELK)
Environment: Oracle, Python, R-Shiny, Tableau, SQL, Apache Sqoop, Apache ZOO,
Apache, Oozie, R-Studio, Python, Theano, SQL, MS Excel 2016, Tableau,
WINDOWS/Linux platform
Data Analyst
United Nations (UN), New York, USA
January 2010 to January 2015
• Designed Excel, Power BI, and Tableau dashboards for higher management reports, and
other data visualization reports as per the needs of the user to measure KPI. Frequently prepared
ad-hoc reports using Microsoft Access and Excel.
• Performed Data Analysis using QlikView, and Tableau software to analyze financial data.
• Created BRDs and PRDs with the collaboration of the project manager, QA, and systems
analyst. Data Analysis with SQL for queries from DB2 and Oracle DB, OBIEE, Excel, Access
DB
• Developed UML diagrams showing a high-level map of business interactions.
• Work experience with Project management & Collaboration tools: MS Project, MS
PowerPoint, MS Excel, Atlassian Confluence, JIRA.
• Working with the business to do UAT. Acted as an SME and worked with the testing team.
• Interpret, and break down accumulated data in a forthright & concise manner to facilitate
efficient use for end users. Reconciling the accuracy of data in the system.
• Providing efficient technical support during and after implementation for EDI configuration
• Effectively utilized CRM systems like SAP and Salesforce CRM, Supplier related systems
like SAP SCM and SRM, SAP ERP Experience with core ERP modules like HR (Human
Resource), MM (Material Management), FI-CO (Financial Accounting and Controlling),
among others. SAP HCM and Sap Basis.
• Experience with Microsoft SharePoint and Atlassian Confluence to collaborate with
other team members and clients.
• Combining structured and unstructured information for data warehousing and analyzing claims
data to help identify cost and recovery. Working with formats like CSV, XML, and Excel
• Wrote SQL queries in Microsoft Access based on management's requirements/expectations.
Environment: R, Python, Microsoft Excel, Access, Query Analyzer, SharePoint, Office 365,
Tableau, MySQL, SQL Server 2005/2008, PL/SQL, Oracle, UNIX, Windows
Education
Master in Plastic Engineering
August 2022
Certification in Data Science
Northeastern University -
Boston, 2018
Skills
SQL, MS SQL Server, MySQL, MICROSOFT SQL SERVER, Hadoop, Linux, JavaScript,
Python, R, SAS, Hadoop, Tableau, Business Intelligence, Unix Administration, Adobe, Coding,
Apache, C++, CMS, Database Development, Database Administration, Data Entry, Data
Analysis, Data Mining, Big Data, Mongo DB, AWS, Deep Learning, Machine Learning, Project
Management, Data Management, Data Warehousing, BI, Excel, Essbase, Fisma, Hyperion