Professional Documents
Culture Documents
Data Science Content
Data Science Content
Data Science Content
Prerequisite:
No statistical background is necessary
Data Analysis and Reporting background is necessary
This is a 100% hands on training. Every participant should have access to a computer.
Tools:
R , Python and tensorflow
Course Contents
Week 1:- Statistics Theory:
1. Introduction to Statistics
2. Graphical and Tabular Descriptive Statistics
3. Probability
4. Probability Distribution
5. Hypothesis Testing
6. Statistical Tests (Z-Test, Chi-Square, T-Tests, etc)
b. Introduction to R programming
2. Data Handling in R
a. Importing data
b. Sampling
c. Data Exploration
c. Measures of dispersion
2. Data exploration
3. Data validation
5. Outliers identification
6. Data Cleaning
1. Regression Analysis
a. Correlation
c. R-Square
d. Multiple regression
e. Multi collinearity
2. Logistic Regression
a. Need of logistic Regression
f. Confusion Matrix
1. Decision Trees
a. Segmentation
b. Entropy
c. Building Decision Trees
d. Validation of Trees
e. Fine tuning and Prediction using Trees
h. Cross validation
i. Boot strapping
2. Model building-1
3. Model building-2
4. Model validation
5. Variable selection
6. Model calibration
B. SVM
a. Introduction
d. SVM algorithm
g. Conclusion
a. Introduction
b. Ensembling learning
c. Bagging
e. Boosting concept
f. XG boosting
g. Conclusion
PCA Analysis
Cluster Analysis
1. Poster algorithm
2. Prior algorithm
3. Choosing probability values
4. Conclusion
1. Objective
2. ML Model-1
3. ML Model-2
Python Introduction
a. What is Python & History?
e. Python packages
f. Loops
h. If-then-else statement
f. Data Merging
b. Descriptive statistics
c. Central Tendency
f. Box Plots
g. Graphs
b. Data exploration
c. Data validation
e. Outliers identification
f. Data Cleaning
a. Correlation
c. R-Square
d. Multiple regressions
e. Multi collinearity
2. Logistic Regression
a. Need of logistic Regression
f. Confusion Matrix
3. Decision Trees
a. Segmentation
b. Entropy
d. Validation of Trees
c. Types of data
d. Types of errors
h. Cross validation
i. Boot strapping
B. SVM
a. Introduction
d. SVM algorithm
g. Conclusion
a. Introduction
d. SVM algorithm
g. Conclusion
2. ML Model-1
3. ML Model-2
Data exploration
Model building
Variable selection
Future reengineering
Final Submission