Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

1

Data Science with Python Training


Duration: 60 hours Course Code: SSDN-C250
Overview:
A cohesive software application that offers a mixture of basic building blocks essential all for creating many kinds of
data science solution and incorporating such solutions into business processes, surrounding infrastructure and
products. These include tasks relating to data access and ingestion, data preparation, interactive exploration and
visualization, feature engineering, advanced modelling, testing, training, deployment and performance engineering
including the Python Programming which is used for production development and also is good for data analysis. Data
science is much more-broad which is most widely used with python programming language.

What you’ll learn:


dfdfdfdfd Target Audience:
After completing the course with Data Science and python  Developers aspiring to be a data scientist or
programming will enable you to master the concepts of Data machine learning engineer.
Analysing.  Analytics managers who are leading a team of
analysts.
 Introduction of Data Science with Statistical Analysis  Business analysts who want to understand data
and business application. science techniques and Information architects
 What is Python and mathematical calculation with who want to gain expertise.
Python?
 Scientific calculation with Python.
Prerequisite Knowledge:
 Data manipulation and machine learning with Python.
 Familiarity with the fundamentals of Python
 Data visualization in Python with matplotib and data
programming
science with Python web scraping.
 Understanding of the basics of statistics and data
science.

You can reach us:


SSDN Technologies, M-50, OLD DLF, Sec-14, Gurgaon, Haryana, 122001. www.ssdntech.com
Contact Us: +91-9999.111.686, +91-9999.50.9970, +91-9999.10.9937. bdt@ssdntech.com

SSDN Technologies +91-9999-111-686


2

Course Content
Module 1: Data Science Overview
 Data Science
 Data Scientists
 Examples of Data Science
 Python for Data Science

Module 2: Data Analytics Overview


 Introduction to Data Visualization
 Processes in Data Science
 Data Wrangling, Data Exploration, and Model Selection
 Exploratory Data Analysis or EDA
 Data Visualization
 Plotting
 Hypothesis Building and Testing

Module 3: Statistical Analysis and Business Applications


 Introduction to Statistics
 Statistical and Non-Statistical Analysis
 Some Common Terms Used in Statistics
 Data Distribution: Central Tendency, Percentiles, Dispersion
 Histogram
 Bell Curve
 Hypothesis Testing
 Chi-Square Test
 Correlation Matrix
 Inferential Statistics

Module 4: Python: Environment Setup and Essentials


 Introduction to Anaconda
 Installation of Anaconda Python Distribution - For Windows, Mac OS, and Linux
 Jupyter Notebook Installation
 Jupyter Notebook Introducti
 Control Flow

Module 5: Mathematical Computing with Python (NumPy)


 NumPy Overview
 Properties, Purpose, and Types of ndarray
 Class and Attributes of ndarray Object
 Basic Operations: Concept and Examples
 Accessing Array Elements: Indexing, Slicing, Iteration, Indexing with Boolean Arrays
 Copy and Views
 Universal Functions (ufunc)
 Shape Manipulation
 Broadcasting
 Linear Algebra

SSDN Technologies +91-9999-111-686


3

Module 6: Scientific computing with Python (Scipy)


 SciPy and its Characteristics
 SciPy sub-packages
 SciPy sub-packages –Integration
 SciPy sub-packages – Optimize
 Linear Algebra
 SciPy sub-packages – Statistics
 SciPy sub-packages – Weave

Module 7: Data Manipulation with Python (Pandas)


 Introduction to Pandas
 Data Structures
 Series
 DataFrame
 Missing Values
 Data Operations
 Data Standardization
 Pandas File Read and Write Support
 SQL Operation

Module 8: Machine Learning with Python (Scikit–Learn)


 Introduction to Machine Learning
 Machine Learning Approach
 How Supervised and Unsupervised Learning Models Work
 Scikit-Learn
 Supervised Learning Models - Linea
 Unsupervised Learning Models: Dimensionality Reduction
 Pipeline
 Model Persistence
 Model Evaluation - Metric Functions

Module 9: Natural Language Processing with Scikit-Learn


 NLP Overview
 NLP Approach for Text Data
 NLP Environment Setup
 NLP Sentence analysis
 NLP Applications
 Major NLP Libraries
 Scikit-Learn Approach
 Scikit - Learn Approach Built - in Modules
 Scikit - Learn Approach Feature Extraction
 Bag of Words
 Extraction Considerations
 Scikit - Learn Approach Model Training
 Scikit - Learn Grid Search and Multiple Parameters
 Pipeline

SSDN Technologies +91-9999-111-686


4

Module 10: Data Visualization in Python using Matplotlib


 Introduction to Data Visualization
 Python Libraries
 Plots
 Matplotlib Features:
o Line Properties Plot with (x, y)
o Controlling Line Patterns and Colors
o Set Axis, Labels, and Legend Properties
o Alpha and Annotation
o Multiple Plots
o Subplots
 Types of Plots and Seaborn

Module 11: Data Science with Python Web Scraping


 Web Scraping
 Common Data/Page Formats on The Web
 The Parser
 Importance of Objects
 Understanding the Tree
 Searching the Tree
 Navigating options
 Modifying the Tree
 Parsing Only Part of the Document
 Printing and Formatting
 Encoding

Module 12: Python integration with Hadoop, MapReduce and Spark


 Need for Integrating Python with Hadoop
 Big Data Hadoop Architecture
 MapReduce
 Cloudera QuickStart VM Set Up
 Apache Spark
 Resilient Distributed Systems (RDD)
 PySpark
 Spark Tools
 PySpark Integration with Jupyter Notebook

SSDN Technologies +91-9999-111-686

You might also like