Professional Documents
Culture Documents
Program: B.Tech Specialization Course Code: BCSE3092 Course Name: Data Science
Program: B.Tech Specialization Course Code: BCSE3092 Course Name: Data Science
Course Outcomes :
• CO1 To acquire good introducing knowledge of the essentials in
Statistical
Fundamentals used in Data science.(K1)
• CO2 Develop an ability to apply algorithmic principles and Programing
knowledge
using Python and R language on Data Science .(K2,K3)
• CO3 Develop ability to visualise the data for Analysis. (K4)
• CO4 Apply and Implement ML principles using Probability and Statistics
(K5)
• CO5 Understating and Recommending statistics and Machine learning
solutions (K6)
• CO6 Gaining Research insights and latest solutions provided by
researchers(K6)
Program Name: BCSE3029 Program Code: Data Science
School of Computing Science and Engineering
Course Code : BCSE3092 Course Name: Data Science
Course Prerequisites
• PYTHON BASICS
• STATISTICS
• LINEAR ALGEBRA
Syllabus
Recommended Books
Text books
Reference Book
• Fast: It is fast because our brains are great at identifying patterns, but only when data
is presented in a tangible format. Armed with visualization, we can spot trends and
outliers very effectively.
• Insightful: Exploring graph data interactively allows users to gain more in-depth
knowledge, understand the context and ask more questions, compared to static
visualization or raw data.
Principles of Data Visualization
• The role of data visualization in
communicating the complex insights hidden
inside data is vital.
• Illustrating movement
• Proportion
• Proper rhythm
• Variety
• Theme
The first step in representing information is trying
to understand that data visualization.
1. Overview first
3. Details on demand
Layout and design:
communicative
elements
• All visual representations begin with a blank
dimensional space that will eventually hold the
information which will be communicated.
• Histograms
Histograms display the distribution of a continuous
variable by dividing the range of scores into a
specified number of bins on the x-axis and displaying
the frequency of scores in each bin on the y-axis.
A histogram takes only one variable from the dataset
and shows the frequency of each occurrence. I will
use a simple dataset to learn how histogram helps to
understand a dataset.
Box plots
• Box plots
• A box-and-whiskers plot describes the distribution of a
continuous variable by plotting its five-number summary.
• Using parallel box plots to compare groups
• Box plots can be created for individual variables or for
variables by group.
• A boxplot shows the distribution of the data with more
detailed information. It shows the outliers more clearly,
maximum, minimum, quartile(Q1), third quartile(Q3),
interquartile range(IQR), and median.
Code - Box plots
Notched box plots