Download as pdf or txt
Download as pdf or txt
You are on page 1of 43

Vishal Mishra (IBS, Hyderabad)

BASICS
Analytics, Data Analytics, Business Analytics

According to Merriam-webster dictionary, analytics “is the method of logical


analysis”.

According to Cambridge dictionary analytics “is a process in which


a computer examines information using mathematical methods in order
to find useful patterns”.

According to INFORMS (an international association of operations research and


analytics professionals) analytics is “the scientific process of transforming data into
insights for the purpose of making better decisions”.
Vishal Mishra (IBS, Hyderabad)
BASICS

• Elements, Variables, and Observations

• Scales of Measurement

• Types of Data
- Qualitative and Quantitative Data
- Cross-Sectional and Time Series Data
- Primary and Secondary Data
- Sample and Population Data
Vishal Mishra (IBS, Hyderabad)
BASICS

• Scales of Measurement

Nominal
Ordinal
Interval
Ratio
Vishal Mishra (IBS, Hyderabad)

BASICS

• Statistics: Data Collection, Organization, Analysis, Interpretation and


Presentation

• Descriptive Statistics

• Inferential Statistics
Vishal Mishra (IBS, Hyderabad)
BASICS
Descriptive Statistics

Graphical, Tabular, Numeric representation of Data

- Qualitative

- Quantitative
Vishal Mishra (IBS, Hyderabad)

BASICS
Graphical, Tabular, Numeric representation of Data

Qualitative Data

Bar Graph, Pie Chart

Frequency Distribution
Relative Frequency Distribution
Cross-tabulation
Vishal Mishra (IBS, Hyderabad)

BASICS
Graphical, Tabular, Numeric representation of Data

Quantitative Data

Dot Plot, Histogram, Ogive, Scatter Diagram

Frequency Distribution
Relative Frequency Distribution
Cumulative Frequency Distribution
Stem and Leaf Display
Cross-tabulation

Measures of location, Variability


Vishal Mishra (IBS, Hyderabad)

BASICS
Quantitative Data: Numeric Representation

Measures of location: Mean, Median, Mode

Measures of Variability:
Range
Inter-quartile Range
Standard Deviation
Variance
Coefficient of Variation
Vishal Mishra (IBS, Hyderabad)

BASICS S. No. Color S. No. Colour


1 Blue 16 Blue
2 Red 17 Blue
Qualitative Data: Example
3 Blue 18 Green
4 Green 19 Red
5 Red 20 Blue
6 Blue 21 Red
7 Red 22 Red
8 Red 23 Green
9 Red 24 Green
10 Blue 25 Red
11 Red 26 Red
12 Red 27 Blue
13 Green 28 Blue
14 Green 29 Red
15 Blue 30 Red
Vishal Mishra (IBS, Hyderabad)

BASICS
Qualitative Data: Bar Graph

Colour Preference
16

14

12

10

Blue Red Green


Vishal Mishra (IBS, Hyderabad)

BASICS
Qualitative Data: Pie Chart

Colour Preference
Vishal Mishra (IBS, Hyderabad)

BASICS
Qualitative Data: Frequency Distribution

Colour Frequency

Blue 10

Red 14

Green 6
Vishal Mishra (IBS, Hyderabad)

BASICS
Qualitative Data: Relative Frequency Distribution

Relative Frequency: Frequency/Total

Colour Frequency Relative Frequency

Blue 10 0.3333

Red 14 0.4667

Green 6 0.2

Total 30 1
Vishal Mishra (IBS, Hyderabad)

BASICS S. No. JOB TYPE SATISFACTION S. No. JOB TYPE SATISFACTION

Qualitative Data: Example 1 S/W Engineer Y 16 S/W Engineer N


2 Carpenter Y 17 S/W Engineer N
3 S/W Engineer N 18 Bank Cashier N
4 Bank Cashier Y 19 Carpenter N
5 Carpenter Y 20 S/W Engineer Y
6 S/W Engineer N 21 Carpenter N
7 Carpenter Y 22 Carpenter Y
8 Carpenter Y 23 Bank Cashier N
9 Carpenter Y 24 Bank Cashier N
10 S/W Engineer N 25 Carpenter Y
11 Carpenter N 26 Carpenter Y
12 Carpenter N 27 S/W Engineer Y
13 Bank Cashier Y 28 S/W Engineer Y
14 Bank Cashier Y 29 Carpenter Y
15 S/W Engineer N 30 Carpenter Y
Vishal Mishra (IBS, Hyderabad)

BASICS
Qualitative Data: Cross-Tabulation

SATISFACTION
NO YES
S/W Engineer 6 4
JOB Bank Cashier 3 3
Carpenter 4 10
Vishal Mishra (IBS, Hyderabad)

BASICS S. No. Marks S. No. Marks


1 19 16 47
2 33 17 10
Quantitative Data: Example
3 22 18 35
4 32 19 12
5 33 20 15
6 34 21 27
7 38 22 19
8 27 23 45
9 27 24 32
10 26 25 14
11 34 26 40
12 35 27 31
13 25 28 17
14 44 29 20
15 26 30 36
Vishal Mishra (IBS, Hyderabad)
BASICS

Quantitative Data: Frequency Distribution

Number of classes ? Based on judgement, given the number of observations


e.g. for 30 observations we can decide on 5 classes

Class Width = (Max. Value – Min. Value)/(no. of classes)


(47-10)/5 = 7.4 rounded to 8, the larger integer
Vishal Mishra (IBS, Hyderabad)

BASICS
Quantitative Data: Frequency Distribution

Class Intervals Frequency


10 to 18 5
18 to 26 5
26 to 34 10
34 to 42 7
42 to 50 3
Vishal Mishra (IBS, Hyderabad)

BASICS
Quantitative Data: Relative Frequency Distribution

Class Intervals Frequency Relative Frequency Cumulative Relative Frequency

10 to 18 5 0.166666667 (<18): 0.166666667

18 to 26 5 0.166666667 (<26): 0.333333333

26 to 34 10 0.333333333 (<34): 0.666666667

34 to 42 7 0.233333333 (<42): 0.9

42 to 50 3 0.1 (<50): 1

Total = 30 Total = 1
Vishal Mishra (IBS, Hyderabad)

BASICS
Quantitative Data: Histogram

Frequency
12

10

0
Less than 10 10 to 18 18 to 26 26 to 34 34 to 42 42 to 50
Vishal Mishra (IBS, Hyderabad)

BASICS
Quantitative Data: Ogive

Cumulative Frequency
35
30
25
20
15
10
5
0
Less than 10 Less than 18 Less than 26 Less than 34 Less than 42 Less than 50
Vishal Mishra (IBS, Hyderabad)

BASICS
Quantitative Data: Numeric Summary

Measures of location OR central tendency:


Mean, Median, Mode

Measures of Variability:
Range, IQR, Standard Deviation (S.D), Variance, Coefficient of variation
Vishal Mishra (IBS, Hyderabad)

BASICS
Quantitative Data: Measures of Location OR Central Tendency

e.g. Two measures of central tendency for the dataset:


Mean: 28.5 (Symbol for population: µ ; Symbol for sample: x)
Mode: 27

Quartiles/Percentiles:

i = (p/100)*n

If i is an integer, then (in a data arranged in an ascending order) pth


percentile is a value that is mean of the value in the ith and (i+1)th
position. If i is a fraction, then pth percentile is the value in the position
that is obtained by rounding up the value of i to the higher integer.

Median is called as the 50th percentile and represented using Q2


Vishal Mishra (IBS, Hyderabad)

BASICS S. No. Marks S. No. Marks


1 10 16 31
2 12 17 32
Quantitative Data: Ascending Order
3 14 18 32
4 15 19 33
5 17 20 33
6 19 21 34
7 19 22 34
8 20 23 35
9 22 24 35
10 25 25 36
11 26 26 38
12 26 27 40
13 27 28 44
14 27 29 45
15 27 30 47
Vishal Mishra (IBS, Hyderabad)

BASICS
Quantitative Data: Measures of Location OR Central Tendency

For Q2, the 50th percentile or median, i = (50/100)*30 = 15

Now since i is an integer, Q2 is the mean of observations in the ith and (i+1) th
position (when the data is arranged in ascending order).

i.e. Q2 = (27+31)/2

Thus Q2 or Median is 29
Vishal Mishra (IBS, Hyderabad)

BASICS
Quantitative Data: Measures of Variability

Range: Maximum Value – Minimum Value


=?

Inter Quartile Range: IQR = Q3 – Q1

Where, Q1 is the first quartile (or 25th Percentile)


and Q3 is the third quartile (or 75th Percentile)

Here Q1 = ? and Q3 = ?
Thus IQR = ?
Vishal Mishra (IBS, Hyderabad)

BASICS
Quantitative Data: Measures of Variability

Range: Maximum Value – Minimum Value


= 47 – 10 = 37

Inter Quartile Range: IQR = Q3 – Q1

Where, Q1 is the first quartile (or 25th Percentile)


and Q3 is the third quartile (or 75th Percentile)

Here Q1 (8th position) = 20 and Q3 (23rd position) = 35


Thus IQR = 35 – 20 = 15
Vishal Mishra (IBS, Hyderabad)

BASICS
Quantitative Data: Measures of Variability

Population Variance:  ( x −  ) 2
2 = i
N
=?
Population S.D = σ = ?

2  ( xi − x )
2
Sample Variance: s =
n −1
=?

Sample S.D = s = ?
Vishal Mishra (IBS, Hyderabad)

BASICS S. No. Marks Mean (Marks- Mean)^2


1 10 28.5 342.25
Quantitative Data: Measures of Variability 2 12 28.5 272.25
3 14 28.5 210.25
4 15 28.5 182.25
(Note: data arranged in ascending order)
5 17 28.5 132.25
6 19 28.5 90.25
 ( xi −  ) 2
Population  =
2 7 19 28.5 90.25
N 8 20 28.5 72.25
9 22 28.5 42.25
10 25 28.5 12.25
2  ( xi − x )
2
s = 11 26 28.5 6.25
Sample n −1 12 26 28.5 6.25
… … … …
… … … …
… … … …
Vishal Mishra (IBS, Hyderabad)

BASICS
Quantitative Data: Measures of Variability
 ( x −  ) 2
Population Variance: 2 = i
N
= 93.85,
(Numerator, Sum of Squared Difference from Mean: ∑(Xi - µ)2 = 2815.5)

Population S.D, σ = 9.69

2  ( xi − x )
2
Sample Variance: s =
n −1
= 97.09,

Sample S.D, s = 9.85


Vishal Mishra (IBS, Hyderabad)

BASICS
Quantitative Data: Measures of Variability

Limitations of Variance, S.D

Two populations: Population 1, Population 2

S.D: 4 kg, 16 kg respectively


Vishal Mishra (IBS, Hyderabad)

BASICS
Quantitative Data: Measures of Variability

Concept of Coefficient of variation: Limitations of Variance, S.D

Coefficient of variation, Population = (σ/µ) * 100

Coefficient of variation, Sample = (s/ )x* 100


Vishal Mishra (IBS, Hyderabad)

BASICS
Quantitative Data: Measures of Variability

Example: Samples from two populations: Human Beings, Blue Whales


Characteristic: Body Weight

Sample S.D: 4 kg, 16 kg respectively

Sample Mean: 65 kg, 100000 kg respectively

Sample Coefficient of variation: = (s/ )x* 100

Sample 1 C.V: (4/65)*100 = 6.15%


Sample 2 C.V: (16/100000)*100 = 0.016%
Vishal Mishra (IBS, Hyderabad)

BASICS
Relationship Between Variables

Qualitative Data: Cross-tabulation (Tabular)

Quantitative Data:

Scatter Plot (Graphical)

Cross-tabulation (Tabular)

Covariance & Correlation (Numeric)


Vishal Mishra (IBS, Hyderabad)

BASICS S. No. Hours Studied Marks Scored


1 4 4
Relationship Between Variables 2 8 9
3 6 8
Quantitative Data:
4 2 6
Scatter Plot (Graphical) 5 9 10
6 7 6
7 15 20
8 3 5
9 12 18
10 1 3
Vishal Mishra (IBS, Hyderabad)

BASICS
Relationship Between Variables Marks Scored
25
Quantitative Data:
20

Scatter Plot (Graphical)


15

10

0
0 2 4 6 8 10 12 14 16
Vishal Mishra (IBS, Hyderabad)

BASICS S. No.
1
Marks Scored
9
Hours Spend on Social Media
8
2 10 8
3 11 7
Relationship Between Variables
4 11 6
5 17 6
Quantitative Data: 6 19 7
7 19 2
8 20 5
Cross-tabulation 9 22 5
10 25 4
11 26 5
12 26 6
13 27 4
14 28 2.5
15 29 4
16 31 2.5
17 32 2.5
18 32 2
19 33 2
20 33 2
Vishal Mishra (IBS, Hyderabad)

BASICS
Relationship Between Variables

Quantitative Data:
Hours on Social Media
Cross-tabulation
0 to 3 3 to 6 6 to 9

0 to 12 0 0 4

Marks Scored 12 to 24 1 2 2

24 to 36 6 4 1
Vishal Mishra (IBS, Hyderabad)

BASICS
Relationship Between Variables

Quantitative Data:

Covariance : Degree of linear association

 ( xi − x )( yi − y )
Sample: sxy =
n −1
 ( xi −  x )( yi −  y )
Population:
 xy =
N
Vishal Mishra (IBS, Hyderabad)

BASICS
Relationship Between Variables

Quantitative Data: Covariance


S.No. Hours Studied (X) Marks Scored (Y) Xi - MeanX Yi - Mean Y (Xi-MeanX) * (Yi-MeanY)
1 4 4 -2.7 -4.9 13.23
2 8 9 1.3 0.1 0.13
3 6 8 -0.7 -0.9 0.63
4 2 6 -4.7 -2.9 13.63
5 9 10 2.3 1.1 2.53
6 7 6 0.3 -2.9 -0.87
7 15 20 8.3 11.1 92.13
8 3 5 -3.7 -3.9 14.43
9 12 18 5.3 9.1 48.23
10 1 3 -5.7 -5.9 33.63

Mean 6.7 8.9


Vishal Mishra (IBS, Hyderabad)

BASICS
Relationship Between Variables

Quantitative Data: Covariance

Numerator: Product of differences from mean: 217.7

Sample Covariance: 217.7/9 = 24.19

Population Covariance = 217.7/10 = 21.77


Vishal Mishra (IBS, Hyderabad)

BASICS
Relationship Between Variables

Quantitative Data:
Drawback of Covariance
Correlation - Degree of linear association

sxy
Sample: rxy =
sx s y
Population:  xy
 xy =
 x y
Vishal Mishra (IBS, Hyderabad)

BASICS
Relationship Between Variables

Quantitative Data:
Correlation - Degree of linear association
Population Variance of X = 18.01; Sample Variance = 20.01
Population Variance of Y = 29.89 ; Sample Variance = 33.21

sxy
rxy =
Sample: sx s y
 xy
Population:  xy =
 x y

You might also like