Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Name: Albert Davis

Roll No: 2027916

1) Nominal variable
Malaria Dataset No. of cases from different countries of 2017

Country No. of cases


Afghanistan 161778
Brazil 189503
Colombia 52805
Somalia 35138
Thailand 11440
Yemen 114004
Zimbabwe 315624
Source: https://www.kaggle.com/imdevskp/malaria-dataset

Malaria Dataset No. of cases from different


countries of 2017

161778

315624

189503

114004 52805

11440 35138

Afghanistan Brazil Colombia Somalia Thailand Yemen Zimbabwe

Pie Chart
Bar Graph
From this data it is very clear that Zimbabwe is having the highest number of cases in 2017
and Thailand is having the least number of cases as per 2017. The reason for this may be
increased rainfall, decreasing use and reduced insecticide activity of long-lasting insecticide-
treated nets, and drug shortages may have been responsible.
2) Ordinal Data
Happiness report from 2015
Extremely Extremely
Happy Neutral Unhappy
Country Happy Unhappy
Switzerland 45% 23% 4% 15% 13%
Iceland 56% 22% 8% 11% 3%
Denmark 65% 27% 0% 4% 4%
Norway 51% 24% 5% 18% 2%
Canada 27% 45% 5% 8% 15%
Finland 49% 19% 23% 0% 9%
Netherlands 59% 24% 5% 6% 6%
Sweden 35% 31% 24% 7% 3%
New Zealand 39% 45% 5% 6% 5%
Australia 48% 32% 10% 2% 8%
Source: https://www.kaggle.com/mathurinache/world-happiness-report
Bar Graph
From the Bar Graph we can see that Denmark has the highest percentage with Extremely
Happy people and Norway has the highest percentage with Extremely Unhappy people.
Canada has a high percentage of Happy people compared to any other country. Sweden and
Iceland have the least percentage of people who are extremely unhappy. Therefore this Bar
graph clearly explains the Happiness scale of different countries which helps in
understanding the level of happiness in different countries.
3) Interval Data
Students Performance in Maths Exams
Math
Score Bin
Math Score Frequency
72 10
0-10 0
69 20
10-20 1
90 30
20-30 0
47 40
30-40 3
76 50
40-50 4
71 60
50-60 2
88 70
60-70 7
40 80
70-80 6
64 90
80-90 3
38
More 0
58
Total 26
40
65
78
50
69
88
18
46
54
66
65
44
69
74
73
https://www.kaggle.com/spscientist/dataset/?select=StudentsPerformance.cs
Source: v

Distribution of Marks
8
7
6
5
4
3
2
1
0
0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90 More
Math Marks
Histogram

From the Histogram above we can see that most No. of the students has scored a mark range
of 60-70 and only one student has scored below 30 for the math exam.3 students have scored
above 80 for the math exam. Therefore by preparing a histogram it provides a visual
interpretation of numerical data by showing the number of data points that fall within a
specified range of values.
4) Ratio Data
80 Cereals Nutrition data on 80 cereal products

Name of cereals calories weight


Basic 4 130 1.33
Fruit & Fibre Dates; Walnuts; and
Oats 120 1.25
Fruitful Bran 110 1.33
Just Right Fruit & Nut 140 1.3
Mueslix Crispy Blend 160 1.5
Nutri-Grain Almond-Raisin 140 1.33
Oatmeal Raisin Crisp 130 1.25
Post Nat. Raisin Bran 120 1.33
Puffed Rice 50 0.5
Puffed Wheat 60 0.5
Raisin Bran 120 1.33
Shredded Wheat 80 0.83
Total Raisin Bran 140 1.5
Source: https://www.kaggle.com/crawford/80-cereals?select=cereal.csv

Scatter Diagram
From the above data we can see that the different cereals have different Calories and weights.
We can see that Just Right Fruit & Nut cereals have the highest calories and highest weight
quantity. At the same time Puffed Rice cereal has got lowest calories and lowest weight
quantity as compared to any other cereals. Therefore Scatter plots are used to observe
relationships between variables. It provides a visual image of the data plotted as points, which
helps show any patterns in the data.

Conclusion
From the above Diagram and chart it shows that we utilize a ton of information perception
strategies to speak to information both on a level plane and vertically with the plotting
focuses. What's more, every procedure characterizes a particular reason. A disperse plot
contains focuses that are gliding everywhere throughout the screen. These focuses are not
associated with a line or speak to any bar. Dissipate plots permit us to show the relationship
of one variable with the other.
A variable has one of four different levels of measurement: Nominal, Ordinal, Interval, or
Ratio. (Interval and Ratio levels of measurement are sometimes called Continuous or Scale).
It is important for the researcher to understand the different levels of measurement, as these
levels of measurement, together with how the research question is phrased, dictate what
statistical analysis is appropriate. Second, knowing the level of measurement helps you
decide what statistical analysis is appropriate on the values that were assigned.
Therefore this particular Research helped me to identify/transform data on four levels of
measurement (Nominal, Ordinal, Interval and Ratio) and apply data visualization for Bar
Chart, Pie Chart, Histogram and Scatter diagram.

You might also like