Professional Documents
Culture Documents
Data Visualistgaiong
Data Visualistgaiong
wlvers
Data visualization
Kn
wlvers
Introduction:
Introduction Importance Applications
Data characteristics
Kn
wlvers
Introduction
Kn
Introduction
Visualization Transformation of data or information into pictures Engages primary human sensory apparatus vision
wlvers
Data visualization Data visualization is the presentation of data in a pictorial or graphical format. Graphical presentation may entail manipulation of graphical entities (points, lines, shapes, images, text) and attributes (colour, size, position, shape).
A picture is worth more than a thousand words. a Chinese proverb A picture is worth more than a thousand numbers.
Kn
Importance
wlvers
Third eye vision: Visualization helps people see things that were not obvious to them before. Even when data volume is very large, patterns can be spotted quickly and easily. Easy to share the ideas : It lets people ask others, Do you see what I see? And it can even answer questions like What would happen if we made an adjustment to that area? Comparison : It facilitates the comparison of two or more datasets or frequency distributions.
Spreadsheet of data, an Excel or a CSV file, should always be visualized. If not, its a meal only partially cooked. The data is there, but its not communicating anything. Its not engaging. CEOs and executives worldwide have limited time and limited attention. The presentation of data needs to be factual, simplified, and highly visual. Insights should pop out.
Kn
Importance
Simple Example:
wlvers
Kn
Importance
wlvers
6 5 4
Value (m)
3 40% 2 1 20%
0
Staff
0%
Kn
Applications
wlvers
Data visualization has its broad application in various statistical analysis ,Presentations, Business intelligence etc. Some specific areas where it is used are:
Exploratory data analysis Data mining techniques Business intelligence Dashboards Advertising campaigns Political polling Annual reports Sports statistics Real estate maps Forecasting Big data visualizations
Kn
wlvers
Graphical information
Kn
Introduction
wlvers
With the help of Data visualization, we can infer about various information from the data. A viewer should know about various characteristics of the data as well as graph to make full use Of data visualization techniques. This section list out some of the data characteristics used for data visualization. There are a few basic concepts that can help us to generate the best visuals for displaying our data: Understand the data you are trying to visualize, including its size and cardinality (the uniqueness of data values in a column). Determine what you are trying to visualize and what kind of information you want to communicate. Know your audience and understand how it processes visual information. Use a visual that conveys the information in the best and simplest form for your audience.
Kn
Data characteristics
Averages : mean, median, mode Dispersion : How dispersed the data is , from the average values Skewness: Lack of symmetry
wlvers
Kn
Data characteristics
Kurtosis : Measure of Peakedness
wlvers
Discrete or continuous
Kn
Data characteristics
Dense or sparce
wlvers
Disjoint or Overlapping
Kn
Data characteristics
Outliers Dimension : 1-D , 2-D,3-D or n-D
wlvers
Kn
Data characteristics
wlvers
Example :Generate and plot 1000 normal random variable with mean 3 and variance 2.25 both In excel and R.
Kn
wlvers
Kn
Curves
wlvers
It is most simple and widely use technique of data visualization. The data points are connected with lines or smooth curves.(line charts or smooth curves)
Kn
Bar chart
A graphical representation of the frequency distribution with the help of bar of different heights. It is generally used for categorical variables.
wlvers
Example : Create a frequency distribution of gender in the table survey and make a bar chart as well as line chart of the distribution in Excel . Import the table in R and perform the same activity in R.
Kn
Histogram
wlvers
A diagram consisting of rectangles which area is proportional to the frequency of a continuous variable and which width is equal to the class interval (bin).
-- Symmetry or skewness
-- Unimodality, bimodality or multimodality -- Presence of outliers
Kn
Histogram
wlvers
The only difference in Bar graph and histogram is that the width of the histogram represents the range of values . It need not be of the same size, whereas the width of the bar must be same. For small data sets, histograms can be misleading. Small changes in the data, or bins, can deceive. For large data sets, histograms can be quite effective at illustrating general properties of the distribution.
Example : Create a frequency distribution of gender by age-group. Take a age group as 16-25,26-35 Create a histogram in Excel and R.
Kn
Pie chart
A special chart that uses "pie slices" to show relative sizes of data. It uses different colours and shades to show different variables.
wlvers
Example : Construct a Excel pie chart of the Frequency distribution of height taking a class interval of 5 as 56-60,61-65,66-70,71-75,76-80,81-85. Also label the percentage of contribution of each group to the total.
Kn
Scatter plot
A graph of plotted points that show the relationship between two sets of data. e.g. y-axis = response, x-axis = suspected indicator useful to answer: x,y related? linear quadratic other outliers present?
wlvers
Kn
Scatter plot
wlvers
Kn
Scatter plot
wlvers
Example :Create a flag (1 for male ,0 for female) and construct a scatter plot between 1.Gender and height 2.Age and play. Try to interpret the result. Tools : Excel and R
Kn
Box and whisker plot
Box and whisker plots demonstrates the 5 number summary(Min,Q1,Q2,Q3,Max) of data with the help of Box and whisker.
wlvers
Kn
Box and whisker plot
wlvers
Box plot in R assumes the data points outside the range (Q1-1.5*IQR,Q3+1.5*IQR) as outliers. Outliers will be shown outside the plot.
Example :Create a box plot for variable Age, height and shoe in Excel and R.
Kn
Few R code for plotting
par() # to set the graphical parameter # help(par) plot(density(variable),main=title,sub=subtitle,xlab=x-asix title,ylab=y=axis title, col =color_density , col.main=color1,col.sub=color .) help(plot) barplot(table(variable),names.arc=c(variable_name.)) # for bar plot # help(barplot)
wlvers
boxplot(data,..) help(boxplot)
Kn
Charts in excel
wlvers
Thermometer Chart- A thermometer chart shows you how much of a goal has been achieved. To create a thermometer chart, execute the following steps. 1. Select a cell (here B16) . The cell selected should not be connected to the other cells.
2. On the Insert tab, click Column and then click Clustered Column.
Kn
Charts in excel
The result will be-
wlvers
Kn
Charts in excel
Trend-line Chart We can create a two axis chart on any data . Go to the Insert tab and then select the Line chart option. Let us take the following data set-
wlvers
Kn
Charts in excel
wlvers
Suppose , if we want to show, the %age of total sales on each day. This can be done in the following manner.
Kn
Charts in excel
wlvers
Selecting the dataOnce the chart is inserted, we right click on the chart and then select the Select Data option. We can select the data in the following manner.
Kn
Charts in excel
The result will be-
wlvers
Kn
Charts in excel
wlvers
Two-Axis Chart If for the same data, we want to show the amount of sales on each day, we may use a bar chart. The result will be-
Kn
Charts in excel
We can select the exploded pie chart, we need to go to the Insert tab, select the Pie chart option and then choose the Exploded Pie chart .
wlvers
Kn
Charts in excel
wlvers
Let us take the following data set that shows the percentage of expenditure incurred by a person on different items like food, clothing, etc.
ITEM Food
Clothing
Electricity Rent
30%
10% 20%