Data Analyst (1) (11168)

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 399

Data Analyst

© Copyright FHM.AVI – Data 1


Data Analyst

© Copyright FHM.AVI – Data 2


Data Analyst
Table of Contents
.......................................................................................................................................................... 1

Data Analyst .................................................................................................................................... 4


How to Become a Data Analyst? ........................................................................................... 4
Data analysis............................................................................................................................ 9
Data Visualization ................................................................................................................. 45
Data Mining......................................................................................................................... 399
Data Insight ......................................................................................................................... 454
Tableau Tutorial ................................................................................................................. 471

© Copyright FHM.AVI – Data 3


Data Analyst
Data Analyst
How to Become a Data Analyst?

Who is a Data Analyst?


Nowadays, companies receive a tremendous amount of information every
day that can be used to optimize their strategies. To get insights from the
massive data collected, they need a highly qualified professional: The Data
Analyst.
The task of a Data Analyst is to process the varied data concerning the
customers, the products, or the performances of the company, to release
indicators useful for the decision-makers. Thus, the information provided
by the data analyst enables companies to define the products to be offered
to customers according to their needs,the marketing strategy to adopt, or
the improvements to be made to the production process.

Data Analyst Qualifications


How to becoming a data analyst requires both academic qualifications and
skills. Letus see these categories in detail below.

© Copyright FHM.AVI – Data 4


Data Analyst

Academic Qualifications
It is recommended that graduate from a data analysis program and have a high GPA,
it would be easy for you to land an entry-level data analysis job. Even if you don’t
have a specialization in data analysis, but a degree in mathematics, statistics, or
economics from a well-reputed University, can easily land a data analysis entry-level
job.
Most entry-level data analyst jobs require at least a bachelor level degree. Higher-
level data analyst jobs usually guarantee a higher pay and may require you to have
a master’s or a doctoral degree. Having a master’s degree in Data Science or
Business Analytics is very helpful. If you are interested in data analytics, you should
consider earning a master’s degree.
Apart from the degree, you can also enroll in online courses if that what interests
you the most, then the path you take to be qualified can be anything.

Skills
Skills of the candidate can be further classified into three categories, let us discuss
them in detail:

Technical Skills
Programming Languages: As a data analyst, you should be proficient in at least
one programming language. However, the more languages you are proficient in, the
better it is. Popular programming languages that can be used to manipulate data are
R, Python, C++, Java, MATLAB, PHP, and more.
Data Management and Manipulation: As a data analyst, you should be familiar
with languages, such as R, HIVE, SQL, and more. Building queries to extract the
desired data is an essential aspect of data analysis. Once you have analyzed the data,
you would have to create accurate reports. Some standard tools for doing the same
are SAS, Oracle Visual Analyzer, Microsoft Power BI, Cognos, Tableau, and more.

© Copyright FHM.AVI – Data 5


Data Analyst

Soft Skills
Domain Knowledge and Excellent Communication Skills: A data analyst’s job is
to provide detailed and accurate information to the decision-makers. Hence, data
analysts must understand the specific user requirements, along with having a deep
understanding of the data. Excellent communication skills are essential for
collaboration with the various clients, executives, IT specialists, to ensure that the
data aligns well with the business objectives. Ultimately, the analysis done by a data
analyst modifies/improves some business processes.

Practical Skills
High Level of Mathematical Ability: Knowledge of statistics and the right comfort
level with formulae required for analyzing data to provide real-world value. As a
data analyst, you should have a good grasp of mathematics, and you should be able
to solve common business problems, for example, calculating compound interest,
depreciation, statistical measures (for example, mean, median, mode). Also, you
should know how to use tables, charts, graphs, and more. It is essential to be
comfortable with college-level algebra, thereby Making visualization of data more
appealing. Knowing linear algebra and multivariate calculus is very helpful for data
analysts as they are all extensively used in performing Data Analysis.
Microsoft Excel: Organizing data and calculating numbers are among the main
tasks of data analysts. Hence it is beneficial if you are comfortable with using Excel.
There are many great online sources where you can learn how to use Excel to its full
potential.

Data Analyst Career Path


Skilled data analysts are in demand in almost every sector. Hence, it doesn’t come
as a surprise that the predicted growth rate in demand for data analysts for the next
seven years is 19%. Data analysis is considered to be the most crucial skill, so every
professional should learn Data Science as soon as possible to excel in a career. Some
industries where the demand for data analysts is quite high are as follows:

© Copyright FHM.AVI – Data 6


Data Analyst

• Market Research: 72% of marketers consider data analysis to be vital for


thriving in the present marketing landscape. The success of the marketing
campaigns can be understood using data analysis. Also, data analysis can be
used by companies for a market research before launching a new product or
service.

• Finance and Investments: Financial institutions generally require entry-


level data analysts as well as expert ones. At many financial institutions, such
as investment banks, the most common career path taken by data analysts is
that of management. If you prove to be the best among your peer group, you
are considered for promotion by the senior management as they consider you
as someone who could manage new hires well.

• Sales: There are many data related to sales of products and services in a
company that is analyzed, which helps in increasing sales and customer
satisfaction and also in identifying the potential sales barriers. Hence, a
requirement for data analysts arises in this sector, as well.
A data analyst fresher makes a handsome salary, and the range of the salary depends
on his/her expertise and skill-set. The skills required as a fresher may vary across
the industry.
For example, the typical job of a Data Analyst is to run queries against the available
data for finding the important trends and processing the data that might be of use to
Data Scientists. In general, the Data Analysts are very good at database query
languages, for example, SQL. They may also write scripts and produce visuals on
the data available to them for better understanding.
A Data Scientist, on the other hand, builds models using Machine Learning. These
models are used to make several predictions and can also explain the future of the
organization. Data Scientists work closely with Data Analysts while preparing the
data to be used for the machine learning models. However, the salaries of Data
Scientists are much higher than those of Data Analysts because of very high demand
and low supply.
Many Data Analysts gain relevant skills and become Data Scientists. The transition
to becoming a Data Scientist is not very difficult for Data Analysts since they already
have some relevant skills. Many Data Analysts go on to become Data Scientists.

© Copyright FHM.AVI – Data 7


Data Analyst

The designations of a Data Analyst would depend on the company he/she works.
However, generally, the technical work of the Data Analysts keeps on decreasing,
and the managerial work keeps on increasing as they climb up the corporate ladder.
After a certain point, the promotion starts to depend on the leadership and managerial
skills. Hence, Data Analysts need to work on their soft skills as well.

How to Become a Data Analyst?


To become a data analyst, you must first earn a Bachelor’s degree, which is a
requirement for most of the entry-level data analyst positions. The relevant
disciplines include Finance, Economics, Mathematics, Statistics, Computer Science,
and Information Management.
Considering that you don’t have any prior work experience as a data analyst, the
most important task is to gain relevant work experience. As with a majority of
professions, work experience is invaluable for a data analyst too. Fortunately,
because of the massive demand for data analysts, there are many data analysis
internship opportunities. You can work as an intern, which would help you gain the
relevant work experience and also add a star to your resume.
Data analysis deals with understanding changing trends and technologies, which
makes it essential for a data analyst to commit himself/herself to lifelong learning.
You can take up MOOCs to ensure that you keep learning new things relevant to
data analysis, which helps you stay ahead of the curve.

How to Become a Data Analyst with No


Experience?
If you plan to switch being a data analyst but bear no experience in the industry, you
can probably start with a degree in an online course in data analysts. The course
would make your foundation strong in the subject, also allowing you to build
practical projects and learn and develop your skills. Moving on, you can get into an
internship or pick up some freelance work to gain experience and add to your profile
this way would stand out and have an edge when you start looking for a high profile
job as a data analyst.

© Copyright FHM.AVI – Data 8


Data Analyst

Data analysis
Introduction
Data analysis is a process of inspecting, cleansing, transforming and modeling data
with the goal of discovering useful information, informing conclusions, and
supporting decision-making. Data analysis has multiple facets and approaches,
encompassing diverse techniques under a variety of names, and is used in different
business, science, and social science domains. In today's business world, data
analysis plays a role in making decisions more scientific and helping businesses
operate more effectively.
Data mining is a particular data analysis technique that focuses on statistical
modeling and knowledge discovery for predictive rather than purely descriptive
purposes, while business intelligence covers data analysis that relies heavily on
aggregation, focusing mainly on business information. In statistical applications,
data analysis can be divided into descriptive statistics, exploratory data analysis
(EDA), and confirmatory data analysis (CDA). EDA focuses on discovering new
features in the data while CDA focuses on confirming or falsifying existing
hypotheses. Predictive analytics focuses on the application of statistical models for
predictive forecasting or classification, while text analytics applies statistical,
linguistic, and structural techniques to extract and classify information from textual
sources, a species of unstructured data. All of the above are varieties of data analysis.

Data integration is a precursor to data analysis, and data analysis is closely linked to
data visualization and data dissemination.
The process of data analysis

© Copyright FHM.AVI – Data 9


Data Analyst

Data science process flowchart from Doing Data Science, by Schutt & O'Neil (2013)
Analysis, refers to dividing a whole into its separate components for individual
examination. Data analysis, is a process for obtaining raw data, and subsequently
converting it into information useful for decision-making by users. Data, is collected
and analyzed to answer questions, test hypotheses, or disprove theories.
There are several phases that can be distinguished, described below. The phases are
iterative, in that feedback from later phases may result in additional work in earlier
phases. The CRISP framework, used in data mining, has similar steps.

Data requirements
The data is necessary as inputs to the analysis, which is specified based upon the
requirements of those directing the analysis or customers (who will use the finished
product of the analysis).The general type of entity upon which the data will be
collected is referred to as an experimental unit (e.g., a person or population of
people). Specific variables regarding a population (e.g., age and income) may be
specified and obtained. Data may be numerical or categorical (i.e., a text label for
numbers).

© Copyright FHM.AVI – Data 10


Data Analyst

Data collection
Data is collected from a variety of sources. The requirements may be communicated
by analysts to custodians of the data; such as, Information Technology personnel
within an organization. The data may also be collected from sensors in the
environment, including traffic cameras, satellites, recording devices, etc. It may also
be obtained through interviews, downloads from online sources, or reading
documentation.

Data processing
The phases of the intelligence cycle used to convert raw information into actionable
intelligence or knowledge are conceptually similar to the phases in data analysis.
Data, when initially obtained, must be processed or organized for analysis. For
instance, these may involve placing data into rows and columns in a table format
(known as structured data) for further analysis, often through the use of spreadsheet
or statistical software.

Data cleaning
Once processed and organized, the data may be incomplete, contain duplicates, or
contain errors. The need for data cleaning will arise from problems in the way that
the datum are entered and stored. Data cleaning is the process of preventing and
correcting these errors. Common tasks include record matching, identifying
inaccuracy of data, overall quality of existing data, deduplication, and column
segmentation. Such data problems can also be identified through a variety of
analytical techniques. For example; with financial information, the totals for
particular variables may be compared against separately published numbers that are
believed to be reliable. Unusual amounts, above or below predetermined thresholds,
may also be reviewed. There are several types of data cleaning, that are dependent
upon the type of data in the set; this could be phone numbers, email addresses,
employers, or other values. Quantitative data methods for outlier detection, can be
used to get rid of data that appears to have a higher likelihood of being input
incorrectly. Textual data spell checkers can be used to lessen the amount of mis-
typed words. However, it is harder to tell if the words themselves are correct.

© Copyright FHM.AVI – Data 11


Data Analyst

Exploratory data analysis


Once the datasets are cleaned, it can then be analyzed. Analysts may apply a variety
of techniques, referred to as exploratory data analysis, to begin understanding the
messages contained within the obtained data. The process of data exploration may
result in additional data cleaning or additional requests for data; thus, the
initialization of the iterative phases mentioned in the lead paragraph of this section.
Descriptive statistics, such as, the average or median, can be generated to aid in
understanding the data. Data visualization is also a technique used, in which the
analyst is able to examine the data in a graphical format in order to obtain additional
insights, regarding the messages within the data.

Modeling and algorithms


Mathematical formulas or models (known as algorithms), may be applied to the data
in order to identify relationships among the variables; for example, using correlation
or causation. In general terms, models may be developed to evaluate a specific
variable based on other variable(s) contained within the dataset, with some residual
error depending on the implemented model's accuracy (e.g., Data = Model + Error).
Inferential statistics, includes utilizing techniques that measure the relationships
between particular variables. For example, regression analysis may be used to model
whether a change in advertising (independent variable X), provides an explanation
for the variation in sales (dependent variable Y). In mathematical terms, Y (sales) is
a function of X (advertising). It may be described as (Y = aX + b + error), where the
model is designed such that (a) and (b) minimize the error when the model predicts
Y for a given range of values of X. Analysts may also attempt to build models that
are descriptive of the data, in an aim to simplify analysis and communicate results.

Data product
A data product is a computer application that takes data inputs and generates outputs,
feeding them back into the environment. It may be based on a model or algorithm.
For instance, an application that analyzes data about customer purchase history, and
uses the results to recommend other purchases the customer might enjoy.

© Copyright FHM.AVI – Data 12


Data Analyst

Communication
Data visualization is used to help understand the results after data is analyzed. Once
data is analyzed, it may be reported in many formats to the users of the analysis to
support their requirements. The users may have feedback, which results in additional
analysis. As such, much of the analytical cycle is iterative.
When determining how to communicate the results, the analyst may consider
implementing a variety of data visualization techniques to help communicate the
message more clearly and efficiently to the audience. Data visualization uses
information displays (graphics such as, tables and charts) to help communicate key
messages contained in the data. Tables are a valuable tool by enabling the ability of
a user to query and focus on specific numbers; while charts (e.g., bar charts or line
charts), may help explain the quantitative messages contained in the data.

Quantitative messages
Eight types of quantitative messages that users may attempt to understand or
communicate from a set of data and the associated graphs used to help communicate
the message during the course of Analysis.
1. Time-series: A single variable is captured over a period of time, such as the
unemployment rate over a 10-year period. A line chart may be used to
demonstrate the trend.
2. Ranking: Categorical subdivisions are ranked in ascending or descending
order, such as a ranking of sales performance (the measure) by salespersons
(the category, with each salesperson a categorical subdivision) during a single
period. A bar chart may be used to show the comparison across the
salespersons.
3. Part-to-whole: Categorical subdivisions are measured as a ratio to the whole
(i.e., a percentage out of 100%). A pie chart or bar chart can show the
comparison of ratios, such as the market share represented by competitors in
a market.
4. Deviation: Categorical subdivisions are compared against a reference, such
as a comparison of actual vs. budget expenses for several departments of a

© Copyright FHM.AVI – Data 13


Data Analyst

business for a given time period. A bar chart can show the comparison of the
actual versus the reference amount.
5. Frequency distribution: Shows the number of observations of a particular
variable for a given interval, such as the number of years in which the stock
market return is between intervals such as 0–10%, 11–20%, etc. A histogram,
a type of bar chart, may be used for this analysis.
6. Correlation: Comparison between observations represented by two variables
(X, Y) to determine if they tend to move in the same or opposite directions.
For example, plotting unemployment (X) and inflation (Y) for a sample of
months. A scatter plot is typically used for this message.
7. Nominal comparison: Comparing categorical subdivisions in no particular
order, such as the sales volume by product code. A bar chart may be used for
this comparison.
8. Geographic or geospatial: Comparison of a variable across a map or layout,
such as the unemployment rate by state or the number of persons on the
various floors of a building. A cartogram is a typical graphic used.

Techniques for analyzing quantitative data


The recommended series of best practices for understanding quantitative data
include:
- Check raw data for anomalies prior to performing an analysis;
- Re-perform important calculations, such as verifying columns of data that are
formula driven;
- Confirm main totals are the sum of subtotals;
- Check relationships between numbers that should be related in a predictable
way, such as ratios over time;
- Normalize numbers to make comparisons easier, such as analyzing amounts
per person or relative to GDP or as an index value relative to a base year;
- Break problems into component parts by analyzing factors that led to the
results, such as DuPont analysis of return on equity.
For the variables under examination, analysts typically obtain descriptive statistics
for them, such as the mean (average), median, and standard deviation. They may

© Copyright FHM.AVI – Data 14


Data Analyst

also analyze the distribution of the key variables to see how the individual values
cluster around the mean.
The consultants at McKinsey and Company named a technique for breaking a
quantitative problem down into its component parts called the MECE principle. Each
layer can be broken down into its components; each of the sub-components must be
mutually exclusive of each other and collectively add up to the layer above them.
The relationship is referred to as "Mutually Exclusive and Collectively Exhaustive"
or MECE. For example, profit by definition can be broken down into total revenue
and total cost. In turn, total revenue can be analyzed by its components, such as the
revenue of divisions A, B, and C (which are mutually exclusive of each other) and
should add to the total revenue (collectively exhaustive).
Analysts may use robust statistical measurements to solve certain analytical
problems. Hypothesis testing is used when a particular hypothesis about the true state
of affairs is made by the analyst and data is gathered to determine whether thatstate
of affairs is true or false. For example, the hypothesis might be that "Unemployment
has no effect on inflation", which relates to an economics concept called the Phillips
Curve. Hypothesis testing involves considering the likelihood of Type I and type II
errors, which relate to whether the data supports accepting or rejecting the
hypothesis.
Regression analysis may be used when the analyst is trying to determine the extent
to which independent variable X affects dependent variable Y (e.g., "To what extent
do changes in the unemployment rate (X) affect the inflation rate (Y)?"). This is an
attempt to model or fit an equation line or curve to the data, such that Y is a function
of X.
Necessary condition analysis (NCA) may be used when the analyst is trying to
determine the extent to which independent variable X allows variable Y (e.g., "To
what extent is a certain unemployment rate (X) necessary for a certain inflation rate
(Y)?"). Whereas (multiple) regression analysis uses additive logic where each X-
variable can produce the outcome and the X's can compensate for each other (they
are sufficient but not necessary), necessary condition analysis (NCA) uses necessity
logic, where one or more X-variables allow the outcome to exist, but may not
produce it (they are necessary but not sufficient). Each single necessary condition
must be present and compensation is not possible.

© Copyright FHM.AVI – Data 15


Data Analyst

Analytical activities of data users


Users may have particular data points of interest within a data set, as opposed to the
general messaging outlined above. Such low-level user analytic activities are
presented in the following table. The taxonomy can also be organized by three poles
of activities: retrieving values, finding data points, and arranging data points.

Barriers to effective analysis


Barriers to effective analysis may exist among the analysts performing the data
analysis or among the audience. Distinguishing fact from opinion, cognitive biases,
and innumeracy are all challenges to sound data analysis.

Confusing fact and opinion


Effective analysis requires obtaining relevant facts to answer questions, support a
conclusion or formal opinion, or test hypotheses. Facts by definition are irrefutable,
meaning that any person involved in the analysis should be able to agree upon them.
For example, in August 2010, the Congressional Budget Office (CBO) estimated
that extending the Bush tax cuts of 2001 and 2003 for the 2011–2020 time period
would add approximately $3.3 trillion to the national debt. Everyone should be able
to agree that indeed this is what CBO reported; they can all examine the report. This
makes it a fact. Whether persons agree or disagree with the CBO is their own
opinion.
As another example, the auditor of a public company must arrive at a formal opinion
on whether financial statements of publicly traded corporations are "fairly stated, in
all material respects". This requires extensive analysis of factual data and evidence
to support their opinion. When making the leap from facts to opinions, there is
always the possibility that the opinion is erroneous.

Cognitive biases
There are a variety of cognitive biases that can adversely affect analysis. For
example, confirmation bias is the tendency to search for or interpret information in
a way that confirms one's preconceptions. In addition, individuals may discredit
information that does not support their views.

© Copyright FHM.AVI – Data 16


Data Analyst

Innumeracy
Effective analysts are generally adept with a variety of numerical techniques.
However, audiences may not have such literacy with numbers or numeracy; they are
said to be innumerate. Persons communicating the data may also be attempting to
mislead or misinform, deliberately using bad numerical techniques.
For example, whether a number is rising or falling may not be the key factor. More
important may be the number relative to another number, such as the size of
government revenue or spending relative to the size of the economy (GDP) or the
amount of cost relative to revenue in corporate financial statements. This numerical
technique is referred to as normalization or common-sizing. There are many such
techniques employed by analysts, whether adjusting for inflation (i.e., comparing
real vs. nominal data) or considering population increases, demographics, etc.
Analysts apply a variety of techniques to address the various quantitative messages
described in the section above.
Analysts may also analyze data under different assumptions or scenario. For
example, when analysts perform financial statement analysis, they will often recast
the financial statements under different assumptions to help arrive at an estimate of
future cash flow, which they then discount to present value based on some interest
rate, to determine the valuation of the company or its stock. Similarly, the CBO
analyzes the effects of various policy options on the government's revenue, outlays
and deficits, creating alternative future scenarios for key measures.

Consideration for Analysis Process


Initial data analysis
The most important distinction between the initial data analysis phase and the main
analysis phase, is that during initial data analysis one refrains from any analysis that
is aimed at answering the original research question. The initial data analysis phase
is guided by the following four questions:

© Copyright FHM.AVI – Data 17


Data Analyst

Quality of data
The quality of the data should be checked as early as possible. Data quality can be
assessed in several ways, using different types of analysis: frequency counts,
descriptive statistics (mean, standard deviation, median), normality (skewness,
kurtosis, frequency histograms), normal imputation is needed.
- Analysis of extreme observations: outlying observations in the data are
analyzed to see if they seem to disturb the distribution.
- Comparison and correction of differences in coding schemes: variables are
compared with coding schemes of variables external to the data set, and
possibly corrected if coding schemes are not comparable.
- Test for common-method variance.
The choice of analyses to assess the data quality during the initial data analysis phase
depends on the analyses that will be conducted in the main analysis phase.

Quality of measurements
The quality of the measurement instruments should only be checked during the initial
data analysis phase when this is not the focus or research question of the study.One
should check whether structure of measurement instruments corresponds to structure
reported in the literature.
There are two ways to assess measurement quality:
- Confirmatory factor analysis
- Analysis of homogeneity (internal consistency), which gives an indication of
the reliability of a measurement instrument. During this analysis, one inspects
the variances of the items and the scales, the Cronbach's α of the scales, and
the change in the Cronbach's alpha when an item would be deleted from a
scale

Initial transformations
After assessing the quality of the data and of the measurements, one might decide to
impute missing data, or to perform initial transformations of one or more variables,
although this can also be done during the main analysis phase.

© Copyright FHM.AVI – Data 18


Data Analyst

Possible transformations of variables are:


- Square root transformation (if the distribution differs moderately from
normal)
- Log-transformation (if the distribution differs substantially from normal)
- Inverse transformation (if the distribution differs severely from normal)
- Make categorical (ordinal / dichotomous) (if the distribution differs severely
from normal, and no transformations help)

Intentions of the research design


One should check the success of the randomization procedure, for instance by
checking whether background and substantive variables are equally distributed
within and across groups. If the study did not need or use a randomization procedure,
one should check the success of the non-random sampling, for instance by checking
whether all subgroups of the population of interest are represented in sample.
Other possible data distortions that should be checked are:
- dropout (this should be identified during the initial data analysis phase)
- Item non-response (whether this is random or not should be assessed during
the initial data analysis phase)
- Treatment quality (using manipulation checks).

Characteristics of data sample


In any report or article, the structure of the sample must be accurately described. It
is especially important to exactly determine the structure of the sample (and
specifically the size of the subgroups) when subgroup analyses will be performed
during the main analysis phase.
The characteristics of the data sample can be assessed by looking at:
- Basic statistics of important variables
- Scatter plots
- Correlations and associations
- Cross-tabulations

© Copyright FHM.AVI – Data 19


Data Analyst

Final stage of the initial data analysis


During the final stage, the findings of the initial data analysis are documented, and
necessary, preferable, and possible corrective actions are taken. Also, the original
plan for the main data analyses can and should be specified in more detail or
rewritten.
In order to do this, several decisions about the main data analyses can and should be
made:
- In the case of non-normals: should one transform variables; make variables
categorical (ordinal/dichotomous); adapt the analysis method?
- In the case of missing data: should one neglect or impute the missing data;
which imputation technique should be used?
- In the case of outliers: should one use robust analysis techniques?
- In case items do not fit the scale: should one adapt the measurement
instrument by omitting items, or rather ensure comparability with other (uses
of the) measurement instrument(s)?
- In the case of (too) small subgroups: should one drop the hypothesis about
inter-group differences, or use small sample techniques, like exact tests or
bootstrapping?
- In case the randomization procedure seems to be defective: can and should
one calculate propensity scores and include them as covariates in the main
analyses?

Analysis
Several analyses can be used during the initial data analysis phase:
- Univariate statistics (single variable)
- Bivariate associations (correlations)
- Graphical techniques (scatter plots)
It is important to take the measurement levels of the variables into account for the
analyses, as special statistical techniques are available for each level:
• Nominal and ordinal variables

© Copyright FHM.AVI – Data 20


Data Analyst

o Frequency counts (numbers and percentages)


o Associations
▪ circumambulations (crosstabulations)
▪ hierarchical loglinear analysis (restricted to a maximum of 8 variables)
▪ loglinear analysis (to identify relevant/important variables and possible
confounders)
o Exact tests or bootstrapping (in case subgroups are small)
o Computation of new variables
• Continuous variables
o Distribution
▪ Statistics (M, SD, variance, skewness, kurtosis)
▪ Stem-and-leaf displays
▪ Box plots

Nonlinear analysis
Nonlinear analysis is often necessary when the data is recorded from a nonlinear
system. Nonlinear systems can exhibit complex dynamic effects including
bifurcations, chaos, harmonics and subharmonics that cannot be analyzed using
simple linear methods. Nonlinear data analysis is closely related to nonlinear system
identification.

Main data analysis


In the main analysis phase, analyses aimed at answering the research question are
performed as well as any other relevant analysis needed to write the first draft of the
research report.

Exploratory and confirmatory approaches


In the main analysis phase, either an exploratory or confirmatory approach can be
adopted. Usually the approach is decided before data is collected. In an exploratory
analysis no clear hypothesis is stated before analyzing the data, and the data is

© Copyright FHM.AVI – Data 21


Data Analyst

searched for models that describe the data well. In a confirmatory analysis clear
hypotheses about the data are tested.
Exploratory data analysis should be interpreted carefully. When testing multiple
models at once there is a high chance on finding at least one of them to be significant,
but this can be due to a type 1 error. It is important to always adjust the significance
level when testing multiple models with, for example, a Bonferroni correction. Also,
one should not follow up an exploratory analysis with a confirmatory analysis in the
same dataset. An exploratory analysis is used to find ideas for a theory, but not to
test that theory as well. When a model is found exploratory in a dataset, then
following up that analysis with a confirmatory analysis in the same dataset could
simply mean that the results of the confirmatory analysis are due to the same type 1
error that resulted in the exploratory model in the first place. The confirmatory
analysis therefore will not be more informative than the original exploratory
analysis.

Stability of results
It is important to obtain some indication about how generalizable the results are.
While this is often difficult to check, one can look at the stability of the results. Are
the results reliable and reproducible? There are two main ways of doing that.
- Cross-validation. By splitting the data into multiple parts, we can check if an
analysis (like a fitted model) based on one part of the data generalizes to
another part of the data as well. Cross-validation is generally inappropriate,
though, if there are correlations within the data, e.g. with panel data. Hence
other methods of validation sometimes need to be used. For more on this
topic, see statistical model validation.
- Sensitivity analysis. A procedure to study the behavior of a system or model
when global parameters are (systematically) varied. One way to do that is via
bootstrapping.

© Copyright FHM.AVI – Data 22


Data Analyst

Possible Career Path

To forge a career in data analytics, it’s important to think about the bigger picture.
What happens once you qualify as a data analyst? What is the typical career path you
can expect to follow? Is there one? Some of the most common data analyst career
paths are mapped out as shown in the diagram above with description following this
section.
Your data analyst career path starts with learning the necessary skills. If you’re a
complete beginner coming from an unrelated background, you’ll need to get to grips
with the entire data analysis process—from preparing and analyzing raw data, to
creating visualizations and sharing your insights. You’ll also need to develop
database querying skills with SQL, learn the fundamentals of Python (the go-to
language used by analysts), and grasp key concepts such as data mining and ethics.
At the same time, you’ll need to be proficient in the essential industry tools such as
Excel and Tableau.
Once you’ve acquired the necessary skills, you can think about applying for your
first data analyst job. At this stage, it’s important to market yourself as a data

© Copyright FHM.AVI – Data 23


Data Analyst

analyst—updating your online profiles, writing a resumé geared towards data


analytics roles, and building a professional data analyst portfolio.
This is a process that requires a good amount of time and commitment—especially
if you’re starting from scratch. For a structured, guided approach to learning all the
necessary skills, consider a dedicated course. A data analytics certification is a great
(and highly respectable) alternative to a university degree, and will tell employers
that you’ve gone through rigorous training.

Data Analyst
As a newly qualified analyst, you can expect to start in a very hands-on role—as a
junior analyst or, quite simply, a data analyst. You’ll be responsible for extracting
data, cleaning it, performing all the analyses, and sharing your findings. You’ll work
closely with business stakeholders and use your insights to guide them in their
decisions.
So what determines whether you start out as a junior analyst or go straight in for the
data analyst job title? It all depends on your previous experience and the company
hiring you. Generally speaking, you can expect to start off with a junior role if you
don’t have any prior experience using analytical skills. If you’ve got some
transferable experience from your previous career or studies, you’ll likely be
considered for a data analyst position. There’s no hard-and-fast rule on this one,
though; it varies greatly across industries and organizations.
The great thing about data analytics is that it relies on a broad range of skills which
are often transferable from other professions—such as good communication and an
aptitude for problem solving. Even if you’ve never worked as a data analyst, you’ll
likely see some of your existing skills and qualities reflected in data analyst job
descriptions. To get a feel for the kinds of jobs you’d be qualified for after
completing your data analyst program, browse sites like Indeed and LinkedIn for
both junior data analyst and data analyst roles, and see what the general requirements
are. Another option is to consider an internship. We show you how to land a data
analyst internship here.
Whichever job title you end up with, your first role should give you plenty of hands-
on experience with all aspects of the data analysis process.

© Copyright FHM.AVI – Data 24


Data Analyst

Climbing the ladder to a mid-level or senior data


analyst position
As with many professions, the typical next step in the data analyst career path is to
progress to a more senior position. How quickly you climb the ladder will vary
depending on the size of the company and whether you’re progressing within your
current organization or applying for a new role. It’s important to remember that there
is no one-size-fits-all when it comes to the data analyst career path—we can map out
the typical route, but different sectors and organizations will offer different
opportunities.
Still, once you’ve gained one or two years’ experience as a data analyst, you can
start to think about your next move. Typically, more experienced analysts will work
as senior data analysts or analytics managers. Such roles will see you taking
ownership of the data processes within your organization, and potentially managing
a team of analysts.
Your next steps will also depend on your interests and the industry you choose to
work in. Instead of going down the management route, you may choose to specialize
as an analyst in a certain field. We’ll take a look at specialist data analyst career paths
next.

Specialist data analyst career paths


Some data analysts will progress to senior management positions, stepping away
from the frontline to focus more on overseeing the company’s overall data strategy
and managing other analysts. Others will take the specialist route, honing their
expertise in a specific field—such as healthcare, finance, or machine learning. Data
analysts are in demand across a whole host of industries, so you can follow a career
path that combines your analytical skills with a particular area of interest. If you do,
you could end up with a specialist job title, such as:

• Financial analyst
• Healthcare analyst
• Machine learning analyst

© Copyright FHM.AVI – Data 25


Data Analyst

• Social data analyst


• Insurance underwriting analyst
• Digital marketing analyst
• Systems analyst
Another popular route for data analysts is to eventually move into a data scientist
role. We’ll consider how you can make this transition in the next section.

From data analyst to data scientist


Although the terms are often used interchangeably, data analytics and data science
constitute two distinct career paths. While data analysts seek to address specific
questions and challenges, often looking at static data from the past, data scientists
focus on optimizing the overall functioning of the business, using data to predict
future outcomes. This is a very pared-down comparison; for a full explanation of the
differences between a data analyst and a data scientist, take a look at this guide.
The transition from data analyst to data scientist is not strictly linear, but if you do
like the idea of moving into a data science role, your data analysis skills will serve
as a good foundation. Typically, data analysts looking to become data scientists will
focus on expanding their skillset to include more complex concepts such as data
modeling, machine learning, building algorithms, and more advanced knowledge of
programming languages such as Python and R.
Just like data analysts, data scientists work across a whole range of industries. If your
career path takes you down the science route, you could eventually end up working
as a senior data scientist, a machine learning engineer, or even occupying a C-suite
role such as chief data officer.

Working as a data analytics consultant


After many years in the industry — at least six or seven — many data analysts will
go on to become data analytics consultants. A data analytics consultant essentially
carries out the same work as a data analyst, but for a variety of different clients rather
than one company. They can work for consulting firms, but many opt for the self-
employed route. So, if you’re wondering whether your data analyst career path could
eventually lead you to a more flexible career, the answer is yes! However, this is

© Copyright FHM.AVI – Data 26


Data Analyst

something you can realistically consider much further down the line; for the first few
years of your career, it’s important to gather as much hands-on experience as
possible and to hone your skills in a number of different roles. That way, you’ll be
better equipped to work with a variety of clients within different contexts.

Key takeaways
There is no one-size-fits-all approach when it comes to forging your data analytics
career path. You can choose to specialize and continue adding more complex skills
to your repertoire, or you can become a business and strategy all-star—or a
combination of the two! Once you’ve mastered the fundamentals of data analysis,
you can design a career that speaks to both your interests and your talents. Still, every
data analyst career path starts in the same place learning the key tools, skills, and
processes, and building a professional portfolio.

Type of data analysis

1. Descriptive analytics: What happened?


Descriptive analytics describes what has happened. The aim is solely to provide an
easily digestible snapshot.
Take the COVID-19 statistics page on google for example, the line graph is just a
pure summary of the cases/deaths, a presentation and description of the population
of a particular country infected by the virus.

© Copyright FHM.AVI – Data 27


Data Analyst

There are two main techniques used in descriptive analytics: Data aggregation and
data mining. Data aggregation is the process of gathering data and presenting it in a
summarized format. Let’s imagine an ecommerce company collects all kinds of data
relating to their customers and people who visit their website. The aggregate data, or
summarized data, would provide an overview of this wider dataset—such as the
average customer age, for example, or the average number of purchases made.
Data mining is the analysis part. This is when the analyst explores the data in order
to uncover any patterns or trends. The outcome of descriptive analysis is a visual
representation of the data—as a bar graph, for example, or a pie chart.
So: Descriptive analytics condenses large volumes of data into a clear, simple
overview of what has happened. This is often the starting point for more in-depth
analysis.

2. Diagnostic analytics: Why did it happen?


Diagnostic analytics seeks to delve deeper in order to understand why something
happened. For example: If your descriptive analysis shows that there was a 20% drop
in sales for the month of March, you’ll want to find out why. The next logical step
is to perform a diagnostic analysis.
In order to get to the root cause, the analyst will start by identifying any additional
data sources that might offer further insight into why the drop in sales occurred. They
might drill down to find that, despite a healthy volume of website visitors and

© Copyright FHM.AVI – Data 28


Data Analyst

a good number of “add to cart” actions, very few customers proceeded to actually
check out and make a purchase. Upon further inspection, it comes to light that the
majority of customers abandoned ship at the point of filling out their delivery
address. With a little bit of digging, you’re closer to finding an explanation for your
data anomaly.
When running diagnostic analytics, there are a number of different techniques that
you might employ, such as probability theory, regression analysis, filtering, and
time-series analysis.
So: While descriptive analytics looks at what happened, diagnostic analytics
explores why it happened.

3. Predictive analytics: What is likely to happen in the


future?
Predictive analytics seeks to predict what is likely to happen in the future. Based on
past patterns and trends, data analysts can devise predictive models which estimate
the likelihood of a future event or outcome. This is especially useful as it enables
businesses to plan ahead.
Predictive models use the relationship between a set of variables to make
predictions; for example, you might use the correlation between seasonality and sales
figures to predict when sales are likely to drop. If your predictive model tells you
that sales are likely to go down in summer, you might use this information to come
up with a summer-related promotional campaign, or to decrease expenditure
elsewhere to make up for the seasonal dip. Perhaps you own a restaurant and want
to predict how many takeaway orders you’re likely to get on a typical Saturday night.
Based on what your predictive model tells you, you might decide to get an extra
delivery driver on hand.
A credit card company might use a predictive model, and specifically logistic
regression, to predict whether or not a given customer will default on their
payments—in other words, to classify them in one of two categories: “will default”
or “will not default”. Based on these predictions of what category the customer will
fall into, the company can quickly assess who might be a good candidate for a credit
card.

© Copyright FHM.AVI – Data 29


Data Analyst

Machine learning is a branch of predictive analytics.


So: Predictive analytics builds on what happened in the past and why to predict what
is likely to happen in the future.

4. Prescriptive analytics: What’s the best course of


action?
Prescriptive analytics looks at what has happened, why it happened, and what might
happen in order to determine what should be done next. In other words, prescriptive
analytics shows you how you can best take advantage of the future outcomes that
have been predicted. What steps can you take to avoid a future problem? What can
you do to capitalize on an emerging trend?
Prescriptive analytics is, without doubt, the most complex type of analysis, involving
algorithms, machine learning, statistical methods, and computational modeling
procedures. Essentially, a prescriptive model considers all the possible decision
patterns or pathways a company might take, and their likely outcomes. This enables
you to see how each combination of conditions and decisions might impact the
future, and allows you to measure the impact a certain decision might have. Based
on all the possible scenarios and potential outcomes, the company can decide what
is the best “route” or action to take.
An oft-cited example of prescriptive analytics in action is maps and traffic apps.
When figuring out the best way to get you from A to B, Google Maps will consider
all the possible modes of transport (e.g. bus, walking, or driving), the current traffic
conditions and possible roadworks in order to calculate the best route. In much the
same way, prescriptive models are used to calculate all the possible “routes” a
company might take to reach their goals in order to determine the best possible
option. Knowing what actions to take for the best chances of success is a major
advantage for any type of organization, so it’s no wonder that prescriptive analytics
has a huge role to play in business.
As another example, in healthcare services, you can better supervise the patient
population using prescriptive analytics to estimate the number of patients who are
suffering from obesity. Then, with the help of such filtering factors as sugar diabetes,

© Copyright FHM.AVI – Data 30


Data Analyst

you can make up your mind on where to focus the treatment. A similar prescriptive
pattern could be applied by almost any target group.
So: Prescriptive analytics looks at what has happened, why it happened, and what
might happen in order to determine the best course of action for the future.

5. Time Series Analysis


In practically every scientific area, measurements are conducted over time. These
observations lead to a data collection of organized data known as time series.
A well-known example of a time series is the daily value rate of a stock market
index. Consequently, the information is used to create prediction models that are
able to predict future changes.

6. Exploratory Data Analysis


EDA explores data and finds relationships between variables which were
previously unknown.
There is a whole data science process with several stages:

© Copyright FHM.AVI – Data 31


Data Analyst

Once a set of data is cleaned, it can be analyzed. Analysts may apply exploratory
data analysis as one of many techniques for analyzing data. They do it to better
understand the messages enclosed in the data. The procedure of exploration may
result in additional data cleaning or further requests for data, so this process may be
repetitive in nature.
One example of EDA on climate change is by taking the rise in temperature over the
years, say 1950 to 2020 for example, and the increase of human activities and
industrialization, and form relationships from the data, e.g. increasing number of
factories, cars on the road and airplane flights increase correlates.

7. Inferential Analysis
It uses a small sample of data to infer about a larger population.
The idea of inferring about the population at large with a smaller sample is quite
intuitive, many statistics you see on the media and the internet are inferential, a
prediction of an event based on a small sample. To give an example, a psychology
study for the benefits of sleep, a total of 500 people involved in the study, when
followed up with the candidates, they reported to have better overall attention and
well-being with 7–9 hours of sleep, while those with less sleep and more sleep
suffered with reduced attention and energy. This report from 500 people was just a
tiny portion of 7b people in the world, thus an inference of the larger population.
Thus, the inferential analysis is used to make inferences from data to more general
conditions.
Analysis of Big Data
Big data analytics describes the process of uncovering trends, patterns, and
correlations in large amounts of raw data to help make data-informed decisions.
These processes use familiar statistical analysis techniques and apply them to more
extensive datasets with the help of newer tools.
Big data analysis is similar to the standard data analysis above, but involves more
with technology processing big data at high speed.
Here are some of the key big data analytics tools:
• Hadoop - helps in storing and analyzing data

© Copyright FHM.AVI – Data 32


Data Analyst

• MongoDB - used on datasets that change frequently


• Talend - used for data integration and management
• Cassandra - a distributed database used to handle chunks of data
• Spark - used for real-time processing and analyzing large amounts of data
• STORM - an open-source real-time computational system
• Kafka - a distributed streaming platform that is used for fault-tolerant storage
The ability to analyze more data at a faster rate can provide big benefits to an
organization, allowing it to more efficiently use data to answer important questions.
Big data analytics is important because it lets organizations use colossal amounts of
data in multiple formats from multiple sources to identify opportunities and risks.
Some benefits of big data analytics include:
• Cost savings. Helping organizations identify ways to do business more
efficiently
• Product development. Providing a better understanding of customer needs
• Market insights. Tracking purchase behavior and market trends
Here are some of the sectors where Big Data is actively used:
• Ecommerce - Predicting customer trends and optimizing prices are a few of
the ways e-commerce uses Big Data analytics.
• Healthcare - With the help of a patient’s medical history, Big Data analytics
is used to predict how likely they are to have health issues.
• Media and entertainment - Used to understand the demand of shows, movies,
songs, and more to deliver a personalized recommendation list to its users.

Required set of Skill for DA


Data Cleaning and Preparation
Data cleaning and preparation accounts for around 80% of the work of data
professionals. This makes it perhaps the key skill for anyone serious about getting a
job in data.

© Copyright FHM.AVI – Data 33


Data Analyst

Commonly, a data analyst will need to retrieve data from one or more sources and
prepare the data so it is ready for numerical and categorical analysis. Data cleaning
also involves handling missing and inconsistent data that may affect your analysis.
Data cleaning isn’t always considered “sexy”, but preparing data can actually be a
lot of fun when treated as a problem-solving exercise. In any case, it's where most
data projects start, so it's a key skill you'll need if you're going to become a data
analyst.

Data Analysis and Exploration


It might sound funny to list “data analysis” in a list of required data analyst skills.
But analysis itself is a specific skill that needs to be mastered.
At its core, data analysis means taking a business question or need and turning it into
a data question. Then, you'll need to transform and analyze data to extract an answer
to that question.
Another form of data analysis is exploration. Data exploration is looking to find
interesting trends or relationships in the data that could bring value to a business.
Exploration might be guided by an original business question, but it also might be
relatively unguided. By looking to find patterns and blips in the data, you may
stumble across an opportunity for the business to decrease costs or increase growth!

Statistical Knowledge
A strong foundation in probability and statistics is an important data analyst skill.
This knowledge will help guide your analysis and exploration and help you
understand the data that you're working with.
Additionally, understanding stats will help you make sure your analysis is valid and
will help you avoid common fallacies and logical errors.
The exact level of statistical knowledge required will vary depending on the
demands of your particular role and the data you're working with. For example, if
your company relies on probabilistic analysis, you'll need a much more rigorous
understanding of those areas than you would otherwise.

© Copyright FHM.AVI – Data 34


Data Analyst

Creating Data Visualizations


Data visualizations make trends and patterns in data easier to understand. Humans
are visual creatures, and most people aren’t going to be able to get meaningful insight
by looking at a giant spreadsheet of numbers. As a data analyst, you'll need to be
able to create plots and charts to help communicate your data and findings visually.
This means creating clean, visually compelling charts that will help others
understand the data. It also means avoiding things that are either difficult to interpret
(like pie charts) or can be misleading (like manipulating axis values).
Visualizations can also be an important part of data exploration. Sometimes there
are things that you can see visually in the data that can hide when you just look at
the numbers.

Creating Dashboards and/or Reports


As a data analyst, you'll need to empower others within your organization to use data
to make key decisions. By building dashboards and reports, you’ll be giving others
access to important data by removing technical barriers.
This might take the form of a simple chart and table with date filters, all the way up
to a large dashboard containing hundreds of data points that are interactive and
update automatically.
Job requirements can vary a lot from position to position, but almost every data
analyst job is going to involve producing reports on your findings and/or building
dashboards to showcase them.

Writing and Communication Skills


The ability to communicate in multiple formats is a key data analyst skill. Writing,
speaking, explaining, listening— strong communication skills across all of these
areas will help you succeed.
Communication is key in collaborating with your colleagues. For example, in a
kickoff meeting with business stakeholders, careful listening skills are needed to
understand the analyses they require. Similarly, during your project, you may need
to be able to explain a complex topic to non-technical teammates.

© Copyright FHM.AVI – Data 35


Data Analyst

Written communication is also incredibly important — you'll almost certainly need


to write up your analysis and recommendations.
Being clear, direct, and easily understood is a skill that will advance your career in
data. It may be a “soft” skill, but don’t underestimate it — the best analytical skills
in the world won’t be worth much unless you can explain what they mean and
convince your colleagues to act on your findings.

Domain Knowledge
Domain knowledge is understanding things that are specific to the particular industry
and company that you work for. For example, if you're working for a company with
an online store, you might need to understand the nuances of e-commerce. In
contrast, if you're analyzing data about mechanical systems, you might need to
understand those systems and how they work.
Domain knowledge changes from industry to industry, so you may find yourself
needing to research and learn quickly. No matter where you work, if you don't
understand what you're analyzing it's going to be difficult to do it effectively, making
domain knowledge a key data analyst skill.
This is certainly something that you can learn on the job, but if you know a specific
industry or area you’d like to work in, building as much understanding as you can
up front will make you a more attractive job applicant and a more effective employee
once you do get the job.

Problem-Solving
As a data analyst, you're going to run up against problems, bugs, and roadblocks
every day. Being able to problem-solve your way out of them is a key skill.
You might need to research a quirk of some software or coding language that you're
using. Your company might have resource constraints that force you to be innovative
in how you approach a problem. The data you're using might be incomplete. Or you
might need to perform some “good enough” analysis to meet a looming deadline.
Whatever the circumstances, strong problem-solving skills are going to be an
incredible asset for any data analyst.

© Copyright FHM.AVI – Data 36


Data Analyst

Top DA Technique to Apply


There are different techniques for Data Analysis depending upon the question at
hand, the type of data, and the amount of data gathered. Each focuses on taking onto
the new data, mining insights, and drilling down into the information to transform
facts and figures into decision-making parameters. Accordingly, the different
techniques of data analysis can be categorized as follows:

1. Techniques based on Mathematics and Statistics


Descriptive Analysis: Descriptive Analysis considers the historical data, Key
Performance Indicators and describes the performance based on a chosen
benchmark. It takes into account past trends and how they might influence future
performance.
Dispersion Analysis: Dispersion in the area onto which a data set is spread. This
technique allows data analysts to determine the variability of the factors under study.
Regression Analysis: This technique works by modeling the relationship between a
dependent variable and one or more independent variables. A regression model can
be linear, multiple, logistic, ridge, non-linear, life data, and more.
Factor Analysis: This technique helps to determine if there exists any relationship
between a set of variables. This process reveals other factors or variables that
describe the patterns in the relationship among the original variables. Factor
Analysis leaps forward into useful clustering and classification procedures.
Discriminant Analysis: It is a classification technique in data mining. It identifies the
different points on different groups based on variable measurements. In simple
terms, it identifies what makes two groups different from one another; this helps to
identify new items.
Time Series Analysis: In this kind of analysis, measurements are spanned across time,
which gives us a collection of organized data known as time series.

© Copyright FHM.AVI – Data 37


Data Analyst

2. Techniques based on Artificial Intelligence and


Machine Learning
Artificial Neural Networks: a Neural network is a biologically-inspired programming
paradigm that presents a brain metaphor for processing information. An Artificial
Neural Network is a system that changes its structure based on information that flows
through the network. ANN can accept noisy data and are highly accurate. They can
be considered highly dependable in business classification and forecasting
applications.
Decision Trees: As the name stands, it is a tree-shaped model representing a
classification or regression model. It divides a data set into smaller subsets,
simultaneously developing into a related decision tree.
Evolutionary Programming: This technique combines the different types of data
analysis using evolutionary algorithms. It is a domain-independent technique, which
can explore ample search space and manages attribute interaction very efficiently.
Fuzzy Logic: It is a data analysis technique based on the probability that helps handle
the uncertainties in data mining techniques.

3. Techniques based on Visualization and Graphs


Column Chart, Bar Chart: Both these charts are used to present numerical differences
between categories. The column chart takes to the height of the columns to reflect
the differences. Axes interchange in the case of the bar chart.
Line Chart: This chart represents the change of data over a continuous interval of
time.
Area Chart: This concept is based on the line chart. It also fills the area between the
polyline and the axis with color, representing better trend information.
Pie Chart: It is used to represent the proportion of different classifications. It is only
suitable for only one series of data. However, it can be made multi-layered to
represent the proportion of data in different categories.
Funnel Chart: This chart represents the proportion of each stage and reflects the size
of each module. It helps in comparing rankings.

© Copyright FHM.AVI – Data 38


Data Analyst

Word Cloud Chart: It is a visual representation of text data. It requires a large amount
of data, and the degree of discrimination needs to be high for users to perceive the
most prominent one. It is not a very accurate analytical technique.
Gantt Chart: It shows the actual timing and the progress of the activity compared to
the requirements.
Radar Chart: It is used to compare multiple quantized charts. It represents which
variables in the data have higher values and which have lower values. A radar chart
is used for comparing classification and series along with proportional
representation.
Scatter Plot: It shows the distribution of variables in points over a rectangular
coordinate system. The distribution in the data points can reveal the correlation
between the variables.
Bubble Chart: It is a variation of the scatter plot. Here, in addition to the x and y
coordinates, the bubble area represents the 3rd value.
Gauge: It is a kind of materialized chart. Here the scale represents the metric, and the
pointer represents the dimension. It is a suitable technique to represent interval
comparisons.
Frame Diagram: It is a visual representation of a hierarchy in an inverted tree
structure.
Rectangular Tree Diagram: This technique is used to represent hierarchical
relationships but at the same level. It makes efficient use of space and represents the
proportion represented by each rectangular area.
Regional Map: It uses color to represent value distribution over a map partition.

Point Map: It represents the geographical distribution of data in points on a


geographical background. When the points are the same in size, it becomes
meaningless for single data, but if the points are as a bubble, it also represents the
size of the data in each region.
Flow Map: It represents the relationship between an inflow area and an outflow area.
It represents a line connecting the geometric centers of gravity of the spatial
elements. The use of dynamic flow lines helps reduce visual clutter.

© Copyright FHM.AVI – Data 39


Data Analyst

Heat Map: This represents the weight of each point in a geographic area. The color
here represents the density.
Let us now read about a few tools used in data analysis in research.

Data Analysis Tools


There are several data analysis tools available in the market, each with its own set of
functions. The selection of tools should always be based on the type of analysis
performed and the type of data worked. Here is a list of a few compelling tools for
Data Analysis.

1. Excel
It has various compelling features, and with additional plugins installed, it can
handle a massive amount of data. So, if you have data that does not come near the
significant data margin, Excel can be a versatile tool for data analysis.
Looking to learn Excel? Data Analysis with Excel Pivot Tables course is the highest-
rated Excel course on Udemy.

2. Tableau
It falls under the BI Tool category, made for the sole purpose of data analysis. The
essence of Tableau is the Pivot Table and Pivot Chart and works towards
representing data in the most user-friendly way. It additionally has a data cleaning
feature along with brilliant analytical functions.
If you want to learn Tableau, Udemy's online course Hands-On Tableau Training
For Data Science can be a great asset for you.

3. Power BI
It initially started as a plugin for Excel, but later on, detached from it to develop in
one of the most data analytics tools. It comes in three versions: Free, Pro, and
Premium. Its PowerPivot and DAX language can implement sophisticated advanced
analytics similar to writing Excel formulas.

© Copyright FHM.AVI – Data 40


Data Analyst

4. Fine Report
Fine Report comes with a straightforward drag and drops operation, which helps
design various reports and build a data decision analysis system. It can directly
connect to all kinds of databases, and its format is similar to that of Excel.
Additionally, it also provides a variety of dashboard templates and several self-
developed visual plug-in libraries.

5. R & Python
These are programming languages that are very powerful and flexible. R is best at
statistical analysis, such as normal distribution, cluster classification algorithms, and
regression analysis. It also performs individual predictive analyses like customer
behavior, spending, items preferred by him based on his browsing history, and more.
It also involves concepts of machine learning and artificial intelligence.

6. SAS
It is a programming language for data analytics and data manipulation, which can
easily access data from any source. SAS has introduced a broad set of customer
profiling products for web, social media, and marketing analytics. It can predict their
behaviors, manage, and optimize communications.

Challenge for Data analyst and Concern


about data analysis
Collecting meaning full and real-time data:
The collector need to access the right sources and extracted meaningful of data, they
should collaborate with data engineering team a business user to consolidate the data
profiling table which is content all the information of tables and fields that need to
be ingested.
Especially for data real-time data, need the better way to present the most value and
user can make the decision intermediately.
Visual representation of data:

© Copyright FHM.AVI – Data 41


Data Analyst

To be understood and impactful, data often needs to be visually presented in graphs


or charts. While these tools are incredibly useful, it’s difficult to build them
manually. Taking the time to pull information from multiple areas and put it into a
reporting tool is frustrating and time-consuming.
Data from multiple sources: The next issue is trying to analyze data across multiple,
disjointed sources. Different pieces of data are often housed in different systems.
Employees may not always realize this, leading to incomplete or inaccurate analysis.
Manually combining data is time-consuming and can limit insights to what is easily
viewed.
With a comprehensive and centralized system, employees will have access to all
types of information in one location. Not only does this free up time spent accessing
multiple sources, it allows cross-comparisons and ensures data is complete.
Strong data systems enable report building at the click of a button. Employees and
decision-makers will have access to the real-time information they need in an
appealing and educational format.
Data governance: Need to make sure all the data is hash if its sensitive information.
About database accessibility that need to be ensure running with effective way and
avoid any issue related to security.
As well as the data quality and quantity:
- Making sure the data quality is good with some cooperation with different
sources and master data if any. Is there any enrichment need to be there to
conform data?
- Data quantity check and percentage of acceptance is must have in the pipeline
for further enhance the system.
Confusion or anxiety:
Users may feel confused or anxious about switching from traditional data analysis
methods, even if they understand the benefits of automation. Nobody likes change,
especially when they are comfortable and familiar with the way things are done.
To overcome this HR problem, it’s important to illustrate how changes to analytics
will actually streamline the role and make it more meaningful and fulfilling. With

© Copyright FHM.AVI – Data 42


Data Analyst

comprehensive data analytics, employees can eliminate redundant tasks like data
collection and report building and spend time acting on insights instead.
Shortage of skills
Some organizations struggle with analysis due to a lack of talent. This is especially
true in those without formal risk departments. Employees may not have the
knowledge or capability to run in-depth data analysis.
This challenge is mitigated in two ways: by addressing analytical competency in the
hiring process and having an analysis system that is easy to use. The first solution
ensures skills are on hand, while the second will simplify the analysis process for
everyone. Everyone can utilize this type of system, regardless of skill level.
Pressure from the top
As risk management becomes more popular in organizations, CFOs and other
executives demand more results from risk managers. They expect higher returns and
a large number of reports on all kinds of data.
With a comprehensive analysis system, risk managers can go above and beyond
expectations and easily deliver any desired analysis. They’ll also have more time to
act on insights and further the value of the department to the organization.

Top certificate for Data analyst


Currently the Data Analyst Certificate is quite more popular, especially for every
platform that should have their own Certificate type: Tablau, Power, SAS, …
The thing that I can recommend is strongly understand one tool and the general
technique of Data Analyst that mentioned in this document and be an expert of one
tool before you accomplish a new one. Because the technique is just one but
applicable for every tools, don’t limit yourself by the tool.
Start with Power BI tools and Data Analyst (DA100) first as my recommendation.
In this course, you can have got a lot of information and knowledge, not only
fundamental but also some advertise of tools.

© Copyright FHM.AVI – Data 43


Data Analyst

Key take way and Summary


• To summarize amount of knowledge and challenging of data analyst, make
sure you go through and understood what should you do for preparing to get
a success of Data analysis job.
• We did summary the introduction of what is the process of data analyst do in
daily basic
• Type of data and analysis skill that need to have in hand, for particular data
we would have the proper solution to analysis those data, and fast to getting
insight of data to make the decision.
• We also mentioned the skillset that Data Analyst should have and must have,
there is something overlap with data engineer and data scientist, but it
obviously makes sense because we work with the same “DATA”. We also
highlighted what you NEED to know, what you NICE to know and what you
MUST to know.
• To gather success of skillset of Data analyst, you need to improve skill of data
and domain, listed down required set of skill, recommend tools, what
certificate we can have, that’s making more effectiveness on the way to get
the job done.
• Last but not least, the challenging that you will be deal with the job and it
make more easy to resolve and don’t be surprised.
• Finally, be active to learn and ask yourself.
Reference:

https://hackr.io/blog/what-is-data-analysis-methods-techniques-
tools
https://www.clearrisk.com/risk-management-blog/challenges-of-
data-analytics-0

© Copyright FHM.AVI – Data 44


Data Analyst

Data Visualization
What is Data Visualization?
With so much information being collected through data analysis in the business
world today, we must have a way to paint a picture of that data so we can interpret
it. Data visualization gives us a clear idea of what the information means by giving
it visual context through maps or graphs. This makes the data more natural for the
human mind to comprehend and therefore makes it easier to identify trends, patterns,
and outliers within large data sets.

Why is Data Visualization Important?


No matter what business or career you’ve chosen, data visualization can help by
delivering data in the most efficient way possible. As one of the essential steps in the
business intelligence process, data visualization takes the raw data, models it, and
delivers the data so that conclusions can be reached. In advanced analytics, data
scientists are creating machine learning algorithms to better compile essential data
into visualizations that are easier to understand and interpret.
Specifically, data visualization uses visual data to communicate information in a
manner that is universal, fast, and effective. This practice can help companies
identify which areas need to be improved, which factors affect customer satisfaction
and dissatisfaction, and what to do with specific products (where should they go and
who should they be sold to). Visualized data gives stakeholders, business owners,
and decision-makers a better prediction of sales volumes and future growth.

What Are The Benefits of Data


Visualization?
Data visualization positively affects an organization’s decision-making process with
interactive visual representations of data. Businesses can now recognize patterns

© Copyright FHM.AVI – Data 45


Data Analyst

more quickly because they can interpret data in graphical or pictorial forms. Here
are some more specific ways that data visualization can benefit an organization:
• Correlations in Relationships: Without data visualization, it is challenging
to identify the correlations between the relationship of independent variables.
By making sense of those independent variables, we can make better business
decisions.
• Trends Over Time: While this seems like an obvious use of data
visualization, it is also one of the most valuable applications. It’s impossible
to make predictions without having the necessary information from the past
and present. Trends over time tell us where we were and where we can
potentially go.
• Frequency: Closely related to trends over time is frequency. By examining
the rate, or how often, customers purchase and when they buy gives us a better
feel for how potential new customers might act and react to different
marketing and customer acquisition strategies.
• Examining the Market: Data visualization takes the information from
different markets to give you insights into which audiences to focus your
attention on and which ones to stay away from. We get a clearer picture of the
opportunities within those markets by displaying this data on various charts
and graphs.
• Risk and Reward: Looking at value and risk metrics requires expertise
because, without data visualization, we must interpret complicated
spreadsheets and numbers. Once information is visualized, we can then
pinpoint areas that may or may not require action.
• Reacting to the Market: The ability to obtain information quickly and easily
with data displayed clearly on a functional dashboard allows businesses to act
and respond to findings swiftly and helps to avoid making mistakes.

Which Data Visualization Techniques are


Used?
There are many different methods of putting together information in a way that the
data can be visualized. Depending on the data being modeled, and what its intended

© Copyright FHM.AVI – Data 46


Data Analyst

purpose is, a variety of different graphs and tables may be utilized to create an easy
to interpret dashboard. Some visualizations are manually created, while others are
automated. Either way, there are many types to meet your visualization needs.

• Infographics: Unlike a single data visualization, infographics take an


extensive collection of information and gives you a comprehensive visual
representation. An infographic is excellent for exploring complex and highly-
subjective topics.
• Heatmap Visualization: This method uses a graph with numerical data
points highlighted in light or warm colors to indicate whether the data is a
high-value or a low-value point. Psychologically, this data visualization
method helps the viewer to identify the information because studies have
shown that humans interpret colors much better than numbers and letters.
• Fever Charts: A fever chart shows changing data over a period of time. As
a marketing tool, we could take the performance from the previous year and
compare that to the prior year to get an accurate projection of next year. This
can help decision-makers easily interpret wide and varying data sources.
• Area Chart (or Graph): Area charts are excellent for visualizing the data’s
time-series relationship. Whether you’re looking at the earnings for individual
departments on a month to month basis or the popularity of a product since
the 1980s, area charts can visualize this relationship.
• Histogram: Rather than looking at the trends over time, histograms are
measuring frequencies instead. These graphs show the distribution of
numerical data using an automated data visualization formula to display a
range of values that can be easily interpreted.

Who Uses Data Visualization?


Data visualization is used across all industries to increase sales with existing
customers and target new markets and demographics for potential customers. The
World Advertising and Research Center (WARC) predicts that in 2020 half of the
world’s advertising dollars will be spent online, which means companies everywhere
have discovered the importance of web data. As a crucial step in data analytics, data
visualization gives companies critical insights into untapped information and
messages that would otherwise be lost. The days of scouring through thousands of

© Copyright FHM.AVI – Data 47


Data Analyst

rows of spreadsheets are over, as now we have a visual summary of data to identify
trends and patterns.
It’s hard to think of a professional industry that doesn’t benefit from making data
more understandable. Every STEM field benefits from understanding data—and so
do fields in government, finance, marketing, history, consumer goods, service
industries, education, sports, and so on.
Common use cases for data visualization include the following:

Sales and marketing:

Research from the media agency Magna predicts that half of all global advertising
dollars will be spent online by 2020. As a result, marketing teams must pay close
attention to their sources of web traffic and how their web properties generate
revenue. Data visualization makes it easy to see traffic trends over time as a result
of marketing efforts.

Politics:

A common use of data visualization in politics is a geographic map that displays the
party each state or district voted for.

Healthcare:

Healthcare professionals frequently use choropleth maps to visualize important


health data. A choropleth map displays divided geographical areas or regions that
are assigned a certain color in relation to a numeric variable. Choropleth maps allow
professionals to see how a variable, such as the mortality rate of heart disease,
changes across specific territories.

Scientists:

Scientific visualization, sometimes referred to in shorthand as SciVis, allows


scientists and researchers to gain greater insight from their experimental data than
ever before.

© Copyright FHM.AVI – Data 48


Data Analyst

Finance:

Finance professionals must track the performance of their investment decisions when
choosing to buy or sell an asset. Candlestick charts are used as trading tools and help
finance professionals analyze price movements over time, displaying important
information, such as securities, derivatives, currencies, stocks, bonds and
commodities. By analyzing how the price has changed over time, data analysts and
finance professionals can detect trends.

Logistics:

Shipping companies can use visualization tools to determine the best global shipping
routes.

Data scientists and researchers:

Visualizations built by data scientists are typically for the scientist's own use, or for
presenting the information to a select audience. The visual representations are built
using visualization libraries of the chosen programming languages and tools. Data
scientists and researchers frequently use open source programming languages -- such
as Python -- or proprietary tools designed for complex data analysis. The data
visualization performed by these data scientists and researchers helps them
understand data sets and identify patterns and trends that would have otherwise gone
unnoticed.

Uncovering the Benefits of Data


Visualization Tools
• Solve any data inefficiencies and easily and instantly absorb vast amounts of
data presented in visual formats.
• Increase the speed of decision making as well
• Visualization helps identify errors and inaccuracies in data quickly.
• Able to understand and act on real time data

© Copyright FHM.AVI – Data 49


Data Analyst

• Promotes storytelling in the most compelling way.


• Assists in exploring business insights to achieve business goals in the right
direction.
• Reveal any differences in the trends and patterns
• Create templates of tailor-made reports, which significantly cuts down on
employee time.

Data Visualization Types


Data visualization – like the underlying data itself – comes in many shapes and
forms, but you’re probably already familiar with some of the most commonly used
types of visualizations, even if you’re not familiar with their names.

The Most Common Types of Data Visualizations:


Charts and Graphs
Below are some commonly used types of data visualizations. All of them are charts,
which use symbols including icons, lines, bars, points, areas, colors and connections
to bring meaning to data.

Line Plot
The simplest technique, a line plot, is used to plot the relationship or dependence of
one variable on another.

© Copyright FHM.AVI – Data 50


Data Analyst

Bar Chart
Bar charts are used for comparing the quantities of different categories or groups.
Values of a category are represented with the help of bars, and they can be
configured with vertical or horizontal bars, with the length or height of each bar
representing the value.

© Copyright FHM.AVI – Data 51


Data Analyst

Pie and Donut Charts


They are used to compare the parts of a whole and are most effective when there are
limited components and when text and percentages are included to describe the
content. However, they can be difficult to interpret because the human eye has a hard
time estimating areas and comparing visual angles.

© Copyright FHM.AVI – Data 52


Data Analyst

Histogram Plot
A histogram, representing the distribution of a continuous variable over a given
interval or period, is one of the most frequently used data visualization techniques.
It plots the data by chunking it into intervals called bins. It is used to inspect the
underlying frequency distribution, outliers, skewness, and so on.

© Copyright FHM.AVI – Data 53


Data Analyst

Scatter Plot
Another common visualization technique is a scatter plot that is a two-dimensional
plot representing the joint variation of two data items. Each marker (symbols such
as dots, squares and plus signs) represents an observation. The marker position
indicates the value for each observation. When you assign more than two measures,
a scatter plot matrix is produced that is a series of scatter plots displaying every
possible pairing of the measures that are assigned to the visualization. Scatter plots
are used for examining the relationship, or correlations, between X and Y variables.

© Copyright FHM.AVI – Data 54


Data Analyst

Variations and Other Types of Data Visualizations


There are so many other types of data visualizations and variations that it would be
difficult to list them all, but here are some other examples:
• Histogram. A histogram is a type of vertical bar chart that shows the
frequency of distribution. The bars correspond to a range of values and the
height shows the number of data points that fall within that range. Unlike a
bar chart, in which the bars can be rearranged, a histogram cannot be
rearranged.
• Area chart. An area chart or area graph is similar to a line chart, except the
area between the line and the axis is filled in. Like a line chart, area charts are
great for showing trends over time, and the addition of the filled-in area
emphasizes differences between the data points.

© Copyright FHM.AVI – Data 55


Data Analyst

• Maps. Maps can be used to display data. Geographic areas can be shaded with
the opacity or color of the shading corresponding to a range or value. Data
can often be displayed in different ways. For example, see a chart below

That displays the same data both on a graph and with a map.
• Treemap. A treemap uses shapes, usually rectangles, with the size
corresponding to a value and each rectangle representing a data point. The
rectangles can be nested to create categories and subcategories.

© Copyright FHM.AVI – Data 56


Data Analyst

• Infographic. Infographics cover pretty much any visual representation of


information. This might include multiple graphs, charts or other information.

Here are some other types of visualizations


• Cartogram
• Heat Map
• Matrix
• Steamgraph
• Box plot
• Dot plot
• Sankey diagram
• Radar chart
• Chord diagram
• Network graph

© Copyright FHM.AVI – Data 57


Data Analyst

Visualizing Big Data


Today, organizations generate and collect data each minute. The huge amount of
generated data, known as Big Data, brings new challenges to visualization because
of the speed, size and diversity of information that must be considered. New and
more sophisticated visualization techniques based on core fundamentals of data
analysis consider not only the cardinality, but also the structure and the origin of
such data.
Kernel Density Estimation for Non-Parametric Data
If we have no knowledge about the population and the underlying distribution of
data, such data is called non-parametric and is best visualized with the help of Kernel
Density Function that represents the probability distribution function of a random
variable. It is used when the parametric distribution of the data doesn’t make much
sense, and you want to avoid making assumptions about the data.

Box and Whisker Plot for Large Data


A binned box plot with whiskers shows the distribution of large data and easily see
outliers. In its essence, it is a graphical display of five statistics (the minimum, lower
quartile, median, upper quartile and maximum) that summarizes the distribution of

© Copyright FHM.AVI – Data 58


Data Analyst

a set of data. The lower quartile (25th percentile) is represented by the lower edge of
the box, and the upper quartile (75th percentile) is represented by the upper edge of
the box. The median (50th percentile) is represented by a central line that dividesthe
box into sections. Extreme values are represented by whiskers that extend out from
the edges of the box. Box plots are often used to understand the outliers in the data.

Word Clouds and Network Diagrams for Unstructured Data

The variety of big data brings challenges because semi structured and unstructured
data require new visualization techniques. A word cloud visual represents the
frequency of a word within a body of text with its relative size in the cloud. This
technique is used on unstructured data to display high- or low-frequency words.

© Copyright FHM.AVI – Data 59


Data Analyst

Another visualization technique that can be used for semi structured or unstructured
data is the network diagram. Network diagrams represent relationships as nodes
(individual actors within the network) and ties (relationships between the
individuals). They are used in many applications, for example for analysis of social
networks or mapping product sales across geographic areas.

© Copyright FHM.AVI – Data 60


Data Analyst

Correlation Matrices

A correlation matrix allows quick identification of relationships between variables


by combining big data and fast response times. Basically, a correlation matrix is a
table showing correlation coefficients between variables: Each cell in the table
represents the relationship between two variables. Correlation matrices are used to
summarize data, as an input into a more advanced analysis, and as a diagnostic for
advanced analyses.

© Copyright FHM.AVI – Data 61


Data Analyst

With the many techniques available, it is easy to end up presenting the information
using the wrong tool. To choose the most appropriate visualization technique you
need to understand:
• The data, its type and composition.
• What information you are trying to convey to your audience, and how viewers
process visual information.
Remember: sometimes, a simple line plot can do the task saving time and effort spent
on trying to plot the data using advanced Big Data techniques.
The infographic below can help you find out what type of chart you should use. All
that you need is to have information about your variables. After that, you should
answer the given questions. These questions will serve as a framework that will help
you to tell a compelling story to the customer.

© Copyright FHM.AVI – Data 62


Data Analyst

Data Visualization Tools Comparison


Tableau (and Tableau Public)
Tableau has a variety of options available, including a desktop app, server and hosted
online versions, and a free public option. There are hundreds of data import options
available, from CSV files to Google Ads and Analytics data to Salesforce data.
Output options include multiple chart formats as well as mapping capability. That
means designers can create color-coded maps that showcase geographically
important data in a format that’s much easier to digest than a table or chart could
ever be.
The public version of Tableau is free to use for anyone looking for a powerful way
to create data visualizations that can be used in a variety of settings. They have an
extensive gallery of infographics and visualizations that have been created with the

© Copyright FHM.AVI – Data 63


Data Analyst

public version to serve as inspiration for those who are interested in creating their
own.

Pros
• Hundreds of data import options
• Mapping capability
• Free public version available
• Lots of video tutorials to walk you through how to use Tableau

Cons
• Non-free versions are expensive ($70/month/user for the Tableau Creator
software)
• Public version doesn’t allow you to keep data analyses private

Data Visualization Examples:

© Copyright FHM.AVI – Data 64


Data Analyst

A data visualization of unique words used by three central characters in the Game
of Thrones book series.

© Copyright FHM.AVI – Data 65


Data Analyst

Data visualizations can make public safety data easier to digest.

© Copyright FHM.AVI – Data 66


Data Analyst

An interactive visualization of the highest-grossing actors of all time.

Infogram
Infogram is a fully-featured drag-and-drop visualization tool that allows even non-
designers to create effective visualizations of data for marketing reports,
infographics, social media posts, maps, dashboards, and more.
Finished visualizations can be exported into a number of formats: .PNG, .JPG, .GIF,
.PDF, and .HTML. Interactive visualizations are also possible, perfect for
embedding into websites or apps. Infogram also offers a WordPress plugin that
makes embedding visualizations even easier for WordPress users.

Pros
• Tiered pricing, including a free plan with basic features
• Includes 35+ chart types and 550+ map types
• Drag and drop editor
• API for importing additional data sources

© Copyright FHM.AVI – Data 67


Data Analyst

Cons
• Significantly fewer built-in data sources than some other apps

Examples

Visualizations can make complex topics easy to understand.

© Copyright FHM.AVI – Data 68


Data Analyst

Charts make data easier to compare, year-to-year.

© Copyright FHM.AVI – Data 69


Data Analyst

Maps are an excellent way to give a snapshot of worldwide data.

ChartBlocks
ChartBlocks claims that data can be imported from “anywhere” using their API,
including from live feeds. While they say that importing data from any source can
be done in “just a few clicks,” it’s bound to be more complex than other apps that
have automated modules or extensions for specific data sources.
The app allows for extensive customization of the final visualization created, and the
chart building wizard helps users pick exactly the right data for their charts before
importing the data.
Designers can create virtually any kind of chart, and the output is responsive—a big
advantage for data visualization designers who want to embed charts into websites
that are likely to be viewed on a variety of devices.

Pros
• Free and reasonably priced paid plans are available
• Easy to use wizard for importing the necessary data

© Copyright FHM.AVI – Data 70


Data Analyst

Cons
• Unclear how robust their API is
• Doesn’t appear to have any mapping capability

Examples

Stacked graph charts are an effective way to compare and contrast data.

© Copyright FHM.AVI – Data 71


Data Analyst

Scatter plots are a simple way to represent data trends.

Line charts are effective at showing trends and comparisons.

Datawrapper
Datawrapper was created specifically for adding charts and maps to news stories.
The charts and maps created are interactive and made for embedding on news

© Copyright FHM.AVI – Data 72


Data Analyst

websites. Their data sources are limited, though, with the primary method being
copying and pasting data into the tool.
Once data is imported, charts can be created with a single click. Their visualization
types include column, line, and bar charts, election donuts, area charts, scatter plots,
choropleth and symbol maps, and locator maps, among others. The finished
visualizations are reminiscent of those seen on sites like the New York Times or
Boston Globe. In fact, their charts are used by publications like Mother Jones,
Fortune, and The Times.
The free plan is perfect for embedding graphics on smaller sites with limited traffic,
but paid plans are on the expensive side, starting at $39/month.

Pros
• Specifically designed for newsroom data visualization
• Free plan is a good fit for smaller sites
• Tool includes a built-in color blindness checker

Cons
• Limited data sources
• Paid plans are on the expensive side

Example

© Copyright FHM.AVI – Data 73


Data Analyst

Scatter plots can show a multitude of data, especially when color-coded to show
more points.

D3.js
D3.js is a JavaScript library for manipulating documents using data. D3.js requires
at least some JS knowledge, though there are apps out there that allow non-
programming users to utilize the library.
Those apps include NVD3, which offers reusable charts for D3.js; Plotly’s Chart
Studio, which also allows designers to create WebGL and other charts; and Ember
Charts, which also uses the Ember.js framework.

Pros
• Very powerful and customizable
• Huge number of chart types possible
• A focus on web standards
• Tools available to let non-programmers create visualizations

© Copyright FHM.AVI – Data 74


Data Analyst

• Free and open source

Cons
• Requires programming knowledge to use alone
• Less support available than with paid tools

Examples

Chord diagrams show relationships between groups of entries.

© Copyright FHM.AVI – Data 75


Data Analyst

Showing geographic data is best done with data maps.

© Copyright FHM.AVI – Data 76


Data Analyst

Voronoi maps are an interesting way to show geographic data.

Google Charts
Google Charts is a powerful, free data visualization tool that is specifically for
creating interactive charts for embedding online. It works with dynamic data and the
outputs are based purely on HTML5 and SVG, so they work in browsers without the
use of additional plugins. Data sources include Google Spreadsheets, Google Fusion
Tables, Salesforce, and other SQL databases.
There are a variety of chart types, including maps, scatter charts, column and bar
charts, histograms, area charts, pie charts, treemaps, timelines, gauges, and many
others. These charts can be customized completely, via simple CSS editing.

Pros
• Free
• Wide variety of chart formats available
• Cross-browser compatible since it uses HTML5/SVG

© Copyright FHM.AVI – Data 77


Data Analyst

• Works with dynamic data

Cons
• Beyond the tutorials and forum available, there’s limited support

Examples

Combo charts show trends and comparisons.

© Copyright FHM.AVI – Data 78


Data Analyst

GeoCharts are just one method of visualizing data with Google Charts.

© Copyright FHM.AVI – Data 79


Data Analyst

Annotations make charts and graphs easier to understand.

FusionCharts
FusionCharts is another JavaScript-based option for creating web and mobile
dashboards. It includes over 150 chart types and 1,000 map types. It can integrate
with popular JS frameworks (including React, jQuery, React, Ember, and Angular)
as well as with server-side programming languages (including PHP, Java, Django,
and Ruby on Rails).
FusionCharts gives ready-to-use code for all of the chart and map variations, making
it easier to embed in websites even for those designers with limited programming
knowledge. Because FusionCharts is aimed at creating dashboards rather than just
straightforward data visualizations it’s one of the most expensive options included
in this article. But it’s also one of the most powerful.

Pros
• Huge number of chart and map format options
• More features than most other visualization tools

© Copyright FHM.AVI – Data 80


Data Analyst

• Integrates with a number of different frameworks and programming


languages

Cons
• Expensive (starts at almost $500 for one developer license)
• Overkill for simple visualizations outside of a dashboard environment

Examples

FusionCharts is designed for creating data visualization dashboards.

© Copyright FHM.AVI – Data 81


Data Analyst

Dashboards can showcase numerous data visualizations side by side.

© Copyright FHM.AVI – Data 82


Data Analyst

Managing business operations is done best with data visualization dashboards.

Chart.js
Chart.js is a simple but flexible JavaScript charting library. It’s open source,
provides a good variety of chart types (eight total), and allows for animation and
interaction.
Chart.js uses HTML5 Canvas for output, so it renders charts well across all modern
browsers. Charts created are also responsive, so it’s great for creating visualizations
that are mobile-friendly.

Pros
• Free and open source
• Responsive and cross-browser compatible output

Cons
• Very limited chart types compared to other tools
• Limited support outside of the official documentation

© Copyright FHM.AVI – Data 83


Data Analyst

Examples

Bubble charts can showcase numerous data points simultaneously.

© Copyright FHM.AVI – Data 84


Data Analyst

Multi-axis line charts are better when they’re annotated (this one uses tooltips when
hovering over points on the lines).

Stacked area line charts are visually striking visualizations.

Grafana
Grafana is open-source visualization software that lets users create dynamic
dashboards and other visualizations. It supports mixed data sources, annotations, and

© Copyright FHM.AVI – Data 85


Data Analyst

customizable alert functions, and it can be extended via hundreds of available


plugins. That makes it one of the most powerful visualization tools available.
Export functions allow designers to share snapshots of dashboards as well as invite
other users to collaborate. Grafana supports over 50 data sources via plugins. It’s
free to download, or there’s a cloud-hosted version for $49/month. (There’s also a
very limited free hosted version.) The downloadable version also has support plans
available, something a lot of other open-source tools don’t offer.

Pros
• Open source, with free and paid options available
• Large selection of data sources available
• Variety of chart types available
• Makes creating dynamic dashboards simple
• Can work with mixed data feeds

Cons
• Overkill for creating simple visualizations
• Doesn’t offer as many visual customization options as some other tools
• Not the best option for creating visualization images
• Not able to embed dashboards in websites, though possible for individual
panels

Examples

© Copyright FHM.AVI – Data 86


Data Analyst

Grafana is a powerful data visualization dashboard tool.

© Copyright FHM.AVI – Data 87


Data Analyst

Chartist.js
Chartist.js is a free, open-source JavaScript library that allows for creating simple
responsive charts that are highly customizable and cross-browser compatible. The
entire JavaScript library is only 10KB when GZIPped. Charts created with Chartist.js
can also be animated, and plugins allow it to be extended.

Pros
• Free and open source
• Tiny file size
• Charts can be animated

Cons
• Not the widest selection of chart types available
• No mapping capabilities
• Limited support outside of developer community

Examples

© Copyright FHM.AVI – Data 88


Data Analyst

Chartist.js offers a number of basic graph types.

Sigmajs
Sigmajs is a single-purpose visualization tool for creating network graphs. It’s highly
customizable but does require some basic JavaScript knowledge in order to use.
Graphs created are embeddable, interactive, and responsive.

Pros
• Highly customizable and extensible
• Free and open source
• Easy to embed graphs in websites and apps

Cons
• Only creates one type of visualization: network graphs
• Requires JS knowledge to customize and implement

Examples

© Copyright FHM.AVI – Data 89


Data Analyst

Sigmajs creates network graphs exclusively.

Polymaps
Polymaps is a dedicated JavaScript library for mapping. The outputs are dynamic,
responsive maps in a variety of styles, from image overlays to symbol maps to
density maps. It uses SVG to create the images, so designers can use CSS to
customize the visuals of their maps.

Pros
• Free and open source
• Built specifically for mapping
• Easy to embed maps in websites and apps

Cons
• Only creates one type of visualization
• Requires some coding knowledge to customize and implement

Examples

© Copyright FHM.AVI – Data 90


Data Analyst

In this case, the data represented is a photoset from NASA’s Earth Observatory.

© Copyright FHM.AVI – Data 91


Data Analyst

A representation of Flickr geotagged photos.

How to create good data visualizations?


The typology described in this article is simple. By answering just two questions,
you can set yourself up to succeed.

The Two Questions


To start thinking visually, consider the nature and purpose of your visualization:
CONCEPTUAL
FOCUS: Ideas
GOALS: Simplify, teach (“Here’s how our organization is structured.”)
DATA-DRIVEN
FOCUS: Statistics
GOALS: Inform, enlighten (“Here are our revenues for the past two years.”)
Is the information conceptual or data-driven?
Am I declaring something or exploring something?

© Copyright FHM.AVI – Data 92


Data Analyst

If you know the answers to these questions, you can plan what resources and tools
you’ll need and begin to discern what type of visualization will help you achieve
your goals most effectively.
The first question is the simpler of the two, and the answer is usually obvious. Either
you’re visualizing qualitative information or you’re plotting quantitative
information: ideas or statistics. But notice that the question is about the information
itself, not the forms you might ultimately use to show it. For example, the classic
Gartner Hype Cycle uses a traditionally data-driven form—a line chart—but no
actual data. It’s a concept.

If the first question identifies what you have, the second elicits what you’re doing:
either communicating information (declarative) or trying to figure something out
(exploratory).
DECLARATIVE
FOCUS: Documenting, designing
GOALS: Affirm (“Here is our budget by department.”)
EXPLORATORY
FOCUS: Prototyping, iterating, interacting, automating
GOALS: Confirm (“Let’s see if marketing investments contributed to rising
profits.”) and discover (“What would we see if we visualized customer purchases by
gender, location, and purchase amount in real time?”)
Managers most often work with declarative visualizations, which make a statement,
usually to an audience in a formal setting. If you have a spreadsheet workbook full

© Copyright FHM.AVI – Data 93


Data Analyst

of sales data and you’re using it to show quarterly sales in a presentation, your
purpose is declarative.
But let’s say your boss wants to understand why the sales team’s performance has
lagged lately. You suspect that seasonal cycles have caused the dip, but you’re not
sure. Now your purpose is exploratory, and you’ll use the same data to create visuals
that will confirm or refute your hypothesis. The audience is usually yourself or a
small team. If your hypothesis is confirmed, you may well show your boss a
declarative visualization, saying, “Here’s what’s happening to sales.”
Exploratory visualizations are actually of two kinds. In the example above, you were
testing a hypothesis. But suppose you don’t have an idea about why performance is
lagging—you don’t know what you’re looking for. You want to mine your
workbook to see what patterns, trends, and anomalies emerge. What will you see,
for example, when you measure sales performance in relation to the size of the region
a salesperson manages? What happens if you compare seasonal trends in various
geographies? How does weather affect sales? Such data brainstorming can deliver
fresh insights. Big strategic questions—Why are revenues falling? Where can we
find efficiencies? How do customers interact with us? —can benefit from a
discovery-focused exploratory visualization.

The Four Types


The nature and purpose questions combine in a classic 2×2 to define four types of
visual communication: idea illustration, idea generation, visual discovery, and
everyday dataviz.

© Copyright FHM.AVI – Data 94


Data Analyst

Idea Illustration
We might call this quadrant the “consultants’ corner.” Consultants can’t resist
process diagrams, cycle diagrams, and the like. At their best, idea illustrations clarify
complex ideas by drawing on our ability to understand metaphors (trees, bridges)
and simple design conventions (circles, hierarchies). Org charts and decision trees
are classic examples of idea illustration. So is the 2×2 that frames this article.
Idea illustration demands clear and simple design, but its reliance on metaphor
invites unnecessary adornment. Because the discipline and boundaries of data sets
aren’t built into idea illustration, they must be imposed. The focus should be on clear
communication, structure, and the logic of the ideas. The most useful skills here are
similar to what a text editor brings to a manuscript—the ability to pare things down
to their essence. Some design skills will be useful too, whether they’re your own or
hired.

Idea Illustration
• INFO TYPE: Process, framework
• TYPICAL SETTING: Presentations, teaching
• PRIMARY SKILLS: Design, editing

© Copyright FHM.AVI – Data 95


Data Analyst

• GOALS: Learning, simplifying, explaining


Suppose a company engages consultants to help its R&D group find inspiration in
other industries. The consultants use a technique called the pyramid search—a way
to get information from experts in other fields close to your own, who point you to
the top experts in their fields, who point you to experts in still other fields, who then
help you find the experts in those fields, and so on.
It’s actually tricky to explain, so the consultants may use visualization to help. How
does a pyramid search work? It looks something like this:

The axes use conventions that we can grasp immediately: industries plotted near to
far and expertise mapped low to high. The pyramid shape itself shows the relative
rarity of top experts compared with lower-level ones. Words in the title—“climbing”
and “pyramids”—help us grasp the idea quickly. Finally, the designer didn’t
succumb to a temptation to decorate: The pyramids aren’t literal, three-dimensional,
sandstone-colored objects.
Too often, idea illustration doesn’t go that well, and you end up with something like
this:

© Copyright FHM.AVI – Data 96


Data Analyst

Here the color gradient, the drop shadows, and the 3-D pyramids distract us from the
idea. The arrows don’t actually demonstrate how a pyramid search works. And
experts and top experts are placed on the same plane instead of at different heights
to convey relative status.

Idea Generation
Managers may not think of visualization as a tool to support idea generation, but
they use it to brainstorm all the time—on whiteboards, on butcher paper, or,
classically, on the back of a napkin. Like idea illustration, idea generation relies on
conceptual metaphors, but it takes place in more-informal settings, such as off-sites,
strategy sessions, and early-phase innovation projects. It’s used to find new ways of
seeing how the business works and to answer complex managerial challenges:
restructuring an organization, coming up with a new business process, codifying a
system for making decisions.
IDEA GENERATION

• INFO TYPE: Complex, undefined


• TYPICAL SETTING: Working session, brainstorming

© Copyright FHM.AVI – Data 97


Data Analyst

• PRIMARY SKILLS: Team-building, facilitation


• GOALS: Problem solving, discovery, innovation
Although idea generation can be done alone, it benefits from collaboration and
borrows from design thinking—gathering as many diverse points of view and visual
approaches as possible before homing in on one and refining it. Managers who are
good at leading teams, facilitating brainstorming sessions, and encouraging and then
capturing creative thinking will do well in this quadrant. Design skills and editing
are less important here, and sometimes counterproductive. When you’re seeking
breakthroughs, editing is the opposite of what you need, and you should think in
rapid sketches; refined designs will just slow you down.
Suppose a marketing team is holding an off-site. The team members need to come
up with a way to show executives their proposed strategy for going upmarket. An
hourlong whiteboard session yields several approaches and ideas (none of which are
erased) for presenting the strategy. Ultimately, one approach gains purchase with the
team, which thinks it best captures the key point: Get fewer customers to spend much
more. The whiteboard looks something like this:

© Copyright FHM.AVI – Data 98


Data Analyst

Of course, visuals that emerge from idea generation often lead to more formally
designed and presented idea illustrations.

© Copyright FHM.AVI – Data 275


Data Analyst

Visual Discovery
This is the most complicated quadrant, because in truth it holds two categories.
Recall that we originally separated exploratory purposes into two kinds: testing a
hypothesis and mining for patterns, trends, and anomalies. The former is focused,
whereas the latter is more flexible. The bigger and more complex the data, and the
less you know going in, the more open-ended the work.
VISUAL DISCOVERY

• INFO TYPE: Big data, complex, dynamic


• TYPICAL SETTING: Working sessions, testing, analysis
• PRIMARY SKILLS: Business intelligence, programming, paired analysis
• GOALS: Trend spotting, sense making, deep analysis
Visual confirmation. You’re answering one of two questions with this kind of
project: Is what I suspect actually true? or What are some other ways of depicting
this idea?
The scope of the data tends to be manageable, and the chart types you’re likely to
use are common—although when trying to depict things in new ways, you may
venture into some less-common types. Confirmation usually doesn’t happen in a
formal setting; it’s the work you do to find the charts you want to create for
presentations. That means your time will shift away from design and toward
prototyping that allows you to rapidly iterate on the dataviz. Some skill at
manipulating spreadsheets and knowledge of programs or sites that enable swift
prototyping are useful here.

© Copyright FHM.AVI – Data 276


Data Analyst

Suppose a marketing manager believes that at certain times of the day more
customers shop his site on mobile devices than on desktops, but his marketing
programs aren’t designed to take advantage of that. He loads some data into an online
tool (called Datawrapper) to see if he’s right (1 above).
He can’t yet confirm or refute his hypothesis. He can’t tell much of anything, but
he’s prototyping and using a tool that makes it easy to try different views into the
data. He works fast; design is not a concern. He tries a line chart instead of a bar
chart (2).
Now he’s seeing something, but working with three variables still doesn’t quite get
at the mobile-versus-desktop view he wants, so he tries again with two variables (3).
Each time he iterates, he evaluates whether he can confirm his original hypothesis:
At certain times of day more customers are shopping on mobile devices than on
desktops.
On the fourth try he zooms in and confirms his hypothesis (4).
New software tools mean this type of visualization is easier than ever before: They’re
making data analysts of us all.

© Copyright FHM.AVI – Data 277


Data Analyst

Visual exploration. Open-ended data-driven visualizations tend to be the province


of data scientists and business intelligence analysts, although new tools have begun
to engage general managers in visual exploration. It’s exciting to try, because it often
produces insights that can’t be gleaned any other way.
Because we don’t know what we’re looking for, these visuals tend to plot data more
inclusively. In extreme cases, this kind of project may combine multiple data sets or
load dynamic, real-time data into a system that updates automatically. Statistical
modeling benefits from visual exploration.

10 Essential Data Visualization Techniques,


Concepts & Methods to Improve Your
Business

“By visualizing information, we turn it into a landscape that you can explore
with your eyes. A sort of information map. And when you’re lost in information,
an information map is kind of useful.” – David McCandless

Did you know? 90% of the information transmitted to the brain is visual.

© Copyright FHM.AVI – Data 278


Data Analyst

Concerning professional growth, development, and evolution, using data-driven


insights to formulate actionable strategies and implement valuable initiatives is
essential. Digital data not only provides astute insights into critical elements of your
business but, if presented in an inspiring, digestible, and logical format, it can tell a
tale that everyone within the organization can get behind.
Data visualization methods refer to the creation of graphical representations of
information. Visualization plays an important part of data analytics and helps
interpret big data in a real-time structure by utilizing complex sets of numerical or
factual figures.
With the seemingly infinite streams of data readily available to today's businesses
across industries, the challenge lies in data interpretation, which is the most valuable
insight to the individual organization as well as its aims, goals, and long-term
objectives.
That's where data visualization comes in.
Due to the way the human brain processes information, presenting insights in charts
or graphs to visualize significant amounts of complex data is more accessible than
relying on spreadsheets or reports.
Visualizations offer a swift, intuitive, and simpler way of conveying critical concepts
universally – and it's possible to experiment with different scenarios by making tiny
adjustments.
Recent studies discovered that the use of visualizations in data analytics could
shorten business meetings by 24%. Moreover, a business intelligence
strategy with visualization capabilities boasts a ROI of $13.01 back on every
dollar spent.
Therefore, the visualization of data is critical to the sustained success of your
business and to help you yield the most possible value from this tried and tested
means of analyzing and presenting vital information. Here are 10 essential data
visualization techniques you should know.

1. Know Your Audience


This is one of the most overlooked yet vital concepts around.

© Copyright FHM.AVI – Data 279


Data Analyst

In the grand scheme of things, the World Wide Web and Information Technology as
a concept are in its infancy - and data visualization is an even younger branch of
digital evolution.
That said, some of the most accomplished entrepreneurs and executives find it
difficult to digest more than a pie chart, bar chart, or a neatly presented visual, nor
do they have the time to delve deep into data. Therefore, ensuring that your content
is both inspiring and tailored to your audience is one of the most essential data
visualization techniques imaginable.
Some stakeholders within your organization or clients and partners will be happy
with a simple pie chart, but others will be looking to you to delve deeper into the
insights you’ve gathered. For maximum impact and success, you should always
conduct research about those you’re presenting to prior to a meeting, and collating
your report to ensure your visuals and level of detail meet their needs exactly.

2. Set Your Goals


Like any business-based pursuit, from brand storytelling right through to digital
selling and beyond - with the visualization of your data, your efforts are only as
effective as the strategy behind them.
To structure your visualization efforts, create a logical narrative and drill down into
the insights that matter the most. It’s important to set a clear-cut set of aims,
objectives, and goals prior to building your management reports, graphs, charts, and
additional visuals.
By establishing your aims for a specific campaign or pursuit, you should sit down in
a collaborative environment with others invested in the project and establish your
ultimate aims in addition to the kind of data that will help you achieve them.
One of the most effective ways to guide your efforts is by using a predetermined set
of relevant KPIs for your project, campaigns, or ongoing commercial efforts and
using these insights to craft your visualizations.

© Copyright FHM.AVI – Data 280


Data Analyst

3. Choose The Right Chart Type


One of the most effective data visualization methods on our list; to succeed in
presenting your data effectively, you must select the right charts for your specific
project, audience, and purpose.
For instance, if you are demonstrating a change over a set of time periods with more
than a small handful of insights, a line graph is an effective means of visualization.
Moreover, lines make it simple to plot multiple series together.

An example of a line chart used to present monthly sales trends for a one-year
period in a clear and glanceable format.
Here are three other effective chart types for different data visualization concepts:
a) Number charts

© Copyright FHM.AVI – Data 281


Data Analyst

Real-time number charts are particularly effective when you’re looking to showcase
an immediate and interactive overview of a particular key performance indicator,
whether it’s a sales KPI, site visitations, engagement levels or a percentage of
evolution.
b) Maps

First of all, maps look great which means they will inspire engagement in a board
meeting or presentation. Secondly, a map is a quick, easy, and digestible way to
present large or complex sets of geographical information for a number of purposes.

c) Pie charts

© Copyright FHM.AVI – Data 282


Data Analyst

While pie charts have received a bad rep in recent years, we feel that they form a
useful visualization tool that serves up important metrics in an easy-to-followformat.
Pie charts prove particularly useful when demonstrating the proportional
composition of a certain variable over a static timeframe. And as such, pie charts
will make a valuable item of your visualization arsenal.

d) Gauge charts

© Copyright FHM.AVI – Data 283


Data Analyst

This example shows the operating expense ratio, strongly related to the profit and
loss area of your finance department’s key activities, and this color-coded health
gauge helps you gain access to the information you need, even at a quick glance.
Gauge charts can be effectively used with a single value or data point. Whether
they're used in financial or executive dashboard reports to display progress against
key performance indicators, gauge charts are an excellent example to showcase an
immediate trend indication.
To find out more, and expand your data visualization techniques knowledgebase,
you can explore our selected data visualization types, a simple guide on how and
when to use them.

4. Take Advantage of Color Theory


The most straightforward of our selected data visualization techniques - selecting the
right color scheme for your presentational assets will help enhance your efforts
significantly.

The principles of color theory will have a notable impact on the overall success
of your visualization model. That said, you should always try to keep your color

© Copyright FHM.AVI – Data 284


Data Analyst

scheme consistent throughout your data visualizations, using clear contrasts to


distinguish between elements (e.g. positive trends in green and negative trends in
red).
As a guide, people, on the whole, use red, green, blue, and yellow as they can be
recognized and deciphered with ease.

5. Handle Your Big Data


With an overwhelming level of data and insights available in today’s digital world -
with roughly 1.7 megabytes of data to be generated per second for every human
being on the planet by the year 2020 - handling, interpreting and presenting this rich
wealth of insight does prove to be a real challenge.
To help you handle your big data and break it down for the most focused, logical,
and digestible visualizations possible, here are some essential tips:
• Discover which data is available to you and your organization, decide which
is the most valuable, and label each branch of information clearly to make it
easy to separate, analyze, and decipher.
• Ensure that all of your colleagues, staff, and team members understand where
your data comes from and how to access it to ensure the smooth handling of
insights across departments.
• Keep your data protected and your data handling systems simple, digestible,
and updated to make the visualization process as straightforward and intuitive
as humanly possible.
• Ensure that you use business dashboards that present your most valuable
insights in one easy-to-access, interactive space - accelerating the
visualization process while also squeezing the maximum value from your
information.

6. Use Ordering, Layout, And Hierarchy to Prioritize


Following on our previous point, once you’ve categorized your data and broken it
down to the branches of information that you deem to be most valuable to your
organization, you should dig deeper, creating a clearly labelled hierarchy of your
data, prioritizing it by using a system that suits you (color-coded, numeric, etc.)

© Copyright FHM.AVI – Data 285


Data Analyst

while assigning each data set a visualization model or chart type that will showcase
it to the best of its ability.
Of course, your hierarchy, ordering, and layout will be in a state of constant
evolution but by putting a system in place, you will make your visualization efforts
speedier, simpler, and more successful.

7. Utilize Word Clouds and Network Diagrams

To handle semi-structured or decidedly unstructured sets of data efficiently, you


should consult the services of network diagrams or cloud words.
A network diagram is often utilized to draw a graphical chart of a network. This style
of layout is useful for network engineers, designers, and data analysts while
compiling comprehensive network documentation.
Akin to network diagrams, word clouds offer a digestible means of presenting
complex sets of unstructured information. But, as opposed to graphical assets, a word
cloud is an image developed with words used for particular text or subject, in which
the size of each word indicates its frequency or importance within the contextof the
information.

8. Include Comparisons
This may be the briefest of our data visualization methods, but it’s important
nonetheless: when you’re presenting your information and insights, you should

© Copyright FHM.AVI – Data 286


Data Analyst

include as many tangible comparisons as possible. By presenting two graphs, charts,


diagrams together, each showing contrasting versions of the same information over
a particular timeframe, such as monthly sales records for 2016 and 2017 presented
next to one another, you will provide a clear-cut guide on the impact of your data,
highlighting strengths, weaknesses, trends, peaks, and troughs that everyone can
ponder and act upon.

9. Tell Your Tale


Similar to content marketing, when you're presenting your data in a visual format
with the aim of communicating an important message or goal, telling your story will
engage your audience and make it easy for people to understand with minimal effort.

Scientific studies confirm that humans, in large, respond better to a well-told


story and by taking this approach to your visualization pursuits, you will not only
dazzle your colleagues, partners, and clients with your reports and presentations, but
you will increase your chances of conveying your most critical messages, getting the
buy-in and response you need to make the kind of changes that will result in long-
term growth, evolution and success.
To do so, you should collate your information, thinking in terms of a writer,
establishing a clear-cut beginning, middle, and end, as well as a conflict and
resolution, building tension during your narrative to add maximum impact to your
various visualizations.

10. Apply Visualization Tools for The Digital Age


We live in a fast-paced, hyper-connected digital age that is far removed from the pen
and paper or even copy and paste mentality of the yesteryears - and as such, to make
a roaring visualization success, you should use the digital tools that will help you
make the best possible decisions while gathering your data in the most efficient,
effective way.
A task-specific, interactive dashboard or tool offers a digestible, intuitive,
comprehensive, and interactive mean of collecting, collating, arranging, and
presenting data with ease - ensuring that your techniques have the most possible
impact while taking up a minimal amount of your time.

© Copyright FHM.AVI – Data 287


Data Analyst

We hope these data visualization concepts served to help propel your efforts to new
successful heights. To enhance your ongoing activities, explore our cutting-edge
business intelligence and data visualization tool.

How to Learn Data Visualization


First, you need to be able to understand data and how to present it without
misinterpreting that data and misleading those you’re trying to reach. For example,
you’ll want to know the difference between a median and an average and how an
average can be misleading if there are outliers in your data.
There’s also the option of pursuing undergraduate degrees in data science or business
analytics, though many professionals who work with data visualization every day
don’t have a degree in those fields. There is still room for degrees, especially for
people who want to dive deep into data as their career.
Becoming active in online communities and blogs can be a great way to learn more
about data visualization. Some data visualization tools maintain blogs with
inspiration, guides and examples.
Without pursuing a formal degree, you’ll need to set aside time to learn on your own,
but there’s a range of courses, books, tutorials and other online resources for getting
started in data visualization. Here are a few:

● Coursera (free and paid options)


● Data Stories podcast
● Tableau eLearning (free)
● Towards Data Science (blog)
● Visualising Data (blog) and training
● Codecademy (for Python and JavaScript) (free and paid)
● Makeover Monday webinars (free)
● LinkedIn Learning (free trial, then $29.99 per month)
● Reddit (r/dataisbeautiful)

● Kaggle (free)

© Copyright FHM.AVI – Data 288


Data Analyst

Tools for Data Visualization


No coding:
Excel
The Point of Excel Spreadsheet Visualization:

Excel spreadsheets aren’t as cool as they used to be. As useful as they are for data
entry and calculating, all those cells and formulas can be overwhelming. And despite
the volume of important information, executives still need to have the story told to
them. And you don’t want to hand someone a stack of spreadsheets because you risk
not getting the story across.
Most clients approach us with spreadsheets full of data and we use the same analysis
process for many of them in order to tell their data story, visually. We collaborate
with clients from data to visualized product, but sometimes you don’t have time to
engage with a vendor to get the job done.
Where do go when you need to visualize your dense data story using excel as your
data source?

Tableau

© Copyright FHM.AVI – Data 289


Data Analyst

Tableau Public: This is right at the top because it's essentially the same platform as
our self-service BI tool Editors' Choice winner Tableau Desktop (Visit Store at
Tableau). The company chose not to make its free version feature-poor. Instead, this
is the full version of Tableau that's available for free download, with only one caveat:
Everything you create with it is public, which means you'll automatically be making
it available on the web via Tableau's visualization gallery.

Tableau Gallery: Tableau's gallery is cool enough to warrant a mention all its own
because you don't need to download the tool nor use it to benefit from the gallery.
Every visualization here can be downloaded into documents and email, or embedded
into webpages with code snippets provided by Tableau. Other folks have done
tremendous work on some truly impressive data visualizations and Tableau has
curated that content and made it available for download. This is a great resource, not

© Copyright FHM.AVI – Data 290


Data Analyst

only for business people but also for researchers, students, and journalists looking
for ways not just to flesh out and beautify their content but to keep it current, too.

Some coding:
R
It’s free, supported by tons of ongoing development adding useful packages on top
of the base language, and there are great free resources to learn it. It has useful
packages like ggplot, a very popular visualization tool) all the way to adding
interactivity, publishing to the web via Shiny and storytelling with data.

I ♥ code:
D3.js
D3.js is a JavaScript library to create dynamic, interactive visualizations for modern
web browsers. D3 helps you bring data to life using HTML, SVG, and CSS. D3’s
emphasis on web standards gives you the full capabilities of modern browsers
without tying yourself to a proprietary framework, combining powerful visualization
components and a data-driven approach to DOM manipulation.

© Copyright FHM.AVI – Data 291


Data Analyst

D3.js transforms your arbitrary data tables into stunning visual graphics with the full
power and flexibility of standard web languages like HTML5, CSS3, and JavaScript.
It is open-source and free to use.

Python
Data visualization in python is perhaps one of the most utilized features for data
science with python in today’s day and age. The libraries in python come with lots
of different features that enable users to make highly customized, elegant, and
interactive plots.
In this article, we will cover the usage of Matplotlib, Seaborn as well as an
introduction to other alternative packages that can be used in python visualization.
Within Matplotlib and Seaborn, we will be covering a few of the most commonly
used plots in the data science world for easy visualization.
To get a little overview here are a few popular plotting libraries:
• Matplotlib: low level, provides lots of freedom
• Pandas Visualization: easy to use interface, built on Matplotlib
• Seaborn: high-level interface, great default styles
• ggplot: based on R’s ggplot2, uses Grammar of Graphics
• Plotly: can create inteactive plots

© Copyright FHM.AVI – Data 292


Data Analyst

In the examples, I will use pandas to manipulate the data and use it to drive the
visualization. In most cases these tools can be used without pandas but I think the
combination of pandas + visualization tools are so common, it is the best place to
start.

Visualization best practices


Choose a dashboard design type

You’ve probably read that there are 3 main types of dashboard design. In fact, this
is only true when looking at Business Intelligence dashboards. The three main types
of BI dashboard are:

● Operational dashboards – these dashboards help the user see what’s


happening right now
● Analytical dashboards – these dashboards give the user a clear view of
performance trends and potential problems
● Strategic dashboards – this type of dashboard lets the user track their main
strategic goals via KPIs
Here are the main advantages of the different BI dashboard types, and when to use
them.

© Copyright FHM.AVI – Data 293


Data Analyst

Operational Dashboards
Operational dashboards are used to show your user their current status in your app.
Use operational dashboards to display critical information that’s time relevant. For
example, in a web analytics application, the operational dashboard could include
information like: active users on site, top social referrals and pageviews per minute.

Operational dashboards are great because they let your user check their status at a
glance. You should structure them so that the most important data is visible at top
left (for left-right reading languages), helping your user get a snapshot as soon as
they open the dashboard. They can contain a few graphs but shouldn’t reflect detailed
data views.

Analytical Dashboards
Analytical dashboards are used to present key data sets to the user, always reflected
against previous performance. They should be data-centric, and show as many
relevant data views as is feasible.

© Copyright FHM.AVI – Data 294


Data Analyst

Analytical dashboards should lead with key account data front and center, and should
minimize graphical elements. They serve as a barometer of the user’s status in your
application, and make it easier for a user to spot problems.

Strategic Dashboards
Strategic dashboards are used to indicate performance against a set of key
performance indicators (KPIs). As seen in this great example from Cascade, a
strategic dashboard should reflect how your user is performing against their strategic
goals… and not much else.

© Copyright FHM.AVI – Data 295


Data Analyst

Other dashboard types


Outside of the business intelligence realm, dashboards can come in all sorts of shapes
and sizes, depending on your application type. Schedules, profile pages, your movie
library… if it’s a single page with the most important information or actions the user
might want to access, it’s a dashboard.

Platform dashboards
Platform dashboards are used to give a user access to controls, tools and analytics
related to their account on a social platform. The YouTube Studio Dashboard is a
good example. A simple initial view shows the user’s latest videos, with some stats
for each.
There are also cards for channel analytics, comments and news and tips. In the
sidebar, the user can access a large number of tools and account controls, including
the video manager, channel status and more. YouTube keeps things simple, while
giving the user complete control.

© Copyright FHM.AVI – Data 296


Data Analyst

Time-in-app dashboards
A popular new trend in mobile apps is to give the user information about how much
time they’re spending in each app. If you’re working on a new social app, you’ll
want to include this kind of dashboard. The idea is to help users to see if they’re
spending too much time on their device.

How to build Data Reports to improve


Business Performance
While they have always played a pivotal role in business success, the terms ‘data
report’ or ‘business report’ haven’t exactly been synonymous with creativity or
innovation. Data reporting is often seen as a necessary evil created by analysts and
consultants to offer functional operational insights. As such, the term usually
conjures up images of static PDFs, old-school PowerPoint slides, and big tables.
Usually created with past data without any room for generating real-time or
predictive insights, data reports were deemed obsolete, consisting of numerous
external and internal files, without proper data management processes at hand.

© Copyright FHM.AVI – Data 297


Data Analyst

But in the digital age, it doesn’t have to be this way. In fact, the business intelligence
industry has evolved enormously over the past decade, and data reports are riding
the crest of this incredible technological wave.
The rise of innovative report tools means you can create report data that people are
compelled to read and that will offer a wealth of business-boosting value. If you
utilize business intelligence correctly, not only you will be able to connect your data
dots, but take control of your data across the company and improve your bottom line.
Here, we will consider the question, ‘what is a data report?’, explore how to arrange
report data, and provide the best possible data reports examples, all created with
modern software. Without further ado, read on to see why data reports matter and
our top data reporting tips.

What Is a Data Report?


A data report is an evaluation tool used to assess past, present, and future business
information while keeping track of the overall performance of a company. It
combines various business data and usually used both on an operational or strategic
level of decision-making.
As mentioned, these reports had features of static presentation of data, manually
written or calculated, but with the introduction of modern processes such as
dashboard reporting, they have developed into an invaluable resource to successfully
manage your sales processes, marketing data, even robust manufacturing
analytics and numerous other business processes needed to stay on top of the pack.
But let’s get into the basics in more detail, and afterward, we will explore data
reporting examples that you can use for your own internal processes and more.

Data Reporting Basics


We’ve explored the data reports definition – now, it’s time to look at the essential
reports and data fundamentals: the building blocks of business intelligence success.
• Purpose: Data analytics is the art of curating and analyzing raw data with the
purpose of transforming metrics into actionable insight. Data reports present
metrics, analyses, conclusions, and recommendations in an accessible,

© Copyright FHM.AVI – Data 298


Data Analyst

digestible visual format so everyone in an organization can make informed


data-driven decisions.
• Data types: Business data reports cover a variety of topics and organizational
functions. As such, all data report types vary greatly in length, content, and
format. It’s possible to present reporting data as an annual report, monthly
sales report, accounting report, reports requested by management exploring a
specific issue, reports requested by the government showing a company’s
compliance with regulations, progress reports, feasibility studies, and more.
The all-encompassing nature of data-centric reports means that it’s possible
to work with a mix of historic, predictive, and real-time insights to paint a
panoramic picture of your organization's functions, processes, and overall
progress.
• Accessibility: Historically, creating business data reports was time- and
resource-intensive. Data pull requests were the exclusive duties of the IT
department, with a significant amount of time spent analyzing, formatting,
and then presenting the data. Because this task was so resource-heavy, data
analysis was an occasional luxury. Also, by the time the data was presented,
it was generally out of date. The emergence of real-time cloud-based BI
reporting tools has changed the data reporting game. Now a wider range of
business users can act as analysts, even performing advanced analytics. The
right BI platform can blend multiple data sources into one report and analysis:
enhancing business insights and better-informed decision making. These
cloud-based tools allow organizations to collaborate on a report, bringing
various subject matter experts (SME) to the same table. Modern business
dashboard tools allow a wider audience to comprehend and disseminate the
report findings. Users can also easily export these dashboards and data
visualizations into visually stunning reports that can be shared via multiple
options such as automating emails or providing a secure viewer area, even
embedding reports into your own application, for example.
• Flexibility: In addition to the fact that data report software offers a wealth of
visually-accessible KPI-driven insight, business intelligence dashboards are
also completely customizable to suit individual business goals or needs.
Moreover, data dashboards are optimized for mobile devices, meaning that
it’s possible for users to access a wealth of business-boosting information
from a central dashboard, 24/7, without restrictions or limits. You can

© Copyright FHM.AVI – Data 299


Data Analyst

leverage business intelligence at any time of day or night, from anywhere in


the world.
Now that you understand the superior analytical capabilities of modern business data
reporting, we’re going to look at a mix of tips and ideas designed to help you build
and create data reports that will save time and costs while driving innovation across
the business.
"The future belongs to those who see possibilities before they become obvious."
- John Sculley, early adopter, and advocate of data intelligence

5 Principles of Report Design


22

In my previous experience as a supervisory accounting professional and my current


role as an accounting professor, I’ve reviewed numerous financial and other reports
and have observed that accounting students and young professionals alike often
prepare reports that are lacking in a variety of ways. These draft reports have
presented issues that tend to fall into five major categories: accuracy, consistency,
appearance, efficiency, and usability, with occasional overlap between them. While
I’ve noticed these issues mostly in the work of younger and inexperienced report
preparers, even seasoned professionals could benefit by keeping these principles in
mind when preparing reports.

© Copyright FHM.AVI – Data 300


Data Analyst

For illustrative purposes, consider an organization (we’ll call it ABC Organization)


with an internal service center tasked with supplying paper to various departments.
Suppose that the internal service center intends to bill the other departments for their
paper consumption and therefore needs to report types and quantities of paper by
consumer.
Figure 1 provides a sample report for this purpose, generated using Microsoft Excel.
This report has a variety of design issues, perhaps representing a first draft that might
be prepared. Before you continue reading, it may be insightful to take a few minutes
to closely examine this sample report and try to identify necessary changes.
Determining that this report has design problems is a critical step (and may not be
particularly difficult), but understanding the specific nature of the reporting
problems and how to correct them is key to producing higher-quality reports later.

© Copyright FHM.AVI – Data 301


Data Analyst

Figure 1
It’s important when reviewing reports to allow some flexibility in presentation as
long as the end result is still appropriately professional. In my experience,
individuals often develop their own reporting style (such as selecting particular fonts
and other reporting attributes that they favor); my primary emphasis is in
encouraging the use of a consistent and professional reporting style compliant with
these five principles. Evaluation of these principles is likely to be more beneficial
for reports that are prepared manually, but they’re also worthy of consideration
related to system-generated reports that can be negatively affected by the setup and
maintenance of system components (the so-called “garbage in, garbage out”
scenario).

1. Accuracy
The accuracy principle simply means that the content of a report represents what it
claims it does. It involves, for example, ensuring that the written components and
titles in the report are free from spelling and grammatical errors and that the data
presented is associated with the time period(s) indicated. It should also require that
the titles and descriptions in the report are consistent with the actual amounts
included in the report.
For example, an amount reported as wages expense on a report should, in fact, be
the amount of wages expense for the period and not some other expense. And, of
course, any amounts included in the report need to be accurate, even if amounts
(particularly in accounting) sometimes require judgment and estimation. When the

© Copyright FHM.AVI – Data 302


Data Analyst

true amounts may not be known for certain until some future date, as is common
with accounting data, the estimated amounts should at least be verifiable against
standard data sources to be considered accurate.
In my professional experience, reports often included nonfinancial data such as
employee full-time equivalents (FTEs), inventory quantities, and even pairs of safety
shoes. As financial systems don’t always track such measures, utilizing established
scales to convert financial data and to reasonably estimate nonfinancial measures is
sometimes necessary. Further, I learned to be particularly cautious when reports
include any labels that are presented in all caps because such labels usually are
excluded from spell-checking functions.

Figure 2
The sample report is lacking in accuracy in at least three regards. First, both the title
(“Organziation”) and one consumer name (“Marekting”) have spelling errors.
Misspellings can occur when reports are prepared in a hurry and involve manual data
entry. The misspellings can usually be easily detected and corrected by using the
spell-checking function available in most software packages, including Microsoft
Excel, though these tools aren’t always sufficient. All reports should be proofread
carefully. Also, the total for “Std 81/2×14” of 38 reams is clearly incorrect, as the
chief legal officer alone consumed 40 reams. These accuracy problems are noted in
Figure 2. Other accuracy issues, such as specific amounts not corresponding to the
label or period indicated, could be less immediately apparent. The process for
determining the paper quantities for each consumer also could need to be validated.

2. Consistency

© Copyright FHM.AVI – Data 303


Data Analyst

The consistency principle requires that the format and layout of a report are similar
to prior issuances of the same report and/or other reports issued by the same
department. In many organizations, selected individuals or departments (such as the
CFO or the board of directors) will receive many different reports each period.
Depending on how well such a recipient organizes reports, whether electronic or in
paper form, having a consistent (and, in certain cases, distinctive) format or “feel”
for each report or for each issuing department will allow the recipient to quickly
identify a needed report for a specific related decision. Having a consistent format
can also provide a brand style for the source department or individual preparer.
In my own reporting, for example, I developed a preference for using the Garamond
font in the title bar (also centered and bolded) and the Book Antiqua font in the body
of my reports. Consistency also involves ensuring that titles and descriptions remain
the same from one period to the next so that recipients know that the same
information is being reported. (To the contrary, changing a column or row label,
even slightly, could lead a recipient to question whether something different is now
being reported.) Fonts and other format attributes also should be consistent to be in
compliance with the principle.
A number of consistency issues exist in the sample report. For one, although
reporting styles will vary, the title for this report isn’t consistent with what most
people would use. More often, the title would be centered and would perhaps involve
bolded or other emphasized text. The sample report also includes some inconsistent
consumer descriptors. Most of the noted consumers represent groups or departments,
but two of them (Chief Legal Officer and Manufacturing Manager) represent
individuals. This is an example of a reporting issue that could result from
inconsistent structure within the accounting system.

© Copyright FHM.AVI – Data 304


Data Analyst

Figure 3
Last, although I find that many students in particular have difficulty noticing such a
problem, one row in the report (for the Manufacturing Manager) is presented using
a different font. This particular problem often arises when an additional row must be
added to a pre-existing report. Figure 3 highlights the consistency issues associated
with the sample report.

3. Appearance
The appearance principle means that the report is aesthetically pleasing and also
professional-looking. (After all, this is similar to but not quite the same as creating
artwork.) Aesthetically pleasing reports should include proper alignments and
should make appropriate use of white space, borders, shading, and color. The
purpose of most reports is to support decision making, and improving the appearance
of the report can often help to draw the attention of the decision maker to the most
relevant data items (and can avoid distracting the recipient).
For example, inserting a blank row above and/or underlining a very important
financial statement amount naturally attracts the gaze of the reader. I’m an admitted
direct communicator, being very specific when I provide positive or negative
feedback to report preparers (and now students). During my professional career, I
became notorious for occasionally remarking “this report hurts my eyes” when I was
particularly frustrated with the design of a draft report that I was reviewing. In short,
the appearance principle is intended to not “hurt the eyes” of report recipients.

© Copyright FHM.AVI – Data 305


Data Analyst

Figure 4
The sample report is lacking in various appearance attributes. The report includes no
lines or bolding that would help draw the attention of recipients to important data. It
wouldn’t be uncommon, for example, to include lines beneath the column headers
and above the column totals so that the titles and totals appear separated from the
central content of the report. Additional white space, borders, shading, and color
could be applied in this sample report to better direct attention and to support
decision making. Also related to appearance, amounts included in reports should be
right-aligned rather than left-aligned. Use of the “accounting” cell format in
Microsoft Excel will replace the zeros with dashes, which often results in a more
pleasing report appearance. The needed corrections to the sample report related to
appearance are shown in Figure 4.

4. Efficiency
The efficiency principle involves ensuring that a standard report can be prepared as
quickly and easily as possible. This often means utilizing automated or formulaic
fields where possible. This will help to minimize the data entry and computations
necessary for the preparation of reports. If possible, building reports to extract data
directly from the underlying accounting system, both for labels and amounts, can
create the greatest efficiency. In one of my prior professional positions, where a
legacy, homegrown accounting system with poor reporting capabilities was used, we
created higher-quality reports with many automated fields using Microsoft

© Copyright FHM.AVI – Data 306


Data Analyst

Access. The reports then extracted data from SQL tables that were created in a
nightly download from the accounting system.
In one case, that allowed us to essentially automate about two-thirds of the
organization’s annual reporting, which had previously been prepared by
continuously updating numerous large Excel files (a process that was slow, labor-
intensive, and ripe for errors). And efficiency should relate not only to the
preparation of reports but also to the use of reports where helping to ensure that
relevant decisions are supported efficiently is important.

Figure 5
Again, the sample report reveals a variety of efficiency problems. First, because it
was observed earlier that the total for the legal stock was incorrect, it would seem
that formulas aren’t being used for the total row of the report. Microsoft Excel offers
a variety of powerful formulas and functions that can improve reporting efficiency,
the most basic of which is to include a summing formula in the total row. (More
complicated formulas, lookups, and pivot functions can add tremendous efficiency
for reports generated using Microsoft Excel.)
Related to efficiency of use, it isn’t at all clear in the sample report what decision
process it’s intended to support. Inclusion of some additional information related to
the totals might make the report more efficient to use. For example, it was noted
earlier that the purpose of the report was for the internal service center to request
reimbursement for the paper provided to each consumer group. That purpose could
be better demonstrated in the report itself to avoid recipients making either no use

© Copyright FHM.AVI – Data 307


Data Analyst

or inefficient use of the report. Figure 5 illustrates the efficiency issues described for
this sample report.

5. Usability
The usability principle relates very specifically to decision support for the report
recipients. It involves considering how the report will be disseminated. In that
regard, report data should be organized to allow for easy extraction by recipients. It
should also be easily understandable given the specific background(s) of the
recipients. My professional experience included working for a scientific
organization, a manufacturing organization, and a healthcare/education
organization. Each of those industries involved specialized vernacular that may not
have been understandable to the general public. As such, it was always important
that I considered the backgrounds of the specific recipient(s) for each report that I
was preparing or reviewing and ensured that the format and terminology would be
understandable.

Reports should also be formatted for duplication, where appropriate, or for posting
to websites (to be accessible using a variety of electronic devices and a variety of
internet browsers). For example, whereas use of shading and color may be deemed
worthwhile to improve the appearance of a report, it might make the report less
suited for duplication.
The sample report is again lacking in terms of usability. It was noted earlier that the
report was intended to request reimbursement for paper supplied by the service
center, and including a column for the amount requested would serve that purpose.
But the report would be more usable if it also somehow indicated the basis for
determining the amount requested from each consumer department, perhaps by
indicating the fee or reimbursement rate per ream of each type of paper.

© Copyright FHM.AVI – Data 308


Data Analyst

Figure 6
Rearranging the order of the rows to be alphabetical might also allow each recipient
to more quickly identify the amount owed by his or her department. The report might
be considered more usable or understandable if the column labels were changed to
reflect common descriptions for the types of paper rather than paper weights and
sizes and to include some notation of the quantity included in a ream of each type of
paper (which often varies). The usability concerns for the sample report are noted in
Figure 6.

The Solution
While reporting style preferences will vary and reports won’t always be perfect,
greater attention to these principles should allow accounting professionals and
students alike to improve the quality of reporting. In addressing the problems related
to the five principles, the revised sample report presented in Figure 7 is substantially
improved.

© Copyright FHM.AVI – Data 309


Data Analyst

Figure 7
The specific style of this revised report may not align with the preferences of other
preparers, but the generalities of the five principles of report design should be
universal. Reporting quality improves when preparers give proper consideration to
accuracy, consistency, appearance, efficiency, and usability, and this article should
serve as a useful guide, particularly for students and young professionals developing
their reporting skills.

© Copyright FHM.AVI – Data 310


Data Analyst

© Copyright FHM.AVI – Data 311


Data Analyst

How to build a Data Report

We’ve been through the basics, so now, it’s time to look at how to create data reports
from a practical perspective. That’s where our documenting data in business reports
tips come into play.
Depending on the type of the report, each has its own set of rules and best practices.
We will mention below the most popular ones, but our main focus is on business
data reports that will, ultimately, provide you with a roadmap on how you can make
your reports more productive. Let’s get started.

1. Define The Type of the Data Report


What types of data reporting do you need to present? Having this definition ahead
of time will help set parameters you can easily stick to. Here are the most common
data report types:
1. Informational vs. analytical: First determine if this report is just providing
factual information. Informational reports are usually smaller in size, the
writing structure is not strict, and the sole purpose is to inform about facts
without adding any analysis. On the other hand, if it is providing any analysis,
demonstrates relationships or recommendations, it is an analytical report.
2. Recommendation/justification report: Presents an idea and makes
suggestions to management or other important decision-makers. What the

© Copyright FHM.AVI – Data 312


Data Analyst

name suggests, it provides recommendations to changes in business


procedures and justifies courses of actions that have the goal of improving
business performance.
3. Investigative report: Helps determine the risks involved with a specific
course of action. Here, reporting data is based on documenting specific
information objectively with the purpose of presenting enough information to
stakeholders. They will ultimately decide if further actions are needed. An
example would be a report created for legal purposes.
4. Compliance report: Shows accountability by providing compliance
information for example to a governing body. This is particularly important
as accurate, well presented compliance data will avoid costly mistakes or red
tape issues.
5. Feasibility report: An exploratory report to determine whether an idea will
work. Data-driven insights that could potential save thousands of pounds by
helping businesses avoid redundant processes or developments.
6. Research studies report: Presents in-depth research and insights on a
specific issue or problem. Research is pivotal to growth and evolution and
having the visual data to back up your decisions will set you apart from the
pack.
7. Periodic report: Improves policies, products or processes via consistent
monitoring at fixed intervals, such as weekly, monthly, quarterly, etc. These
types of reports help foster incremental growth as well as consistency across
the board.
8. KPI report: Monitors and measures key performance indicators (KPIs) to
assess if your operations deliver the expected results. The best dashboards for
benchmarking progress in a number of areas, both internal and external.
9. Yardstick report: Weighs several potential solutions for a given situation.
An invaluable tool that you can adapt to your specific goals, aims, needs, and
situations. A solution-centric tools that every modern business should
embrace.

2. Know Your Target Audience


Knowing your audience will help determine what data you present, the
recommendations you make and how you present the data. Your audience may be

© Copyright FHM.AVI – Data 313


Data Analyst

upper, middle or line management, other departments in the company, coworkers,


the client, potential clients, the government or another company in the same market.
Knowing your audience helps determine what type of information to include in the
report. If a report is internal facing, branding such as colors, font, and logo aren’t as
crucial. If it is a one-time live presentation, formatting for printing isn’t key.
Determine ahead of time if your audience needs persuasion or education. If your
audience is C-suite level or the board, you may want to present mostly high-level
data with specific call outs and action items.
If the report is more exploratory in nature, you may want to include more granular
data and options to interact with the data. Ramon Ray, tech evangelist and founder
of Smart Hustle Magazine, wrote about how to best present your data to a wide
audience. He focused on keeping text simple, use visualizations whenever possible,
including video and animation when appropriate, and making your
reports/presentations interactive. Knowing your audience before you start your
analysis – and even more importantly before you put together the report – will keep
your reports and data focused and impactful.

3. Have A Detailed Plan and Select Your KPIs


We are going to sound like a broken record here, but have a report plan before you
start your analysis. What information does the management need for its effective
decision making? What data and insights do your shareholders require? Understand
the scope of data required and think about how you will want to use that data.
Utilize as many data sources as possible. But don’t go data crazy and get bogged
down in unnecessary information. Of course, you have to remain agile and may have
to adapt the plan, but a robust plan is crucial. Remaining purpose-driven will focus
your work, save you time in the long run and improve your business reporting
outcomes.
When creating your plan, it is crucial to select the right key performance indicators
(KPIs). You don’t need dozens of metrics that will answer all your business
questions at once, but pick a few that will tell a comprehensive data story (more on
that later), and enable you to take proper action (more on that later, too).

© Copyright FHM.AVI – Data 314


Data Analyst

Depending on your department or industry, reports will vary as KPIs also vary, but
choose the ones that will help you put your data into proper context and always keep
in mind the audience you’re addressing. If you understand your audience on a deep
level and set clearcut strategic objectives, you will find the KPI selection process
easier and more valuable.
Choose KPIs that align directly with your specific business aims, and you will
benefit from a cohesive mix of visual benchmarks that will help you track your
progress accurately while spotting trends that will help you streamline your business
for success.

4. Be Objective, When Possible


A good business data report describes the past, present, or possible future situation
in an objective and neutral way. Objective means the report states facts, not an
opinion. Keep the opinions minimal. It helps to combine them in one section,
possibly titled “Suggested Actions.” Also, using a passive voice in a report will help
keep the report formal and objective. For example:
Active: The managers need to make changes in their management style.
Passive: Changes in management style need to be made.
If you’re too subjective or biased with any data report format, you’re essentially
moving away from your goal of uncovering factual insights that will give you a
competitive edge. Collect data from reliable sources, record your insights with
pinpoint accuracy, and you will connect with objective insights that will push your
business to the next level.

5. Be Visually Stunning
Numerous types of data visualization have proven to be extremely powerful.
Analytics presented visually make it easier for decision-makers to grasp difficult
concepts or identify new patterns. Data presented visually are easier for humans to
perceive and digest. Reports should include data visualizations over text whenever
possible. Just make sure you are choosing the most appropriate data visualization to
tell your data story and that you are following BI dashboard best practices. With the
right data reporting tool, anyone can create meaningful visuals and share them with

© Copyright FHM.AVI – Data 315


Data Analyst

their team, customers, and other shareholders. All this can be accomplished without
involving a data scientist.
Also, make sure your report remains visually stunning, no matter how it is shared
and disseminated. Your report should look good on a computer, tablet, PDF, or even
on a mobile screen. That’s why utilizing a dashboard can be the most cost-effective
solution that will provide you with not only stunning visuals but interactivity as well.

6. Have Content Sharply Written


While the focus should be on visuals, some data report types also need text. Make
sure your reports use persuasive and even-toned business writing. Use concise,
active, and engaging language. Use bullet points versus long paragraphs. Use
headers and provide legends and supplementary text for your visualizations. Also,
you should always proofread!
To optimize your data analytics presentation and content, our guide to digital
dashboard creation and best practices offers practical insights that will help you
format your reports for success.

7. Make Sure the Report Is Actionable


Prescriptive, descriptive, and predictive analytics are becoming increasingly popular
in recent years. Each brings new insights needed to make better business decisions
and increase ROI – insights from the past, future, and prescribing possible outcomes.
That being said, make sure your report has a conclusion. When necessary, provide
recommendations.
Reports should be objective but the best ones are also actionable. Intended audiences
should walk away with the next steps or greater insights. By doing so, you will
enable a data-driven business environment and foster a more efficient collaboration.
To help make your data-centric reports more actionable, you must ensure that your
KPIs and insights work together to paint a comprehensive picture of a particular
process, strategy, or function.
For instance, if you’re looking to analyze your customer service success, adding
metrics relating to both staff performance and consumer satisfaction will give you a
balanced mix of insights that will help you take decisive action. Naturally, all of your

© Copyright FHM.AVI – Data 316


Data Analyst

KPIs will offer invaluable standalone information, but if they all complement one
another, you will accelerate your business success in a number of key areas.

8. Keep It Simple and Don’t Be Misleading


While data should be objective, formatting, filtering, and manipulation can be easily
part of misleading statistics. Make sure you are being consistent and reliable with
your reporting. Also, keep it simple. The boom of data visualization and reporting
tools has led to the creation of visualizations that don’t tell a data story.
You shouldn’t need 3-D glasses to read a report. Sometimes, a simple chart is all you
need. You also don’t need to go nuts with colors and formats. You can easily
overwhelm your audience this way. Choose a couple of colors that are easy on the
eyes. Keep to one font. Don’t go crazy with highlighted, bold or italicized text. You
don’t have to create a “piece of art” for your report to be visually stunning and
impactful.
The key takeaway here is: Keep your eyes on the prize and always remember the
goal or primary objective when developing your reports. Remaining true to your
objectives while prioritizing making your dashboards universally accessible will
ensure you keep your reports simple, transparent, and accurate at all times.

9. Don’t Forget to Tell a Complete Story


To successfully report data, you must take into account the logic of your story. The
report should be able to provide a clear narrative that will not confuse the recipient
but enable him/her to derive the most important findings.
Consider creating a dashboard presentation. That way you will have your data on a
single screen with the possibility to interact with numerous charts and graphs while
your story will stay focused and effective. By utilizing interactive visualizations, you
not only have a strong backbone on how to build a data report but also ensure that
your audience is well-informed and digests data easily and quickly.
Human beings absorb and engage with narratives better than other formats. If you
tell a tale with your data, you will skyrocket your business success, improve your
chances of executive buy-in, and foster innovation across the organization.

© Copyright FHM.AVI – Data 317


Data Analyst

Our definitive guide to dashboard presentation and storytelling will tell you all you
need to know to get started.

10. Use Professional Data Report Software


Last but not least, utilizing a modern visual analytics software will ensure you design
your reports based on the decisions you need to make, filtering the ever-present noise
in reporting processes and making sure you don’t get lost in the details. Often times,
reports are piled with large volumes of spreadsheets and presentation slides that can
create an obscure view of the presented data, and increase the possibility of
(unintentional) errors. The software can eliminate hideous manual tasks of searching
through rows and columns, provide the necessary real-time view, alongside the
possibility to look into the past and the future of how the data will behave.
No matter if you’re an analyst working with databases and need a strong MySQL
reporting tool or a marketing professional looking to consolidate all your channels
under the same data-umbrella, the software will enable you to clear the clutter and
automate your reports based on your specific time intervals. They will update the
data automatically, and you will not need more than small refinements to make sure
the data you present is the one your audience needs.
We have expounded on the data reports definition, saw the top 10 best practices to
create your own, and now we will continue our focus on data reports examples from
a few industries that will present these practices in action.

Data Reports Examples and Templates


To put everything we’ve discussed so far into perspective, let’s move onto data
reporting examples. To be able to create reports that drive action and provide added
value to your company’s business efforts, here are some examples that put the
reporting creation and presentation in perspective. These examples are created with
the help of a professional dashboard designer that empowers everyone in the line of
business to build their own reports. Let’s start with the finance department.

1. Financial KPI dashboard

© Copyright FHM.AVI – Data 318


Data Analyst

Finance is the beating heart of any business and creating a financial report is the basis
for sustainable development. Companies need to keep a close eye on how their
monetary operations perform and make sure their financial data is 100% accurate.
The example focuses on KPIs that are meticulously chosen to depict the general
financial health of a company. The displayed information is presented in a logical
order, connecting various financial KPIs that make a complete data story, without
the need to overcrowd the screen or complicate the report.

Primary KPIs:

• Working Capital
• Quick Ratio / Acid Test
• Cash Conversion Cycle
• Vendor Payment Error Rate
• Budget Variance
What is data reporting doing in this case is quite simple. Presenting the most
important information in a clear financial narrative that will drive action. We can see

© Copyright FHM.AVI – Data 319


Data Analyst

in this financial dashboard that the company managed to decrease the cash cycle, but
the vendor payment rate had a spike in September last year. It might make sense to
take action and see in more detail what happened so that the processes can be
adjusted accordingly.

2. Retail KPI dashboard


Retailers must be extra careful in picking the right KPIs and presenting their data in
a clear order, without cluttering the report or confusing the people that need to read
it and act accordingly.

Primary KPIs:

• Back Order Rate


• Rate of Return
• Customer Retention
• Total Volume of Sales

© Copyright FHM.AVI – Data 320


Data Analyst

A retail dashboard such as the one presented above focuses on the perspective of
orders which is one of the crucial points in this cutthroat business.
Gaining access to these business touchpoints will equip you with the best possible
ingredients to stay competitive on the market. By utilizing KPIs such as the rate of
return (also by category), customer retention rate, and the number of new and
returning customers will enable you to access in-depth information on your order
processes and ensure your actions stay focused on developing your business on a
sustainable level. For example, you can keep an eye on the rate of return and make
sure it stays as low as possible. That way, your costs will be significantly lower and,
ultimately, customers more satisfied.
The retail analytics processes don’t need to foster complex reports, but with an
example such as we presented above, you can see that reporting with dynamic
visualizations empowers you to make better business decisions.

3. Talent management dashboard


The next of our data report examples is our HR dashboard focused on talent
management. Talent retention and development is an ongoing challenge for HR
managers. This data-centric reporting tool is designed to keep your top-performing
staff engaged and motivated on a consistent basis.

© Copyright FHM.AVI – Data 321


Data Analyst

Primary KPIs:

• Talent Satisfaction
• Talent Rating
• Talent Turnover Rate
• Dismissal Rate
With a wealth of at-a-glance insights that are essential to successful talent
management strategies and HR KPIs focused on the likes of rising talent as well as
dismissal and turnover rates, this invaluable tool will prove vital to the health and
growth of your organization. Moreover, your HR analytics efforts will prove to
enhance hiring processes, enabling you to attract the best possible talent, automate
tasks, and create a satisfying workforce environment.

4. Procurement quality dashboard


The procurement dashboard is designed to streamline and fortify the relationship
between you, your vendors, and your suppliers.

© Copyright FHM.AVI – Data 322


Data Analyst

Primary KPIs:

• Supplier Quality Rating


• Vendor Rejection Rate & Costs
• Emergency Purchase Ratio
• Purchases In Time & Budget
• Spend Under Management
Cohesive procurement is vital to the financial and operational success of any modern
organization, regardless of industry or sector. This interactive procurement report
will help you quality-check your suppliers while digging deeper into metrics
surrounding emergency purchases, rejection rates, costs, budgetary constraints, and
more. A business-boosting tool that will form the backbone of your business.

5. Sales opportunity dashboard

© Copyright FHM.AVI – Data 323


Data Analyst

Sales are integral to the success of most businesses. Our sales dashboard will help
you identify revenue-boosting sources with ease while prioritizing them in order of
prospective value.

Primary KPIs:

• Number of Sales Opportunities


• Sales Opportunity Score
• Average Purchase Value
This will allow you to streamline your sales strategy for maximum income,
efficiency, and sustainability. This visually-balanced performance dashboard is easy
to understand and will help you take direct action at the times when it matters most
– a priceless business intelligence tool for any forward-thinking organization.

Start Building Your Data Reports Now!


Now that you know how to build a data report, it’s time to embrace the power of
modern BI solutions and data analytics.

© Copyright FHM.AVI – Data 324


Data Analyst

Reporting, analytics, and smart informational processing can have a


transformational impact on an organization if approached the right way.
Fortunately, the mind-numbing task of manually creating daily or weekly reports is
a thing of the past. With the right plan and proper business reporting software, you
can easily analyze your data and also create eye-catching and remarkable reports.
We are living in the age of information – a time where anything is possible. By
embracing data-centric reports and forming the right foundations, you will accelerate
the success of your organization in ways you never thought possible, pushing you
ahead of the pack in the process.

Introduction to Data Dashboards (Meaning,


Definition & Industry Examples)
Data is all around us, and it has changed our lives in many ways - including in the
fast-paced world of business. Digital data empowers organizations across sectors to
improve their processes, initiatives, and innovations using the power of insight. But
with so many stats, facts, and figures in today’s hyper-connected age, knowing which
information to work with can seem like a minefield.
To help you understand the business-boosting power of dashboards, we’re going to
explore a definitive data dashboard definition, explain the importance of dashboard
data, and look at a selection of real-world data dashboard examples.
To gain a working understanding of the concepts outlined in this guide, we must first
ask the question, ‘What is a data dashboard?” So, let’s start with an official
definition:

What Is a Data Dashboard?


A data dashboard is a tool that provides a centralized, interactive means of
monitoring, measuring, analyzing, and extracting relevant business insights from
different datasets in key areas while displaying information in an interactive,
intuitive, and visual way.

© Copyright FHM.AVI – Data 325


Data Analyst

They offer users a comprehensive overview of their company’s various internal


departments, goals, initiatives, processes, or projects. These are measured through
key performance indicators (KPIs), which provide insights that help to foster growth
and improvement.

Online dashboards provide immediate navigable access to actionable analytics


that has the power to boost your bottom line through continual commercial
evolution.
To properly define dashboards, you need to consider the fact that, without the
existence of dashboards and dashboard reporting practices, businesses would
need to sift through colossal stacks of unstructured data, which is both inefficient
and time-consuming. Alternatively, a business would have to ‘shoot in the dark’
concerning its most critical processes, projects, and internal insights, which is far
from ideal in today’s world.
Now that you understand a clearly defined dashboard meaning, let’s move onto one
of the primary functions of data dashboards: answering critical business questions.

What Is the Purpose of a Data Dashboard?


As mentioned earlier, a data dashboard has the ability to answer a host of business-
related questions based on your specific goals, aims, and strategies.
By taking raw data from a number of sources and consolidating it before presenting
it in a tailored, customized visual way, data dashboards can help make sense of your
company’s most valuable data and empower you to find actionable answers to your
most burning business questions.
Through linking with specific KPIs that align with your business goals, you can drill
down into specific pockets of information, creating benchmarks, and measuring your
success on a continual basis.
In doing so, your business will be data-driven, and as a direct result – more
successful. To find out more about dashboards and key performance indicators,
explore our ever-expanding collection of various business-boosting KPI
examples and templates.

© Copyright FHM.AVI – Data 326


Data Analyst

How to Design a Dashboard - 20 Dashboard Design


Principles & Best Practices to Enhance Your Data
Analysis
In the digital age, there’s little need for a department of IT technicians, plus a
qualified graphic designer, to create a dazzling data dashboard. However, if you want
to enjoy optimal success, gaining a firm grasp of logical judgment and strategic
thinking is essential – especially regarding dashboard design principles.
At this point, you have already tackled the biggest chunk of the work – collecting
data, cleaning it, consolidating different data sources, and creating a mix of useful
metrics. Now, it’s time for the fun part.
Here, you can get carried away by your creativity and design a pretty, dazzling,
colorful dashboard. To take a look at 80+ great designs that will inspire you, we
suggest you check out our live dashboard page, where we created a selection of
real-time visuals based on industry, function, and platform.
Unfortunately, you can’t play around with designs like the next Picasso. There are
certain dashboard design best practices you should follow to display your data in the
best way, making it easy to analyze and actionable.

Your business dashboard should be user-friendly and constitute a basic aid in


the decision-making process. To help you on your journey to data-driven success,
we’ll delve into 20 dashboard design principles that will ensure you develop the most
comprehensive dashboard for your personal business needs.

© Copyright FHM.AVI – Data 327


Data Analyst

These 20 definitive dashboard design best practices will bestow you with all of the
knowledge you need to create striking, results-driven data dashboards on a
sustainable basis.
Dashboard design principles are most effective as part of a structured process. Here,
we’ll go over these dashboard design guidelines to ensure you don’t miss out on any
vital steps.

1. Consider your audience


Concerning dashboard best practices in design, your audience is one of the most
important principles you have to take into account. You need to know who's going
to use the dashboard.
To do so successfully, you need to put yourself in your audience’s shoes. The context
and device on which users will regularly access their dashboards will have direct
consequences on the style in which the information is displayed. Will the dashboard
be viewed on-the-go, in silence at the office desk or will it be displayed as a
presentation in front of a large audience?
That said, you should never lose sight of the purpose of designing a dashboard. You
do it because you want to present data in a clear and approachable way that facilitates
the decision-making process for a specific audience in mind. If the audience is more
traditional, we suggest you adhere to a less 'fancy' design and find something that
would resonate better.

© Copyright FHM.AVI – Data 328


Data Analyst

Additionally, if you make the charts look too complex, the users will spend even
more time on data analysis than they would without the dashboard. Data analysis
displayed on a dashboard should provide additional value. For example, a user
shouldn’t need to do some more calculations on his own, to get to the information
he was looking for, because everything he needs will be clearly displayed on the
charts. Always try to put yourself in the audience's position.

We can see in the example above, a sales dashboard provides the audience with data
at their fingertips, mostly interesting for high-level executives and VPs.
Keep in mind what data will the user be looking for? What information would help
him/her to better understand the current situation? If you have two relative values,
why not add a ratio to show either an evolution or a proportion, to make it even
clearer? An important point is also to add the possibility for the user to compare your
number with a previous period. You can’t expect all users to remember what were
the results for last year’s sales, or last quarter’s retention rate. Adding an evolution
ratio and a trend indicator, will add a lot of value to your metrics, whether logistics
KPIs or procurement and make the audience like you.

© Copyright FHM.AVI – Data 329


Data Analyst

2. Don’t try to place all the information on the same page


The next in our rundown of dashboard design tips is a question of information. This
most golden of dashboard design principles refers to both precision and the right
audience targeting.
That said, you should never create one-size-fits-all dashboards and don’t cram all
the information into the same page. Think about your audience as a group of
individuals who have different needs – sales manager doesn’t need to see the same
data as a marketing specialist, HR department, or professionals in logistics
analytics. If you really want to put all the data on a single dashboard, you can use
tabs to split the information per theme or subject, making it easier for users to find
information. For example, you can split a marketing dashboard into sections
referring to different parts of the website like product pages, blog, terms of use, etc.
However, instead of using different tabs, filters, selectors, and drill-down lists and
making the user endlessly click around, it’s better to simply create one dashboard
for each job position. A dashboard creator software will help you to do just
that.
This may sound like a lot of work, but it’s actually easier than trying to cram all of
the data that could be of interest to everyone onto a single display. When each role
is provided with its own dashboard, the need for filters, tabs, selectors, extensive
drill-downs is minimized, and it becomes much easier to instantly find
a significant piece of information.

3. Choose relevant KPIs

For a truly effective KPI dashboard design, selecting the right key performance
indicators (KPIs) for your business needs is a must.
Your KPIs will help to shape the direction of your dashboards as these metrics will
display visual representations of relevant insights based on specific areas of the
business.
Once you’ve determined your ultimate goals and considered your target audience,
you will be able to select the best KPIs to feature in your dashboard.

© Copyright FHM.AVI – Data 330


Data Analyst

To help you with your decision, we have selected over 250 KPI examples in our
rich library for the most important functions within a business, industry, and
platform. One example comes from the retail industry:

This retail KPI shows the total volume of sales and the average basket size during
a period of time. The metric is extremely important for retailers to identify when the
demand for their products or services are higher and/or lower. That way it is much
easier to recognize areas that aren't performing well and adjust accordingly (create
promotions, A/B testing, discounts, etc.).

4. Select the right type of dashboard


Remember to build responsive dashboards that will fit all types of screens, whether
it’s a smartphone, a PC, or a tablet, but we will cover this in more detail later. If your
dashboard will be displayed as a presentation or printed, make sure it’s possible to
contain all key information within one page.

For reference, here are the 4 primary types of dashboards for each main branch
business-based activity:

© Copyright FHM.AVI – Data 331


Data Analyst

Strategic: A dashboard focused on monitoring long-term company strategies by


analyzing and benchmarking a wide range of critical trend-based information.
Operational: A business intelligence tool that exists to monitor, measure, and
manage processes or operations with a shorter or more immediate time scale.
Analytical: These particular dashboards contain large streams of comprehensive
data that allow analysts to drill down and extract insights to help the company to
progress at an executive level.
Tactical: These information-rich dashboards are best suited to mid-management and
help in formulating growth strategies based on trends, strengths, and weaknesses
across departments, such as in the example below:

Each dashboard should be designed for a particular user group with the specific aim
of assisting recipients in the business decision-making process. Information is
valuable only when it is directly actionable. The receiving user must be able to
employ the information in his own business strategies and goals. As a dashboard
designer who uses only the best dashboard design principles, make sure you are

© Copyright FHM.AVI – Data 332


Data Analyst

able to identify the key information, and separate it from the inessential one to
enhance users’ productivity.

5. Provide context
Without providing context, how will you know whether those numbers are good or
bad, or if they are typical or unusual? Without comparison values, numbers on a
dashboard are meaningless for the users. And more importantly, they won’t know
whether any action is required. For example, a management dashboard design will
focus on high-level metrics that are easy to compare and, subsequently, offer a visual
story.
Always try to provide maximum information, even if some of them seem obvious to
you, your audience might find them perplexing. Name all the axes and add titles to
all charts. Remember to provide comparison values. The rule of thumb here is to use
comparisons that are most common, for example, comparison against a set target,
against a preceding period or against a projected value. This is an effective
dashboard design tip that you should always consider.

6. Use the right type of chart

We can’t stress enough the importance of choosing the right data visualization
types. You can destroy all of your efforts with a missing or incorrect chart type. It’s
important to understand what type of information you want to convey and choosea
data visualization that is suited to the task.
Line charts are great when it comes to displaying patterns of change across a
continuum. They are compact, clear and precise. Line charts format is common and
familiar to most people so they can easily be analyzed at a glance.
Choose bar charts if you want to quickly compare items in the same category, for
example, page views by country. Again such charts are easy to understand, clear and
compact.
Pie charts aren’t the perfect choice. They rank low in precision because users find
it difficult to accurately compare the sizes of the pie slices. Although such charts can
be instantly scanned and users will notice the biggest slice immediately, there can

© Copyright FHM.AVI – Data 333


Data Analyst

be a problem in terms of scale resulting in the smallest slices being so small that they
even cannot be displayed.
Sparklines usually don’t have a scale which means that users will not be able to
notice individual values. However, they work well when you have a lot of metrics
and you want to show only the trends. They are rapidly scannable and very compact.
It’s also not that easy to decipher scatterplots. They lack precision and clarity as the
relationships between two quantitative measures don’t change very frequently. Still,
they can be used for an interactive presentation for knowledgeable users.
Most experts agree that bubble charts are not fit for dashboards. They require too
much mental effort from their users even when it comes to reading simple
information in a context. Due to their lack of precision and clarity, they are not very
common and users are not familiar with them.
In summary, dashboard-centric charts and visualizations fall into four primary
categories: relationship, distribution, composition, and comparison.
Depending on what you want to communicate or show, there is a chart type to suit
your goals. Placing your aims into one of the 4 primary categories above will help
you make an informed decision on chart type:

© Copyright FHM.AVI – Data 334


Data Analyst

7. Choose your layout carefully


Dashboard best practices in design concern more than just good metrics and well-
thought-out charts. The next step is the placement of charts on a dashboard. If your
dashboard is visually organized, users will easily find the information they need.
Poor layout forces users to think more before they grasp the point, and nobodylikes
to look for data in a jungle of charts and numbers. The general rule is that the key
information should be displayed first – at the top of the screen, upper left-hand
corner. There is some scientific wisdom behind this placement – most cultures read
their written language from left to right and top to bottom, which means that people
intuitively look at the upper-left part of a page first, no matter if you're developing
an enterprise dashboard design or a smaller-scaled within the department - the rule
is the same.
Another useful dashboard layout principle is to start with the big picture. The major
trend should be visible at a glance. After this revealing first overview, you can
proceed with more detailed charts. Remember to group the charts by theme with the

© Copyright FHM.AVI – Data 335


Data Analyst

comparable metrics placed next to each other. This way, users don’t have to change
their mental gears while looking at the dashboard by, for example, jumping from
sales data to marketing data, and then again to sales data. This analytics dashboard
best practice will enable you to present your data in the most meaningful way and
clear to the end-user.

8. Prioritize simplicity
One of the best practices for dashboard design focuses on simplicity. Nowadays, we
can play with a lot of options in the chart creation and it’s tempting to use them all
at once. However, try to use those frills sparingly. Frames, backgrounds, effects,
gridlines… Yes, these options might be useful sometimes, but only when there is a
reason for applying them.
Moreover, be careful with your labels or legend and pay attention to the font, size,
and color. It shouldn’t hide your chart, but also be big enough to be readable. Don’t
waste space on useless decorations, like for example a lot of pictures. You can check
out our example on how to create a market research report, where we focused
on simplicity and the most important findings presented on 3 different reporting
dashboard designs.
Additionally, applying shadows can be quite an effect since it highlights some areas
of the dashboard and gives more depth. Since the point is to keep it simple, don't
overdo it and use it when you really need it. Designing a dashboard should be a well-
thought process but the end-user should see a simple data-story with the main points
highlighted and the points should be immediately clear. If this is not respected, more
questions will arise about the dashboard itself rather than discussing the points that
you're trying to make and the story you're trying to present. This leads us to our next
point.

9. Round your numbers


Continuing on simplicity, rounding the numbers on your dashboard design should
be also one of the priorities since you don't want your audience to be flooded with
numerous decimal places. Yes, you want to present details but, sometimes, too many
details give the wrong impression. If you want to present your conversion rate with
5 more decimal places, it would make sense to round the number and avoid too many

© Copyright FHM.AVI – Data 336


Data Analyst

number-specific factors. Or, if you want to present your revenue, you don't need to
do so by going into cents. 850K looks simpler and more visually effective than $850
010, 25. Especially if you want to implement executive dashboard best practices,
where strategic information doesn't need to represent every operational detail of a
certain number.
The latter may exaggerate minor elements, in this case, cents, which, for an effective
data story, isn't really necessary in your dashboard design process.

10. Be careful with colors - choose a few and stick to them


Without a shadow of a doubt, this is one of the most important of all dashboard
design best practices.
This particular point may seem incongruous to what we have said up to this point,
but there are options to personalize and customize your creations to your preferences.
The interactive nature of data dashboards means that you can let go of PowerPoint-
style presentations from the 90s. The modern dashboard is minimalist and clean. Flat
design is really trendy nowadays.
Now, when it comes to color, you can choose to stay true to your company identity
(same colors, logo, fonts) or go for a totally different color palette. The important
thing here is to stay consistent and not use too many different colors – an essential
consideration when learning how to design a dashboard.

You can choose two to three colors, and then play with gradients. A common
mistake is using highly saturated colors too frequently. Intense colors can instantly
draw users’ attention to a certain piece of data, but if a dashboard contains only
highly saturated colors, users may feel overwhelmed and lost – they wouldn’t know
what to look at first. It’s always better to tone most colors down. Dashboard design
best practices always stress consistency when it comes to your choice of colors.
With this in mind, you should use the same color for matching items across all charts.
Doing so will minimize the mental effort required from a users’ perspective, making
dashboards more comprehensible as a result. Moreover, if you’re looking to display
items in a sequence or a group, you shouldn’t aim for random colors: if a
relationship between categories exists (e.g., lead progression, grade levels, etc.),

© Copyright FHM.AVI – Data 337


Data Analyst

you should use the same color for all items, graduating the saturation for easy
identification.
Thanks to this, your users will only have to note that higher-intensity colors
symbolize variable displays of a particular quality, item, or element, which is far
easier than memorizing multiple sets of random colors. Again, creating a dashboard
that users can understand at a glance is your main aim here.

In the example above, manufacturing analytics are presented in a neat


production dashboard, where a 'dark' theme is chosen after careful consideration of
a few colors.
Our final suggestion concerning colors is to be mindful when using “traffic light”
colors. For most people, red means “stop” or “bad” and green represents “good” or
“go.” This distinction can prove very useful when designing dashboards – but only
when you use these colors accordingly.

11. Don’t go over the top with real-time data

© Copyright FHM.AVI – Data 338


Data Analyst

Next on our list of good dashboard design tips refers to insight: don’t overuse real-
time data. In some cases, information displayed in too much detail only serves to
lead to distraction. Unless you’re tracking some live results, most dashboards don’t
need to be updated continually. Real-time data serves to paint a picture of a general
situation or a trend. Most project management dashboards must only be
updated periodically – on a weekly, daily, or hourly basis. After all, it is the right
data that counts the most.
Moreover, you can implement smart alarms so that the dashboard itself notifies you
if any business anomalies occur. That way, your refresh interval, and intelligent
alarms will work hand-in-hand, making them one of the dashboard design guidelines
that will ensure you save countless working hours.

12. Be consistent with labeling and data formatting


Number 12 on our list of tips on how to design a dashboard is focused on clarity and
consistency. Above all else, in terms of functionality, the main aim of a data
dashboard is gaining the ability to extract important insights at a swift glance. It’s
critical to make sure that your labeling and formatting is consistent across KPIs,
tools, and metrics. If your formatting or labeling for related metrics or KPIs is wildly
different, it will cause confusion, slow down your data analysis activities, and
increase your chances of making mistakes. Being 100% consistent across the board
is paramount to designing dashboards that work.
We will go into more detail with white labeling and embedding in some other points,
but here it's important to keep in mind that the dashboard design methodology should
be detailed and well-prepared in order to generate the most effective visuals. That
includes clear formatting and labeling.

13. Use interactive elements


Any comprehensive dashboard worth its salt will allow you to dig deep into certain
trends, metrics, or insights with ease. When considering what makes a good
dashboard, factoring drill-downs, click-to-filter, and time interval widgets into your
design is vital.
Drill-down is a smart interactive feature that allows the user to drill down into more
comprehensive dashboard information related to a particular element, variable, or

© Copyright FHM.AVI – Data 339


Data Analyst

key performance indicator without overcrowding the overall design. They are neat,
interactive, and give you the choice of viewing or hiding key insights when you want
rather than wading through muddied piles of digital information:

Another interactive element, crucial in dissecting data, is the click-to-filter option.


This feature enables users to utilize the dimensions of the charts and graphs within
a dashboard as temporary filter values. In practice, that means that this filter will
apply data to the whole dashboard just by clicking on a specific place of interest, like
in the example below:

This example shows how we filtered data just for Australia, for the month of
February.

© Copyright FHM.AVI – Data 340


Data Analyst

Looking at data over time is another crucial element to consider when designing a
dashboard. The time interval widget will enable you to do just that. It's a neat feature
that allows you to enhance individual time scales on various charts, meaning you
can easily look at your data across days, weeks, months or years, as in the following
example:

These elements are of utmost importance in dashboard design since they help to keep
the dashboard unburdened of too many elements while the interactivity enables to
have all the data needed. For more details and complete scale of the top interactivity
features.

14. Additionally, use animation options


Animation options can be one of the dashboard elements that give an additional neat
visual impression where you select the appearance of the specific element on the
dashboard and assign an animation option. The result is a simple, yet effective
automated movement based on the desired speed (slow, medium, or fast,e.g.) and
types such as linear, swing, ease-in, or ease-out.

© Copyright FHM.AVI – Data 341


Data Analyst

Moreover, modern dashboard features include this option since it gives you an
additional option to catch the attention of the viewer. In essence, each time you open
a dashboard tab or refresh, the animation will trigger and start. Simple.

15. Double up your margins


One of the most subtle yet essential dashboard guidelines, this principle boils down
to balance. White space – also referred to as negative space – is the area of blankness
between elements featured on a dashboard design.
Users aren’t typically aware of the pivotal role that space plays in visual
composition, but designers pay a great deal of attention to it because when metrics,
stats, and insights are unbalanced, they are difficult to digest. You should always
double the margins surrounding the main elements of your dashboard to ensure each
is framed with a balanced area of white space, making the information easier
to absorb.

16. Optimize for multiple devices


Optimization for mobile or tablet is another critical point in the dashboard
development process. By offering remote access to your most important insights,
you can answer critical business questions on-the-go, without the need for a special
office meeting. Benefits such as swift decision-making and instant access ensure
everyone has the possibility to look at the data on-the-fly.

© Copyright FHM.AVI – Data 342


Data Analyst

Here it makes sense to keep in mind that the dashboard layout it's not the same as in
desktop. A mobile dashboard has a smaller screen and, therefore, the placement
of the elements will differ. Additionally, the level of analysis in comparison to the
desktop version will not be as deep since this kind of dashboard needs to focus on
the most critical visuals that fit the screen, oftentimes high-level.
To create such a design, we suggest you trim all the surplus that is not relevant and
test across devices. Additionally, keep in mind that the dashboard design process
should also include the 'bigger fingers' element. Not everyone has smaller hands and
buttons should be well optimized for all hands' shapes and sizes. Moreover, and we
can't stress this enough, keep only the most important metrics and information on
the screen, so that they're easily scannable and immediately visible.

17. Consider the use in terms of exports vs. digital


In the process of dashboard designing, you also need to think about exports. You can
use the dashboard itself and share it, but if you plan on regularly use exports, you
might want to consider optimizing towards printing bounds, fewer colors, and
different types of line styles to make sure everything is readable even on a black-
and-white printout. Hence, when you plan your data dashboard design, you also need
to look into the future uses and how to optimize towards different exporting options
or simply sharing the dashboard itself with all its features and options.
Additionally, by assigning a viewer area, you can specify the number of features you
openly allow, including the number of filters, and all the bits and details of specific
permissions. That way, you have full control over your digital presentation and the
amount of analysis you want to share. In this digital case, you don't need to take into
account print, but it would certainly help if you ever want to create one.

18. White label and embed if you need to


Another critical point when considering your workflow dashboard design is the
opportunity to white label and embed the dashboard into your own application or
intranet, e.g. With these options in mind, you can consider using your own
company's logos, color styles, and overall brand requirements and completely adjust
the dashboard as it's your own product. Embedded business intelligence
ensures that access to the analytical processes and data manipulation

© Copyright FHM.AVI – Data 343


Data Analyst

is completely done within their existing systems and applications. There are many
users that prefer this option so when you consider what kind of dashboard techniques
you want to implement in your design, embedding and white labeling are 2 more
options you need to take into account.

An embedded dashboard will look like your own product, as mentioned, but
the point is that you don't need to invest in the development process at all, but simply
take over a product, and use it as your own. When you're in the process of creating
your business dashboard design, these features should be taken into account as well.

19. Avoid common data visualization mistakes


Data visualization has evolved from simple static presentations to modern interactive
software that takes the visual perception onto the next level. It also enabled
average business users and advanced analysts to create stunning visuals that tell a
clear data-story to any potential audience profile, from beginners in a field to
seasoned analysts and strategists.
But positive development has also brought some negative side effects such as
creating mistakes that you can see in various media. Online data visualization
is not just about creating visuals for the sake of it, but it needs to be clear and
communicate effectively. That said, avoid these common mistakes:
Failed calculations: The numbers should add up to 100. For example, if you conduct
a survey and people have the option to choose more than one answer, you will
probably need some other form of visuals than a pie chart since the numbers won't
add up, and the viewers might get confused.
The wrong choice of visualizations: We have mentioned how important it is to
choose the right type of chart and dashboard, so if you want to present a relationship
between the data, a scatter plot might be the best solution.
Too much data: Another point you need to keep in mind, and we have discussed in
detail, don't put too much data on a single chart because the viewer will not recognize
the point.

© Copyright FHM.AVI – Data 344


Data Analyst

Besides, you can also familiarize yourself with general design mistakes that you can
avoid if you follow the rules of simplicity and color theory, no matter if you
need to create an executive dashboard design or operational.

20. Never stop evolving


Last but certainly not least in our collection of principles of effective dashboards –
the ability to tweak and evolve your designs in response to the changes around you
will ensure ongoing analytical success.
When designing dashboards, asking for feedback is essential. By requesting regular
input from your team and asking the right questions, you’ll be able to improve the
layout, functionality, look, feel, and balance of KPIs to ensure optimum value at all
times. Asking for feedback on a regular basis will ensure that both you and the
customer (or team) are on the same page. As we mentioned many times, your
audience is your number one consideration, and you need to know how to adjust the
visuals to generate value.

For example, if you need to present an HR dashboard, it would make sense to


ask the team, executives, or relevant stakeholders to provide you with feedback on
the dashboard, whether it's focused on employee performance, recruiting, or talent
management. That way, you can be sure to respect the best practices for dashboard
design and deliver outstanding visuals.
The digital world is ever-evolving. Change is constant, and the principles of effective
dashboards are dictated by a willingness to improve and enhance your design efforts
continuously. A failure to do so will only hinder the success of your efforts.
So, never stop evolving.
“There are three responses to a piece of design – yes, no, and WOW! Wow is
the one to aim for.” – Milton Glasner, world-renowned graphic designer
So, what makes a good dashboard? An effective data dashboard should be striking
yet visually balanced, savvy yet straightforward, accessible, user-friendly, and
tailored to your goals as well as your audience. All of the above dashboard design
tips form a water-tight process that will help you produce visualizations that will
enhance your data analysis efforts exponentially. Moreover, dashboard design
should be the cherry on top of your business intelligence (BI) project.

© Copyright FHM.AVI – Data 345


Data Analyst

Every dashboard you create should exist for a focused user group with the specific
aim of helping users tap into business decision-making processes and transform
digital insights into positive strategic actions.
Information is only valuable when it is directly actionable. Based on this principle,
it’s critical that the end-user can employ the information served up by a dashboard
to enhance their personal goals, roles, and activities within the business.
By only using the best and most balanced dashboard design principles, you’ll ensure
that everyone within your organization can identify key information with ease, which
will accelerate the growth, development, and evolution of your business. Thatmeans
a bigger audience, a greater reach, and more profits – the key ingredients of a
successful business. So if you're wondering how many steps are recommended to
follow in creating an effective dashboard? The answer lies within this article. Stick
to these 20 steps, and your dashboards will impress your audience, but also make
your data analysis life much easier.

Top 20 Data Dashboard Examples & Templates


Now it’s time to delve deeper into 20 prime real-world digital dashboards tailored
for a host of different sectors and industries.

For a more detailed glance, you can check out 80 or more business dashboard
examples suited to an ever wider range of business functions (marketing, sales,
finance, management, etc.) and industries (healthcare, retail, logistics,
manufacturing, etc.).
For now, let’s explore our personal 20. Prepare to be inspired...

1) Management KPI Dashboard

Our first data dashboard template is a management dashboard. It is a good


example of a “higher level” dashboard for a C-level executive. You’ll notice that this
example is focused on important management KPIs like:

• Number of new customers compared to targets


• The average revenue per customer

© Copyright FHM.AVI – Data 346


Data Analyst

• Customer acquisition cost


• Gross revenue, target revenue, and last year’s revenue

The example above does a great job of keeping things focused and avoids clutter
through the liberal use of white space.

2) Financial KPI Dashboard

© Copyright FHM.AVI – Data 347


Data Analyst

Maintaining your financial health and efficiency is pivotal to the continual growth
and evolution of your business. Without keeping your fiscal processes water-tight,
your organization will essentially become a leaky tap, draining your budgets dry
right under your nose.

As one of our most widely used financial data analytics tools, our financial
KPI dashboard will tell you all you need to know about your business’s ongoing
fiscal health by offering digestible visualizations on every key financial area of the
business.
Here, you can analyze cash conversion cycle efficiency and vendor payment error
trends while examining budget variance trends and breaking down existing assets as
well as liabilities.
Through careful monitoring of this most insightful of examples, you will gain the
power to spot emerging trends, troubleshoot any financial issues before they harm
the business, and enhance your supplier relationships.

© Copyright FHM.AVI – Data 348


Data Analyst

When you can access your working capital calculations ‘at a glance,’ monitoring
your company’s overall financial health on any given day, week, or month will
become a swift, simple process, giving you the time and space to develop initiatives
to grow the business.

3) Sales Cycle Length Dashboard

This sales dashboard below is a sales manager’s dream. This dashboard example
breaks down how long customers are taking to move through your funnel, on
average. It expands on this by showing how different sales managers are performing
compared to one another.

inally, it makes things even more nitty-gritty by breaking down these sales KPIs
into how many people are at each stage of the funnel and how long each stagelasts on
average.

4) IT Project Management Dashboard

© Copyright FHM.AVI – Data 349


Data Analyst

In many ways, your IT department is the backbone of your business - it’s what keeps
your business operational.
Most IT operatives are overstretched and under pressure, but by working with the
right data visualizations, you can give your department the tools to become smarter,
more efficient, and more impactful.
Our IT project management dashboard offers a wealth of at-a-glance insights geared
towards reducing project turnaround times while enhancing efficiency in key areas,
including total tickets vs. open tickets, projects delivered on budget, and average
handling times.
All KPIs complement each other, shedding light on real-time workloads, overdue
tasks, budgetary trends, upcoming deadlines, and more. This perfectly balanced mix
of metrics will help everyone within your IT department balance their personal
workloads while gaining the insight required to make strategic enhancements and
deliver projects with consistent success.
By consolidating every key piece of information in one central place, every member
of your IT team can get on the same page, working cohesively to resolve emerging

© Copyright FHM.AVI – Data 350


Data Analyst

issues with maximum efficiency while helping one another tackle the various stages
of every vital IT project with complete confidence.

5) Procurement Data Analytics Dashboard


Procurement, a strategic function that connects the needs of a company with
contractors, suppliers, freelancers, or agencies, e.g., serves as a critical component
that requires up to date data, easy monitoring processes, and advanced automation
capabilities that will optimize the procurement department and save countless
working hours.
It's imperative to track the supplier management processes in the most efficient way,
otherwise, companies risk losing precious information and face difficulties in short-
term as well as long-term business operations. That's why a procurement
dashboard helps professionals to easily understand large volumes of data and
extract meaningful insights within minutes.

This example focuses on supplier delivery management and stats about the
performance. At the top, we can see a quick overview of the defect rate, on-time

© Copyright FHM.AVI – Data 351


Data Analyst

supplies, supplier availability, and lead time in days. The defect rate should be low
as possible, as this is one of the procurement KPIs that measures how many
products received are not compliant with product specifications but also don't meet
quality standards. After all, the final quality of the product is an essential indicator
of the suppliers' reliability since higher amounts of defects would mean more
bottlenecks in the entire process.
The details of each suppliers' defect rate and defect type are visualized on the bottom
left, where you can see how well the top 6 suppliers have been performing. Supplier
number 1 is considered the most reliable since the defect rate is quite low and the
type of defect has no impact on the product so there could be issues in delivery
processes or payments.
You can also compare the delivery time between different suppliers and take a look
if there are additional possibilities to negotiate faster deliveries. In this case, you can
see that supplier number 4 has never been late, although it has fewer early deliveries
in comparison to supplier numbers 1 and 2.
With the help of detailed monitoring processes, each industry professional or
manager can build and automate a comprehensive procurement report that will
optimize the supplier delivery mechanisms and increase the productivity levels of
everyone involved.

6) Web Analytics Dashboard

If you use digital marketing heavily in your business, this marketing


dashboard below will soon become your best friend. Focused on web analytics,
this example will provide you with substantial visualizations that you can dig deeper
into and examine all web-related marketing data.

© Copyright FHM.AVI – Data 352


Data Analyst

This data dashboard template gives you a number of useful marketing KPIs at
a glance, giving you the answers to questions like:

• How many people are visiting your website?


• How many pages are they looking at?
• How long are they staying?
• How many people are converting?
This example goes even further by breaking down your top converting channels,
campaigns, and pages, and showing how much of your traffic is coming from each
channel.
This kind of information makes it really easy to know where you should prioritize
your time and energy.

7) Human Resources Data Dashboard


HR is becoming more data-driven and modern software solutions have entered the
human resources realm in order to make recruiting and talent management

© Copyright FHM.AVI – Data 353


Data Analyst

operations more effective and focused on unbiased quality. Especially today, when
attracting and keeping the right talent has become a difficult challenge for companies
that are faced with scarce talents, and a high level of competition, particularly in the
IT industry. In fact, 43% of human resources professionals say they struggle with
the hiring process because of competition from other employers. Even when they
hire the right candidate, there is often a scenario where talents leave because of low
satisfaction levels or a better offer from the competition.

A modern HR dashboard, focused on talent management, as we can see above,


helps professionals in identifying issues and potential negative impacts on the
business by analyzing important metrics focused on talents. For example, the talent
turnover rate is an HR KPI that will show you which department has difficulties in
retaining top talent. In our example above, you can see that the financial department
deals with the highest voluntary turnover so it might make sense to examine why.

© Copyright FHM.AVI – Data 354


Data Analyst

The talent satisfaction is expressed through the net promoter score (NPS) and it's
evident that the highest satisfaction is achieved after a 5-year employment period,
although the first year is also very near result-wise.
The talent rating is expressed by the employment period and category. It's critical to
conduct regular feedback and meetings in order to develop communication as well
as talent capabilities. If there is a particular lack of skills present in the assessment
process, it would make sense to invest in employees and provide additional
educational opportunities. That way, the company can profit, too.
You can also take a look at a quick overview of the hiring stats that include the time
to fill, training costs, new hires, and cost per hire. The total amount of employees
and monthly salary, as well as vacancies, will provide you with an at-a-glance
overview of the talent management stats within the first quarter of the year.
This dashboard description includes all-important information essential to
developing a modern HR report for professionals, managers, and VPs that need
to compete to attract the best possible candidates and keep them in the long run.
One of our next database dashboard examples shows processes in the manufacturing
industry.

8) Manufacturing Production Dashboard

If your company is in the manufacturing business, a production dashboard


will prove invaluable, as seen in this example:

© Copyright FHM.AVI – Data 355


Data Analyst

With manufacturing KPIs for your total production volume, your sales
revenue, your units ordered, and your top-performing machines, you have your
finger on the pulse of your factory.
Finally, the refunded items by reason graph serves as an “early warning sign” for
manufacturing defects.

9) Marketing Performance Dashboard

© Copyright FHM.AVI – Data 356


Data Analyst

A data analytics dashboard packed with insight, this marketing-centric innovation is


an effective way of ensuring a consistently healthy return on investment (ROI).
To maximize the value of your promotional campaigns and initiatives, engaging your
audience through the right touchpoints, channels, and mediums at exactly the right
times is vital. If you fail to do so, you’re merely marketing to the void and throwing
your budget into the abyss.

Our digital marketing report example provides a panoramic breakdown of


your campaigns’ performance by allowing you to drill down into essential metrics,
including click-through rates (CTR), cost per click (CPC), and cost per acquisition
(CPA).
By analyzing these insightful KPIs and comparing the performance of various
campaigns from one central location, you can pinpoint any inefficiencies while
capitalizing on your marketing strengths. This deep level of visual insight will help
you understand how your customers engage with your marketing materials while
guiding you towards the content and strategies that earn the best results. And when
you do that, you will see a significant boost in revenue as well as audience growth.

© Copyright FHM.AVI – Data 357


Data Analyst

10) Logistics Transportation Dashboard


This particular dashboard example shows how big data and data analytics can impact
the logistics industry. When it comes to logistics, every moment matters, and you
want as many deliveries as possible to be on time.

This logistics dashboard makes it easy to adjust the fleet performance and
overall transportation performance since the delivery status, fleet efficiency, average
loading time, and other logistics KPIs will enable you to generate actionable
insights, examine various details, and identify trends in your transportation
management processes.

11) Cash Management Dashboard


Naturally, your cash flow is integral to the ongoing health of your business. By
managing your income, expenditure, liquidity, and financial relationships
effectively, you’ll be able to meet your monetary targets, gain greater access to
credit, and ensure your business is fortified against any unforeseen circumstances.

© Copyright FHM.AVI – Data 358


Data Analyst

This dynamic financial dashboard features all of the functions and KPIs to keep
every element of your finances afloat, helping you to optimize your financial activity
for ongoing growth and success. An essential data dashboard-type for organizations
across industries.

12) Customer Support KPI Dashboard

© Copyright FHM.AVI – Data 359


Data Analyst

Niche or sector aside, offering your clients or customers an exceptional level of


customer support is no longer an added luxury - it’s essential. No exceptions. No
compromises. Without providing a valuable personal customer experience, your
business will suffer. And harnessing the right data is the most effective way to avoid
falling behind your competitors.

Our dynamic customer service dashboard serves up a perfect storm of visual


information that helps everyone in a consumer-facing role (from support agents and
team leaders to senior managers, VPs, and beyond) perform their role better.
With KPIs including service level, support costs vs. revenue, and customer
satisfaction rates all on the menu, this well-rounded data analytics dashboard
provides every metric required to reduce service costs, monitor support trends, and
tackle any glaring customer-facing issues head-on.
By tracking this KPI dashboard regularly, you will reduce the time your agents take
to solve service issues, streamline your departmental processes, and ultimately meet
the needs of your customers with confidence and consistency. In turn, you will boost

© Copyright FHM.AVI – Data 360


Data Analyst

your brand reputation, increase your revenue, and improve customer loyalty - three
key ingredients of commercial success.

13) Retail Data Dashboard Example


What dashboards in the retail industry provide is, in short, an indispensable means
of reporting and centralizing massive volumes of information. No matter the size of
a company, a retail dashboard specifically designed for online retailers, has
immense importance in optimizing online processes and providing a centralized
point of access for all retail-related data.

Achieving success in online retail highly depends on data collection, management,


monitoring, and optimizing processes in order to increase productivity and,
consequently, revenue. Moreover, customers are more informed and demanding than
ever before, and that's why having all the data in a central place is critical. That will
enable you to act swiftly and ensure high levels of customer satisfaction.

© Copyright FHM.AVI – Data 361


Data Analyst

The example starts with the return reasons, one of the most important retail KPIs
that are closely connected with the rate of return and affect the quality assessment.
The perfect order rate and the number of total orders dig deeper into thestats over a
longer period of time. That way, you can adjust your retail analytics strategies
more carefully and increase your efficiency.
Top sellers by orders will tell you if you need to adjust your offers or invest more
into marketing the best performing pieces in your inventory.

14) Salesforce Data Analytics Dashboard


In today's competitive environment, Salesforce is one of the tools critical for
managing and optimizing customer relationship processes. With the help of a
professional Salesforce dashboard, monitoring and optimizing the sales
lifecycle, connecting with other data sources, and automating the reporting processes
is done quicker than ever before.

Cold calling is still an effective strategy to gain new customers, and the
outbound calls dashboard helps to determine whether the team is doing a great job

© Copyright FHM.AVI – Data 362


Data Analyst

needs additional help, or develops a more productive sales environment. The


dashboard shows an overview of the number of outbound calls, demos, contracts
closed as well as the contract value. On the right side, you can see an outline of 5
agents and their performance in the last month which can help you in creating
detailed salesforce reports and optimize further.

At the bottom, you can see the calls and contact rate per weekday and notice how
the mid-week contact rate rises. That way, you can adjust your future planning and
cold calling operations more effectively.

15) Hospital KPI Dashboard


Hospitals are the beating heart of the health sector – without them, where would we
be? With so much activity to analyze and so many resources to consider,
implementing healthcare analytics software that includes data dashboard
technology in hospitals is essential in this day and age.

Covering treatment costs, financial efficiency, and various critical aspects of patient
care, this hospital dashboard plays a pivotal part in ironing out any operational

© Copyright FHM.AVI – Data 363


Data Analyst

efficiencies on a daily basis. Moreover, with the insights gained from this dashboard,
hospitals stand to increase mortality rates, make wiser investments, and ensure
cohesive communication throughout the institution.

16) Employee Performance Dashboard


It doesn't matter what you sell or offer – if your employees aren’t happy, motivated,
and performing to the best of their abilities, your entire organization will suffer.
That's why we're bringing you another HR-focused example created with a
professional HR analytics software.

A comprehensive staff-centric innovation, the employee performance dashboard


drills down into every major aspect of personnel management, from absenteeism
rates and overtime hours to productivity levels and training costs. The insights here
help offer support and motivation where needed, boosting staff engagement, and
improving your management strategies.

17) Energy Management Dashboard

© Copyright FHM.AVI – Data 364


Data Analyst

Many businesses overlook the financial impact of powering their everyday


operations. But the average business uses between 100 and 150 kWh every
year. And for organizations with multiple sites or branches, this figure is likely to
at least quadruple.
Driving down your energy consumption will not only save you a significant level of
annual costs, but it will also make your business greener and more sustainable.
An invaluable dashboard for any scaling business, our energy management platform
offers a clearcut visualization of your energy sources and consumption while
allowing you to compare your usage to other sectors.

Our energy dashboard also displays visual data and trends based on any power
cuts you’ve experienced over a certain period. This insightful dashboard example
will give you a greater understanding of how your business consumes energy while
giving you the insight you need to nip any inefficiencies in the bud. You will save
money and keep your operations flowing in the process.

© Copyright FHM.AVI – Data 365


Data Analyst

This is an essential data analysis dashboard for cementing sustainable financial


growth while consistently eliminating unnecessary wastage.

18) IT Data Analytics Dashboard


In IT, a dashboard means extreme value and it's one of the tools needed to ensure
projects are delivered on time, tickets managed successfully, and costs are under
control. Especially for chief technology officers, who need to have an overview of
strategic management and lead the technological resources in alignment with
business goals and needs.

That said, an IT dashboard that centralizes multiple touchpoints has immense


value for leaders and professionals in the IT advancements of a company. Let's take
a closer look at how to read our example focused on IT leadership.

The visual is organized into 4 main sections that modern CTOs need to monitor and
evaluate in order to gain and keep positive business development: learning, internal,
finance/customers, and users. Let's take a quick look at each.

© Copyright FHM.AVI – Data 366


Data Analyst

The learning section is divided into IT KPIs critical for managing tickets and bugs
as well as the team's attrition rate. You can immediately spot where issues are
concentrated, in this case, the restore success rate has slightly worsened this month
in comparison to the last. We can see that the development of other metrics such as
reopened tickets is better but also look at critical bugs over a longer timeframe.
The internal part has a bit more issues in metrics such as the meantime to repair,
availability, and accuracy of estimates. For the latter, that means that the team didn't
estimate their workload correctly but it would make sense to examine why and what
exactly happened. The other 2 metrics, mean time between failures and the
downtime due to security have shown positive development in comparison to last
month. This shows you which parts of your strategy you need to focus on more in
order to bring success in every aspect of the technological development and
management of a company.
On the bottom part, you can see the financial and customer aspects as well as the
development of users through a 12-month period. These aspects are essential in IT
analytics in order to reach sustainable development since costs are oftentimes an
issue in IT projects as well as support expenses.

To learn more about project management, you can explore our guide on project
management dashboards.
19) Content Quality Control Dashboard

© Copyright FHM.AVI – Data 367


Data Analyst

In an increasingly digital world, consumers expect a great deal of value from brands
and businesses. To earn consumer loyalty and trust, creating content that not only
solves your audience’s pain points but also inspires and engages will push you ahead
of the pack.
An informational innovation, our content quality control dashboard offers a detailed
breakdown of your article and content creation processes. Here, you can analyze
your best-performing content, explore how long each phase of your story creation
process takes, and gain an understanding of how easy your content is to read.
By working with every informational aspect of this content-centric dashboard, you
will know exactly what types of content engage your audience and earn the best
results. As a result, you can tweak and enhance your content marketing strategy and
article creation processes for ongoing brand awareness, commercial growth, and
customer retention growth.
Content is one of the most vital components of any business strategy. Without
knowing what works and what doesn’t, you’re unlikely to ever reach your full
potential. In terms of content creation, this dashboard created with a professional
media analytics software will help you get where you need to beand beyond.
20) FMCG KPI Dashboard

© Copyright FHM.AVI – Data 368


Data Analyst

Managing your fast-moving consumer goods (FMCG) processes with pinpoint


accuracy will make your supply chain stronger, smoother, and more efficient than
ever before. Conversely, if you allow your FMCG initiatives to dwindle, your supply
chain will fall apart, and you will lose money.

Our FMCG dashboard will help you get a firm grip on every fast-moving part
of your supply chain, which you probably know is a real challenge.
By using these interactive visualizations to your advantage, you will gain a working
insight into how swiftly you’re selling your goods within their freshness date, check
FMCG stock levels in real-time, examine your inventory turnover over specific
timeframes, and analyze your average time to sell specific categories of items.
With access to every essential FMCG insight in one central location, you will be able
to weed out any glaring fast-moving supply chain inefficiencies while managingyour
stock and fulfillment strategies in a way that consistently saves time and money.
Fast-moving goods need to be handled with care at every stage of the supply chain
while being delivered to the right parties at the right time. This highly intuitive tool
will help you steer clear of any brand-damaging fulfillment calamities while keeping

© Copyright FHM.AVI – Data 369


Data Analyst

your supply chain fluent from end to end. In turn, you will see a boost to your
business’s bottom line.

The Top 10 Benefits of Data Dashboards

Now that you have at least a basic understanding of what is a data dashboard and
how it can benefit your company, let’s dig deeper into some of its uses.

1. They Are Customizable


One of the biggest advantages data dashboards have over more traditional
spreadsheets is that they are almost infinitely customizable and flexible. This makes
sense operationally, as the same data is used in very different ways by different
people in your company.
For example, let’s say you have all of your sales data from the past quarter in a
spreadsheet. While you can dig through the spreadsheet for details, you have to look
at all the information at once. This can affect your ability to focus. But using an
effective sales report created with a dashboard, you could easily have one layout
optimized for a specific sales representative, showing their:

• Deal closing rates


• Average order size
• Lifetime value (LTV) of customers closed

© Copyright FHM.AVI – Data 370


Data Analyst

A sales manager might see each of these data sets for each of their sales reps, and a
C-level executive won’t see any of this. They’ll see averages of LTV, order sizes,
and overall closing rates, along with other “bigger picture data.”

2. They Are Interactive


Let’s say our sales executive wanted to dig deeper into the data around a promising
new hire. If she was using a dashboard software, she could simply tap a few buttons
on her dashboard and get to a view similar to that of the sales manager. Conversely,
if the sales manager (or even the rep) wanted to understand how their job role fits
into the company as a whole, they could “zoom out” as needed. This kind of “focus
adjustment” just isn’t possible with spreadsheets.
When presenting data and communicating insights, it is important to create a
dialogue – no one likes being preached throughout a whole presentation. Letting
your audience being part of it thanks to interactive dashboard features will convey
your message more efficiently. By using real-time data and manipulating it live,
encouraging your listeners to ask questions and drill down the information by
themselves, explore on their own, they will be much more engaged in the discussion.
The interactivity is especially interesting in dashboarding with a diverse audience:
newcomers are onboarded easily while experts can dig deeper into the data for more
insights.

3. They allow for real-time monitoring


The business world changes quickly.
If you’re still running analytical reports by sending them to your IT department and
then waiting to get them back, your company is missing out on situations where
agility wins. If you have to wait days or weeks to get data on a situation, it might be
too late to make necessary changes.
For example, with real-time monitoring, you can see how a phone outreach
campaign is going for a big product launch you have coming up. If a week into the
campaign, you notice that you’re not getting any results via your dashboard, you can
make a quick intervention and switch to digital ads, direct mail, or some other
strategy.
You’re saving yourself a big opportunity cost here.

© Copyright FHM.AVI – Data 371


Data Analyst

4. All of your data Is in one place


When you use dashboards, you have one centralized location where all of your users
can access your data.
This is in contrast to a myriad of spreadsheets, software, and databases that represent
the legacy approach. While all of your PPC data is on Facebook, while all of your
customer data is in your CRM software, and your income/loss statements are in your
mainframe (which is running on the 1980s software and requires an IT staff member
to operate). Not pretty.

5. They Are Intuitive


Let’s be honest – most human beings find it easier to understand a graph than a long,
tedious Excel spreadsheet full of numbers. Dashboards allow you to “tease out”
patterns in your data that you might not see in a purely numerical format.
Additionally, people have different levels of “tech-savvy.” Some users may find
even an Excel document to be quite difficult to work with. Dashboards make
everything accessible.

6. They Get Everyone On the Same Page


In order to have a dashboard culture, everyone on your team needs to be on board.
That can’t happen if some employees don’t have the same access to data that other
employees do, whether due to technical comfort levels or simply due to data being
spread out.
When everyone on your team is looking at the same set of numbers, they can
understand the team’s mission in a much more tangible way. You can point to sales
numbers and really connect them with future goals and projections.
This kind of connection is priceless.

7. They force you to focus


Perhaps the biggest reason to use dashboards for understanding your data is that they
force you to focus.
One of the key points to remember when creating a stunning dashboard is that you
shouldn’t overwhelm it with too many KPIs. From a visual and aesthetic perspective,

© Copyright FHM.AVI – Data 372


Data Analyst

things get cluttered if you put too much on the screen. From an effectiveness
standpoint, your mind starts to wonder what’s important.
Therefore, if you’re using dashboards the right way, your brain always knows what
KPIs are important – the ones on the screen. You can even make more important
KPIs bigger, further reinforcing their status in your mind.

As Tony Robbins says, “Where focus goes, energy flows.” In other words, you
take action on what you focus on. Dashboards make it easy to focus on what’s truly
important.

8. They help you to multitask


Multitasking is a useful skill, but in business, trying to absorb too much information
or tackle too many tasks at once can prove to be detrimental, with projects becoming
convoluted and mistakes being made.
However, when it comes to data analytics, a modern dashboard consolidates all
critical insights from various data sources through data connectors and presents them
in a dynamic visual format. By being able to view a multitude of organized, visually
digestible information in one central space, you can quickly extract several pieces of
information at once, empowering you to complete projects swiftly with a greater
level of success and accuracy.

9. They Are Predictive


Dashboards aren’t psychic, but that can help you predict the direction your business
is going based on current trends, metrics, and insights. If you don’t like the way
particular elements of your business are going, digital dashboards will give you the
intelligence you’ll need to make enhancements that will help to cement your future
success.
By working with a mix of past insights, real-time data, and detailed patterns,
dashboards provide a panoramic overview of specific areas of your business. If you
feel something is going to cost you unnecessary investment in the future, you’ll
be able to improve your financial strategy; if you see something worth
capitalizing on, you can create a campaign or initiative around it.

© Copyright FHM.AVI – Data 373


Data Analyst

Moreover, as modern data dashboards securely store everything on the cloud, access
to predictive insights is literally 247 - 365 days a year. All you need is a web
connection and you can start formulating strategies wherever you are in the world,
at any time.

10. They help you to tell stories with your data


When it comes to sharing your data with others, prompting any form of progressive
action requires others to understand the message you’re trying to convey.
To help other departments within your organization, or external partners, get on
board with your data-driven insights, you need to tell a story with your data. An
interactive digital dashboard will help you do just that.
Through seamless data visualization, a dashboard can help you create a narrative
with the information delivered through your KPIs. In gaining the ability to do so
effectively, you’ll be able to work more cohesively towards your goals and
accelerate the success of your business exponentially.
We’ll now discuss the primary features or benefits of BI-powered dashboards and
take a glimpse at legacy solutions.

How Data Dashboards are used in BI


Now that we’ve asked the question, ‘what is a dashboard?’ and looked at the primary
functions of these powerful tools, let’s examine them in a business intelligence
context.
When it comes to business intelligence, data dashboards play a pivotal role.
Business intelligence (BI) is a term that relates to the applications, infrastructure,
practices, and tools that empower businesses to access a broad range of analytical
data for improvement, campaign optimization, and enhanced decision-making that
maximizes performance.
A data dashboard is the vessel – or tool – that materializes BI practices, converting,
visualizing, and communicating complex business data into meaningful, actionable
insights.

© Copyright FHM.AVI – Data 374


Data Analyst

BI dashboard tools bestow business users with the ability to drill down even deeper
into analytical data to capitalize on strengths, spot weaknesses, and make changes
that will benefit the future of their organization.
Digital dashboards have essentially quashed the necessity to sift through multiple
reporting tools; instead, they access dynamic insights – often in real-time. With such
dashboards, users can also customize settings, functionality, and KPIs to optimize
their dashboards to suit their specific needs.
To summarize, in the context of BI, data dashboards are used for:

• Deep-level insight: Drilling down deeper into key aspects of your business’s
daily, weekly, and monthly operations to create initiatives for increased
efficiency.
• Information sharing: To facilitate the online data visualization of your most
valuable data so that you can share key insights with other stakeholders both
internally and outside the organization.
• Performance measurement: By working with interactive KPI dashboards,
you can set specific business tasks, measuring your performance while
working towards clearly defined milestones for enhanced business success.
• Forecasting: As dashboards are equipped with predictive analytics, it’s
possible to spot trends and patterns that will help you develop initiatives and
make preparations for future business success.
A data dashboard assists in 3 key business elements: strategy, planning, and
analytics. As we have expounded on the data dashboard definition, showed you real-
life examples, now we will focus on legacy data solutions.

Legacy Data Solutions


Before dashboards, it was much harder to get an intuitive grasp of your company’s
performance through data-driven intelligence.
Because a huge amount of data existed in a company’s mainframe computer
(particularly data related to profits, costs, revenue, etc.), you would often need IT
professionals to prepare data reports for you.

© Copyright FHM.AVI – Data 375


Data Analyst

Why? Well, many of these mainframe computers are still running legacy software
designed in the 1970s or 1980s. These innovations offer no swipe-able screens or
easy-to-use interfaces.
These are more “command line” types of machines, and other issues include:

• The time delay associated with requesting reports


• Data is spread out amongst many databases
• Lack of different data visualization types
Due to all of these problems, dashboards were developed. They help businesses
answer the question “How can we use all the data we have about our company and
our customers in order to make better informed, data-driven decisions that lead to
more revenue and profits?”
Behind the scenes, with the help of a dashboard builder, you can integrate all of the
information you need from all of the different data sources your company has.
However, what you (the end-user) see are simple tables, charts, and graphs – all in
real-time.

Dashboard Storytelling: From A Powerful to an


Unforgettable Presentation
Plato famously quipped that “those who tell stories rule society.” This statement is
as true today as it was in ancient Greece – perhaps even more so in modern times.
In the contemporary world of business, the age-old art of storytelling is far from
forgotten: rather than speeches on the Senate floor, businesses rely on striking
data visualizations to convey information, drive engagement, and persuade
audiences.
By combining the art of storytelling with the technological capabilities of
dashboard software, it’s possible to develop powerful, meaningful, data-
backed presentations that not only move people but also inspire them to take action
or make informed, data-driven decisions that will benefit your business.

As far back as anyone can remember, narratives have helped us make sense of the
sometimes complicated world around us. Rather than listing facts, figures, and

© Copyright FHM.AVI – Data 376


Data Analyst

statistics alone, people used gripping, imaginative timelines, bestowing raw data
with real context and interpretation. In turn, this gripped listeners, immersing them
in the narrative, thereby offering a platform to absorb a series of events in their
mind’s eye precisely the way they unfolded.

Here, we explore data-driven, live dashboard storytelling in depth, looking at


storytelling with KPIs and the dynamics of a data storytelling presentation while
offering real-world storytelling presentation examples.
First, we’ll delve into the power of data storytelling as well as the general dynamics
of a storytelling dashboard and what you can do with your data to deliver a great
story to your audience. Moreover, we will offer dashboard storytelling tips and tricks
that will help you make your data-driven narrative-building efforts as potent as
possible, driving your business into exciting new dimensions. But let's start with a
simple definition.

"You’re never going to kill storytelling, because it’s built in the human plan.
We come with it.” – Margaret Atwood

What Is Dashboard Storytelling?


Dashboard storytelling is the process of presenting data in effective visualizations
that depict the whole narrative of key performance indicators, business strategies and
processes in the form of an interactive dashboard on a single screen, and in real-time.
Storytelling is indeed a powerful force, and in the age of information, it’s possible
to use the wealth of insights available at your fingertips to communicate your
message in a way that is more powerful than you could ever have imagined. So,stay
tuned to see the top tips and tricks we will now explain to be able to successfullycreate
your own story with a few clicks.

4 Tricks to Get Started with Dashboard Storytelling


Big data commands big stories.

Forward-thinking business people turn to online data analysis and data


visualizations to display colossal volumes of content in a few well-designed charts.
But these condensed business insights may remain hidden if they aren’t
communicated with words in a way that is effective and rewarding to

© Copyright FHM.AVI – Data 377


Data Analyst

follow. Without language, business people often fail to push their message through
to their audience, and as such, fail to make any real impact.
Marketers, salespeople, and entrepreneurs are today’s storytellers – they are wholly
responsible for their data story. People in these roles are often the bridge between
their data and the forum of decision-makers they’re looking to encourage to take the
desired action.
Effective dashboard storytelling with data in a business context must be focused on
tailoring the timeline to the audience and choosing one of the right data
visualization types to complement or even enhance the narrative.
To demonstrate this notion, let’s look at some practical tips on how to prepare the
best story to accompany your data.

1. Start with data visualization


This may sound repetitive, but when it comes to a dashboard presentation, or
dashboard storytelling presentation, it will form the foundation of your success: you
must choose your visualization carefully.
Different views answer different questions, so it’s vital to take care when choosing
how to visualize your story. To help you in this initiative, you will need a robust
data visualization tool. These intuitive aids in dashboard storytelling arenow
ubiquitous and provide a wide array of options to choose from, including line charts,
bar charts, maps, scatter plots, spider webs, and many more. Such interactivetools are
rightly recognized as a more comprehensive option than PowerPoint presentations
or endless Excel files.
These tools help both in exploring the data and visualizing it, enabling you to
communicate key insights in a persuasive fashion that results in buy-in from your
audience.
But for optimum effectiveness, we still need more than a computer algorithm – here,
we need a human to present the data in a way that will make it meaningful and
valuable. Moreover, this person doesn’t need to be a common presenter or a teacher-
like figure. According to research carried out by Stanford University,
there are two types of storytelling: author- and reader-driven storytelling.

© Copyright FHM.AVI – Data 378


Data Analyst

An author-driven narrative is static and authoritative because it dictates the analysis


process to the reader or listener. It’s like analyzing a chart printed in a newspaper.
On the other hand, reader-driven storytelling allows the audience to structure the
analysis on their own. Here, the audience can choose the data visualizations that they
deem meaningful and interact with them on their own by drilling down to more
details or choosing from various KPI examples they want to see visualized. Thus,
they can reach out for insights that are crucial to them and make sense out of data
independently.

2. Put your audience first


Storytelling for a dashboard presentation should always begin with stating your
purpose. What is the main takeaway from your data story? It’s clear that your
purpose will be to motivate the audience to take a certain action.
Instead of thinking about your business goals, try to envisage what your listeners are
seeking. Each member of your audience – be that a potential customer, future
business partner, or stakeholder – has come to listen to your data storytelling
presentation to gain a profit for him or herself. To better meet your audience’s
expectations and gain their trust (and money), put their goals first and let them
determine the line of your story.
Needless to say, before your dashboard presentation, try to learn as much as you can
about your listeners. Put yourself in their shoes: Who are they? What do they do
on a daily basis? What are their needs? What value can they draw from your
data for themselves?
The better you understand your audience, the more they will trust you and follow
your idea.

3. Don’t fill up your data storytelling with empty words


Storytelling with data, rather than presenting data visualizations alone, brings the
best results. That said, there are certain enemies of your story that make it more
complicated than enlightening and turn your efforts into a waste of time.

The first of these bugbears are the various technology buzzwords that are
devoid of any defined meaning. These words don’t create a clear picture in your

© Copyright FHM.AVI – Data 379


Data Analyst

listeners’ heads and are useless as a storytelling aid. In addition, to under-informing


your audience, buzzwords are a sign of your lazy thinking and a herald that you don’t
have anything unique or meaningful to say. Try to add clarity to your story by using
more precise and descriptive narratives that truly communicate your purpose.
Another trap is using your industry jargon to sound more professional. The problem
here is that it may not be the jargon of your listeners’ industry – thus, they may not
comprehend your narrative. Moreover, some jargon phrases have different meanings
depending on the context they are used in – they mean one thing in the business field
and something else in everyday life. They reduce clarity and can also convey the
opposite meaning of what you intend to communicate in your data storytelling.
Don’t make your story too long, focus on explaining the meaning of data rather than
the ornateness of your language, and humor of your anecdotes. Avoid overusing
buzzwords or industry jargon and try to figure out what insights your listeners want
to draw from the data you show them.

4. Utilize the power of storytelling


Before we continue our journey into data-powered storytelling, we'd like to further
illustrate the unrivaled power of offering your audience, staff, or partners inspiring
narratives by sharing these must-know insights:

• Recent studies suggest that 80% of today’s consumers want brands to tell
stories about their business or products.

• The average person processes 100,500 digital words every day. By taking
your data and transforming it into a focused, value-driven narrative, you stand
a far better chance of your message resonating with your audience and
yielding the results you desire.

• Human beings absorb information 60 times faster with visuals than linear
text-based content alone. By harnessing the power of data visualization to
form a narrative, you’re likely to earn an exponentially greater level of success
from your internal or external presentations.

© Copyright FHM.AVI – Data 380


Data Analyst

How to Present a Dashboard - 6 Tips for The Perfect


Dashboard Storytelling Presentation

Now that we’ve covered the data-driven storytelling essentials, it’s time to dig
deeper into ways that you can make maximum impact with your storytelling
dashboard presentations.

Business dashboards are now driving forces for visualization in the field of
business intelligence. Unlike their predecessors, a state-of-the-art dashboard
builder gives presenters the ability to engage audiences with real-time data and
offer a more dynamic approach to presenting data compared to the rigid, linear nature
of, say, Powerpoint.

With the extra creative freedom data dashboards offer, the art of storytelling is
making a reemergence in the boardroom. The question now is: What is great
dashboarding?
Without further ado, here are our top six tips on how to transform your presentation
into a story and rule your own company through dashboard storytelling.

1. Set up your plan

© Copyright FHM.AVI – Data 381


Data Analyst

Start at square one on how to present a dashboard: outline your presentation. Like
all good stories, the plot should be clear, problems should be presented, and an
outcome foreshadowed. You have to ask yourself the right data analysis
questions when it comes to exploring the data to get insights, but you also need to
ask yourself the right questions when it comes to presenting such data to a certain
audience. Which information do they need to know or want to see? Make sure you
have a concise storyboard when you present so you can take the audience along with
you as you show off your data. Try to be purpose-driven to get the best dashboarding
outcomes, but don’t entangle yourself in a rigid format that is unchangeable.

2. Don’t be afraid to show some emotion

Stephen Few, a leading design consultant, explains on his blog that “when we
appeal to people’s emotions strictly to help them personally connect with
information and care about it, and do so in a way that draws them into reasoned
consideration of the information, not just feeling, we create a path to a brighter, saner
future”. Emotions stick around much longer in a person’s psyche than facts and
charts. Even the most analytical thinkers out there will be more likely to remember
your presentation if you can weave elements of human life and emotion. How to
present a dashboard with emotion? By adding some anecdotes, personal life
experiences that everyone can relate to, or culturally shared moments and jokes.
However, do not rely just on emotions to make your point. Your conclusions and
ideas need to be backed by data, science, and facts – otherwise, and especially in
business, you might not be taken seriously. You’d also miss an opportunity to help
people learn to make better decisions by using reason and would only tap into a
“lesser-evolved” part of humanity. Instead, emotionally appeal to your audience to
drive home your point.

3. Make your story accessible to people outside your sector


Combining complicated jargon, millions of data points, advanced math concepts,
and making a story that people can understand is not an easy task. Opt for simplicity
and clear visualizations to increase the level of audience engagement.
Your entire audience should be able to understand the points that you are driving
home. Jeff Bladt, the director of Data Products & Analytics at DoSomething.org,

© Copyright FHM.AVI – Data 382


Data Analyst

offered a pioneering case study on accessibility through data. When


commenting on how he goes from 350 million data points to organizational change,
he shared: “By presenting the data visually, the entire staff was able to quickly grasp
and contribute to the conversation. Everyone was able to see areas of high and low
engagement. That led to a big insight: Someone outside the analytics team noticed
that members in Texas border towns were much more engaged than members in
Northwest coastal cities.”
Making your presentation accessible to laypeople opens up more opportunities for
your findings to be put to good use.

4. Create an interactive dialogue


No one likes being told what to do. Instead of preaching to your audience, enable
them to be a part of the presentation through interactive dashboard features.
By using real-time data, manipulating data points in front of the audience, and
encouraging questions during the presentation, you will ensure your audiences are
more engaged as you empower them to explore the data on their own. At the same
time, you will also provide a deeper context. The interactivity is especially
interesting in dashboarding when you have a broad target audience: it onboards
newcomers easily while letting the ‘experts’ dig deeper into the data for more
insights.

5. Experiment
Don’t be afraid to experiment with different approaches to storytelling with data.
Create a dashboard storytelling plan that allows you to experiment, test different
options, and learn what will build the engagement among your listeners and make
sure you fortify your data storytelling with KPIs. As you try and fail by making them
fall asleep or check their email, you will only learn from it and get the information
on how to improve your dashboarding and storytelling with data techniques,
presentation after presentation.

6. Balance your words and visuals wisely

© Copyright FHM.AVI – Data 383


Data Analyst

Last but certainly not least is a tip that encompasses all of the above advice but also
offers a means of keeping it consistent, accessible, and impactful from start to finish:
balance your words and visuals wisely.
What we mean here is that in data-driven storytelling, consistency is key if you want
to grip your audience and drive your message home. Our eyes and brains focus on
what stands out. The best data storytellers leverage this principle by building charts
and graphs with a single message that can be effortlessly understood, highlighting
both visually and with words the strings of information that they want their audience
to remember the most.
With this in mind, you should keep your language clear, concise, and simple from
start to finish, using the best visualizations to enhance each segment of your story,
placing a real emphasis on any graph, chart, or sentence that you want your audience
to take away with them.

Every single element of your dashboard design is essential, but by emphasizing


the areas that really count, you’ll make your narrative all the more memorable,
giving yourself the best possible chance of enjoying the results you deserve.

The Best Dashboard Storytelling Examples


Now that we’ve explored the ways in which you can improve your data-centric
storytelling and make the most of your presentations, it’s time for some inspiring
storytelling presentation examples. Let’s start with a storytelling dashboard that
relates to the retail sector.

1. A retailer’s store dashboard with KPIs


The retail industry is an interesting one as it has particularly been disrupted with the
advent of online retailing. Collecting data analytics is extremely important for this
sector as it can take an excellent advantage out of analytics because of its data-driven
nature. And as such, data storytelling with KPIs is a particularly effective method to
communicate trends, discoveries and results.

© Copyright FHM.AVI – Data 384


Data Analyst

Open In Full Screen The Retail Store Dashboard

The first of our storytelling presentation examples serves up the information related
to customers’ behavior and helps in identifying patterns in the data collected. The
specific retail KPIs tracked here are focused on the sales: by division, by items,
by city, and the out-of-stock items. It lets us know what the current trends in
customers’ purchasing habits are and allow us to break down this data according to
a city or a gender/age for enhanced analysis. We can also anticipate any stock-out to
avoid losing money and visualize the stock-out tendencies over time to spot any
problems in the supply chain.

This most excellent of data storytelling examples presented in a retail dashboard


will help you tell deeper, more intricate stories thanks to your marketing campaigns.
These analytics enable us to adapt these campaigns per channel: we can see, by
breaking down the Sales Volume by division, that women are the first point of
revenue. Per city, New York adds up to almost 30% of the sales.

© Copyright FHM.AVI – Data 385


Data Analyst

All this information is important for customer retention as it is less expensive to


retain the ones we already have than acquiring new clients.

2. A hospital’s management dashboard with KPIs


This second of our data storytelling examples delivers the tale of a busy working
hospital. That might sound a little fancier than it is, but it’s of paramount importance
– all the more when it comes to public healthcare, a sector very new to data collection
and analytics that has a lot to win from it in many ways.

Open In Full Screen The Hospital KPI Dashboard

For a hospital, a centralized dashboard is a great ally in the everyday management


of the facility. The one we have here gives us the big picture of a complex
establishment, tracking several healthcare KPIs.

From the total admissions to the total patients treated, the average waiting time in
the ER, or broken down per division, the story told by this healthcare
dashboard is essential. The top management of this facility have a holistic view
to run the operations more easily and efficiently and can try to implement diverse

© Copyright FHM.AVI – Data 386


Data Analyst

measures if they see abnormal figures. For instance, an average waiting time for a
certain division that is way higher than the others can shed light on some problems
this division might be facing: lack of staff training, lack of equipment, understaffed
unit, etc.
All this is vital for the patient’s satisfaction as well as the safety and wellness of the
hospital staff that deals with life and death every day.

3. A human resources (HR) recruitment dashboard with KPIs


The third of our data storytelling examples relates to human resources. This
particular storytelling dashboard focuses on one of the most essential responsibilities
of any modern HR department – the recruitment of new talent.

Open In Full Screen The HR Recruitment Dashboard

In today’s world, digital natives are looking to work with a company that not only
shares their beliefs and values but offers opportunities to learn, progress, and grow
as an individual. Finding the right fit for your organization is essential if you want
to improve internal engagement and reduce employee turnover.
The HR KPIs (human resources key performance indicators) related to this
storytelling dashboard are designed to enhance every aspect of the recruitment
journey, helping to drive down economical efficiencies and improving the quality of
hires significantly.

© Copyright FHM.AVI – Data 387


Data Analyst

Here, the art of storytelling with KPIs is made easy. As you can see, this HR
dashboard offers a clear snapshot into important aspects of HR recruitment,
including the cost per hire, recruiting conversion or success rates, and the time to fill
a vacancy from initial contact to official offer.
With this most intuitive of data storytelling examples, building a valuable narrative
that resonates with your audience is made easy, and as such, it’s possible to share
your recruitment insights in a way that fosters real change and business growth.

Start Your Dashboard Storytelling Now!


“I'll tell you a secret. Old storytellers never die. They disappear into their own
story.” – Vera Nazarian
One of the major advantages of working with dashboards is the improvement they
have made to data visualization. Don’t let this feature go to waste with your own
presentations. Place emphasis on making visuals clear and appealing to get the most
from your dashboarding efforts.
With the abundance of ways to visually present data, make sure you choose the one
that works best with for great storytelling with data. For more tips, you can check
out how to create data reports people love to read, but also find inspiration on how
to create a dashboard with ease.
Transform your presentations from static, lifeless work products into compelling
stories by weaving an interesting and interactive plot line into them.

Ready to Get Started with Data?


We’ve asked the question, ‘What is a data dashboard?’ and delved deep into the
dynamics of data dashboard software, and one thing’s for sure: if you work with
your business’s insights the right way, you will propel your business to new
heights.
Now that you understand the unrivaled power of data dashboards, we’re sure you’re
keen to get started. The good news is, harassing the power of data dashboards is a
few clicks away.

© Copyright FHM.AVI – Data 388


Data Analyst

Dashboard design best practices


Dashboards should only show relevant information
We don’t expect a car dashboard to display information about the news, local places
of interest or upcoming appointments because however interesting this information
is, it’s not relevant.
For most application dashboards, users will expect to see information about their
current status, as well as any urgent information, warnings or alerts that they need to
deal with.
A great rule of thumb for data disclosure in dashboards is that you should always
start with a high-level overview and provide easy paths for your users to increase the
level of granularity.

Responsive dashboards hand power to the user


Adding a responsive design to a dashboard allows the user to decide which data they
want to focus on. The key to a good responsive design is a clear, easily understood
UI which allows the user to control exactly which data needs to be front and center
in the dashboard

© Copyright FHM.AVI – Data 389


Data Analyst

Great dashboards lead with key data


We love dashboards that cut out the bunk and lead with big, bold numbers. A
dashboard like this communicates confidence and decisiveness. There’s lots of white
space and clear, bold takeaway data. Presenting data like this helps the user see
what’s important in an instant, doing what a dashboard should always do: save the
user time.

© Copyright FHM.AVI – Data 390


Data Analyst

Use information architecture for great dashboard design


When you design your dashboard, consider the principles of information architecture
and hierarchy when you decide which cards to show and in which positions.
Remember the F and Z patterns that reflect how a user’s eye scans a page? Apply
that logic to the structure and order of elements on your dashboard. Lead with the
absolute top must-have takeaway, and let your dashboard flow from there.

© Copyright FHM.AVI – Data 391


Data Analyst

Use different views to keep things light


We like dashboards that use different views to keep the main view as simple as
possible. Take a look at this dashboard for a restaurant management web-app.
Note how the user can filter data by date, switch between restaurants, and access
information about reservations, outgoing payments, employees, shifts and external
providers, all while maintaining a clean and simple look. Imagine trying to include
all that info in one screen.

© Copyright FHM.AVI – Data 392


Data Analyst

Use consistent design language and color scheme


Just because dashboards often include data views, that doesn’t mean they can’t look
drop dead gorgeous. By keeping things light and airy, with prudent use of color, this
publisher dashboard delivers clear visibility, simple navigation and striking good
looks.

© Copyright FHM.AVI – Data 393


Data Analyst

Your dashboard should provide the relevant information in


about 5 seconds.
Your dashboard should be able to answer your most frequently asked business
questions at a glance. This means that if you’re scanning the information for minutes,
this could indicate a problem with your dashboard’s visual layout.
When designing a dashboard, try to follow the five-second rule — this is the amount
of time you or the relevant stakeholder should need to find the information you’re
looking for upon examining the dashboard. Of course, ad hoc investigation will
obviously take longer; but the most important metrics, the ones that are most
frequently needed for the dashboard user during her workday, should immediately
pop from the screen.

Logical Layout: The Inverted Pyramid


Display the most significant insights on the top part of the dashboard, trends in the
middle, and granular details in the bottom.

© Copyright FHM.AVI – Data 394


Data Analyst

This concept originated from the world of journalism, and basically divides the
contents of a news report into three, in order of diminishing significance: the most
important and substantial information is at the top, followed by the significant details
that help you understand the overview above them; and at the bottom you have
general and background information, which will contain much more detail and allow
the reader or viewer to dive deeper (think of the headline, subheading, and body of
a news story).
How does a journalistic technique relate to dashboard design? Well, business
intelligence dashboards, like news items, are all about telling a story. The story your
dashboard tells should follow the same internal logic: keep the most significant and
high-level insights at the top, the trends, which give context to these insights,
underneath them, and the higher granularity details that you can then drill into and
explore further at the bottom.

Awesome examples of dashboard design


CleanMac
This dashboard has a simple background with pops of color that attract the eye to
certain key pieces of data.

© Copyright FHM.AVI – Data 395


Data Analyst

The dashboard design enjoys a vertical navigation bar, which helps users see
different aspects of their computer and helps separate the information logically. The
information architecture helps us make sense of all the data, while making sure the
user isn’t overwhelmed. Wonderful!

H-care Medical App


This dashboard is to help emergency rooms run faster and more efficiently. The idea
itself is great – an additional help that all staff can rely on for key information on the
department. The design that brings this idea to life didn’t disappoint!

© Copyright FHM.AVI – Data 396


Data Analyst

What we love the most about this dashboard design is that while users can dive
deeper using the navigation bar to the left, this particular screen is a snapshot of the
E.R. at that moment in time. It gives a global overview of the people in care, as well
as key information on the resources of the department.

Product Analytics System Dashboard


This dashboard is interesting because the dark background contrasts with the lighter
color of some key data such as the total number of sessions.

© Copyright FHM.AVI – Data 397


Data Analyst

We like that even though this dashboard design doesn’t give users as many pieces of
information as some of the other examples on this post, the data doesn’t leave any
room for confused users. The information is presented either in clear graphs or we
are given a delta of the change in metrics. We love that this dashboard puts data into
perspective for users!

© Copyright FHM.AVI – Data 398


Data Analyst

Data Mining
1. What is Data Mining?
Data mining, also known as knowledge discovery in databases (KDD), is the
process of uncovering patterns and other valuable information from large data sets.

Data mining is the act of automatically searching for large stores of information to
find trends and patterns that go beyond simple analysis procedures. Data mining
utilizes complex mathematical algorithms for data segments and evaluates the
probability of future events.

© Copyright FHM.AVI – Data 399


Data Analyst

2. Data Mining in Data Science


1. Data mining in map of data disciplines
Data mining has improved organizational decision-making through insightful data
analyses. The data mining techniques that underpin these analyses can be divided
into two main purposes; they can either describe the target dataset or they can
predict outcomes through the use of machine learning algorithms. These methods
are used to organize and filter data, surfacing the most interesting information, from
fraud detection to user behaviors, bottlenecks, and even security breaches.
When combined with data analytics and visualization tools, like Apache Spark,
delving into the world of data mining has never been easier and extracting relevant
insights has never been faster. Advances within artificial intelligence only continue
to expedite adoption across industries.

By outsourcing data mining, all the work can be done faster with low operation costs.
Specialized firms can also use new technologies to collect data that is impossible to
locate manually. There are tons of information available on various platforms, but

© Copyright FHM.AVI – Data 400


Data Analyst

very little knowledge is accessible. The biggest challenge is to analyze the data to
extract important information that can be used to solve a problem or for company
development. There are many powerful instruments and techniques available to mine
data and find better insight from it.

2. Data Mining vs. Data Science

Data Science is a pool of data operations that also involves Data Mining.
A Data Scientist is responsible for developing data products for the industry. On the
other hand, data mining is responsible for extracting useful data out of other
unnecessary information.
While Data Science is a quantitative field, Data Mining is limited to only business
roles that require specific information to be mined.
A Data Scientist is required to perform multiple operations like analysis of data,
development of predictive models, discovering hidden patterns, etc. On the contrary,
Data Mining involves statistical modeling to unearth useful information.
A Data Scientist has to deal with both structured as well as unstructured data. On the
other hand, Data Mining only deals with structured information.

© Copyright FHM.AVI – Data 401


Data Analyst

3. Skills needed to become a Data Mining


Specialist?
A data mining specialist needs a unique combination of technological, business, and
interpersonal skills. The technical skills that a data mining specialist must master
include the following:

• Familiarity with data analysis tools, especially SQL, NoSQL, SAS, and
Hadoop
• Strength with the programming languages of Java, Python, and Perl
• Experience with operating systems, especially LINUX
In order to make use of the patterns that a data mining specialist finds in an
organization’s data, he or she must have keen business sense. Data analysis is
nothing without a clear view of the business’s model and aims for the future. Data
mining specialists thus must understand their own organization’s goals, as well as
have knowledge of industry trends and best practices.
The data mining specialist must then be able to translate technical findings into
presentations that non-technical colleagues can understand. Therefore, the data
mining specialist should have strong public-speaking skills and the ability to
communicate results to internal and external shareholders.

4. Four steps to launching a successful data


mining specialist Career
Step 1: Earn your Undergraduate Degree
Data mining specialists need a strong background in data science, as well as business
administration. Relevant undergraduate degrees include computer science, data
science, information systems, statistics, and business administration, or any related
fields. You’ll need to understand how to use statistical methods to analyze data, and
you’ll want to be able to develop predictive models. Data mining specialists must be

© Copyright FHM.AVI – Data 402


Data Analyst

able to apply data analysis to real-world business issues, and thus coursework in
developing business intelligence is excellent preparation.

Step 2: Gain Employment as a Data Analyst


The world of data science offers many avenues into more advanced positions. If
possible, you’ll want to get an entry-level job while still in college, especially in IT.
Once out of college, look for positions as a data analyst. This will let you hone your
technical skills further and develop an all-round understanding of the process of data
extraction, transformation, and loading. You’ll want to have a firm grasp on database
design as well.

Step 3: Pursue an Advanced Degree in Data Science


While not all universities offer specific coursework in data science, there are now
many cutting-edge master and doctoral programs in data science. Obtaining an
advanced degree will likely have a positive effect on your salary, as well as keep you
at the forefront of new technologies. Regardless of the degree you hold, you’ll need
to continue pursuing classes in data science advancements for the entirety of your
career.

Step 4: Get Hired as a Data Mining Specialist


You can find positions as a data mining specialist in many different industries. You
may want to begin your career as a data mining specialist with a company that
provides opportunities to contribute to a team working at the forefront of data
science. Software corporations and computer manufacturers are example industries
where you are likely to find this type of opportunity.

© Copyright FHM.AVI – Data 403


Data Analyst

5. Differences between Data Analytics and


Data Mining
1. Team Size
Data mining can be undertaken by a single specialist with excellent technological
skills. With the right software, they are able to collect the data ready for further
analysis. At this stage, a larger team simply isn’t required. From here, a data mining
specialist will usually report their findings to the client, leaving the next steps in
someone else’s hands.

However, when it comes to data analytics, a team of specialists may be needed.


They need to assess the data, figure out patterns, and draw conclusions. They may
use machine learning or prognostication analytics to help with the processing, but
this still has a human element involved.
Data analytics teams need to know the right questions to ask – for example, if they’re
working for a telephony company, they may want to know the answer to ‘how is
VoIP used in business’. A data mining specialist can provide evidence of where it’s
used and how often, but data analytics uncovers the how and the why.
Their goal is to work together to uncover information and figure out how the
gathered data can be used to answer questions and solve problems for the business.
Artificial Intelligence advances are likely to bring serious changes to the analytics
process. An AI system can analyze hundreds of data sets and predict various

© Copyright FHM.AVI – Data 404


Data Analyst

outcomes, offering information about customer preferences, product development,


and marketing avenues.
AI-powered systems will soon be able to complete menial tasks for data analytics
teams, freeing up their time to take on more important work. It has the capability to
dramatically improve the productivity of data scientists by helping to automate
elements of the data analytics process.

2. Data Structure
When it comes to data mining, studies are conducted mostly on structured data. A
specialist will use data analysis programs to research and mine data. They report
their findings to the client through graphs and spreadsheets. This is often a very
visual explanation, due to the complicated nature of the data. Clients are not typically
data mining experts, and they don’t claim to be!

So, data needs to be fairly simply interpreted into graphics or bar charts. As with the
earlier phone company example, if the client needs to know the data behind how
many people click the link to ‘what is a VoIP number’ on their website, this should
be displayed in easy to read charts, not complicated documents.
A data mining specialist builds algorithms to identify a structure within the
data, which can then be interpreted. It’s based on mathematical and scientific
concepts, making it excellent for businesses to gather clear and accurate data.

© Copyright FHM.AVI – Data 405


Data Analyst

This is in contrast to data analytics, which can be done on structured, semi-


structured, or unstructured data. They’re also not responsible for creating
algorithms like a data mining specialist. Instead, they are tasked with spotting
patterns within the data and using them to brief the client on their next steps.
This can then be applied to a company’s business model. The marketing team may
want to see their customer and industry data laid out in front of them. If they can
understand the behaviors of a competitor’s consumer, they can then apply it to their
own strategies.

3. Data Quality
The way that the data needs to be presented for data mining compared to data
analytics varies. While data mining is used to collect data and search for patterns,
data analytics tests a hypothesis and translates findings into accessible
information. This means the quality of data they work with can differ.
A dating mining specialist will use big data sets and extract the most useful data
from them. Therefore, because they’re using vast and sometimes free data sets, the
quality of the data they are working with isn’t always going to be top-notch. Their
job is to mine the most useful data from this, and report their findings in ways
businesses will understand.
However, data analytics involves collecting data and checking for data quality.
Typically, a data analytics team member will be working with good quality raw data
that is as clean as possible. When the data quality is poor, it can negatively impact
the results, even if the process is the same as with clean data. This is a vital step in
data analytics, so the team must check that the data quality is good enough to start
with.

4. Hypothesis Testing in Data Analytics and Data


Mining
A hypothesis is effectively a starting point that requires further investigation, like
the idea that cloud-native databases are the way forward. The idea is constructed
from limited evidence and then investigated further.

© Copyright FHM.AVI – Data 406


Data Analyst

A key difference between data analytics and data mining is that data mining does
not require any preconceived hypothesis or notions before tackling the data. It simply
compiles it into useful formats. However, data analysis does need a hypothesis to
test, as it is looking for answers to particular questions.

Data mining is about identifying and discovering patterns. A specialist will build a
mathematical or statistical model based on what they derive from the data. Because
they don’t lead with a hypothesis, a data mining specialist typically works with large
data sets to cast the widest net of possibly useful data. This gives them the
opportunity to whittle down the data, ensuring the data they’re left with at the end of
the process is usable and reliable. This process works much like a funnel, startingwith
large data sets and filtering it into more valuable data.
In contrast, data analytics tests a hypothesis, extracting meaningful insights as part
of their research. It helps in proving the hypothesis, and it may use the data mining
discoveries in the process. For example, a business may start with a hypothesis such
as, ‘Having a free sample link at checkout will lead to an improved conversion rate
of 15%’. This can then be implemented and tested on the website.
The data analytics team will work to test the hypothesis statement by analyzing each
visit to the website. They may even conduct split tests into A/B for link placement,
where ‘A’ leaves the sample link at the top of the page, and ‘B’ at the bottom. This

© Copyright FHM.AVI – Data 407


Data Analyst

gives an even closer insight into consumer behaviour when purchasing items, and
lets the business know the best place to position the free sample link.

5. Forecasting
One of the tasks of a data mining specialist is forecasting what may be interpreted
from the data. They find data patterns and note what it could lead to by using
reasonable future predictions.
Understanding how the market may react to certain products or technologies can be
important for brands and businesses across many sectors. Implementing a new
technology such as a TCPA dialer brings both risks and benefits, and data can help
a business decide if it’s the right solution for them.
Therefore, the work undertaken during the data mining process can prove to be
essential for businesses that rely on forecasting trends.
As well as this, a data mining specialist will make sense of data by analyzing:
• Clustering – researching and recording groups of data, which is then analyzed
based on similarities.
• Deviations – detecting anomalies within the data, and how and why this could
have happened.
• Correlations – studying the closeness of two or more variables, determining how
they are associated with one another.
• Classification – looking for new patterns in the data.
This all aids businesses in making smart decisions, based on genuine data from their
consumers and the market they operate in.
On the other hand, data analytics is more about drawing conclusions from the data.
It works partly in conjunction with data mining forecasts by helping to apply the
techniques from its findings. Forecasting is not part of the data analytics process
because it focuses more on the data at hand. They collect, manipulate and analyze
the data. They can then prepare detailed reports drawing their own conclusions.
Forecasting is not to be confused with predictive analysis, which factors in a variety
of inputs to then predict future behavior. It gives an overall view of past, present and

© Copyright FHM.AVI – Data 408


Data Analyst

future consumer behavior. So, it can also even be applied to events that have already
happened.
Predictive analytics focuses more on statistics to predict outcomes. This could be
beneficial for businesses who want to optimize marketing campaigns, though it does
not give an insight into the market beyond this.
This differs from forecasting, which concentrates on predicting future trends in the
market for years to come.

6. Responsibilities
The expectations of data analytics and data mining findings vary because both have
different responsibilities.
While data mining is responsible for discovering and extracting patterns and
structure within the data, data analytics develops models and tests the hypothesis
using analytical methods.
Data mining specialists will work with three types of data: metadata, transactional,
and non-operational. This is reflective of their responsibilities within the data
analysis process. Transactional data is produced on a daily basis per ‘transaction’,
hence the name. This includes data from customer clicks on a website. For instance,
if you’re a software company, you may track how many customers click through
from searches like ‘best UCaaS providers’.
Non-operational data refers to data produced by a sector that can be utilized to a
company’s advantage. This involves investigating the data for insights and then
forecasting for the future. What’s more, metadata refers to the database design, and
how it holds the other data. This includes breaking the data down into categories,
such as field names, length, type etc. Because it’s organized in this way, specialists
find it easier to retrieve, interpret or use this information.
A data mining specialist’s responsibilities are often concerned with the way the data
is collected and presented. Here’s an example of how metadata is used to organize
information and present it:

© Copyright FHM.AVI – Data 409


Data Analyst

However, in data analytics, the team’s responsibilities are less about algorithms and
more about interpretation. They predict yields and interpret the underlying frequency
distribution for continuous data. This is so they can then report on relevant data when
completing their tasks.
Companies usually look to data analytics teams to assist them in making important
strategic decisions. This is one of their biggest responsibilities. Here are the different
types of data the team may analyze:

• Social media content engagement and social network activity


• Customer feedback from emails, surveys and focus groups
• Page visits and internet clickstream data
The findings from these investigations can lead to new revenue opportunities as well
as improved efficiency within the business. Their responsibility is to ensure that they
produce consistent results that can be used as guidance for the future.

© Copyright FHM.AVI – Data 410


Data Analyst

7. Area of Expertise
If you’re considering a career in data mining or data analytics, you need to be aware
of the different areas of expertise required to take on the job.
Data mining is a combination of machine learning, statistics and databases. Data
mining specialists need to master:
• Experience with operating systems such as LINUX
• Public-speaking skills
• Programming languages such as Javascript and Python
• Data analysis tools such as NoSQL and SAS
• Knowledge of industry trends
• Machine learning
This unique combination of technical, personal, and business skills is what makes a
data mining specialist sought after within the industry.
Data analytics requires a different set of skills, namely in computer science,
mathematics, machine learning, and statistics. Those who desire a data analytics
career need to have:
• Strong industry knowledge
• Good communication skills
• Data analysis tools such as NoSQL and SAS, as well as machine learning
• Mathematical skills for numerical data processing
• Critical thinking skills
By using those with a skill set as described above, teams should be able to collect
and analyze data, and provide a detailed report using project planning tools for the
process. Putting together a team of people all with strong data analytics skills can
take time, due to the specific requirements.

© Copyright FHM.AVI – Data 411


Data Analyst

6. Data Mining Process


The whole process of data mining cannot be completed in a single step. In other
words, you cannot get the required information from the large volumes of data as
simple as that. It is a very complex process than we think involving a number of
steps. The steps including data cleaning, data integration, data selection, data
transformation, data mining, pattern evaluation and knowledge representation are to
be completed in the given order.

1. Data Cleaning
This is, in short, the most crucial and most time-consuming part. Doing it well allows
Data users to interprete and analyze it accurately and vice versa, handling it bad or
incompletely can lead to misinterpretation and ultimately terrible business decision
making upon the information provided.
Data cleaning is the step where the data gets cleaned. Data in the real world is
normally incomplete, noisy and inconsistent. The data available in data sources
might be lacking attribute values, data of interest etc. For example, you want the
demographic data of customers and what if the available data does not include
attributes for the gender or age of the customers? Then the data is of course

© Copyright FHM.AVI – Data 412


Data Analyst

incomplete. Sometimes the data might contain errors or outliers. An example is an


age attribute with value 200. It is obvious that the age value is wrong in this case.
The data could also be inconsistent. For example, the name of an employee might
be stored differently in different data tables or documents. Here, the data is
inconsistent. If the data is not clean, the data mining results would be neither reliable
nor accurate.
Data cleaning involves a number of techniques including filling in the missing values
manually, combined computer and human inspection, etc. The output of data
cleaning step is adequately cleaned data.
Data Cleaning involves:
• Data Quality
• The workflow
• Inspection
• Cleaning
• Verifying
• Reporting

Data Quality

High-quality data needs to pass a set of quality criteria. Those include:


Validity

The degree to which the data conform to defined business rules or constraints.
Data-Type Constraints: values in a particular column must be of a particular
datatype, e.g., boolean, numeric, date, etc.
Range Constraints: typically, numbers or dates should fall within a certain range.
Mandatory Constraints: certain columns cannot be empty.
Unique Constraints: a field, or a combination of fields, must be unique across a
dataset.
Set-Membership constraints: values of a column come from a set of discrete
values, e.g. enum values. For example, a person’s gender may be male or female.

© Copyright FHM.AVI – Data 413


Data Analyst

Foreign-key constraints: as in relational databases, a foreign key column can’t have


a value that does not exist in the referenced primary key.
Regular expression patterns: text fields that have to be in a certain pattern. For
example, phone numbers may be required to have the pattern (999) 999–9999.
Cross-field validation: certain conditions that span across multiple fields must hold.
For example, a patient’s date of discharge from the hospital cannot be earlier than
the date of admission.

Accuracy

The degree to which the data is close to the true values.


While defining all possible valid values allows invalid values to be easily spotted, it
does not mean that they are accurate.
A valid street address mightn’t actually exist. A valid person’s eye colour, say blue,
might be valid, but not true (doesn’t represent the reality).
Another thing to note is the difference between accuracy and precision. Saying that
you live on the earth is, actually true. But, not precise. Where on the earth? Saying
that you live at a particular street address is more precise.

Completeness

The degree to which all required data is known.


Missing data is going to happen for various reasons. One can mitigate this problem
by questioning the original source if possible, say re-interviewing the subject.
Chances are, the subject is either going to give a different answer or will be hard to
reach again.

Consistency

The degree to which the data is consistent, within the same data set or across
multiple data sets.
Inconsistency occurs when two values in the data set contradict each other.
A valid age, say 10, mightn’t match with the marital status, say divorced. A customer
is recorded in two different tables with two different addresses.
Which one is true?

© Copyright FHM.AVI – Data 414


Data Analyst

Uniformity

The degree to which the data is specified using the same unit of measure.
The weight may be recorded either in pounds or kilos. The date might follow the
USA format or European format. The currency is sometimes in USD and sometimes
in YEN.
And so data must be converted to a single measure unit.

The Workflow
The workflow is a sequence of three steps aiming at producing high-quality data and
taking into account all the criteria we’ve talked about.
1. Inspection: Detect unexpected, incorrect, and inconsistent data.
2. Cleaning: Fix or remove the anomalies discovered.
3. Verifying: After cleaning, the results are inspected to verify correctness.
4. Reporting: A report about the changes made and the quality of the currently
stored data is recorded.
What you see as a sequential process is, in fact, an iterative, endless process. One
can go from verifying to inspection when new flaws are detected.

Inspection
Inspecting the data is time-consuming and requires using many methods for
exploring the underlying data for error detection. Here are some of them:

Data profiling

A summary statistics about the data, called data profiling, is really helpful to give a
general idea about the quality of the data.
For example, check whether a particular column conforms to particular standards or
pattern. Is the data column recorded as a string or number?
How many values are missing? How many unique values in a column, and their
distribution? Is this data set is linked to or have a relationship with another?

© Copyright FHM.AVI – Data 415


Data Analyst

Visualizations

By analyzing and visualizing the data using statistical methods such as mean,
standard deviation, range, or quantiles, one can find values that are unexpected and
thus erroneous.
For example, by visualizing the average income across the countries, one might see
there are some outliers (link has an image). Some countries have people who earn
much more than anyone else. Those outliers are worth investigating and are not
necessarily incorrect data.

Software packages

Several software packages or libraries available at your language will let you specify
constraints and check the data for violation of these constraints.
Moreover, they can not only generate a report of which rules were violated and how
many times but also create a graph of which columns are associated with which rules.

Cleaning
Data cleaning involve different techniques based on the problem and the data type.
Different methods can be applied with each has its own trade-offs.
Overall, incorrect data is either removed, corrected, or imputed.

© Copyright FHM.AVI – Data 416


Data Analyst

Irrelevant data

Irrelevant data are those that are not actually needed, and don’t fit under the context
of the problem we’re trying to solve.
For example, if we were analyzing data about the general health of the population,
the phone number wouldn’t be necessary — column-wise.
Similarly, if you were interested in only one particular country, you wouldn’t want
to include all other countries. Or, study only those patients who went to the surgery,
we wouldn’t include everyone — row-wise.
Only if you are sure that a piece of data is unimportant, you may drop it. Otherwise,
explore the correlation matrix between feature variables.
And even though you noticed no correlation, you should ask someone who is domain
expert. You never know! A feature that seems irrelevant, could be very relevant
from a domain perspective such as a clinical perspective!
Duplicates

Duplicates are data points that are repeated in your dataset.


It often happens when for example:
• Data are combined from different sources
• The user may hit submit button twice thinking the form wasn’t actually
submitted.
• A request to online booking was submitted twice correcting wrong
information that was entered accidentally in the first time.
A common symptom is when two users have the same identity number. Or, the same
article was scrapped twice.
And therefore, they simply should be removed.
Type conversion
Make sure numbers are stored as numerical data types. A date should be stored as a
date object, or a Unix timestamp (number of seconds), and so on.
Categorical values can be converted into and from numbers if needed.

© Copyright FHM.AVI – Data 417


Data Analyst

This is can be spotted quickly by taking a peek over the data types of each column
in the summary (we’ve discussed above).
A word of caution is that the values that can’t be converted to the specified type
should be converted to NA value (or any), with a warning being displayed. This
indicates the value is incorrect and must be fixed.

Syntax errors

Remove white spaces: Extra white spaces at the beginning or the end of a string
should be removed.
" hello world " => "hello world"
Pad strings: Strings can be padded with spaces or other characters to a certain width.
For example, some numerical codes are often represented with prepending zeros to
ensure they always have the same number of digits.
313 => 000313 (6 digits)

Fix typos: Strings can be entered in many different ways, and no wonder, can have
mistakes.
Gender
m
Male
fem.
FemalE
Femle

This categorical variable is considered to have 5 different classes, and not 2 as


expected: male and female since each value is different.
Therefore, our duty is to recognize from the above data whether each value is male
or female. How can we do that?
The first solution is to manually map each value to either “male” or “female”.
The second solution is to use pattern match. For example, we can look for the
occurrence of m or M in the gender at the beginning of the string.

© Copyright FHM.AVI – Data 418


Data Analyst

The third solution is to use fuzzy matching: An algorithm that identifies the
distance between the expected string(s) and each of the given one. Its basic
implementation counts how many operations are needed to turn one string into
another.
• A bar plot is useful to visualize all the unique values.

Standardize

Our duty is to not only recognize the typos but also put each value in the same
standardized format.
For strings, make sure all values are either in lower or upper case.
For numerical values, make sure all values have a certain measurement unit.
The height, for example, can be in meters and centimeters. The difference of 1 meter
is considered the same as the difference of 1 centimeter. So, the task here is to
convert the heights to one single unit.
For dates, the USA version is not the same as the European version. Recording the
date as a timestamp (a number of milliseconds) is not the same as recording the date
as a date object.

Scaling / Transformation

Scaling means to transform your data so that it fits within a specific scale, such as
0–100 or 0–1.

© Copyright FHM.AVI – Data 419


Data Analyst

For example, exam scores of a student can be re-scaled to be percentages (0–100)


instead of GPA (0–5).

It can also help in making certain types of data easier to plot. For example,
we might want to reduce skewness to assist in plotting (when having such many
outliers). The most commonly used functions are log, square root, and inverse.
Scaling can also take place on data that has different measurement units.
Student scores on different exams say, SAT and ACT, can’t be compared since these
two exams are on a different scale. The difference of 1 SAT score is considered the
same as the difference of 1 ACT score. In this case, we need re-scale SAT and
ACT scores to take numbers, say, between 0–1.
By scaling, we can plot and compare different scores.

Normalization

While normalization also rescales the values into a range of 0–1, the intention here
is to transform the data so that it is normally distributed. Why?
In most cases, we normalize the data if we’re going to be using statistical methods
that rely on normally distributed data. How?

One can use the log function, or perhaps, use one of these methods.

© Copyright FHM.AVI – Data 420


Data Analyst

Missing values

Given the fact the missing values are unavoidable leaves us with the question of what
to do when we encounter them. Ignoring the missing data is the same as diggingholes
in a boat; It will sink.
There are three, or perhaps more, ways to deal with them.

— One. Drop.
If the missing values in a column rarely happen and occur at random, then the easiest
and most forward solution is to drop observations (rows) that have missing values.
If most of the column’s values are missing, and occur at random, then a typical
decision is to drop the whole column.
This is particularly useful when doing statistical analysis, since filling in the missing
values may yield unexpected or biased results.

— Two. Impute.
It means to calculate the missing value based on other observations. There are quite
a lot of methods to do that.
— First one is using statistical values like mean, median. However, none of these
guarantees unbiased data, especially if there are many missing values.

Mean is most useful when the original data is not skewed, while the median is
more robust, not sensitive to outliers, and thus used when data is skewed.
In a normally distributed data, one can get all the values that are within 2 standard
deviations from the mean. Next, fill in the missing values by generating random
numbers between (mean — 2 * std) & (mean + 2 * std)
— Second. Using a linear regression. Based on the existing data, one can calculate
the best fit line between two variables, say, house price vs. size m².
It is worth mentioning that linear regression models are sensitive to outliers.
— Third. Hot-deck: Copying values from other similar records. This is only useful
if you have enough available data. And, it can be applied to numerical and
categorical data.

© Copyright FHM.AVI – Data 421


Data Analyst

One can take the random approach where we fill in the missing value witha
random value. Taking this approach one step further, one can first divide the dataset
into two groups (strata), based on some characteristic, say gender, and then fill in
the missing values for different genders separately, at random.
In sequential hot-deck imputation, the column containing missing values is sorted
according to auxiliary variable(s) so that records that have similar auxiliaries
occur sequentially. Next, each missing value is filled in with the value of the first
following available record.
What is more interesting is that 𝑘 nearest neighbor imputation, which classifies
similar records and put them together, can also be utilized. A missing value is then
filled out by finding first the 𝑘 records closest to the record with missing values.
Next, a value is chosen from (or computed out of) the 𝑘 nearest neighbors. In the
case of computing, statistical methods like mean (as discussed before) can be used.

— Three. Flag.
Some argue that filling in the missing values leads to a loss in information, no matter
what imputation method we used.
That’s because saying that the data is missing is informative in itself, and the
algorithm should know about it. Otherwise, we’re just reinforcing the pattern already
exist by other features.
This is particularly important when the missing data doesn’t happen at random. Take
for example a conducted survey where most people from a specific race refuse to
answer a certain question.
Missing numeric data can be filled in with say, 0, but has these zeros must be ignored
when calculating any statistical value or plotting the distribution.
While categorical data can be filled in with say, “Missing”: A new category which
tells that this piece of data is missing.

— Take into consideration …


Missing values are not the same as default values. For instance, zero can be
interpreted as either missing or default, but not both.

© Copyright FHM.AVI – Data 422


Data Analyst

Missing values are not “unknown”. A conducted research where some people didn’t
remember whether they have been bullied or not at the school, should be treated and
labelled as unknown and not missing.
Every time we drop or impute values we are losing information. So, flagging might
come to the rescue.

Outliers

They are values that are significantly different from all other observations. Any data
value that lies more than (1.5 * IQR) away from the Q1 and Q3
quartiles is considered an outlier.
Outliers are innocent until proven guilty. With that being said, they should not be
removed unless there is a good reason for that.
For example, one can notice some weird, suspicious values that are unlikely to
happen, and so decides to remove them. Though, they worth investigating before
removing.
It is also worth mentioning that some models, like linear regression, are very
sensitive to outliers. In other words, outliers might throw the model off from where
most of the data lie.

Verifying
When done, one should verify correctness by re-inspecting the data and making sure
it rules and constraints do hold.
For example, after filling out the missing data, they might violate any of the rules
and constraints.
It might involve some manual correction if not possible otherwise.

Reporting
Reporting how healthy the data is, is equally important to cleaning.
As mentioned before, software packages or libraries can generate reports of the
changes made, which rules were violated, and how many times.

© Copyright FHM.AVI – Data 423


Data Analyst

In addition to logging the violations, the causes of these errors should be considered.
Why did they happen in the first place?

2. Data Integration

Data integration is the process where data from different data sources are integrated
into one. Data lies in different formats in different locations. Data could be stored in
databases, text files, spreadsheets, documents, data cubes, Internet and so on. Data
integration is a really complex and tricky task because data from different sources
does not match normally. Suppose a table A contains an entity named customer_id
where as another table B contains an entity named number. It is really difficult to
ensure that whether both these entities refer to the same value or not. Metadata can
be used effectively to reduce errors in the data integration step. Another issue faced
is data redundancy. The same data might be available in different tables in the same
database or even in different data sources. Data integration tries to reduce
redundancy to the maximum possible level without affecting the reliability of data.
Strategies:

ETL
ETL (Extraction, Transformation and Loading) refers to the data integration
approach where data is extracted from the Origin Systems, then goes through the
transformation process and ends up loading into the Destination System.
The typical ETL scenario is typical used in Data Warehouse systems.

© Copyright FHM.AVI – Data 424


Data Analyst

ELT
EL-T (Extraction, Loading and Transformation) refers to the data integration
approach where data is extracted from the source systems and then loaded into the
destination system without transformations. The data transformation is performed
later in the target system. The ELT scenario is typical of Big Data/Hadoop systems.

Batch vs Real Time in Data Integration

BATCH REAL-TIME

• In Batch Processing, a large group • Real-time processing processes


of transactions is collected and the small groups of transactions on
data is processed during a single demand.
execution.
• The advantage of real-time
• Due to the large volume of data, the processing is that it provides
process must be executed when instant access to data runs with
resources are less busy (this step is fewer resources and improves
usually done at night). uptime.
• Batch processing delays access to • With real-time data integration,
data, requires close monitoring, you know your business as
and data may not be available for a transactions occur.
period of time.
• If an error occurs, it can be handled
• Due to the delay in data access, immediately.
knowledge is lost until processing
is complete.

© Copyright FHM.AVI – Data 425


Data Analyst

• Problems that occur during batch • Real-time processing design is


processing can delay the entire more complex.
process, so you need staff support
• Although real-time processing
to monitor it while it runs.
systems require more effort to
design and implement, the benefit
to the business can be enormous.

3. Data Selection
Data mining step requires large volumes of historical data for analysis. So, usually
the data repository with integrated data contains much more data than actually
required. From the available data, data of interest needs to be selected and stored.
Data selection is the step where the data relevant to the analysis is retrieved from the
database.

4. Data Mining
Data mining is the core step where a number of complex and intelligent methods are
applied to extract patterns from data. Data mining step includes a number of tasks
such as association, classification, prediction, clustering, and time series analysis
and so on.

5. Pattern Evaluation
The pattern evaluation identifies the truly interesting patterns representing
knowledge based on different types of interestingness measures. A pattern is
considered to be interesting if it is potentially useful, easily understandable by
humans, validates some hypothesis that someone wants to confirm or valid on new
data with some degree of certainty.

6. Knowledge Representation
The information mined from the data needs to be presented to the user in an
appealing way. Different knowledge representation and visualization techniques are
applied to provide the output of data mining to the users.

© Copyright FHM.AVI – Data 426


Data Analyst

7. Data Mining Process for Enterprise


(CRISP-DM)
Many different sectors are taking advantage of data mining to boost their business
efficiency, including manufacturing, chemical, marketing, aerospace, etc. Therefore,
the need for a conventional data mining process improved effectively. Data mining
techniques must be reliable, repeatable by company individuals with little or no
knowledge of the data mining context. As a result, a cross-industry standard process
for data mining (CRISP-DM) was first introduced in 1990, after going through many
workshops, and contribution for more than 300 organizations.
Cross-industry Standard Process of Data Mining (CRISP-DM) comprises of six
phases designed as a cyclical method as the given figure:

Let's examine the implementation process for data mining in details:

© Copyright FHM.AVI – Data 427


Data Analyst

1. Business understanding
It focuses on understanding the project goals and requirements form a business point
of view, then converting this information into a data mining problem afterward a
preliminary plan designed to accomplish the target.
Tasks:
• Determine business objectives
• Access situation
• Determine data mining goals
• Produce a project plan
Determine business objectives:
• It understands the project targets and prerequisites from a business point of
view.
• Thoroughly understand what the customer wants to achieve.
• Reveal significant factors, at the starting, it can impact the result of the
project.
Access situation:

© Copyright FHM.AVI – Data 428


Data Analyst

• It requires a more detailed analysis of facts about all the resources, constraints,
assumptions, and others that ought to be considered.
Determine data mining goals:
• A business goal states the target of the business terminology. For example,
increase catalog sales to the existing customer.
• A data mining goal describes the project objectives. For example, It assumes
how many objects a customer will buy, given their demographics details (Age,
Salary, and City) and the price of the item over the past three years.
Produce a project plan:
• It states the targeted plan to accomplish the business and data mining plan.
• The project plan should define the expected set of steps to be performed
during the rest of the project, including the latest technique and better
selection of tools.

2. Data Understanding
Data understanding starts with an original data collection and proceeds with
operations to get familiar with the data, to data quality issues, to find better insight
in data, or to detect interesting subsets for concealed information hypothesis.
Tasks:
• Collects initial data
• Describe data
• Explore data
• Verify data quality
Collect initial data:
• It acquires the information mentioned in the project resources.
• It includes data loading if needed for data understanding.
• It may lead to original data preparation steps.
• If various information sources are acquired then integration is an extra issue,
either here or at the subsequent stage of data preparation.

© Copyright FHM.AVI – Data 429


Data Analyst

Describe data:
• It examines the "gross" or "surface" characteristics of the information
obtained.
• It reports on the outcomes.
Explore data:
• Addressing data mining issues that can be resolved by querying, visualizing,
and reporting, including:
o Distribution of important characteristics, results of simple aggregation.
o Establish the relationship between the small number of attributes.
o Characteristics of important sub-populations, simple statical analysis.
• It may refine the data mining objectives.
• It may contribute or refine the information description, and quality reports.
• It may feed into the transformation and other necessary information
preparation.
Verify data quality:
• It examines the data quality and addressing questions.

3. Data Preparation
• It usually takes more than 90 percent of the time.
• It covers all operations to build the final data set from the original raw
information.
• Data preparation is probable to be done several times and not in any
prescribed order.
Tasks:
• Select data
• Clean data
• Construct data
• Integrate data

© Copyright FHM.AVI – Data 430


Data Analyst

• Format data
Select data:
• It decides which information to be used for evaluation.
• In the data selection criteria include significance to data mining objectives,
quality and technical limitations such as data volume boundaries or data types.
• It covers the selection of characteristics and the choice of the document in the
table.
Clean data:
• It may involve the selection of clean subsets of data, inserting appropriate
defaults or more ambitious methods, such as estimating missing information
by modeling.
Construct data:
• It comprises of Constructive information preparation, such as generating
derived characteristics, complete new documents, or transformed values of
current characteristics.
Integrate data:
• Integrate data refers to the methods whereby data is combined from various
tables, or documents to create new documents or values.
Format data:
• Formatting data refer mainly to linguistic changes produced to information
that does not alter their significance but may require a modeling tool.

4. Modeling
In modeling, various modeling methods are selected and applied, and their
parameters are measured to optimum values. Some methods gave particular
requirements on the form of data. Therefore, stepping back to the data preparation
phase is necessary.
Tasks:
• Select modeling technique
• Generate test design

© Copyright FHM.AVI – Data 431


Data Analyst

• Build model
• Access model
Select modeling technique:
• It selects the real modeling method that is to be used. For example, decision
tree, neural network.
• If various methods are applied, then it performs this task individually for each
method.
Generate test Design:
• Generate a procedure or mechanism for testing the validity and quality of the
model before constructing a model. For example, in classification, error rates
are commonly used as quality measures for data mining models. Therefore,
typically separate the data set into train and test set, build the model on the
train set and assess its quality on the separate test set.
Build model:
• To create one or more models, we need to run the modeling tool on the
prepared data set.
Assess model:
• It interprets the models according to its domain expertise, the data mining
success criteria, and the required design.
• It assesses the success of the application of modeling and discovers methods
more technically.
• It Contacts business analytics and domain specialists later to discuss the
outcomes of data mining in the business context.

5. Evaluation
• At the last of this phase, a decision on the use of the data mining results should
be reached.
• It evaluates the model efficiently, and review the steps executed to build the
model and to ensure that the business objectives are properly achieved.

© Copyright FHM.AVI – Data 432


Data Analyst

• The main objective of the evaluation is to determine some significant business


issue that has not been regarded adequately.
• At the last of this phase, a decision on the use of the data mining outcomes
should be reached.
Tasks:
• Evaluate results
• Review process
• Determine next steps
Evaluate results:
• It assesses the degree to which the model meets the organization's business
objectives.
• It tests the model on test apps in the actual implementation when time and
budget limitations permit and also assesses other data mining results
produced.
• It unveils additional difficulties, suggestions, or information for future
instructions.
Review process:
• The review process does a more detailed evaluation of the data mining
engagement to determine when there is a significant factor or task that has
been somehow ignored.
• It reviews quality assurance problems.
Determine next steps:
• It decides how to proceed at this stage.
• It decides whether to complete the project and move on to deployment when
necessary or whether to initiate further iterations or set up new data-mining
initiatives.it includes resources analysis and budget that influence the
decisions.

6. Deployment
Determine:

© Copyright FHM.AVI – Data 433


Data Analyst

• Deployment refers to how the outcomes need to be utilized.


Deploy data mining results by:
• It includes scoring a database, utilizing results as company guidelines,
interactive internet scoring.
• The information acquired will need to be organized and presented in a way
that can be used by the client. However, the deployment phase can be as easy
as producing. However, depending on the demands, the deployment phase
may be as simple as generating a report or as complicated as applying a
repeatable data mining method across the organizations.
Tasks:
• Plan deployment
• Plan monitoring and maintenance
• Produce final report
• Review project
Plan deployment:
• To deploy the data mining outcomes into the business, takes the assessment
results and concludes a strategy for deployment.
• It refers to documentation of the process for later deployment.
Plan monitoring and maintenance:
• It is important when the data mining results become part of the day-to-day
business and its environment.
• It helps to avoid unnecessarily long periods of misuse of data mining results.
• It needs a detailed analysis of the monitoring process.
Produce final report:
• A final report can be drawn up by the project leader and his team.
• It may only be a summary of the project and its experience.
• It may be a final and comprehensive presentation of data mining.
Review project:

© Copyright FHM.AVI – Data 434


Data Analyst

• Review projects evaluate what went right and what went wrong, what was
done wrong, and what needs to be improved.

8. Types of data that can be mined


1. Data stored in the database
A database is also called a database management system or DBMS. Every DBMS
stores data that are related to each other in a way or the other. It also has a set of
software programs that are used to manage data and provide easy access to it. These
software programs serve a lot of purposes, including defining structure for database,
making sure that the stored information remains secured and consistent, and
managing different types of data access, such as shared, distributed, and concurrent.
A relational database has tables that have different names, attributes, and can store
rows or records of large data sets. Every record stored in a table has a unique key.
Entity-relationship model is created to provide a representation of a relational
database that features entities and the relationships that exist between them.

2. Data warehouse
A data warehouse is a single data storage location that collects data from multiple
sources and then stores it in the form of a unified plan. When data is stored in a data
warehouse, it undergoes cleaning, integration, loading, and refreshing. Data stored
in a data warehouse is organized in several parts. If you want information on data
that was stored 6 or 12 months back, you will get it in the form of a summary.

3. Transactional data
Transactional database stores records that are captured as transactions. These
transactions include flight booking, customer purchase, click on a website, and
others. Every transaction record has a unique ID. It also lists all those items that
made it a transaction.

© Copyright FHM.AVI – Data 435


Data Analyst

4. Other types of data


We have a lot of other types of data known for their structure, semantic meanings,
and versatility. They are used in a lot of applications. Here are a few of those data
types: data streams, engineering design data, sequence data, graph data, spatial data,
multimedia data, and more.

9. Data Mining Techniques

1. Association
It is one of the most used data mining techniques out of all the others. In this
technique, a transaction and the relationship between its items are used to identify a
pattern. This is the reason this technique is also referred to as a relation technique. It
is used to conduct market basket analysis, which is done to find out all those products
that customers buy together on a regular basis.

© Copyright FHM.AVI – Data 436


Data Analyst

This technique is very helpful for retailers who can use it to study the buying habits
of different customers. Retailers can study sales data of the past and then lookout for
products that customers buy together. Then they can put those products in close
proximity of each other in their retail stores to help customers save their time and to
increase their sales.

2. Clustering
This technique creates meaningful object clusters that share the same characteristics.
People often confuse it with classification, but if they properly understand how both
these techniques work, they won’t have any issue. Unlike classification that puts
objects into predefined classes, clustering puts objects in classes that are defined by
it.
Let us take an example. A library is full of books on different topics. Now the
challenge is to organize those books in a way that readers don’t have any problem in
finding out books on a particular topic. We can use clustering to keep books with
similarities in one shelf and then give those shelves a meaningful name. Readers
looking for books on a particular topic can go straight to that shelf. They won’t be
required to roam the entire library to find their book.

© Copyright FHM.AVI – Data 437


Data Analyst

3. Classification
This technique finds its origins in machine learning. It classifies items or variables
in a data set into predefined groups or classes. It uses linear programming, statistics,
decision trees, and artificial neural network in data mining, amongst other
techniques. Classification is used to develop software that can be modelled in a way
that it becomes capable of classifying items in a data set into different classes.
For instance, we can use it to classify all the candidates who attended an interview
into two groups – the first group is the list of those candidates who were selected
and the second is the list that features candidates that were rejected. Data mining
software can be used to perform this classification job.

4. Prediction
This technique predicts the relationship that exists between independent and
dependent variables as well as independent variables alone. It can be used to predict
future profit depending on the sale. Let us assume that profit and sale are dependent

© Copyright FHM.AVI – Data 438


Data Analyst

and independent variables, respectively. Now, based on what the past sales data says,
we can make a profit prediction of the future using a regression curve.

5. Sequential patterns
This technique aims to use transaction data, and then identify similar trends, patterns,
and events in it over a period of time. The historical sales data can be used to discover
items that buyers bought together at different times of the year. Business can make
sense of this information by recommending customers to buy those products at times
when the historical data doesn’t suggest they would. Businesses can use lucrative
deals and discounts to push through this recommendation.

6. Regression
Regression analysis in the data mining process is used to identify and analyze the
relationship between variables because of the presence of the other factor. It is used
to define the probability of the specific variable. Regression, primarily a form of
planning and modeling. For example, we might use it to project certain costs,
depending on other factors such as availability, consumer demand, and competition.
Primarily it gives the exact relationship between two or more variables in the given
data set.

© Copyright FHM.AVI – Data 439


Data Analyst

7. Outer (Anomaly) detection


This type of data mining technique relates to the observation of data items in the data
set, which do not match an expected pattern or expected behavior. This technique
may be used in various domains like intrusion, detection, fraud detection, etc. It is
also known as Outlier Analysis or Outlier mining. The outlier is a data point that
diverges too much from the rest of the dataset. The majority of the real-world
datasets have an outlier. Outlier detection plays a significant role in the data mining
field. Outlier detection is valuable in numerous fields like network interruption
identification, credit or debit card fraud detection, detecting outlying in wireless
sensor network data, etc.

© Copyright FHM.AVI – Data 440


Data Analyst

10. Data Mining Applications


1. Financial Analysis
The banking and finance industry relies on high-quality, reliable data. In loan
markets, financial and user data can be used for a variety of purposes, like predicting
loan payments and determining credit ratings. And data mining methods make such
tasks more manageable.
Classification techniques facilitate separation of crucial factors that influence
customers’ banking decisions from the irrelevant ones. Further, multidimensional
clustering techniques allow identification of customers with similar loan payment
behaviors. Data analysis and mining can also help detect money laundering and other
financial crimes. Read more about data science applications in finance industry

2. Telecommunication Industry
Expanding and growing at a fast pace, especially with the advent of the internet. Data
mining can enable key industry players to improve their service quality to stayahead
in the game.
Pattern analysis of spatiotemporal databases can play a huge role in mobile
telecommunication, mobile computing, and also web and information services. And
techniques like outlier analysis can detect fraudulent users. Also, OLAP and
visualization tools can help compare information, such as user group behavior,
profit, data traffic, system overloads, etc.

3. Intrusion Detection
Global connectivity in today’s technology-driven economy has presented security
challenges for network administration. Network resources can face threats and
actions that intrude on their confidentiality or integrity. Therefore, detection of
intrusion has emerged as a crucial data mining practice.
It encompasses association and correlation analysis, aggregation techniques,
visualization, and query tools, which can effectively detect any anomalies or
deviations from normal behaviour.

© Copyright FHM.AVI – Data 441


Data Analyst

4. Retail Industry
The organized retail sector holds sizable quantities of data points covering sales,
purchasing history, delivery of goods, consumption, and customer service. The
databases have become even larger with the arrival of e-commerce marketplaces.
In modern-day retail, data warehouses are being designed and constructed to get the
full benefits of data mining. Multidimensional data analysis helps deal with data
related to different types of customers, products, regions, and time zones. Online
retailers can also recommend products to drive more sales revenue and analyze the
effectiveness of their promotional campaigns. So, from noticing buying patterns to
improving customer service and satisfaction, data mining opens many doors in this
sector.

5. Higher Education
As the demand for higher education goes up worldwide, institutions are looking for
innovative solutions to cater to the rising needs. Institutions can use data mining to
predict which students would enroll in a particular program, who would require
additional assistance to graduate, refining enrollment management overall.
Moreover, the prognosis of students’ career paths and presentation of data would
become more comfortable with effective analytics. In this manner, data mining
techniques can help uncover the hidden patterns in massive databases in the field of
higher education.

6. Energy Industry
Big Data is available even in the energy sector nowadays, which points to the need
for appropriate data mining techniques. Decision tree models and support vector
machine learning are among the most popular approaches in the industry, providing
feasible solutions for decision-making and management. Additionally, data mining
can also achieve productive gains by predicting power outputs and the clearing price
of electricity.

© Copyright FHM.AVI – Data 442


Data Analyst

7. Spatial Data Mining


Geographic Information Systems (GIS) and several other navigation applications
make use of data mining to secure vital information and understand its implications.
This new trend includes extraction of geographical, environment, and astronomical
data, including images from outer space. Typically, spatial data mining can reveal
aspects like topology and distance.

8. Biological Data Analysis


Biological data mining practices are common in genomics, proteomics, and
biomedical research. From characterizing patients’ behavior and predicting office
visits to identifying medical therapies for their illnesses, data science techniques
provide multiple advantages.
Some of the data mining applications in the Bioinformatics field are:
• Semantic integration of heterogeneous and distributed databases
• Association and path analysis
• Use of visualization tools
• Structural pattern discovery
• Analysis of genetic networks and protein pathways

9. Other Scientific Applications


Fast numerical simulations in scientific fields like chemical engineering, fluid
dynamics, climate, and ecosystem modeling generate vast datasets. Data mining
brings capabilities like data warehouses, data preprocessing, visualization, graph-
based mining, etc.

10. Manufacturing Engineering


System-level designing makes use of data mining to extract relationships between
portfolios and product architectures. Moreover, the methods also come in handy for
predicting product costs and span time for development.

© Copyright FHM.AVI – Data 443


Data Analyst

11. Criminal Investigation


Data mining activities are also used in Criminology, which is a study of crime
characteristics. First, text-based crime reports need to be converted into word
processing files. Then, the identification and crime-machining process would take
place by discovering patterns in massive stores of data.

12. Counter-Terrorism
Sophisticated mathematical algorithms can indicate which intelligence unit should
play the headliner in counter-terrorism activities. Data mining can even help with
police administration tasks, like determining where to deploy the workforce and
denoting the searches at border crossings.

© Copyright FHM.AVI – Data 444


Data Analyst

11. Advantages and Disadvantages of Data


Mining
1. Advantages of Data Mining
Marketing / Retail
Data mining helps marketing companies build models based on historical data to
predict who will respond to the new marketing campaigns such as direct mail, online
marketing campaign…etc. Through the results, marketers will have an appropriate
approach to selling profitable products to targeted customers.
Data mining brings a lot of benefits to retail companies in the same way as
marketing. Through market basket analysis, a store can have an appropriate
production arrangement in a way that customers can buy frequent buying products
together with pleasant. In addition, it also helps the retail companies offer certain
discounts for particular products that will attract more customers.

Finance / Banking
Data mining gives financial institutions information about loan information and
credit reporting. By building a model from historical customer data, the bank, and
financial institution can determine good and bad loans. In addition, data mining helps
banks detect fraudulent credit card transactions to protect the credit card’s owner.

Manufacturing
By applying data mining in operational engineering data, manufacturers can detect
faulty equipment and determine optimal control parameters. For example,
semiconductor manufacturers have a challenge that even the conditions of
manufacturing environments at different wafer production plants are similar, the
quality of wafer are a lot the same and some for unknown reasons even has defects.
Data mining has been applying to determine the ranges of control parameters that
lead to the production of the golden wafer. Then those optimal control parameters
are used to manufacture wafers with desired quality.

© Copyright FHM.AVI – Data 445


Data Analyst

Governments
Data mining helps government agencies by digging and analyzing records of
financial transaction to build patterns that can detect money laundering or criminal
activities.

2. Disadvantages of data mining


Privacy Issues
The concerns about personal privacy have been increasing enormously recently
especially when the internet is booming with social networks, e-commerce, forums,
blogs…. Because of privacy issues, people are afraid of their personal information
is collected and used in an unethical way that potentially causing them a lot of
trouble. Businesses collect information about their customers in many ways for
understanding their purchasing behavior trends. However businesses don’t last
forever, some days they may be acquired by others or gone. At this time, the personal
information they own probably is sold to others or leak.

Security issues
Security is a big issue. Businesses own information about their employees and
customers including social security numbers, birthdays, payroll and etc. However,
how properly this information is taken care of is still in question. There have been a
lot of cases that hackers accessed and stole big data of customers from a big
corporation such as Ford Motor Credit Company, Sony… with so much personal
and financial information available, the credit card was stolen and identity theft
becomes a big problem.

Misuse of information/inaccurate information


Information is collected through data mining intended for ethical purposes can be
misused. This information may be exploited by unethical people or businesses to
take the benefits of vulnerable people or discriminate against a group of people.
In addition, the data mining technique is not perfectly accurate. Therefore, if
inaccurate information is used for decision-making, it will cause serious
consequences.

© Copyright FHM.AVI – Data 446


Data Analyst

12. Data Mining Tools


There are many tools in the market both open source and proprietary with varying
levels of sophistication. At the root, each tool helps with implementing a data mining
strategy, but the difference lies in the level of sophistication you the customer of
these software needs. There are tools that do well in a specific domain such as the
Financial domain or the Scientific domain.

1. Rapid Miner
A data science software platform providing an integrated environment for various
stages of data modelling including data preparation, data cleansing, exploratory data
analysis, visualization and more. The techniques that the software helps with are
machine learning, deep learning, text mining and predictive analytics. Easy to use
GUI tools that take you through the modelling process. This tool written entirely in
Java is an open-source framework and is wildly popular in the data mining world.

2. Oracle Data Mining


Oracle, the world leader database software, combines its prowess in database
technologies with Analytical tools and brings you Oracle Advanced Analytics
Database part of the Oracle Enterprise Edition. It features several data mining
algorithms for classification, regressing, prediction, anomaly detection and more.
This is proprietary software and is supported by Oracle technical staff in helping
your business build a robust data mining infrastructure at the enterprise scale.
The algorithms integrate directly with Oracle database kernel and operate natively
on data stored in its own database, eliminating the need for extraction of data into
standalone analytics servers. The Oracle Data Miner provides GUI tools taking the
user through the process of creating, testing and applying data models

© Copyright FHM.AVI – Data 447


Data Analyst

3. IBM SPSS Modeler


IBM is again a big name in the data space when it comes to large enterprises. It
combines well with leading technologies to implement a robust enterprise-wide
solution. IBM SPSS Modeler is a visual data science and machine learning solution,
helping in shortening the time to value by speeding up operational tasks for data
scientists. IBM SPSS Modeler will have you covered from drag and drop data
exploration to machine learning.
The software is used in leading enterprises for data preparation, discovery, predictive
analytics, model management and deployment. The tool helps organizations to tap
into their data assets and applications easily. One of the advantages of proprietary
software is its ability to meet robust governance and security requirements of an
organization at the enterprise level, and this reflects in every tool that IBM offers on
the data mining front.

4. Knime
Konstanz Information Miner is an open-source data analysis platform, helping you
with build, deployment and scale in no time. The tool aims to help make predictive
intelligence accessible to inexperience users. It aims to make the process easy by it
is a step by step guide based GUI tools. The product markets itself as an End to End
Data Science product, that helps create and production data science using its single
easy and intuitive environment.

5. Python
Python is a freely available and open-source language that is known to have a quick
learning curve. Combined with is the ability as a general-purpose language and it is
a large library of packages that help build a system for creating data models from the
scratch, Python makes for a great tool for organizations who want the software they
use to be custom built to their specifications.

© Copyright FHM.AVI – Data 448


Data Analyst

With Python, you won’t get the fancy stuff that proprietary software offers, but the
functionality is there for anybody to pick up and creates their own environment with
graphical interfaces of their liking. What also supports python is the large online
community of package developers who ensure the packages on offer are robust and
secure. One of the features Python is known for in this field is powerful on the fly
visualization features it offers.

6. Orange
Orange is a machine learning and data science suite, using python scripting and
visual programming featuring interactive data analysis and component-based
assembly of data mining systems. Orange offers a broader range of features than
most other Python-based data mining and machine learning tools. It is a software
that has over 15 years of active development and use. Orange also offers a visual
programming platform with GUI for interactive data visualization.

7. Kaggle
The largest community of data scientists and machine learning professionals. Kaggle
although started as a platform for machine learning competitions, is now extending
its footprint into the public cloud-based data science platform arena. Kaggle now
offers code and data that you need for your data science implementations. There are
over 50k public datasets and 400k public notebooks that you can use to ramp up your
data mining efforts. The huge online community that Kaggle enjoys is your safety
net for implementation-specific challenges.

8. Rattle
The rattle is an R language based GUI tool for data mining requirements. The tool
is free and open-source and can be used to get statistical and visual summaries of
data, the transformation of data for data models, build supervised and unsupervised
machine learning models and compare model performance graphically.

© Copyright FHM.AVI – Data 449


Data Analyst

9. Weka
Waikato Environment for Knowledge Analysis (Weka) is a suite of machine
learning tools written in Java. A collection of visualization tools for predictive
modelling in a GUI presentation, helping you build your data models and test them,
observing the model performances graphically.

10. Teradata
A cloud data analytics platform marketing its no code required tools in a
comprehensive package offering enterprise-scale solutions. With Vantage Analyst,
you don’t need to be a programmer to code complex machine learning algorithms.
A simple GUI based system for quick enterprise-wide adoption.

11. MonkeyLearn
MonkeyLearn is a machine learning platform that specializes in text mining.
Available in a user-friendly interface, you can easily integrate MonkeyLearn with
your existing tools to perform data mining in real-time.
MonkeyLearn supports various data mining tasks, from detecting topics, sentiment,
and intent, to extracting keywords and named entities.
With MonkeyLearn, you can also connect your analyzed data to MonkeyLearn
Studio, a customizable data visualization dashboard that makes it even easier to
detect trends and patterns in your data.

© Copyright FHM.AVI – Data 450


Data Analyst

13. Data Mining Books


1. Introduction to Data Mining
by Tan, Steinbach & Kumar
Basically, this book is a very good introduction book for
data mining. It discusses all the main topics of data
mining that are clustering, classification, pattern mining,
and outlier detection. Moreover, it contains two very
good chapters on clustering by Tan & Kumar.
The book gives both theoretical and practical knowledge
of all data mining topics. It also contains many integrated
examples and figures. Every important topic is presented into two chapters,
beginning with basic concepts that provide the necessary background for learning
each data mining technique, then it covers more complex concepts and algorithms.

2. An Introduction to Statistical Learning: with


Applications in R
by Gareth James & Daniela Witten
Overview of statistical learning based on large datasets of
information. The exploratory techniques of the data are
discussed using the R programming language. The book
provides important prediction and modeling techniques,
along with relevant applications. It includes topics like
linear regression, classification, clustering, shrinkage
approaches, resampling methods, tree-based methods,
support vector machines. Color graphics and real-world examples illustrate the
methods presented.

© Copyright FHM.AVI – Data 451


Data Analyst

3. Data Science for Business: What you need to know


about data mining and data-analytic thinking
by Foster Provost & Tom Fawcett
Generally, an introduction to data science principles and
theory. Also, it explains the necessary analytical thinking
to approach this kind of problem. Further, it discusses
various data mining techniques to explore information.
You will learn to visualize business problems data-
analytically by using the data-mining process to collect
good data in the appropriate way. The book will help you
understand general concepts for gaining knowledge from data.

4. Modeling with Data


by Ben Klemens
This book focuses on some processes to solve analytical
problems applied to data. In particular, it explains the
theory to create tools. That is for exploring big datasets of
information.
The book also offers a narrative to the necessary points
about statistics, although it directly implies that this book
is incomplete relative to all the encyclopedic texts.

5. Mining the Social Web


by Matthew A. Russell
Data Mining Facebook, Twitter, LinkedIn, Google+,
GitHub, and More.
The exploration of social web data is explained in this
book. Data capture from social media apps. Also, it’s
manipulation and the final visualization tools are the focus
of this resource.

© Copyright FHM.AVI – Data 452


Data Analyst

The book provides an accurate synopsis of the social web landscape, the usage of
Docker to smoothly run each chapter’s example code, packaged as a Jupyter
notebook, understanding the process of adapting and contributing to the code’s open
source GitHub repository.

14. References:
1. http://www.saedsayad.com/data_mining_map.htm

2. https://www.javatpoint.com/data-mining

3. https://www.ibm.com/cloud/learn/data-mining

4. https://www.discoverdatascience.org/career-
information/data-mining-specialist
5. https://www.codemotion.com/magazine/dev-hub/big-data-
analyst/data-analytics-data-mining
6. https://www.wideskills.com/data-mining-tutorial/data-
mining-processes
7. https://www.upgrad.com/blog/data-mining-techniques

8. https://www.zentut.com/data-mining/advantages-and-
disadvantages-of-data-mining

© Copyright FHM.AVI – Data 453


Data Analyst

Data Insight
What is Data Insight?
Data insights refer to the process of collecting, analyzing, and acting on data related
to your company and its clients. The goal is simple: make better decisions. A good
data management and analysis strategy help companies monitor the health of critical
systems, streamline processes, and improve profitability.
When people talk about data insights, they usually referring to three core
components:
• Data: think about what happens when you go on Facebook. When you log
into the app, you allow it to collect information about: where you live, articles
you read, places you’ve been, where you went to collect, and other sensitive
information. It can even send you push notifications each time you get a
comment or reaction. Every time you use Facebook, it sends this unstructured
data to a warehouse or database in the form of numbers and text.
• Analytics: So what does Facebook does with all this information? With the
help of data insights tools, the social media giant will break your information
down into smaller groups to create a detailed buyer persona of you. Facebook
can then understand how and why you behave the way you do. They can tell
from the data that you’re getting married or planning to buy a house based on
your actions in the app.
• Insights: Now that Facebook better understands you, they can send
personalized offers to you and other customers similar to you. The platform
can now send targeted messages about new homes to buy in your area, or
offers for low mortgage rates. Actionable insights are what you learn
from data collected, and how your business can use it to improve an objective.

Why are data insights important?


74% of companies say they want to be “data-driven,” according to a Forrester report,
but only 29% are actually successful at connecting analytics to insight. Data will be

© Copyright FHM.AVI – Data 454


Data Analyst

the largest area of spending for companies as they attempt to become more data-
driven. If your business is going to survive, you need a strategy for the future. You
have marketing, social media, web analytics, sales, and support. Staying up-to-date
with reliable information on your growth is not easy tasks. Data insights can give
you a clear overview of what’s happening across your business. And see everything
to gather with the data visualization tools.
With easy access and visibility to data, it’s easier to process the information and
make smarter decisions, faster. Teams can see key metrics into how your company
is performing, how successful campaigns are, where your best customers are coming
from, and so much more. A smart data strategy can help companies of all sizes
improve their bottom line, especially small businesses that need to do more with less.

What are required to draw data insight?


Analytics
One of the widely used tools to derive actionable insights from your data, is
analytics.
Let us understand what Analytics is. It is the practice of managing, capturing &
deriving meaningful insights by turning raw data into information.
For example, it is very imperative for digital marketers to understand the behavior
of their target audience & from what area or region they belong in order to get more
reach & impressions/views. They need to have a clear footprint of the views, clicks
i.e. let’s say 50% of people who came to my website are from south India while the
rest are from nearby areas. This means that the demand for what I am selling through
my website is available in the south as well.
Analytics is the hottest topic ever because the world is evolving digitally, which in
turn is generating data in billions. Additionally, the speed at which change is taking
place in the 21st century is remarkable. As a result, businesses follows a 3 folds
approach:

• Visualizing the Data

© Copyright FHM.AVI – Data 455


Data Analyst

• Deriving Insights from the data


• Using that data for forecasts
Google Analytics, known in the industry as GA, is another free “service that provides
statistics and basic analytical tools for search engine optimization and marketing
purposes”. It has a feature that reduces the work that is required to put Google
Analytics data into Google Docs, Sites or Spreadsheets.
In Google Analytics, you can choose one of the many reports that Google Creates or
can even build your own customized report using the drag and drop interface.

Observation
What is implied here is nothing but a simple observation. Observations merely report
what was witnessed when reviewing consumer data. It can be sales patterns over a
period of time, changes in trend, use of the internet over a certain time period, and
so on & so forth. Observation helps to understand human behavior in not just one
particular focused aspect but rather in a wide spectrum of areas.
Observations make generalized assessments of data that apply to many people.
Here is an example of an observation technique, “Smartphone users in India on an
average spend 70 minutes per day watching videos”. (Source) This is an observation.
Now, Insight from this, People in India who use smartphones are more digitally
active & prefer to watch videos due to low tariff rates and easy availability of data
& regional content as the case may be. Moreover, this is fueled by an aggressive
pricing strategy specific to mobile data & online video & movie streaming platforms.
For digital marketers, observing other people on social media, the term is known as,
“Social Listening”, is a useful observation technique that helps marketers understand
their Target audience in a more vivid manner.

Past experiences (Trends/Graphs)


In continuation of analytics & observation, every insight drawn from either of the
two, individually or combined, is an experience for the business. For example, in the

© Copyright FHM.AVI – Data 456


Data Analyst

past, it was seen that the demand of 40 inch LEDs was more than the 30inch LEDs.
But the production of 30 inch LEDs was more and when the results arrived from the
market, the company wrongly predicted the demand. So, it was understood that our
customers were inclined more towards 30 inches than 40.
From past experiences, examine different time ranges, such as week over week,
month over month, or summer over summer.
Learn to involve others, because one individual can’t see everything, invite others to
delve into the data. It’s vital to have several pairs of eyes looking for actionable
insights. Past experiences can also help understand consumer insights as they are
typically derived by analyzing data to see consumers through a different light. It then
inspires a unique business action to meet the consumer more effectively.
Lastly, the best insights often come from looking not at singular data points but at
trends, especially when they change direction. So, keep looking out for the trends
that change direction or a course of action for the customers.

Secondary Research
The most prominent & imperative for a newcomer in the market is to derive insights
from research already carried out by other individuals in the same area under
consideration. From secondary research data, one can not only derive subjective
insights as per the requirement but rather use the already created insights by the
researcher.
This activity helps in understanding what was & what can be. It provides the
opportunity to focus on the future & save time which otherwise would have been
gone in collecting data single-handedly.

Customer Journey Maps


Using in-depth consumer data to understand who they are, what motivates them,
what their priorities are in life, and what daily challenges they face will prove
invaluable in unearthing the consumer insights.
Start by drafting genuine purchaser personas, pen pictures of your buyers that serve
to bring your demographics to life, and to give your information some unique
circumstances.

© Copyright FHM.AVI – Data 457


Data Analyst

This sort of information likewise empowers you to delineate numerous buyer


ventures you need to follow and each touchpoint included, seeing how your crowd
cooperates with your image every step of the way.
There is the customer journey that maps the purchaser journey but there is also the
day to day life of that consumer that influences this journey. It not only is imperative
to use these maps to derive insights but to understand the consumers better, their
behavior in certain situations & scenarios.
Below is a diagram which shows the customer journey map –
So, to say nevertheless, data is the oil that is driving the industries across the globe.
Analytics, Research, observations, etc. are nothing but some important techniques to
derive insights. However, keeping the COVID-19 pandemic in mind, the economy
post-COVID will be something of a game-changer & these techniques, of course,
will be supported by tools & technology which will give insights no one would have
even imagined.

How to Drive Business Growth With


Analytics
1. It helps you set realistic goals.
Setting goals for your business will involve guesswork without the right information.
You don't want your business goals to be a moving target, shifting from time to time.
This is where analytics comes into play.
With analytics, you'll be able to assemble data from historic trends and previous
activities. You'll have a clear idea of what your goals should and can be right from
the start. This ensures that you don't miss opportunities to help your business grow.
It also makes it very clear when certain goals aren't possible. Such data analysis will
also show you both weaknesses and strengths that you can improve on to grow your
business.

2. It supports decision-making.
A decision-maker's intuition and experience are valuable, but decisions that affect a
business ought to rely on data. You can't make good decisions about inventory

© Copyright FHM.AVI – Data 458


Data Analyst

management, pricing strategies and other business factors without data. Even when
it comes to hiring people, actionable data can help you assess how many people to
hire and how best to deploy them.
Businesses that use data are three times as likely to report that their decision-making
has improved. By incorporating data analytics into your decisions, you'll reduce
risks. You'll also feel more confident that you're improving your efficiency and
business profits.
3. It helps you find your ideal demographic.
There are many ways to identify your ideal demographic with analytics. Data from
your existing customer base and social media are valuable sources of information.
You can study your competitors' audience as well. There's also publicly available
data that can be helpful. By using analytics to study your audience, you'll get rich
and insightful information.
There are many tools that make it easy to gather data about your audience. You can
add Google Analytics to your website. Facebook and other social media platforms
also offer solutions for businesses. You can also add analytics plugins to your site to
understand user behavior. You'll know how people interact with your products and
the type of content that creates the most engagement.
Once you've determined your ideal demographic, you can provide tailored content
and solutions. In this way, leveraging analytics can lead to more conversions.
4. You can segment your audience.
You can divide your audience into distinct groups with analytics. This provides great
value for your business, helping you avoid forcing irrelevant content on your
audience.
You'll be able to create better products and make your communications more
personalized. Tools like Google Analytics will help you learn more about your
audience. You can use it to analyze search queries that lead to your site. This helps
uncover your users' intent. Once you know what your audience is looking for, you'll
be in a better position to meet those needs.

5. It helps you create mass personalization.

© Copyright FHM.AVI – Data 459


Data Analyst

When you've segmented your audience with the help of analytics tools, you can
create mass personalization. You can set up tools to automatically and effectively
personalize email marketing content. You can automate and personalize ad content
to target large groups of people and create personalization at the same time. This
boosts your business's reach and impact, driving up conversions.
Personalization is powerful – 74% of marketers agree that targeted personalization
drives up customer engagement. By using data to drive your marketing efforts, you
can set up relevant experiences across your customer's purchase journey.
6. You can increase your revenue and lower your costs.
Analytics plays a major role in decreasing business costs and increasing revenue. It's
vital to ensure that you are using important resources as effectively as possible. One
study shows that companies that adopt data-driven marketing strategies can increase
revenue by 20% and reduce costs by 30%.
Analytics is useful to monitor e-commerce activities, ad campaigns and multifunnel
channels. It allows you to measure their performance and effectiveness, making
it easy to see what works and what doesn't. You can stop campaigns that aren't
bringing in results, figure out which keywords are driving traffic, and put resources
into better keywords and campaigns.

7. You can boost your memberships.


You can leverage data analytics to boost membership rates on your site. Analytics
provides insights that can help you optimize your membership campaigns. You can
pinpoint what's working and put more resources there.
Analytics helps websites built on membership platforms by telling them more about
their members. Online educational sites, hobby platforms, communities, and forums
all need data to manage and maintain their websites.
You can use membership site analytics to get a deeper understanding of a community
site. It's easy to measure historic growth and current trends. You can then project
your site's membership growth more accurately instead of guessing. You'll get
rich data on membership trends, numbers and behaviors. This tells you what content
gets the most traffic and why membership growth rates stall.
8. It helps you monitor social media.

© Copyright FHM.AVI – Data 460


Data Analyst

Analytics tools can also provide in-depth information on what people are saying
about your brand online. You can track your brand mentions and hashtags on social
media. Social media websites provide analytics to measure the effectiveness of ad
campaigns on their platforms. You'll get data on where your audience is from and
what their interests are. These platforms also let you know what devices your
audience is using and provide important demographic information. You can use this
information to optimize your website and manage content.
If your brand is trending online, you can gather and analyze information that will
help you leverage this in the future. With billions of people active on social media,
analytics can help you reach out to the right audience. This will help you boost traffic
and drive conversions to help your business grow.

Techniques to discover patterns in data.


The search for insight can be thought of as the effort to understand how something
complicated really works by analyzing its data.

Hypothesis and Prediction


Before you explore the data, write down a short list of what you expect to see in the
data: the distribution of key variables, the relationships between important pairs of
them, and so on. Such a list is essentially a prediction based on your current
understanding of the business.

Formulation of a question
Now analyze the data. Make plots, do summaries, whatever is needed to see if it
matches your expectations. Is there anything that doesn’t match? Anything that
makes you go “That’s odd” or “That doesn’t make any sense.”? Terms commonly
associated with statistical hypotheses are null hypothesis and alternative hypothesis.
A null hypothesis is the conjecture that the statistical hypothesis is false; for example,
that the new drug does nothing and that any cure is caused by chance. Researchers
normally want to show that the null hypothesis is false. The alternativehypothesis is
the desired outcome, that the drug does better than chance.

© Copyright FHM.AVI – Data 461


Data Analyst

Testing
Perform data visualization, exploratory data analysis, or statistical analysis to test
your hypothesis and prediction.

Analysis
Analyze the result of your test. and deciding on the next actions to take. The
predictions of the hypothesis are compared to those of the null hypothesis, to
determine which is better able to explain the data. If the evidence has falsified the
hypothesis, a new hypothesis is required; if the experiment supports the hypothesis
but the evidence is not strong enough for high confidence, other predictions from the
hypothesis must be tested. Zoom in and try to understand what in your business is
making that weird thing show up in the data like that. This is the critical step.
For example, let’s assume that we are looking at transaction data from a large B2C
retailer. One of the fields in the dataset was ‘transaction amount’.
What did we expect to see? Well, we expected that most amounts would be around
the average, but there will likely be some smaller amounts and some larger amounts.
In order words, the expected histogram of the field would probably look like this:

© Copyright FHM.AVI – Data 462


Data Analyst

But when we checked the data, this is what we saw:

© Copyright FHM.AVI – Data 463


Data Analyst

We investigated the ‘hmm’.


These transactions weren’t made by their typical shopper — young moms shopping
for their kids. They were made by people who would travel to the US from abroad
once a year, walk into a store, buy lots of items, take them back to their country and
sell them in their own stores. They were resellers who had no special relationship
with our retailer.
This retailer didn’t have a physical presence outside North America at that time nor
did they ship to those locations from their e-commerce site. But there was enough
demand abroad that local entrepreneurs had sprung up to fill the gap.
This modest “discovery” set off a chain reaction of interesting questions on what
sorts of products these resellers were buying, what promotional campaigns may be
best suited for them, and even how this data can be used to inform global expansion
plans.

© Copyright FHM.AVI – Data 464


Data Analyst

Further analysis could be done by looking at details of these transactions and perform
a clustering analysis. How many clusters can be made from these transactions? Does
each cluster tell us a different story about each type of customers?
Let’s pick the top ten items that was bought based on the transaction data. Can we
build a simple random forest model to see which factors would lead a customer to
purchase such items? How important is each of these factors?
All from a simple histogram.
Note that working back from the data to the “root cause” in the business takes time,
effort, and patience. If you have a good network of contacts in the business who can
answer your questions, the more productive you will be. Also, what’s an oddity to
you may be obvious to them (since their understanding of the business may be better
than yours) and you can save time.

Data Insights in action and their values


offered to businesses as well as how drawing
inaccurate Data Insights could affect
businesses negatively
There are a number of ways that customer insights, through dashboard solutions, can
drive value for your organisation. These were touched upon already and are
summarised here.

© Copyright FHM.AVI – Data 465


Data Analyst

Enable real-time customer insights to frontline staff


Drive engaging and personalised customer experiences with every interaction.
Through real-time visibility into customer insights such as previous sentiment, churn
risk, product preferences and transaction history, customer service operations can
make more informed decisions when interacting with customers increasing customer
satisfaction.
Energy Central reports that a significant number, 72% of customers expect customer
service representatives (CSR) to be informed on their product and service history
when they contact the provider. Under such circumstances, CSR-s only have a few
seconds to solve a customer’s issue.
Increase customer growth opportunities
Increase customer growth opportunities by understanding customer needs and
creating highly personalised solutions to satisfy individual customer needs.
Understanding customer sentiment towards a product or service, supporting
customers by quick issue resolution and identifying the next best action thanks to
real-time analytics.
In a recent survey, EY outlined how during the accelerated digital transformation
induced by the COVID-19 pandemic, personalised solutions that are able to satisfy
customer needs can be a real differentiator for financial institutions. Real-time, data-
driven insights, underpinned by solid architectural design allow businesses to
continually adapt and evolve, and respond to customer needs.
Protect existing revenue and increase customer lifetime value (CLV)

© Copyright FHM.AVI – Data 466


Data Analyst

Put preventive churn measures into place and proactively engage "at risk" customers.
Customer acquisition is pointless without customer commitment and long-term
lifetime value.
In the Qualtrics Banking Report, 550 banking customers were interviewed, where
56% reported their banks made no effort to keep them when they announced they
were churning.
Decrease acquisition costs
Lower costs by creating more personalised solutions, instead of targeting the entire
customer base with campaigns that do not apply to certain audiences. Understand
customers on an individual level, and offer the right solutions.
McKinsey talks about personalisation in the context of disruption, and points out
how retailers can use personalized marketing to drive growth and reduce acquisition
cost. “Leading retailers create a 360-degree view of their customers—where they
shop (online or in-store), what they shop for, when they shop, how much they spend,
what they view or click on—and target them with highly personalized offers based
on that data. These retailers use personalization engines powered by machine
learning to improve e-commerce websites and their marketing campaigns across
channels. “

Data Insight Tools/Programming language.


R Programming
R is the leading analytics tool in the industry and widely used for statistics and data
modeling. It can easily manipulate your data and present in different ways. It has
exceeded SAS in many ways like capacity of data, performance and outcome. R
compiles and runs on a wide variety of platforms viz -UNIX, Windows and MacOS.
It has 11,556 packages and allows you to browse the packages by categories. R also
provides tools to automatically install all packages as per user requirement, which
can also be well assembled with Big data.

Tableau Public
Tableau Public is a free software that connects any data source be it corporate Data
Warehouse, Microsoft Excel or web-based data, and creates data visualizations,

© Copyright FHM.AVI – Data 467


Data Analyst

maps, dashboards etc. with real-time updates presenting on web. They can also be
shared through social media or with the client. It allows the access to download the
file in different formats. If you want to see the power of tableau, then we must have
very good data source. Tableau’s Big Data capabilities makes them important and
one can analyze and visualize data better than any other data visualization software
in the market.

Python
Python is an object-oriented scripting language which is easy to read, write, maintain
and is a free open source tool. It was developed by Guido van Rossum in late 1980’s
which supports both functional and structured programming methods. Phython
is easy to learn as it is very similar to JavaScript, Ruby, and PHP. Also, Python has
very good machine learning libraries viz. Scikitlearn, Theano, Tensorflow and
Keras. Another important feature of Python is that it can be assembled on any
platform like SQL server, a MongoDB database or JSON. Pythoncan also handle text
data very well.

SAS
SAS is a programming environment and language for data manipulation and a leader
in analytics, developed by the SAS Institute in 1966 and further developed in 1980’s
and 1990’s. SAS is easily accessible, managable and can analyze data from any
sources. SAS introduced a large set of products in 2011 for customer intelligence
and numerous SAS modules for web, social media and marketing analytics that is
widely used for profiling customers and prospects. It can also predict their behaviors,
manage, and optimize communications.

Apache Spark
The University of California, Berkeley’s AMP Lab, developed Apache in 2009.
Apache Spark is a fast large-scale data processing engine and executes applications
in Hadoop clusters 100 times faster in memory and 10 times faster on disk. Spark is
built on data science and its concept makes data science effortless. Spark is also
popular for data pipelines and machine learning models development.

© Copyright FHM.AVI – Data 468


Data Analyst

Spark also includes a library – MLlib, that provides a progressive set of machine
algorithms for repetitive data science techniques like Classification, Regression,
Collaborative Filtering, Clustering, etc.

Excel
Excel is a basic, popular and widely used analytical tool almost in all industries.
Whether you are an expert in Sas, R or Tableau, you will still need to use Excel.
Excel becomes important when there is a requirement of analytics on the client’s
internal data. It analyzes the complex task that summarizes the data with a preview
of pivot tables that helps in filtering the data as per client requirement. Excel has the
advance business analytics option which helps in modelling capabilities which have
prebuilt options like automatic relationship detection, a creation of DAX measures
and time grouping.

RapidMiner
RapidMiner is a powerful integrated data science platform developed by the same
company that performs predictive analysis and other advanced analytics like data
mining, text analytics, machine learning and visual analytics without any
programming. RapidMiner can incorporate with any data source types, including
Access, Excel, Microsoft SQL, Tera data, Oracle, Sybase, IBM DB2, Ingres,
MySQL, IBM SPSS, Dbase etc. The tool is very powerful that can generate analytics
based on real-life data transformation settings, i.e. you can control the formats and
data sets for predictive analysis.

KNIME
KNIME Developed in January 2004 by a team of software engineers at University
of Konstanz. KNIME is leading open source, reporting, and integrated analytics tools
that allow you to analyze and model the data through visual programming, it
integrates various components for data mining and machine learning via its modular
data-pipelining concept.

© Copyright FHM.AVI – Data 469


Data Analyst

QlikView
QlikView has many unique features like patented technology and has in-memory
data processing, which executes the result very fast to the end users and stores the
data in the report itself. Data association in QlikView is automatically maintained
and can be compressed to almost 10% from its original size. Data relationship is
visualized using colors – a specific color is given to related data and another color
for non-related data.

Splunk
Splunk is a tool that analyzes and search the machine-generated data. Splunk pulls
all text-based log data and provides a simple way to search through it, a user can pull
in all kind of data, and perform all sort of interesting statistical analysis on it, and
present it in different formats.

© Copyright FHM.AVI – Data 470


Data Analyst

Tableau Tutorial
1. Tableau Tutorial
In this Tableau tutorial for beginners, we will learn about Tableau basics: what is
Tableau and its history. A tableau, a tool used for complex visualization and
simplification of complex data. It was designed to help the user to create visuals and
graphics without the help of any programmer or any prior knowledge of
programming.
Furthermore, we will see the Advantages and Disadvantages of Tableau. So, to
understand why Tableau is needed, we should learn what is Tableau data
visualization. At last, we discuss different products by Tableau Software, to
understand Tableau.
So, let us start with our first Tableau Tutorial.

Learn Tableau Data Visualization

© Copyright FHM.AVI – Data 471


Data Analyst

2. What is Tableau?
A tableau, a tool uses for complex visualization and simplification of complex data.
It was designed to help the user to create visuals and graphics without the help of
any programmer or any prior knowledge of programming.
Tableau was designed with the aim to get a business software that was amazingly
responsive and easy. It is an intelligent platform by which business is made to move
faster and easy to comprehend by clients and consumers. It is a highly scalable easily
deployable and efficient business framework.
What will Tableau image software system do and why is it therefore specific that
several businesses have come back to include it in their daily operations? a strong
business intelligence tool for analytics visualizing, it casts the leads to various vivid
forms to induce higher insight. Tableau has each Desktop and online versions, that
allows access to software system knowledge within the cloud or on premises.
The solution’s practicality isn’t restricted by the graphics data representation; there’s
additionally a heap of labor beneath the surface. Indeed, the appliance makes
requests to the cloud and relative databases, spreadsheets and OLAP cubes for the
specified applied mathematics knowledge, and so parses, categorizes and correlates
it to come up with a comprehensive analytical report.
If your company seeks to enhance analytics and find the foremost out of what
knowledge rendition offers, you want to have your sights assail Tableau knowledge
image software system. However, before being loving its exceptional visualizing
capabilities, check the solution’s limitations as they pertain to your business to apply
reasonably and effectively.

3. History of Tableau
Tableau software was developed by an American software company which is
situated in Seattle, WA, USA. Professor Pat Hanrahan and Ph.D. student Chris
Stolte who specialized in visualization techniques created the software in the
department of computer science with the aim to explore and analyze relational
database and datacubes. They led research in which they studied and analyzed the
use of table-based displays to browse multi-dimensional relational databases.

© Copyright FHM.AVI – Data 472


Data Analyst

These founders basically combined the structured query language (SQL) with
graphics and invented a database visualization language called VizQL, i.e. Visual
Query Language. VizQL is responsible for the basic foundation of the Polaris
system, Polaris is an interface for exploring large interface data.
In 2003, Chris Stolte, appointed his former business partner and friend, Christian
Chabot as the CEO for Tableau. Tableau was taken out anonymously.
Tableau converts the relational databases, cubes, cloud database, and spreadsheets
to dashboards and shares over the internet. Revenue collection of Tableau in 2010
was reported to be $34.2 million dollars, it grew to $64 million dollars in the year
2011.

4. Different Products by Tableau


In this Tableau Tutorial, we will study three products of Tableau.

Different Products by Tableau

a. Tableau Server
It is a business intelligence application that provides analysis based on a browser
which anyone can use. It is an alternative which anyone prefers, because of its fast
pace as compared to the traditional business software. No scripting is required for

© Copyright FHM.AVI – Data 473


Data Analyst

Tableau, this makes it very user-friendly, and one can become a business analyst,
grow deployment or even train for free.

b. Tableau Online
Tableau is a hosted version of Tableau server, it makes the analysis for business
super fast. It provides the user to share dashboards on multiple platforms in
minutes. It provides the user and the company to share views live, and it also
provides a secure and hosted environment. There is also no need to buy, set up or
manage any architecture, it can scale up as much as you want.

c. Tableau Public
Tableau public was designed basically for anyone who wanted to share and tell
stories or data with interactive graphics on the web, it runs overnight, with it you can
create and publish data without the help of any programmers of IT.
The premium version is for organizations that want to scale up their websites and
keep the underlying data hidden.

5. Tableau Architecture
Tableau has a highly scalable, n-tier client-server architecture that serves mobile
clients, web clients and desktop-installed software. Tableau Desktop is the authoring
and publishing tool that is used to create shared views on Tableau Server.

Tableau provides a scalable solution for creation and delivery of web, mobile and
desktop analytics.

© Copyright FHM.AVI – Data 474


Data Analyst

Tableau Server is an enterprise-class business analytics platform that can scale up to


hundreds of thousands of users. It offers powerful mobile and browser-based
analytics and works with a company’s existing data strategy and security protocols.
Tableau Server:
• Scales up: Is multi-threaded
• Scales out: Is multi-process enabled
• provides integrated clustering
• Supports High Availability
• Is secure
• Runs on both physical and Virtual Machines

Tableau Server architecture supports fast and flexible deployments


Data Layer: Tableau is that it supports your choice of data architecture.
Tableau does not require your data to be stored in any single system, proprietary or
otherwise. Most organizations have a heterogeneous data environment: data
warehouses live alongside databases and Cubes, and flat files like Excel are still very
much in use. Tableau can work with all of these simultaneously. If your existing data
platforms are fast and scalable.

© Copyright FHM.AVI – Data 475


Data Analyst

Data Connectors: Tableau includes some optimized data connectors for databases
such as Microsoft Excel, SQL Server, Oracle, Teradata, Vertica, Cloudera, Hadoop,
and much more. There is also a generic ODBC connector for any systems without a
native connector.
Tableau provides two modes for interacting with data:
Live connection: Tableau’s data connectors leverage your existing data
infrastructure by sending dynamic SQL or MDX statements directly to the source
database rather than importing all the data.
In-memory: Tableau offers a fast, in-memory Data Engine that is optimized for
analytics. You can connect to your data and then, with one click, extract your data
to bring it in-memory in Tableau. Tableau’s Data Engine fully utilizes your entire
system to achieve fast query response on hundreds of millions of rows of data on
commodity hardware. Because the Data Engine can access disk storage as well as
RAM and cache.

6. Why Tableau Used?


In Tableau tutorial, we study various reasons to use Tableau. It is very easy to use as
you don’t need to know any sort of programming the only requirement with tableau
is to have some kind of data to create reports that are enchanting. It’s a drag and drops
feature is revolutionary. It helps you to create a story or reports by using your
imagination and a mouse. As tableau has VizQL all these features are possible in a
tableau.
The tableau data engine is a revolutionary breakthrough in-memory analytics, which
was designed to overcome the limitations of existing databases and data silos. The
tableau is capable to run on ordinary computers, it puts data into the hand of
everyone no fix data model requires and there is no requirement for any fix data
module.
Have a look at – Tableau Applications
The other analytic software that is available in the market guarantee a lot of fancy
features, but fails as soon as it comes to memory, when the user needs to deal with

© Copyright FHM.AVI – Data 476


Data Analyst

a large amount of data, tableau here comes as a saviour and is capable and deliver
efficient results even with a large amount of data.

7. Tableau Advantages and Disadvantages


In this section, we study about the advantages and disadvantages of a Tableau.

Advantages of Tableau
Although Tableau’s key advantage of the last word quality visual image of
interactive information overshadows all alternative blessings, the list of tableau
package edges the tool brings to businesses is sort of long.

© Copyright FHM.AVI – Data 477


Data Analyst

Advantages of Tableau

Remarkable Visual Image Capabilities


Of course, the application’s information visualizing quality is superior to what
Tableau package competitors provide. Even the merchandise of ancient business
intelligence vendors, like Oracle information visual image or IBM’s merchandise
for information rendition, cannot vie with the illustration and style quality that
Tableau provides.
It converts unstructured applied mathematics info into comprehensive logical
results, that square measure totally practical, interactive and appealing dashboards.
{they square measure they’re} obtainable in many styles of graphics and are
straightforward to use in business affairs.

Ease of Use
The tool’s intuitive manner of making graphics and an easy interface permit non-
dev users to utilize the fundamental app’s practicality to the fullest. Users organize
data into catchy diagrams in a very drag-and-drop method, that facilitates info
ANalyzing and eliminates the necessity for the assistance of an IT department for
pattern building.
Lay users will fancy the capabilities that Tableau offers for stats parsing, like
dashboard development, etc., while not in-depth coaching. However, to induce into
the solution’s capabilities, deeper information could be a should. Also, the shut

© Copyright FHM.AVI – Data 478


Data Analyst

involvement of IT specialists could be a necessity if a corporation seeks to expand


the solution’s practicality.

High Performance
Apart from the Tableau’s high visual image practicality, users rate its overall
performance as strong and reliable. The tool conjointly operates quickly even on
huge information.

Multiple Information Supply Connections


The package supports establishing connections with several information sources,
like HADOOP, SAP and sound unit Technologies, that improves information
analytics quality and allows the making of a unified, informative dashboard. Such a
dashboard grants access to the desired info for any user.

Thriving Community and Forum


The number of Tableau’s fans UN agency invest their experience and skills into the
community will increase steadily. Business users will give a boost to their
information on information parsing and reportage and obtain several helpful insights
during this community. Also, forum guests square measure able to facilitate settle
any user problems and to share their expertise.

Mobile-Friendly
There is a mobile app obtainable for IOS and automaton, that adds quality to Tableau
users and permits them to stay statistics at their fingertips. The app supports full
practicality that a Desktop and online versions have.

Disadvantages of Tableau
Despite its superior visualizing and coming up with capabilities and its alternative
blessings. The Tableau package includes a variety of shortcomings that ought to be
taken into thought.

© Copyright FHM.AVI – Data 479


Data Analyst

Disadvantages of Tableau

High Cost
Tableau isn’t the foremost expensive visual image package, particularly compared
to such business intelligence giants as Oracle’s and IBM’s solutions. All identical,
the license is sort of expensive for many little to medium corporations.
Also, the package needs correct preparation, implementation, maintenance, and
employees coaching that come back at a sizeable worth. Therefore, its high price
makes Tableau the selection of primarily giant businesses.

Inflexible Valuation
Tableau’s sales team isn’t versatile enough to produce an independent approach for
his or her customers. Ignoring the actual fact that every company has its own
distinctive needs to the visual image tool package, the Tableau sales model needs
purchasers to get the extended license from the beginning.
As a result, plenty of corporations that use Tableau hit the conclusion that they don’t
would like all their authorized options. they might like shopping for a collection of
needed ones and scale them if necessary.

Poor After-Sales Support

© Copyright FHM.AVI – Data 480


Data Analyst

On multiple message boards, users complain that Tableau package lacks correct
after-sales maintenance. If a client includes a package performance downside, the
support team doesn’t settle the matter by investigation the problem’s root and
eliminating it. the most effective they are doing is to advise getting a feature, which
can complete their software’s disadvantage.

Security Problems
Since visualizing solutions manipulate some confidential information, the vendors
draw special attention to security improvement. Despite Tableau’s deep concern for
info safety, it fails to produce centralized information level security.
It simply permits establishing a row level security, that stipulates that each user has
his/her own account. an excellent variety of accounts will increase the probabilities
that the system is also hacked.

IT Help For Correct Use


Although the package permits sure as shooting ease in its routine application,
Tableau still needs a big involvement of AN IT department in its additional
configuration and basic practicality enlargement.
Several operations need the creation of SQL queries, that is not possible while not
victimization the services of a talented developer. although untrained business users
might leverage the answer, they’ll not get the most effective out of it while not the
help of IT.

Poor BI Capabilities
As antecedently mentioned, the tool provides best-in-class info visual interpretation.
However, it lacks practicality needed for a full-fledged business intelligence tool,
like large-scale reportage, the building of information tables and static layouts.
Also, the answer has restricted capability for result sharing. Its notification
practicality is sort of straightforward, And solely an admin, not end-users, will put
together scheduled email subscription. a bit of code in Python permits to line strong
trigger-based notifications. However, the seller doesn’t support the choice.

Poor Versioning

© Copyright FHM.AVI – Data 481


Data Analyst

Only recent Tableau versions support revision history, whereas, for the older ones,
package rolling back is not possible.

Embedment Problems
Although the seller claims that its tool will be simply embedded into any business
IT landscape, in reality, the solution’s capabilities don’t afford a sleek embedment.
Seamless Tableau’s integration into a company’s product could be a real challenge
from each the money and technical points of the reading.

Time & Resource-Intensive Employees Coaching


Basic use of the appliance doesn’t demand hyper-focused information in Tableau.
However, the tool’s visual image potential is almost unlimited, whereas the
educational curve is implausibly steep for non-analyst users. going to understand all
of the tool’s capabilities while not comprehensive worker coaching is almost not
possible.
The educational half alone, on each the event and consumption facet, might take
weeks or months, before one will create the most effective use of the tool’s
practicality and obtain AN huge have the benefit of it. At the identical time, it will
increase the price of possession considerably.

8. Features of Tableau
Now, we are going to find out some of the very important and interesting features of
Tableau. It is this bundle of unique features that make Tableau a popular and widely
accepted Business Intelligence tool. So, get a better understanding of Tableau and
its potential by going through the list of features provided to you in the next section.

© Copyright FHM.AVI – Data 482


Data Analyst

Tableau is considered as one of the best Business Intelligence and data visualization
tools and has managed to top the charts quite a few times since its launch. The most
important quality of this tool is that it makes organizing, managing, visualizing and
understanding data extremely easy for its users.
Data can be as complex and mysterious as we can imagine and requires proper tools
to extract meaning from it. Such tools enable us to dig deep into the data so that we
can discover patterns and get meaningful insights. Tableau provides us with a set
of tools that equip us to do data discovery, data visualization and insight sharing at
a detailed level.
One interesting aspect of Tableau that has been a key player in making it everyone’s
favorite BI tool, is its easy drag-and-drop functionality. You do not need to come
from a technical background or know a lot of coding to be able to work on Tableau.
You can easily master this tool by understanding and learning its UI-based features
and functionalities to create dashboards and analyze reports.
Given below are the top features of Tableau:

Top 10 Features of Tableau


Tableau Dashboard
Tableau Dashboards provide a wholesome view of your data by the means of
visualizations, visual objects, text, etc. Dashboards are very informative as they can

© Copyright FHM.AVI – Data 483


Data Analyst

present data in the form of stories, enable the addition of multiple views and objects,
provide a variety of layouts and formats, enable the users to deploy suitable filters.
You even have the option to copy a dashboard or its specific elements from one
workbook to another easily.

Collaboration and Sharing


Tableau provides convenient options to collaborate with other users and instantly
share data in the form of visualizations, sheets, dashboards, etc. in real-time. It allows
you to securely share data from various data sources such as on-premise, on-cloud,
hybrid, etc. Instant and easy collaboration and data sharing help in getting quick
reviews or feedback on the data leading to a better overall analysis of it.

Live and In-memory Data


Tableau ensures connectivity to both live data sources or data extraction from
external data sources as in-memory data. This gives the user the flexibility to use
data from more than one type of data source without any restrictions. You can use
data directly from the data source by establishing live data connections or keep
that data in-memory by extracting data from a data source as per their requirement.
Tableau provides additional features to support data connectivity such as automatic
extract refreshes, notifying the user upon a live connection fail, etc.

Data Sources in Tableau


Tableau offers a myriad of data source options you can connect to and fetch data
from. Data sources ranging from on-premise files, spreadsheets, relational databases,
non-relational databases, data warehouses, big data, to on-cloud data are all available
on Tableau. One can easily establish a secure connection to any of the data sources
from Tableau and use that data along with data from other sources to create a
combinatorial view of data in the form of visualizations. Tableau also supports
different kinds of data connectors such as Presto, MemSQL, Google Analytics,
Google Sheets, Cloudera, Hadoop, Amazon Athena, Salesforce, SQL Server,
Dropbox and many more.

Advanced Visualizations (Chart Types)

© Copyright FHM.AVI – Data 484


Data Analyst

One of the key features of Tableau and the one that got its popularity is its wide
range of visualizations. In Tableau, you can make visualizations as basic as a:
• Bar chart
• Pie chart
and as advanced as a:
• Histogram
• Gantt chart
• Bullet chart
• Motion chart
• Treemap
• Boxplot
and many more. You can select and create any kind of visualization easily by
selecting the visualization type from the Show Me tab.

Maps
Yet another important feature of Tableau is the map. Tableau has a lot of pre-
installed information on maps such as cities, postal codes, administrative boundaries,
etc. This makes the maps created on Tableau very detailed and informative. You can
add different layers of geology on the map as per your requirements and create
informative maps in Tableau with your data. The different kinds of maps available
in Tableau are Heat map, Flow map, Choropleth maps, Point distribution map, etc.

Robust Security
Tableau takes special care of data and user security. It has a fool-proof security
system based on authentication and permission systems for data connections and
user access. Tableau also gives you the freedom to integrate with other security
protocols such as Active Directory, Kerberos, etc. An important point to note here is
that Tableau practices row-level filtering which helps in keeping the data secure.

Mobile View
Tableau acknowledges the importance of mobile phones in today’s world and
provides a mobile version of the Tableau app. One can create their dashboards and

© Copyright FHM.AVI – Data 485


Data Analyst

reports in such a manner that it is also compatible with mobile. Tableau has the
option of creating customized mobile layouts for your dashboard specific to your
mobile device. The customization option gives the option for adding new phone
layouts, interactive offline previews, etc. Hence, the mobile view gives Tableau
users a lot of flexibility and convenience in handling their data on the go.

Ask Data
The Ask data feature of Tableau makes it even more favored by the users globally.
This feature makes playing with data just a matter of simple searches as we do on
Google. You just need to type a query about your data in natural language and
Tableau will present you with the most relevant answers. The answers are not only
in the form of text but also as visuals. For instance, if what you searched for is already
present in a bar graph, the Ask data option will search and open the bar graphfor you
instantly. Such features make data more accessible to users who can easily dig deep
into data and find new insights and patterns.

Trend Lines and Predictive Analysis


Another extremely useful feature of Tableau is the use of time series and forecasting.
Easy creation of trend lines and forecasting is possible due to Tableau’s powerful
backend and dynamic front end. You can easily get data predictions such as a
forecast or a trend line by simply selecting some options and drag-and-drop
operations using your concerned fields.

Miscellaneous Features of Tableau


Along with the list of key features that we just covered, Tableau is loaded with a lot
of other important as well as useful features listed below:
• Cross database join
• Nested sorting
• Drag-and-drop integration
• Data connectors
• Prep conductor
• Text editor

© Copyright FHM.AVI – Data 486


Data Analyst

• Revision history
• Licensing views
• ETL refresh
• Web Data connector
• External service integration
• Split function

Features of Latest Tableau Version


As of now, the latest version out on the roll. This version of Tableau brings with it
some new features. Have a look at the list of features included in the new Tableau
version.
• Improved tables (Tables up to 50 columns, horizontal scrolling, sorting by
dimensions and measures)
• Webhooks support
• View recommendations
• Sandboxed extensions
• LinkedIn sales navigator connector
• Improved maps, activity feed, Ask data, SAP HANA connectors
• Tooltip editing in the browser

Tableau Features (Quick view)


Given below is a quick overview of Tableau features:
• API
• Access Control
• Active Directory Integration
• Activity Dashboard
• Ad hoc Analysis
• Ad hoc Query
• Ad hoc Reporting

© Copyright FHM.AVI – Data 487


Data Analyst

• Categorization
• Collaboration Tools
• Content Library
• Custom Fields
• Customizable Reporting
• Dashboard Creation
• Data filtering, import/export, mapping, storage management, visualization.
• Database Integration
• Drag & Drop Interface
• Email Integration
• Email Notifications
• Filtered Views
• Gantt Charts
• Geographic Maps
• Geolocation
• Interactive Content
• Metadata Management
• Multiple Projects
• Offline Access
• Permission Management, Role-Based Permissions
• Real-Time Analytics, Real-Time Data
• Self Service Portal
• Task Management
• Third-Party Integration
• Trend Analysis
• Usage Tracking
• User Management

© Copyright FHM.AVI – Data 488


Data Analyst

• Visual Analytics
Tableau is a very useful tool loaded with user-friendly features and functionalities
which helps us extract valuable information from raw data and analyze it using
visualizations.

9. Introduction to Tableau Desktop Software


Tableau Desktop Workspace
In the start screen, go to File > New to open a Tableau Workspace
The Tableau Desktop Workspace consists of various elements as given in the figure:

The Tableau workspace consists of menus, a toolbar, the Data window, cards that
contain shelves and legends, and one or more sheets. Sheets can be worksheets or
dashboards.
Worksheets contain shelves, which are where you drag data fields to build views.
You can change the default layout of the shelves and cards to suit your needs,
including resizing, moving, and hiding them.

© Copyright FHM.AVI – Data 489


Data Analyst

Dashboards contain views, legends, and quick filters. When you first create a
dashboard, the Dashboard is empty and all of the worksheets in the workbook are
shown in the Dashboard window.

Data Window
Data fields appear on the left side of the workspace in the Data window. You can
hide and show the Data window by selecting Window > Show Data Window. You
can also click the minimize button in the upper right corner of the Data window.

© Copyright FHM.AVI – Data 490


Data Analyst

Toolbar Icon
Toolbar icon present below the menu bar can be used to edit the workbook using
different features such as undo, redo, new data source, slideshow and so on.
Tableau’s toolbar also contains commands such as Connect to data, New Sheet, and
Save. In addition, the toolbar contains analysis and navigation tools such as Sort,
Group, and Highlight.
Toolbar Description
Undo: undoes the last task you completed.
Redo: repeats the last task you canceled with the Undo button.
Save: saves the changes made to the workbook.
Connect to Data: opens the data page where you can create a new connect or select
one from your repository.
New Sheet: creates a new blank worksheet.
Duplicate Sheet: creates a new worksheet containing the exact same view as the
current sheet.
Clear: clears the current worksheet. Use the drop-down list to clear specific parts of
the view such as filters, formatting, and sizing.
Automatic Updates: controls whether Tableau automatically updates the view
when changes are made. Use the drop-down list to automatically update the entire
sheet or just quick filters.
Run Update: runs a manual query of the data to update the view with changes when
automatic updates is turned off. Use the drop-down list to update the entire sheet or
just quick filters.
Swap: moves the fields on the Rows shelf to the Columns shelf and vice versa. The
Hide Empty Rows and hide Empty Columns settings are always swapped with this
button.
Sort Ascending: applies a sort in ascending order of a selected field based on the
measures in the view.

© Copyright FHM.AVI – Data 491


Data Analyst

Sort Descending: applies a sort in descending order of a selected field based on the
measures in the view.
Group Members: creates a group by combining selected values.
Show Mark Labels: toggles between showing and hiding mark labels for the
current sheet.
Presentation Mode: toggles between showing and hiding everything but the view.
View Cards: shows and hides the specified cards in a worksheet. Select the cards
you want to hide or show from the drop-down list.
Fit Selector: specifies how the view should be sized within the application window.
Select either a Normal fit, Fit Width, Fit Height or Entire View.
Fix Axes: toggles between locking the axes to a specific range and showing all of
the data in the view.
Highlight: turns on highlighting for the selected sheet. Use the options on the drop-
down list to define how values will be highlighted.
Show Me!: displays alternative views of the data, in addition to the best view
according to best practices. The options available depend on the selected data fields.

Tooltips
Tooltips are additional data details that display when you rest the pointer over one
or more marks in the view. Tooltips also offer convenient tools to quickly filter or
remove marks or view underlying data. Tooltips consist of a body, action links, and
commands.

Status Bar
The status bar is located at the bottom of the Tableau workbook. It displays
descriptions of menu items as well as information about the current view. For
example, the status bar below shows that the view has 131 marks shown in 3 rows
and 11 columns. It also shows that the SUM(Profit) for all the marks is $785,604.

© Copyright FHM.AVI – Data 492


Data Analyst

Cards and Shelves: Every worksheet contains a variety of different cards that you
can show or hide. Cards are containers for shelves, legends, and other controls. For
example, the Marks card contains the mark selector, the size slider, the mark
transparency control, and the shape, text, color, size, angle, and level of detail
shelves.

Cards can be shown and hidden as well as rearranged around the worksheet.
The following list describes each card and its contents.
• Columns Shelf - contains the Columns shelf where you can drag fields to add
columns to the view.
• Rows Shelf - contains the Rows shelf where you can drag fields to add
columns to the view.
• Pages Shelf– contains the Pages shelf where you can create several different
pages with respect to the members in a dimension or the values in a measure.
• Filters Shelf– contains the Filters shelf; use this shelf to specify the values to
include in the view.
• Measure Names/Values Shelf – contains the Measure Names shelf; use this
shelf to use multiple measures along a single axis.

© Copyright FHM.AVI – Data 493


Data Analyst

• Color Legend – contains the legend for the color encodings in the view and
is only available when there is a field on the Color shelf.
• Shape Legend – contains the legend for the shape encodings in the view and
is only available when there is a field on the Shape shelf.
• Size Legend – contains the legend for the size encodings in the view and is
only available when there is a field on the Size shelf.
• Map Legend - contains the legend for the symbols and patterns on a map.
The map legend is not available for all map providers.
• Quick Filters – a separate quick filter card is available for every field in the
view. Use these cards to easily include and exclude values from the view
without having to open the Filter dialog box.
• Marks – contains a mark selector where you can specify the mark type as
well as the Path, Shape, Text, Color, Size, Angle, and Level of Detail shelves.
The availability of these shelves is dependent on the fields in the view.
• Title – contains the title for the view. Double-click this card to modify the
title.
• Caption – contains a caption that describes the view. Double-click this card
to modify the caption.
• Summary – contains a summary of each of the measures in the view
including the Min, Max, Sum, and Average.
• Map Options - allows you to modify the various labels and boundaries shown
in the online maps. Also, you can use this card to overlay metro statistical
area information.
• Current Page – contains the playback controls for the Pages shelf and
indicates the current page that is displayed. This card is only available when
there is a field on the Pages shelf.
Each card has a menu that contains common controls that apply to the contents of
the card. For example, you can use the card menu to show and hide the card. Access
the card menu by clicking on the arrow in the upper right corner of the card.

Menu Bar

© Copyright FHM.AVI – Data 494


Data Analyst

It consists of menu options such as File, Data, Worksheet, Dashboard, Story,


Analysis, Map, Format, Server, and Windows. The options in the menu bar include
features such as file saving, data source connection, file export, table calculation
options, and design features for creating a worksheet, dashboard, and storyboard.

Dimension Shelf
The dimensions present in the data source can be viewed in the dimension shelf.

Measure Shelf
The measures present in the data source can be viewed on the measure shelf.

Sets and Parameters Shelf


The user-defined sets and parameters can be viewed in the sets and parameter shelf.
It can also be used to edit the existing sets and parameters.

Page Shelf
Page shelf can be used to view the visualization in video format by keeping the
relevant filter on the page shelf.

Filter Shelf
The filters that can control the visualization can be placed on the filter shelf, and the
required dimensions or measures can be filtered in.

Marks Card
Marks card can be used to design the visualization. The data components of the
visualization such as color, size, shape, path, label, and tooltip used in the
visualizations can be modified in the marks card.

Worksheet
The worksheet is the place where the actual visualization can be viewed in the
workbook. The design and functionalities of the visual can be viewed in the
worksheet.

Tableau Repository

© Copyright FHM.AVI – Data 495


Data Analyst

Tableau repository is used to store all the files related to tableau desktop. It includes
various folders such as Bookmarks, Connectors, Datasources, Extensions, Logs,
Mapsources, Services, Shapes, TabOnlineSyncClient and Workbooks. My Tableau
repository is usually located in the file path C:\Users\User\Documents\My Tableau
Repository.

Tableau Navigation
The navigation of workbook is explained below.

Data Source
The addition of new data source of modification of existing data source can be done
using the ‘Data Source’ tab present at the bottom of the Tableau Desktop Window.

Current Sheet

© Copyright FHM.AVI – Data 496


Data Analyst

Current Sheet can be viewed with the name of the sheet. All the sheets, dashboards
and story board present in the workbook can be viewed here.

New Sheet
The new sheet icon present in the tab can be used to create a new worksheet in the
Tableau Workbook.

New Dashboard
The new dashboard icon present in the tab can be used to create a new dashboard in
the Tableau Workbook.

New Storyboard
The new storyboard icon present in the tab can be used to create new storyboard in
the Tableau Workbook.

Data Window Features and Functions


The Data window has many features and functions to help you organize your data
fields, find specific fields, and hide others.
• Organize the Data Window
• Find Fields
• Rename Fields
• Hide or Unhide Fields
• Add Fields to the Data Window

Organize the Data Window


You can reorganize the Data window from its default layout by selecting from
a variety of sorting options. These Sort by options are located in the Data window
menu.

© Copyright FHM.AVI – Data 497


Data Analyst

You can sort by one of the following options:


• Name – lists the dimensions and measures in alphabetical order according to
their field aliases.
• Data source order – lists the dimensions and measures in the order they are
listed in the underlying datasource.
You can also select to Group by Table, which is a command that toggles on and off.
When you select this option, the dimensions and measures are grouped according to
the database table they belong to. This is especially useful when you have several
joined tables.

Find Fields
You can search for fields in the Data window. If there are many fields in your data
source it can be difficult to find a specific one like “Date” or “Customer” or “Profit.”
To search for a field, click the Find Field icon at the top of the Data window (Ctrl +
F) and type the name of the field you want to search for. Valid field names that fit
the description appear in a drop-down list. Select the field you want and press enter
on your keyboard to highlight the field in the Data window.

Rename Fields
You can assign an alternate name for a field that displays in the Data window as well
as in the view. For instance, a field called Customer Segment in the data source could
be aliased to appear as Business Segment in Tableau. You can rename both

© Copyright FHM.AVI – Data 498


Data Analyst

dimensions and measures. Renaming a field does not change the name of the field
in the underlying data source, rather it is given a special name that only appears
Tableau workbooks. The changed field name is saved with the workbook aswell as
when you export the connection.
Renaming a Field
1. Right-click the field name in the Data window you want to rename and select
Rename.
2. Type the new name in the subsequent dialog box and click OK.
The field displays with the new name in the Data window.

Hide or Unhide Fields


You can selectively hide or show fields in the Data window. To hide a field, right-
click the field you want to hide and select Hide.

Add Fields to the Data Window


You can create calculated fields that appear in the Data window. These new
computed fields can be used like any other field. Select Create Calculated Field on
the Data window menu. Alternatively, select Analysis > Create Calculated Field.

Editing Field Properties


When you drag fields to shelves, the data is represented as marks in the view. You
can specify settings for how the marks from each field will be displayed by setting
mark properties. For example, when you place a dimension on the color shelf the
marks will be colored by the values within that dimension. You can set the Color
property so that anytime you use that dimension on the color shelf your chosen colors
are used. Using field properties you can set the aliases, colors, and shapes, default
aggregation, and so on.
• Comments
• Aliases
• Colors
• Shapes
• Formats

© Copyright FHM.AVI – Data 499


Data Analyst

• Sort
• Aggregation
• Measure Names

Comments
Fields can have comments that describe them. The comments display in a tool tip in
the Data window and in the Calculated fields dialog box. Field comments are a good
way to give more context to the data in your data source. Comments are especially
useful when you are building a workbook for others to use.

Aliases
Aliases are alternate names for specific values within a dimension. Aliases can be
created for the members of most dimensions in the Data window. You cannot,
however, define aliases for continuous dimensions and dates and they do not apply
to measures.
The method for creating aliases depends on the type of data source you are using.

Colors
When you use a dimension to color encode the view, default colors are assigned to
the field’s values. Color encodings are shared across multiple worksheets that use
the same data source to help you create consistent displays of your data. For
example, if you define the Western region to be green, it will automatically be green
in all other views in the workbook. You can set the default color encodings for a
field by right-clicking the field in the Data window and selecting Field Properties >
Color.

Shapes
When you use a dimension to shape encode the view, default shapes are assigned to
the field’s values. Shape encodings are shared across multiple worksheets that use
the same data source to help you create consistent displays of your data. For example,
if you define the Furniture products are represented with a square mark, it will
automatically be changed to a square mark in all other views in the workbook.

© Copyright FHM.AVI – Data 500


Data Analyst

You can set the default shape encodings for a field by right-clicking the field in the
Data window and selecting Field Properties > Shape.

Formats
You can set the default text format for date and number fields. For example, you
may want to always show the Sales values as currency using the U.S. dollar sign and
two decimal points. On the other hand, you may want to always show Discount as a
percentage. You can set the default formats by right-clicking a date or numeric field
and selecting and option on the Field Properties menu. A dialog box opens where
you can specify a default format.

Date Functions in Tableau


The following are the Date functions in tableau:
1. DAY
Syntax – DAY(Date)

If you want to know the day, then this function is used.


2. MONTH
Syntax – MONTH(Date)

If you want to know the month from the date given to Tableau, then this function is
used.
3. YEAR
Syntax – YEAR(Date)

If you want to know the year of the date given to Tableau, then this function is used.
4. DATEDIFF
Syntax – DATEDIFF (depart, date1, date2, [start of the week(optional)])

If you want to know the difference between the input dates in Tableau, then this
function is used.

5. DATEPART
Syntax – DATEPART (depart, date, [start of the week(optional)])

This syntax is helpful if you want to get the integer part of the date.

© Copyright FHM.AVI – Data 501


Data Analyst

6. DATEADD
Syntax – Depart (depart, interval, date)

If you want to add a specific date to the existing date added to Tableau initially, the
DATEADD function is used.
7. DATETRUNC
Syntax – DATETRUNC (depart, Date, [start of the week(optional)])

If you need to truncate the date or shorten or minus the date pre-provided to Tableau,
the DATETRUNC function is used. It gives a new date as an output.
8. DATENAME
Syntax – DATENAME (depart, date, [start of the week(optional)])

If you need to know the date provided as an input to Tableau this function is used. It
gives the output date in the form of string.
9. MAKE DATE
Syntax – MAKE DATE (Year, Month, Day)

This function is used to create a visualization of the date when you input the required
year, month and date.
10. MAKETIME
Syntax – MAKETIME (Hour, Minute, Second)

This function provides the resultant date value with the help of provided hours,
minutes and seconds.
11. NOW
Syntax – NOW ()

If you want to know the present date and time provided to the Tableau, then this
function will be used.
12. Today
Syntax – Today ()

When you use this function, you will get the current date as an output.
13. MAX

© Copyright FHM.AVI – Data 502


Data Analyst

Syntax – (MAX (date1, date2))

This function is basically used for a comparison of numeric expressions. If you want
to compare dates, this function is valid for dates as well. It returns the date value
after comparing it. If the value after returning comes out as null value, it is granted
for this function.
14. MIN
Syntax – (MIN (date1, date 2))

This function is basically used for a comparison of numeric expressions. If you want
to compare dates, this function is valid for dates as well. It returns the minimum date
values after comparing. If the value after returning comes out as null value, it is
granted for this function.

Sort
You can set a default sort order for the values within a categorical field so that every
time you use the field in the view, the values will be sorted correctly. For example,
let’s say you have an Order Priority field that contains the values High, Medium, and
Low. When you place these in the view, by default they will be listed as High, Low,
Medium because they are shown in alphabetical order. You can set a default sort so
that these values are always listed correctly. To set the default sort order right-click a
dimension and select Field Properties > Sort. Then use the sort dialog box tospecify
a sort order.
Note:
The default sort order also controls how the field values are listed in a quick filter.

Aggregation
You can also specify a default aggregation for any measure. The default aggregation
will be used automatically when the measure is first totaled in the view.
To specify a default aggregation:
1. Right-click any measure in the Data window and select Field Properties >
Aggregation.
2. On the Aggregation list, select an aggregation.

© Copyright FHM.AVI – Data 503


Data Analyst

Whether you are specifying the aggregation for a field on a shelf or the default
aggregation in the Data window, you can select from the following options:
Default
For Microsoft Analysis Services data sources, this option computes the aggregation
on the server.
For Essbase data sources, this option computes the total using the default
aggregation determined by the data type (typically SUM).
SUM
Displays the sum of all shown values.
Average
Displays the average of all shown values.
Minimum
Displays the smallest shown value.
Maximum
Displays the largest shown value.
Server

© Copyright FHM.AVI – Data 504


Data Analyst

Computes the aggregation on the server.

Measure Names
There are times that you will want to show multiple measures in a view and so you
will use the Measure Values and the Measure Names fields. When you use Measure
Names all of the measure names appear as row or column headers in the view.
However, the headers include both the measure name and the aggregation label. So
if you are showing the summation of profit the header displays as SUM(Profit). You
can change the names so that they do not include the aggregation label by editing the
member aliases of the Measure Names field. This feature becomes particularly
useful when you are working with a text table that shows multiple measures.

Build Data Views


You can build data views by dragging fields from the Data window and dropping
them onto the shelves that are part of every Tableau worksheet.

Nested Table
You will modify the tableau view from – Basic View: To show quarters in addition
to years.
Drill down on the Year (Order Date) field by clicking the plus button on the right
side of the field.

© Copyright FHM.AVI – Data 505


Data Analyst

Drag the Order Date field from the Data window and drop it on the Columns shelf
to the right of the Year (Order Date) field.

The new dimension divides the view into separate panes for each year. Each pane
has columns for the quarters of the given year. This view is called a nested table
because it displays multiple headers, with quarters nested within years.

Building Views (Automatically)


Rather than building views by dragging and dropping fields, you can use Show Me!
To create views automatically.
In the Show Me! Dialog box, select the type of view you want to create and click
OK.

© Copyright FHM.AVI – Data 506


Data Analyst

Save your Work


After you have created all the desired views of your data, you should save the results
in a Tableau Workbook.
Saving a Tableau workbook allows you to save all your worksheets for later use. It
also allows you to share your results using a convenient file.
Follow the steps below to save your workbook.
1. Select File > Save or press Ctrl + S on your keyboard.
2. Browse to a file location to save the workbook.
By default, Tableau saves workbooks in the Workbooks directory in the Tableau
Repository.
Specify a file name for the workbook.
Specify a file type. You can select from the following options:
Tableau Workbook (.twb) – Saves the all the sheets and their connection information
in a workbook file. The data is not included.
Tableau Packaged Workbook (.twbx) – Saves all the sheets, their connection
information and any local resources (e.g., local file data sources, background images,
custom geocoding, etc.).
When finished, click Save.

© Copyright FHM.AVI – Data 507


Data Analyst

Inspecting Data
Once you have created a view, Tableau offers a selection of a that help you isolate
the data of interest and then continue to explore and analyze. For example, if you
have a dense data view, you can focus on a particular region, select a group of
outliers, view the underlying data source rows for each mark, and then view a
summary of the selected marks include the average, minimum, and maximum values.
• Select
• Zoom Controls
• Pan
• Undo and Redo
• Drop Lines
• Summary Card
• View Data
• Describing the View

Select
Selecting marks is useful when you want to identify a subset of the data view,
visually or you want to run an action.
You can select any individual mark by clicking on it. You can select multiple marks
by holding down the Ctrl key. You can also drag the cursor to draw a box around
the marks you want to select. Finally, you can combine these methods to select all
the marks of interest quickly.

Zoom Controls
Tableau has a set of zoom controls that display in the upper left corner of the view.
By default, these controls only display when you hover over a map view. You can
control when the zoom controls display by selecting View > Zoom Controls and
then select one of the following options:
• Automatic– displays when you hover the mouse over map views.

© Copyright FHM.AVI – Data 508


Data Analyst

• Hide– never displays.


• Show on hover – displays when you hover the mouse over all views.
These settings also apply to the view when it is opened in Tableau Reader or Tableau
Server. You must specify a setting for each worksheet.
The zoom controls allow you to zoom in and out, zoom to a specific area, and fix or
reset the axes. Each control is described below.

Zoom In and Out


Zooming is useful when you have a lot of data in a view, and you want to focus on
a specific part of the view without excluding the rest. Click the plus button to zoom
in on the view and the minus button to zoom out. If the zoom controls are hidden,
double-click the view to zoom in and hold down SHIFT and double-click to zoom
out.

Area Zoom
Rather than zooming in and out on the entire view, you can select a specific area to
zoom to. When you zoom in on an area, the view is enlarged so that the selected area
fills the window. Select the Area Zoom button and then click and drag in the view
to select the area to zoom. If the zoom controls are hidden, hold down CTRL +
SHIFT and then drag the mouse to select the area you want to zoom to.

Reset Axes
When you zoom in or out the axes in the view are locked to a specific range. You
can quickly reset the view back to the automatic axis range by clicking the Reset
Axes button in the zoom controls. This button is also available on the toolbar.

Pan
You can move your view of a table up and down as well as left and right with the
pan tool. There are two uses of panning. The first is when you have zoomed in on a
view, particularly a map, and want to move the map around to see other marks of
interest. The second is when your data view contains many panes, and you want to
move quickly from pane to pane.
Use the Pan tool by holding SHIFT and then dragging the cursor across the view.

© Copyright FHM.AVI – Data 509


Data Analyst

Undo and Redo


You can perform unlimited undo and redo of your actions. You can undo almost all
actions in Tableau by pressing the Undo button on the toolbar. Likewise, you can
redo almost all actions by pressing the Red button on the toolbar.
In this regard, every workbook behaves like a web browser. You can quickly return
to a previous view. Or you can browse all the views of a data source that you have
created. Tableau saves the undo/redo history across all worksheets until you exit.
The history is not saved between sessions.

Drop Lines
Drop lines are most useful for distinguishing marks and calling out their position in
the view. For example, in a view that is dense with scatter marks, you can turn on
drop lines to show the position of a particular data point. When you add drop lines,
a line is extended from the marks to one of the axes. You can choose to show drop
lines all the time or only when a mark is selected.

To add drop lines to the view:


• Right-click on the pane and select Drop Lines.
By default, drop lines are set only to show when the mark is selected. You can
change this setting and specify other options in the Drop Lines dialog box.

To edit drop lines


Right-click on the pane and select Edit Drop Lines to open the Drop Lines dialog
box.

© Copyright FHM.AVI – Data 510


Data Analyst

In the Drop Lines dialog box select an axis to draw the line to, whether to always
show the drop lines, and whether to show labels.
When finished click OK.

Summary Card
The summary card is a really quick way to view information about a selection or the
entire data source. The card shows the SUM, MIN, MAX, and Average for each
measure in the view. You can hide or show the Summary Card by selecting it on the
View Cards toolbar menu. You can also select View > Cards > Summary.
Consider this example, the view below is a scatter plot of profit vs. sales for three
different product categories. You can see that the technology category contains high
profit and high sales products (the green marks). When you select these marks, the
summary card quickly shows you that these products account for $4,334,791 in sales
with a minimum sale of $465,729.

View Data
The View Data command lets you display the values for each row in the data source
that compose the marks. It also shows you the summary data based on the
aggregations in the view. You might want to do view data to verify the aggregated
value associated with a mark or to isolate and export the individual rows associated
with data of interest such as outliers.
You can view data for a selection of marks, the fields in the Data window, and when
you’re connecting to data.
The view shown below shows the average order quantity for two product dimensions
as a bar chart. Suppose you want to view the data for the largest marks in each pane.
To do this, select the marks of interest, right-click in the table, and select View

© Copyright FHM.AVI – Data 511


Data Analyst

Data on the context menu. Alternatively, you can select the Analysis > View Data
menu item.

Note:
Viewing data may not return any records if you are using a field that contains floating
point values as a dimension. This is due to the precision of the data source.

Describing the View


Occasionally you may want to summarize an analysis you have completed on a
worksheet succinctly. You might then want to remind yourself of what it shows (the
filters that are applied, etc.), and finally, you may want to share a summary of the
analysis with someone else.
When you choose View > Describe Sheet, you can view a description of the
workbook, data source, fields and layout of the current worksheet. This summary
includes the Caption in the first line but expounds on other important summary
information. This information can be copied and exported to other applications using
the clipboard.
Note:
If you have Trend Lines turned on, the Describe Sheet dialog box includes
information about the trend line model, including an anova table. Refer to learn more
about the terms used to describe the model.

© Copyright FHM.AVI – Data 512


Data Analyst

10. How to Create a Tableau Dashboard


Tableau is one of the best visualization software in the market and 7 years in a row
as a leader in a Gartner Magic Quadrant for analytics and Business Intelligence
platforms. Tableau is the most flexible, secure and an end to end analytics
performing and visualization tool. It works on simple drag and drops functionality.
It is most easy to use. You do not require any background of coding knowledge to
perform analytics and create visualizations of your analyzed data. In this tutorial, we
going to learn how to create a Tableau Dashboard in 9 simple steps.
To view your data in traditional Excel, it is very difficult to come up with possible
insights. Only data analysts can come up with possible insights by performing a lot
of work on the data, using macros, and more. But with Tableau, anyone can come
up with the possible solutions in a quicker way, provided the individual knows how
to work on Tableau.

Features of Tableau – Makes Interactive Dashboards


The following are the incredible features of Tableau:
• Drag & drop functionality
• Almost connects to all the data sources available

© Copyright FHM.AVI – Data 513


Data Analyst

• Automatic updates
• Can create no code queries
• Can ask questions in natural language to get possible solutions
• Creates a wide range of interactive visuals to create impeccable dashboards
• Most secured as security permissions are asked at all levels
• Mobile compatible dashboards
To more detail about Tableau Features follow the section: Top 10 Features of
Tableau
The dashboards can be created in Tableau Desktop or Tableau Public. If you having
a licensed version Tableau, dashboards can be created in Tableau Desktop or for
practicing you can create dashboards in Tableau Public by just signing in.

Step by Step Create a Tableau Dashboard


The following are the step by step explanation to create the best and interactive
Tableau Dashboard. Here we will create a dashboard in Tableau Desktop.

Step 1: Click On New Dashboard Button to Create New & empty


Tableau Dashboard
• Click on the new dashboard button highlighted in red at the bottom of the bar
to open a new dashboard or
• Click on Dashboard on the menu bar and select a new dashboard
• You can change the name of this dashboard as per your choice

Step 2: Select First Sheet that will be a part of your first Tableau
Dashboard
• Now drag the sheet you want to include in your dashboard and drop it into
dashboard workspace.
• If you are considering a sales data drag and drop the sales map sheet tab.
• You can select your own data for creating a dashboard

© Copyright FHM.AVI – Data 514


Data Analyst

Step 3: Drag second Sheet that you want to use in Tableau


Dashboard Workspace
• Now select the second sheet you wish to include in your data.
• Suppose now you can add the sales vs profit datasheet
• The order of the sheets is not important
• You can add the sheets in any order you want

Step 4: Now release the mouse button to drop the required chart at
the desired location
• Now it’s time to drop the chart
• When you release the click button of your mouse the chart is dropped in your
desired location

Step 5: You can now customize the Tableau Dashboard Sizes by


making a selection from size
• You can customize your dashboard as per your wish
• You can resize the chart size
• You can add more sheets in your dashboard
• You can add filters as per your choice
• You can make the sheets fit properly in the dashboard by clicking on the fit
option from the toolbar

Step 6: Now you will get interactive Tableau Dashboard


• After all the customization is done your dashboard is finally ready
• You can create many such impeccable and interactive dashboards as per your
choice

Step 7: You can show the Dashboard by pressing F7


• Once your dashboard is ready, you can view it in presentation mode
• To enable presentation mode, click on the top button or press F7

© Copyright FHM.AVI – Data 515


Data Analyst

Step 8: Present Dashboard on Sheet


• This opens your created dashboard in the presentation mode
• Till now you were working to create your dashboard in edit mode

Step 9: Share Tableau Dashboard with your team members


• Now it’s time to share your dashboards with other team members
• You can even publish your dashboard on the server, where it can be accessed
by everyone
The above was the step by step process of creating a Tableau Dashboard. To publish
your dashboard on the server you need to have the credentials. Tableau dashboard
can be easily created if you have the vision preset in your mind.

11. Embed Tableau Dashboard/Reports on


the Web Page

The importance of creating reports or dashboards in Tableau is to share it across to


the individuals who can get benefit from it. There are many ways to share, but the
most interactive way is to embed Tableau Dashboard on a webpage.
There are many possible reasons to embed Tableau dashboard in webpages. Here we
will discuss the two most possible ideas to do it as follows:
Reasons to Embed Tableau Dashboard in Webpage
1. Interactive Webpage experience
Whenever a user comes to your website, he is not sure to make a correct decision.
The users will get drifted from your websites more quickly as they do not find
anything appealing or exciting.
If you embed an interactive Tableau dashboard on your Webpage, the user will get
more meaningful insights to come to a conclusion and find the Webpage appealing
and attractive to dig more insights.

© Copyright FHM.AVI – Data 516


Data Analyst

2. Give the clients everything at one spot


Suppose you are offering service to your clients in a particular product. You will
need the total branding control too. But if you make your client visit the third-party
solutions like Tableau, they will get irritated.
Instead, embed the Tableau in your product so that your client won’t feel discomfort.
Pre-Requisites to Embed Tableau Dashboard in Webpage
The following are the two pre-requisites to embed Tableau dashboard in a webpage:
1. Tableau Online/ Tableau Server
• You should have Tableau Online or Tableau Server account. If you do not
have one, you need to subscribe it first
• The content of the Tableau that you will be embedding should be published
to Tableau Online or Tableau Server
2. Python
Python 3 should be pre-installed to complete the process.
To check if Python 3 is installed or not – Open Command prompt à type python three
or python
Methods to Embed Tableau Dashboard in Web Page
The following are the three methods to follow to embed Tableau Dashboard in a
Webpage:
1. Iframe
2. Tableau Embed Code
3. Tableau JavaScript API
Let us discuss the steps in detail.
Let us Get Started
The first step is you need to run your Python HTTP server. To run it follow the steps
given below:
• For running Python HTTP server and HTML files create an empty directory
• In the command line change the directory to home directory

© Copyright FHM.AVI – Data 517


Data Analyst

• Now create an empty folder and name tableau_embedded


• Now change the directory to the newly created folder
• Spin up a python HTTP server to view the running and serving of python
HTTP server files
• Open a browser and go to http://localhost:8000/
• Create an HTML file directory by using the following code:
<!doctype html>
<html lang=”en”>
<head>
<meta charset=”utf-8″>
<title>Embedded Analytics with Tableau</title>
</head>
<body>
This is where the dashboard will go.
</body>
</html>
• Save it naming index.html
• Refresh the browser
• The template to embed Tableau is all set to go further
Method 1 – iframe
This is the simplest method. To embed Tableau, follow the above-given steps and
after that, follow the below-given steps:
• Open Tableau Online / Tableau Server
• Open the content you want to embed
• Click on the share button
• A dialogue box will open named share view
• Click on the Copy link button

© Copyright FHM.AVI – Data 518


Data Analyst

• Now edit the index.html file by placing iframe tag inside the body
• Provide the source link copied along with the height & width of the dashboard
you want to embed
• Below is the reference code:
<!doctype html>
<html lang=”en”>
<head>
<meta charset=”utf-8″>
<title>Embedded Analytics with Tableau</title>
</head>
<body>
<iframe width=”1335px” height=”894px” src=”<tableau_url>”></iframe>
</body>
</html>
• Now refresh your browser the view the embedded Tableau dashboard.
Method 2 – Tableau Embed Code
To apply this method, follow the below steps:
• Open the content you want to embed from Tableau Online / Tableau Server
• Click on the share button
• A dialogue box will open named share view
• Select </> Embed Code
• Again edit index.html with the following code by replacing the iframe tag
<!doctype html>
<html lang=”en”>
<head>
<meta charset=”utf-8″>
<title>Embedded Analytics with Tableau Online</title>

© Copyright FHM.AVI – Data 519


Data Analyst

</head>
<body>
<script type=’text/javascript’ src=’https://us-east-
1.online.tableau.com/javascripts/api/viz_v1.js’></script>
<div class=’tableauPlaceholder’ style=’width: 1488px; height: 706px;’>
<object class=’tableauViz’ width=’1488′ height=’706′ style=’display:none;’>
<param name=’host_url’ value=’https%3A%2F%2Fus-east-
1.online.tableau.com%2F’ />
<param name=’embed_code_version’ value=’3′ />
<param name=’site_root’ value=’&#47;t&#47;zuar’ />
<param name=’name’ value=’Regional&#47;GlobalTemperatures’ />
<param name=’tabs’ value=’yes’ />
<param name=’toolbar’ value=’yes’ />
<param name=’showAppBanner’ value=’false’ />
<param name=’filter’ value=’iframeSizedToWindow=true’ />
</object>
</div>
</body>
</html>
• Refresh to view your Tableau dashboard in the browser
Method 3 – Tableau JavaScript API
This is the most power full method because it gives both Tableau dashboard and
Webpage to interact with each other. Follow the given below steps:
• Open index.html
• Update it with the given below code:
<!doctype html>
<html lang=”en”>

© Copyright FHM.AVI – Data 520


Data Analyst

<head>
<meta charset=”utf-8″>
<title>Embedded Analytics with Tableau</title>
</head>
<body>
<div id=”vizContainer”></div>
<script src=”https://us-east-1.online.tableau.com/javascripts/api/tableau-
2.min.js”></script>
<script>
var viz
function initViz() {
var containerDiv = document.getElementById(“vizContainer”),
url = “https://us-east-
1.online.tableau.com/t/zuar/views/Regional/GlobalTemperatures”;
viz = new tableau.Viz(containerDiv, url);
}
initViz();
</script>
</body>
</html>
Refresh your browser to view the result

© Copyright FHM.AVI – Data 521


Data Analyst

12. 8 Basic Steps to Complete a Project


(Tableau Desktop - Version: 2021.3)
Learn how to connect to data, create data
visualizations, present your findings, and share your
insights with others.
This tutorial walks you through the features and functions of Tableau Desktop
version 2021.3. As you work through this tutorial, you will create multiple views in
a Tableau workbook. The steps you'll take and the workbook you'll work in are based
on a story about an employee who works at headquarters for a large retail chain. The
story unfolds as you step through asking questions about your business and its
performance.
You'll learn how to connect to data in Tableau Desktop; build, present, and share
some useful views; and apply key features along the way. Budget between one and
three hours to complete the steps.

Here's the story ...


Suppose you are an employee for a large retail chain. Your manager just got the
quarterly sales report, and noticed that sales seem better for some products than for
others and profit in some areas is not doing as well as she had expected. Your boss
is interested in the bottom line: It's your job to look at overall sales and profitability
to see if you can find out what's driving these numbers.
She has also asked you to identify areas for improvement and present your findings
to the team. The team can explore your results and take action to improve sales and
profitability for the company's product lines.
You'll use Tableau Desktop to build a simple view of your product data, map product
sales and profitability by region, build a dashboard of your findings, and then create
a story to present. Then, you will share your findings on the web so that remote team
members can take a look.

© Copyright FHM.AVI – Data 522


Data Analyst

Step 1: Connect to your data


Your manager has asked you to look into the overall sales and profitability for the
company and to identify key areas for improvement. You have a bunch of data, but
you aren’t sure where to start. With a free trial of Tableau Desktop(Link
opens in a new window), you decide to begin there.

Open Tableau Desktop and begin

The first thing you see after you open Tableau Desktop is the Start page(Link
opens in a new window). Here, you select the connector (how you will connect to
your data) that you want to use.

The start page gives you several options to choose from:

1. Tableau icon. Click in the upper left corner of any page to toggle between
the start page and the authoring workspace.

2. Connect pane. Under Connect, you can:

© Copyright FHM.AVI – Data 523


Data Analyst

• Connect to data that is stored in a file, such as Microsoft Excel, PDF, Spatial
files, and more.
• Connect to data that is stored on Tableau Server, Microsoft SQL Server,
Google Analytics, or another server.
• Connect to a data source that you’ve connected to before.
Tableau supports the ability to connect to a wide variety of data stored in a wide
variety of places.
3. Under Sample Workbooks, view sample dashboards and worksheets that come
with Tableau Desktop.
4. Under Open, you can open workbooks that you've already created.
5. Under Discover, find additional resources like video tutorials, forums, or the “Viz
of the week” to get ideas about what you can build.
In the Connect pane, under Saved Data Sources, click Sample - Superstore to
connect to the sample data set.

After you select Sample - Superstore, your screen will look something like this:

© Copyright FHM.AVI – Data 524


Data Analyst

The Sample - Superstore data set comes with Tableau. It contains information about
products, sales, profits, and so on that you can use to identify key areas for
improvement within this fictitious company.

Step 2: Drag and drop to take a first look


Create a view
You set out to identify key areas for improvement, but where to start? With four
years' worth of data, you decide to drill into the overall sales data to see what you
find. Start by creating a simple chart.
1. From the Data pane, drag Order Date to the Columns shelf.
Note: When you drag Order Date to the Columns shelf, Tableau creates a column
for each year in your data set. Under each column is an Abc indicator. This indicates
that you can drag text or numerical data here, like what you might see in an Excel

© Copyright FHM.AVI – Data 525


Data Analyst

spreadsheet. If you were to drag Sales to this area, Tableau creates a crosstab (like a
spreadsheet) and displays the sales totals for each year.
2. From the Data pane, drag Sales to the Rows shelf.
Tableau generates the following chart with sales rolled up as a sum (aggregated).
You can see total aggregated sales for each year by order date.

When you first create a view that includes time (in this case Order Date), Tableau
automatically generates a line chart.
This line chart shows that sales look pretty good and seem to be increasing over time.
This is good information, but it doesn't really tell you much about which products
have the strongest sales and if there are some products that might be performing
better than others. Since you just got started, you decide to explore further and see
what else you can find out.

© Copyright FHM.AVI – Data 526


Data Analyst

Refine your view


To gain more insight into which products drive overall sales, try adding more data.
Start by adding the product categories to look at sales totals in a different way.
1. From the Data pane, drag Category to the Columns shelf and place it to the
right of YEAR(Order Date).
Your view updates to a bar chart. By adding a second discrete dimension to the view
you can categorize your data into discrete chunks instead of looking at your data
continuously over time. This creates a bar chart and shows you overall sales for each
product category by year.

© Copyright FHM.AVI – Data 527


Data Analyst

Your view is doing a great job showing sales by category—furniture, office supplies,
and technology. An interesting insight is revealed!
From this view, you can see that sales for furniture is growing faster than sales for
office supplies, even though Office Supplies had a really good year in 2021. Perhaps
you can recommend that your company focus sales efforts on furniture instead of
office supplies? Your company sells a lot of different products in those categories,
so you'll need more information before you can make a recommendation.
To help answer that question, you decide to look at products by sub-category to see
which items are the big sellers. For example, for the Furniture category, you want to
see details about bookcases, chairs, furnishings, and tables. Looking at this data
might help you gain insights into sales and later on, overall profitability, so add sub-
categories to your bar chart.
2. Double-click or drag Sub-Category to the Columns shelf.
Note: You can drag and drop or double-click a field to add it to your view, but be
careful. Tableau makes assumptions about where to add that data, and it might not
be placed where you expect. You can always click Undo to remove the field, or drag
it off the area where Tableau placed it to start over.

© Copyright FHM.AVI – Data 528


Data Analyst

Sub-Category is another discrete field. It creates another header at the bottom of the
view, and shows a bar for each sub-category (68 marks) broken down by category
and year.

Now you are getting somewhere, but this is a lot of data to visually sort through. In
the next section, you will learn how you can add color, filters, and more to focus on
specific results.

© Copyright FHM.AVI – Data 529


Data Analyst

Step summary
This step was all about getting to know your data and starting to ask questions about
your data to gain insights. You learned how to:
• Create a chart in a view that works for you.
• Add fields to get the right level of detail in your view.
Now you're ready to begin focusing on your results to identify more specific areas
of concern. In the next section, you will learn how to use filters and colors to help
you explore your data visually.

Step 3: Focus your results


You've created a view of product sales broken down by category and sub-category.
You are starting to get somewhere, but that is a lot of data to sort through. You need
to easily find the interesting data points and focus on specific results. Well, Tableau
has some great options for that!
Filters and colors are ways you can add more focus to the details that interest you.
After you add focus to your data, you can begin to use other Tableau Desktop
features to interact with that data.

Add filters to your view


You can use filters to include or exclude values in your view. In this example, you
decide to add two simple filters to your worksheet to make it easier to look at product
sales by sub-category for a specific year.
1. In the Data pane, right-click Order Date and select Show Filter.
2. Repeat the step above for the Sub-Category field.
The filters are added to the right side of your view in the order that you selected
them. Filters are card types and can be moved around on the canvas by clicking on
the filter and dragging it to another location in the view. As you drag the filter, a line
appears that shows you where you can drop the filter to move it.
Note: The Get Started tutorial uses the default position of the filter cards.
More on Filtering in the Learning Library (in the top menu).

© Copyright FHM.AVI – Data 530


Data Analyst

© Copyright FHM.AVI – Data 531


Data Analyst

Add color to your view


Adding filters helps you to sort through all of this data—but wow, that’s a lot of
blue! It's time to do something about that.
Currently, you are looking at sales totals for your various products. You can see that
some products have consistently low sales, and some products might be good
candidates for reducing sales efforts for those product lines. But what does overall
profitability look like for your different products? Drag Profit to color to see what
happens.
From the Data pane, drag Profit to Color on the Marks card.
By dragging profit to color, you now see that you have negative profit in Tables,
Bookcases, and even Machines. Another insight is revealed!

Note: Tableau automatically added a color legend and assigned a diverging color
palette because your data includes both negative and positive values.

© Copyright FHM.AVI – Data 532


Data Analyst

Find key insights


As you've learned, you can explore your data as you build views with Tableau
Desktop. Adding filters and colors helps you better visualize your data and can
identify problems right away.
The next step is to interact with your view so that you can begin drawing
conclusions.
Looking at your view, you saw that you had some unprofitable products, but now
you want to see if these products have been unprofitable year over year.
It's time to use your filters to take a closer look.
1. In the view, in the Sub-Category filter card, clear all of the check boxes
except Bookcases, Machines, and Tables.

© Copyright FHM.AVI – Data 533


Data Analyst

Now you can see that, in some years, Bookcases and Machines were profitable.
However, recently Machines are unprofitable. While you've made an important
discovery, you want to gather more information before proposing any action items
to your boss.
On a hunch, you decide to break up your view by region:
2. Select All in the Sub-Category filter card to show all sub-categories again.
3. From the Data pane, drag Region to the Rows shelf and place it to the left of
Sum(Sales).
Tableau creates a view with multiple axes broken down by region.

© Copyright FHM.AVI – Data 534


Data Analyst

Now you see sales and profitability by product for each region. By adding region to
the view and filtering the Sub-Category for Machines only, you notice that machines
in the South are reporting a higher negative profit overall than in your other regions.
You've discovered a hidden insight!

© Copyright FHM.AVI – Data 535


Data Analyst

This view best encapsulates your work so far. Select All in the Sub-Category filter
card (if you changed your filter) to show all sub-categories again, name the
worksheet, and add a title.
4. At the bottom-left of the workspace, double-click Sheet 1 and type Sales by
Product/Region.
You choose to focus your analysis on the South, but you don't want to lose the view
you've created. In Tableau Desktop, you can duplicate your worksheet to continue
where you left off.
5. In your workbook, right-click the Sales by Product/Region sheet and select
Duplicate.
6. Rename the duplicated sheet to Sales in the South.
7. In your new worksheet, from the Data pane, drag Region to the Filters shelf
to add it as a filter in the view.
8. In the Filter Region dialog box, clear all check boxes except South and then
click OK.

© Copyright FHM.AVI – Data 536


Data Analyst

Your view updates to look like the image below.

Now you can focus on sales and profit in the South. You immediately see that
machine sales had negative profit in 2018 and again in 2021. This is definitely
something to investigate!
9. Save your work by selecting File > Save As. Give your workbook a name,
such as Regional Sales and Profits.

© Copyright FHM.AVI – Data 537


Data Analyst

Step summary
In this step you used filter and color to make working with your data a bit easier.
You also learned about a few fun features that Tableau offers to help you answer key
questions about your data. You learned how to:
• Apply filters and color to make it easier to focus on the areas of your data that
interest you the most.
• Interact with your chart using the tools that Tableau provides.
• Duplicate worksheets and save your changes to continue exploring your data
in different ways without losing your work.
Having explored your data in Tableau Desktop, you've identified areas of concern.
You know sales and profit in the South are a problem, but you don't yet have a
solution. You want to look at other factors that might be contributing to these results.
Next, you'll leverage another key feature in Tableau to work with geographic data.

© Copyright FHM.AVI – Data 538


Data Analyst

Step 4: Explore your data geographically


You've built a great view that allows you to review sales and profits by product over
several years. And after looking at product sales and profitability in the South, you
decide to look for trends or patterns in that region.
Because you're looking at geographic data (the Region field), you have the option to
build a map view. Map views are great for displaying and analyzing this kind of
information. Plus, they're just cool!
For this example, Tableau has already assigned the proper geographic roles to the
Country, State, City, and Postal Code fields. That's because it recognized that each
of those fields contained geographic data. You can get to work creating your map
view right away.

Build a map view


Start fresh with a new worksheet.
1. Click the New worksheet icon at the bottom of the workspace.

Tableau keeps your previous worksheet and creates a new one so that you can
continue exploring your data without losing your work.
2. In the Data pane, double-click State to add it to Detail on the Marks card.
Now you’ve got a map view!

© Copyright FHM.AVI – Data 539


Data Analyst

Because Tableau already knows that state names are geographic data and because
the State dimension is assigned the State/Province geographic role, Tableau
automatically creates a map view.
There is a mark for each of the 48 contiguous states in your data source. (Sadly,
Alaska and Hawaii aren't included in your data source, so they are not mapped.)
Notice that the Country field is also added to the view. This happens because the
geographic fields in Sample - Superstore are part of a hierarchy. Each level in the
hierarchy is added as a level of detail.
Additionally, Latitude and Longitude fields are added to the Columns and Rows
shelves. You can think of these as X and Y fields. They're essential any time you
want to create a map view, because each location in your data is assigned a latitudinal
and longitudinal value. Sometimes the Latitude and Longitude fields are generated
by Tableau. Other times, you might have to manually include them in your data. You
can find resources to learn more about this in the Learning Library.
Now, having a cool map focused on 48 states is one thing, but you wanted to see
what was happening in the South, remember?
3. Drag Region to the Filters shelf, and then filter down to the South only. The
map view zooms in to the South region, and there is a mark for each state (11
total).

© Copyright FHM.AVI – Data 540


Data Analyst

Now you want to see more detailed data for this region, so you start to drag other
fields to the Marks card:
4. Drag the Sales measure to Color on the Marks card.

The view automatically updates to a filled map, and colors each state based on its
total sales. Because you're exploring product sales, you want your sales to appear in
USD. Click the Sum(Sales) field on the Columns shelf, and select Format. For
Numbers, select Currency.
Any time you add a continuous measure that contains positive numbers (like Sales)
to Color on the Marks card, your filled map is colored blue. Negative values are
assigned orange.
Sometimes you might not want your map to be blue. Maybe you prefer green, or
your data isn’t something that should be represented with the color blue, like
wildfires or traffic jams. That would just be confusing!
No need to worry, you can change the color palette just like you did before.
5. Click Color on the Marks card and select Edit Colors.
For this example, you want to see which states are doing well, and which states are
doing poorly in sales.
6. In the Palette drop-down list, select Red-Green Diverging and click OK. This
allows you to see quickly the low performers and the high performers.
Your view updates to look like this:

© Copyright FHM.AVI – Data 541


Data Analyst

But wait. Everything just went red! What happened?


The data is accurate, and technically you can compare low performers with high
performers, but is that really the whole story?
Are sales in some of those states really that terrible, or are there just more people in
Florida who want to buy your products? Maybe you have smaller or fewer stores in
the states that appear red. Or maybe there’s a higher population density in the states
that appear green, so there are just more people to buy your stuff.
Either way, there’s no way you want to show this view to your boss because you
aren't confident the data is telling a useful story.

7. Click the Undo icon in the toolbar to return to that nice, blue view.
There’s still a color problem. Everything looks dandy—that’s the problem!

At first glance, it appears that Florida is performing the best. Hovering over its mark
reveals a total of 89,474 USD in sales, as compared to South Carolina, for example,
which has only 8,482 USD in sales. However, have any of the states in the South
been profitable?

© Copyright FHM.AVI – Data 542


Data Analyst

8. Drag Profit to Color on the Marks card to see if you can answer this question.

Now that’s better! Because profit often consists of both positive and negative values,
Tableau automatically selects the Orange-Blue Diverging color palette to quickly
show the states with negative profit and the states with positive profit.
It’s now clear that Tennessee, North Carolina, and Florida have negative profit, even
though it appeared they were doing okay—even great—in Sales. But why? You'll
answer that in the next step.

© Copyright FHM.AVI – Data 543


Data Analyst

Step 5: Drill down into the details


In the last step you discovered that Tennessee, North Carolina, and Florida have
negative profit. To find out why, you decide to drill down even further and focus on
what's happening in those three states alone.

Pick up where your map view left off


As you saw in the last step, maps are great for visualizing your data broadly. A bar
chart will help you get into the nitty-gritty. To do this, you create another worksheet.
1. Double-click Sheet 3 and name the worksheet Profit Map.
2. Right-click Profit Map at the bottom of the workspace and select Duplicate .
Name the new sheet Negative Profit Bar Chart.
3. In the Negative Profit Bar Chart worksheet, click Show Me, and then
select horizontal bars.
Show Me highlights different chart types based on the data you've added to your
view.
Note: At any time, you can click Show Me again to collapse it.

© Copyright FHM.AVI – Data 544


Data Analyst

You now have a bar chart again—just like that.

© Copyright FHM.AVI – Data 545


Data Analyst

4. To select multiple bars on the left, click and drag your cursor across the bars
between Tennessee, North Carolina, and Florida. On the tooltip that appears,
select Keep Only to focus on those three states.
Note: You can also right-click one of the highlighted bars, and select Keep Only.
Notice that an Inclusions field for State is added to the Filters shelf to indicate that
certain states are filtered from the view. The icon with two circles on the field
indicates that this field is a set. You can edit this field by right-clicking the field on
the Filters shelf and selecting, Edit Filter.
Now you want to look at the data for the cities in these states.
5. On the Rows shelf, click the plus icon on the State field to drill-down to the
City level of detail.
There’s almost too much information here, so you decide to filter the view down to
the cities with the most negative profit by using a Top N Filter.

© Copyright FHM.AVI – Data 546


Data Analyst

Create a Top N Filter


You can use a Top N Filter in Tableau Desktop to limit the number of marks
displayed in your view. In this case, you want to use the Top N Filter to hone in on
poor performers.
1. From the Data pane, drag City to the Filters shelf.
2. In the Filter dialog box, select the Top tab, and then do the following:
a. Click By field.
b. Click the Top drop-down and select Bottom to reveal the poorest
performers.
c. Type 5 in the text box to show the bottom 5 performers in your data set.
Tableau Desktop has already selected a field (Profit) and aggregation (Sum) for the
Top N Filter based on the fields in your view. These settings ensure that your view
will display only the five poorest performing cities by sum of profit.

d. Click OK.
What happened to the bar chart, and why is it blank? That's a great question, and a
great opportunity to introduce the Tableau Order of Operations.
The Tableau Order of Operations, also known as the query pipeline, is the order that
Tableau performs various actions, such as the order in which it applies your filters
to the view.
Tableau applies filters in the following order:
1. Extract Filters
2. Data Source Filters

© Copyright FHM.AVI – Data 547


Data Analyst

3. Context Filters
4. Top N Filters
5. Dimension Filters
6. Measure Filters
The order that you create filters in, or arrange them on the Filters shelf, doesn't
change the order in which Tableau applies those filters to your view.
The good news is you can tell Tableau to change this order when you notice
something strange happening with the filters in your view. In this example, the Top
N Filter is applied to the five poorest performing cities by sum of profit for the whole
map, but none of those cities are in the South, so the chart is blank.
To fix the chart, add a filter to context. This tells Tableau to filter that field first,
regardless of where it falls on the order of operations.
But which field do you add to context? There are three fields on the Filters shelf:
Region (a dimension filter), City (a top N filter), and Inclusions (Country, State)
(Country, State) (a set).
If you look at the order of operations again, you know that the set and the top N filter
are being applied before the dimension filter. But do you know if the top N filter or
the set filter is being applied first? Let's find out.
3. On the Filters shelf, right-click the City field and select Add to Context.
The City field turns gray and moves to the top of the Filters shelf, but nothing
changes in the view. So even though you're forcing Tableau to filter City first, the
issue isn't resolved.
4. Click Undo.
5. On the Filters shelf, right-click the Inclusions (Country, State) (Country,
State) set and select Add to Context.
The Inclusions (Country, State) (Country, State) set turns gray and moves to the top
of the Filters shelf. And bars have returned to your view!
You're on to something! But there are six cities in the view, including Jacksonville,
North Carolina, which has a positive profit. Why would a city with a positive profit
show up in the view when you created a filter that was supposed to filter out
profitable cities?

© Copyright FHM.AVI – Data 548


Data Analyst

Jacksonville, North Carolina is included because City is the lowest level of detail
shown in the view. For Tableau Desktop to know the difference between
Jacksonville, North Carolina, and Jacksonville, Florida, you need to drill down to
the next level of detail in the location hierarchy, which, in this case, is Postal Code.
After you add Postal Code, you can exclude Jacksonville in North Carolina without
also excluding Jacksonville in Florida.
6. On the Rows shelf, click the plus icon on City to drill down to the Postal Code
level of detail.
7. Right-click the postal code for Jacksonville, North Carolina, 28540, and then
select Exclude.
Postal Code is added to the Filters shelf to indicate that certain members in the Postal
Code field have been filtered from the view. Even when you remove the PostalCode
field from the view, the filter remains.
8. Drag Postal Code off the Rows shelf.
Your view updates to look like this:

© Copyright FHM.AVI – Data 549


Data Analyst

Now that you've focused your view to the least profitable cities, you can investigate
further to identify the products responsible.

Identify the troublemakers


You decide to break up the view by Sub-Category to identify the products dragging
profit down. You know that the Sub-Category field contains information about
products sold by location, so you start there.
1. Drag Sub-Category to the Rows shelf, and place it to the right of City.
2. Drag Profit to Color on the Marks card to make it easier to see which
products have negative profit.
3. In the Data pane, right-click Order Date and select Show Filter.
You can now explore negative profits for each year if you want, and quickly spot the
products that are losing money.
Machines, tables, and binders don’t seem to be doing well. So what if you stop
selling those items in Jacksonville, Concord, Burlington, Knoxville, and Memphis?

© Copyright FHM.AVI – Data 550


Data Analyst

Verify your findings


Will eliminating binders, machines, and tables improve profits in Florida, North
Carolina, and Tennessee? To find out, you can filter out the problem products to see
what happens.
1. Go back to your map view by clicking the Profit Map sheet tab.
2. In the Data pane, right-click Sub-Category and select Show Filter.
A filter card for all of the products you offer appears next to the map view. You'll
use this filter later.
3. From the Data pane, drag Profit and Profit Ratio to Label on the Marks card.
To format the Profit Ratio as a percentage, right-click Profit Ratio, andselect
Format. Then, for Default Numbers, choose Percentage and set the number
of decimal places you want displayed on the map. For this map, we'llchoose
zero decimal places.
Now you can see the exact profit of each state without having to hover your cursor
over them.
4. In the Data pane, right-click Order Date and select Show Filter to provide
some context for the view.

© Copyright FHM.AVI – Data 551


Data Analyst

A filter card for YEAR(Order Date) appears in the view. You can now view profit
for all years or for a combination of years. This might be useful for your presentation.
5. Clear Binders, Machines, and Tables from the list on the Sub-Category
filter card in the view.
Recall that adding filters to your view lets you include and exclude values to
highlight certain parts of your data.
As you clear each member, the profit for Tennessee, North Carolina, and Florida
improve, until finally, each has a positive profit.

Hey, you made an interesting discovery!


Binders, machines, and tables are definitely responsible for the losses in Tennessee,
North Carolina, and Florida, but not for the rest of the South. Do you notice how
profit actually decreases for some of the other states as you clear items from the filter
card? For example, if you toggle Binders on the Sub-Category filter card, profit
drops by four percent in Arkansas. You can deduce that Binders are actually
profitable in Arkansas.
You want to share this discovery with the team by walking them through the same
steps you took.

© Copyright FHM.AVI – Data 552


Data Analyst

6. Select (All) on the Sub-Category filter card to include all products again.

Now you know that machines, tables, and binders are problematic products for your
company. In focusing on the South, you see that these products have varying impacts
on profit. This might be a worthwhile conversation to have with your boss.
Next, you'll assemble the work you've done so far in a dashboard so that you can
clearly present your findings.
More on working with maps and geographic roles in the Learning Library (in the top
menu).

Step 6: Build a dashboard to show your insights


You’ve created four worksheets, and they're communicating important information
that your boss needs to know. Now you need a way to show the negative profits in
Tennessee, North Carolina, and Florida and explain some of the reasons why profits
are low.
To do this, you can use dashboards to display multiple worksheets at once, and—if
you want—make them interact with one another.

© Copyright FHM.AVI – Data 553


Data Analyst

Set up your dashboard


You want to emphasize that certain items sold in certain places are doing poorly.
Your bar graph view of profit and your map view demonstrate this point nicely.
1. Click the New dashboard button.

2. In the Dashboard pane on the left, you'll see the sheets that you created.
Drag Sales in the South to your empty dashboard.
3. Drag Profit Map to your dashboard, and drop it on top of the Sales in the
South view.
Your view will update to look like this:

Now you can see both views at once!


But sadly, the bar chart is a bit squished, which isn’t helping your boss understand
your data.

© Copyright FHM.AVI – Data 554


Data Analyst

Arrange your dashboard


It's not easy to see details for each item under Sub-Category from your Sales in the
South bar chart. Also, because we have the map in view, we probably don't need the
South region column in Sales in the South, either.
Resolving these issues will give you more room to communicate the information
you need.
1. On Sales in the South, right-click in the column area under the
Region column header, and clear Show header.

2. Repeat this process for the Category row header.

© Copyright FHM.AVI – Data 555


Data Analyst

You've now hidden unnecessary columns and rows from your dashboard while
preserving the breakdown of your data. The extra space makes it easier to see data
on your dashboard, but let's freshen things up even more.
3. Right-click the Profit Map title and select Hide Title.
The title Profit Map is hidden from the dashboard and even more space is created.
4. Repeat this step for the Sales in the South view title.
5. Select the first Sub-Category filter card on the right side of your view, and
at the top of the card, click the Remove icon .
6. Repeat this step for the second Sub-Category filter card and one of the Year
of Order Date filter cards.
7. Click on the Profit color legend and drag it from the right to below Sales in
the South.
8. Finally, select the remaining Year of Order Date filter, click its drop-down
arrow, and then select Floating. Move it to the white space in the map view.
In this example, it is placed just off the East Coast, in the Atlantic Ocean.
Try selecting different years on the Year of Order Date filter. Your data is quickly
filtered to show that state performance varies year by year. That's nice, but it could
be made even easier to compare.
9. Click the drop-down arrow at the top of the Year of Order Date filter, and
select Single Value (Slider).
Your view updates to look like this:

© Copyright FHM.AVI – Data 556


Data Analyst

Now your dashboard is looking really good! Now, you can easily compare profit and
sales by year. But that’s not so different from a couple pictures in a presentation—
and you're using Tableau! Let's make your dashboard more engaging.

© Copyright FHM.AVI – Data 557


Data Analyst

Add interactivity
Wouldn't it be great if you could view which sub-categories are profitable in specific
states?

1. Select Profit Map in the dashboard, and click the Use as filter icon in the
upper right corner.
2. Select a state within the Southern region of the map.
The Sales in the South bar chart automatically updates to show just the sub-category
sales in the selected state. You can quickly see which sub-categories are profitable.
3. Click an area of the map other than the colored Southern states to clear your
selection.
You also want viewers to be able to see the change in profits based on the order date.
4. Select the Year of Order Date filter, click its drop-down arrow, and
select Apply to Worksheets > Selected Worksheets.

© Copyright FHM.AVI – Data 558


Data Analyst

5. In the Apply Filter to Worksheets dialog box, select All in dashboard, and
then click OK.
This option tells Tableau to apply the filter to all worksheets in the dashboard that
use this same data source.
Explore state performance by year with your new, interactive dashboard!
Here, we filter Sales in the South to only items sold in North Carolina, and then
explore year by year profit.

Rename and go
You show your boss your dashboard, and she loves it. She's named it "Regional Sales
and Profit," and you do the same by double-clicking the Dashboard 1 tab and typing
Regional Sales and Profit.
In her investigations, your boss also finds that the decision to introduce machines in
the North Carolina market in 2021 was a bad idea.

© Copyright FHM.AVI – Data 559


Data Analyst

Your boss is glad she has this dashboard to explore, but she also wants you to present
a clear action plan to the larger team. She asks you to create a presentation with your
findings.
Good thing you know about stories in Tableau.

Step 7: Build a story to present


You want to share your findings with the larger team. Together, your team might
reevaluate selling machines in North Carolina.
Instead of having to guess which key insights your team is interested in and including
them in a presentation, you decide to create a story in Tableau. This way, you can
walk viewers through your data discovery process, and you have the option to
interactively explore your data to answer any questions that come up during your
presentation.

Create your first story point


For the presentation, you'll start with an overview.
1. Click the New story button.

You're presented with a blank workspace that reads, "Drag a sheet here." This is
where you'll create your first story point.
Blank stories look a lot like blank dashboards. And like a dashboard, you can drag
worksheets over to present them. You can also drag dashboards over to present them
in your story.
2. From the Story pane on the left, drag the Sales in the South worksheet onto
your view.
3. Add a caption—maybe "Sales and profit by year"—by editing the text in the
gray box above the worksheet.

© Copyright FHM.AVI – Data 560


Data Analyst

This story point is a useful way to acquaint viewers with your data.
But you want to tell a story about selling machines in North Carolina, so let's focus
on that data.

Highlight machine sales


To bring machines into the picture, you can leverage the Sub-Category filter
included in your Sales in the South bar chart.
1. In the Story pane, click Duplicate to duplicate the first caption.
Continue working where you left off, but know that your first story point will be
exactly as you left it.
2. Since you know you’re telling a story about machines, on the Sub-Category
filter, clear the selection for (All), then select Machines.
Now your viewers can quickly identify the sales and profit of machines by year.

© Copyright FHM.AVI – Data 561


Data Analyst

3. Add a caption to underscore what your viewers see, for example, "Machine
sales and profit by year."

You've successfully shifted the focus to machines, but you realize that something
seems odd: in this view, you can't single out which state is contributing to the loss.
You'll address this in your next story point by introducing your map.

© Copyright FHM.AVI – Data 562


Data Analyst

Make your point


The bottom line is that machines in North Carolina lose money for your company.
You discovered that in the dashboard you created. Looking at overall sales and profit
by year doesn't demonstrate this point alone, but regional profit can.
1. In the Story pane, select Blank. Then, drag your dashboard Regional Sales
and Profit onto the canvas.
This gives viewers a new perspective on your data: Negative profit catches the eye.

© Copyright FHM.AVI – Data 563


Data Analyst

2. Add a caption like, "Underperforming items in the South."


To narrow your results to just North Carolina, start with a duplicate story point.
1. Select Duplicate to create another story point with your Regional Profit
dashboard.
2. Select North Carolina on the map and notice that the bar chart automatically
updates.
3. Select All on the Year of Order Date filter card.
4. Add a caption, for example, "Profit in North Carolina, 2018-2021."
Now you can walk viewers through profit changes by year in North Carolina. To do
this, you will create four story points:
1. Select Duplicate to begin with your Regional Profit dashboard focused on
North Carolina.
2. On the Year of Order Date filter, click the right arrow button so
that 2018 appears.

© Copyright FHM.AVI – Data 564


Data Analyst

3. Add a caption, for example, "Profit in North Carolina, 2018," and then
click Duplicate.
4. Repeat steps 2 and 3 for years 2019, 2020, and 2021.
Now viewers will have an idea of which products were introduced to the North
Carolina market when, and how poorly they performed.

Finishing touches
On this story point that focuses on data from 2021, you want to describe your
findings. Let's add more detail than just a caption.
1. In the left pane, select Drag to add text and drag it onto your view.
2. Enter a description for your dashboard that emphasizes the poor performance
of machines in North Carolina, for example, "Introducing machines to the
North Carolina market in 2021 resulted in losing a significant amount of
money."
For dramatic effect, you can hover over Machines on the Sales in the South bar chart
while presenting to show a useful tooltip: the loss of nearly $4,000.

© Copyright FHM.AVI – Data 565


Data Analyst

And now, for the final slide, you drill down into the details.
3. In the Story pane, click Blank.
4. From the Story pane, drag Negative Profit Bar Chart to the view.
5. In the Year of Order Date filter card, narrow the view down to 2021 only.
You can now easily see that the loss of machine profits was solely from Burlington,
North Carolina.
6. In the view, right-click the Burlington mark (the bar) and
select Annotate > Mark.
7. In the Edit Annotation dialog box that appears, delete the filler text and type:
"Machines in Burlington lost nearly $4,000 in 2021."
8. Click OK.
9. In the view, click the annotation and drag it to adjust where it appears.
10.Give this story point the caption: "Where are we losing machine profits in
North Carolina?"
11. Double-click the Story 1tab and rename your story to "Improve Profits in the
South".

© Copyright FHM.AVI – Data 566


Data Analyst

12. Review your story by selecting Window > Presentation mode.

After you present


Your presentation went very well. The team is convinced that there is work to be
done to increase profit in Burlington, North Carolina. And, they're curious to know
why machines did so poorly in the first place. Your boss is thrilled—not only have
you identified a way to address negative profit, you've got the team asking questions
about their data.
To keep the lessons fresh in their minds, your boss asks you to email your team a
document with your findings. It's a good thing that you know about sharing your
visualizations with Tableau Server and Tableau Public.

Step 8: Share your findings


You've done a bunch of work—great work—to learn that Burlington, North Carolina
needs some fine tuning. Let's share this information with your teammates.
Before you continue, select an option below:

© Copyright FHM.AVI – Data 567


Data Analyst

• If you or your company does not use Tableau Server, or if you want to learn
about a free, alternative sharing option, jump to Use Tableau Public.
• If you or your company uses Tableau Server, and you are familiar with what
permissions are assigned to you, jump to Use Tableau Server.
Use Tableau Public
Your story was a hit. You're going to publish it to Tableau Public so that your team
can view it online.
Note: When you publish to Tableau Public, as the name suggests, these views are
publicly accessible. This means that you share your views as well as your underlying
data with anyone with access to the internet. When sharing confidential information,
consider Tableau Server(Link opens in a new window) or Tableau
Online(Link opens in a new window).
1. Select Server > Tableau Public > Save to Tableau Public.
2. Enter your Tableau Public credentials in the dialog box.

If you don't have a Tableau Public profile, click Create one now for free and follow
the prompts.

© Copyright FHM.AVI – Data 568


Data Analyst

3. If you see this dialog box, open the Data Source page. Then in the top-right
corner, change the Connection type from Live to Extract.

4. For the second (and last) time, select Server > Tableau Public > Save to
Tableau Public.
5. When your browser opens, review your embedded story. It will look like this:

6. Click Edit Details to update the title of your viz, add a description, and more.
7. Click Save.
Your story is now live on the web.
8. To share with colleagues, click Share at the bottom of your viz.

© Copyright FHM.AVI – Data 569


Data Analyst

9. How do you want to share your story?


a. Embed on your website: Copy the Embed Code and paste it in your web
page HTML.
b. Send a link: Copy the Link and send the link to your colleagues.
c. Send an email using your default email client by clicking the email icon.
d. Share on Twitter or Facebook by clicking the appropriate icon.
Use Tableau Server
Your story was a hit. You're going to publish it to Tableau Server so that your team
can view it online.

Publish to Tableau Server

1. Select Server > Publish Workbook or click Share on the toolbar.


2. Enter the name of the server (or IP address) that you want to connect to in the
dialog box and click Connect.

3. In the Name field, enter Improve Profits in the South.

© Copyright FHM.AVI – Data 570


Data Analyst

4. If you want, enter a description for reference, for example "Take a look at the
story I built in Tableau Desktop!"
5. Under Sheets, click Edit, and then clear all sheets except Improve Profits in
the South.

6. Click Publish.
Tableau Server opens in your internet browser. If prompted, enter your server
credentials.
The Publishing Complete dialog box lets you know that your story is ready to view.

Great work! You've successfully published your story using Tableau Server.

© Copyright FHM.AVI – Data 571


Data Analyst

Send a link to your work

Let's share your work with your teammates so that they can interact with your story
online.
1. In Tableau Server, navigate to the Improve Profits in the South story that you
published. You will see a screen like this:

If you had published additional sheets from your workbook, they would be listed
alongside Improve Profits in the South.
2. Click Improve Profits in the South.
Your screen will update to look like this:

© Copyright FHM.AVI – Data 572


Data Analyst

Awesome! This is your interactive, embedded story.


3. From the menu, select Share.

4. How do you want to share your story?


a. Embed on your website by copying the Embed Code and pasting it in
your web page HTML.
b. Send a link by copying the Link and sending the link to your
colleagues.
c. Send an email by using your default email client: Click the email icon.

© Copyright FHM.AVI – Data 573


Data Analyst

Congratulations, you did it!

You used Tableau Desktop to create a view of your product data, map the product
sales and profitability by region, build a dashboard around your findings, tell a story
to present, and share your findings on the web so that remote team members can take
a look.
You're a data rockstar.
Well done! You successfully practiced the Tableau "Data Discovery" method:
• Ask a question
• Gather data
• Structure the data
• Explore the data
• Share insights
That's the basic workflow you'll follow when you work in Tableau, although you
might find yourself doing a lot more revising in each stage than you did here. For
example, it might take a few revisions to refine your initial question from something
general (what's going on with sales?) to something specific (which city in the South
is responsible for negative profit?).
And your revisions might take you in unexpected directions. That’s great, and that's
what we hope will happen—we hope that you'll discover opportunities you didn't
know existed when you first looked at your data.
If you're ready to jump in and start working with your data, then go forth and explore
your data in Tableau! But if you want more information first, check out the Learning
Library.

© Copyright FHM.AVI – Data 574


Data Analyst

You might also like