Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

Asking good questions…

Every day there are millions of opportunities to improve people's lives by making better use of data. Are you interested
in disease research, education patterns, industrial efficiency, patient care, or government spending? The opportunities
are endless!

To begin making better use of data, consider an important aspect of data literacy, exploration. The key to successful data
exploration is the formulation of good questions.

Asking why…
When you begin to explore the answer to any question, you will likely find yourself asking follow-up questions. Usually it
is not enough to stop after answering your original question. In the example above, if it turns out that health outcomes
among people with chronic illnesses in the United States vary with dog ownership, the next question to ask is "why."

The "5 Whys" technique, originally developed by Toyota Motors founder Sakichi Toyoda, proposes asking "why" of a
problem that's been identified, and then continuing to ask "why" for each answer or explanation given. While the
primary goal of the technique is to determine the root cause of a defect so as to be able to fix it, it can be used to dig
down into the causes of any outcome.

The prominent information technologist Stephen Few identified a list of traits that help
people to work effectively with data, traits that he calls "aptitudes and attitudes."
Interest, Curiosity, and Imagination –

Are you genuinely passionate about a topic? Does this passion engage your mind? These are examples of INTEREST.

Do you enjoy figuring things out, wonder how things work, and crave information? These are examples of CURIOSITY.

Do you have a knack for coming up with new things to try, over and over, until you achieve your goal? This is an example
of IMAGINATION.

Self-Motivation –

Are you driven to explore and to understand? Do you pursue new questions without hesitation? Do you go beyond the
scope of what's expected without waiting to be told what to do? These are examples of SELF-MOTIVATION.

Open-Mindedness and Flexibility –

Are you willing to accept what you find? Are you willing to admit when you're wrong? These are examples of OPEN-
MINDEDNESS AND FLEXIBILITY.

Awareness of What's Worthwhile, Pattern Spotting, and Healthy Skepticism –

Do you know how to prioritize questions in order to see the difference between a worthwhile pursuits versus one that
would require a great deal of time but yield few results? This is an example of an AWARENESS of what's worthwhile.

Can you spot patterns that are meaningful and ignore ones that are not? This is an example of PATTERN SPOTTING.

Even when you're confident about an answer or a result, do you look at it from a different perspective to possibly learn
something new? This is an example of having HEALTHY SKEPTICISM.

Methodicalness –

Do you learn an efficient set of steps and then repeat them according to a proven method? This is an example of being
METHODICAL.

Ability to Analyze and Ability to Synthesize –


Are you able to examine something complex, and recognize its many parts and how these parts interact to form a
whole? That's an example of an ABILITY TO ANALYZE.

Are you able to see "the big picture" by putting a collection of parts together to form a whole? That's an example of an
ABILITY TO SYNTHESIZE.

Familiarity with the data –

Even if you're new to working with data, do you make sure you become familiar with the facts so you don't jump to
conclusions that may be inaccurate? That's an example of developing FAMILIARITY WITH THE DATA.

Skill in The Practices of Data Analysis –

The skills of data analysis can be developed through training, experience, and lots of practice.

Data fundamentals…
You know that data literacy is the ability to explore, understand, and communicate with data. But what exactly is "data"?

A collection of data is a collection of facts. Even more specifically, consider this expanded definition. Jeffrey Leek, a data
scientist working as a professor at Johns Hopkins Bloomberg School of Public Health, adapted this expanded definition
from Wikipedia:

"Data is comprised of values of qualitative or quantitative variables, belonging to a set of items."

Set of items –

Sometimes called the population, this is the group of objects you are interested in.

Variable –

A measurement, property, or characteristic of an item that may vary or change. (This is opposed to a constant
measurement, such as pi, that does not vary.)

Qualitative variable –

A qualitative variable describes qualities or characteristics, such as country of origin, gender, name, or hair color.

Quantitative variable –

A quantitative variable addresses measurable characteristics, such as height, weight, or temperature.

How is data collected?


Data can be collected in a variety of ways, including questionnaires, interviews, observations, analysis of documents,
web scraping, and machine measurements. Received or collected data is called raw data. Raw data, which can also be
known as source data or primary data, has not been processed in any way. This means it has not been run through any
software, had any variables manipulated, had any data removed from the data set, and has not been summarized in any
way. Raw data often allows the fullest range possible for data analysis, since no data has been removed or summarized.

You might also like