Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

MUHAMMAD Osama Afzal

31681
Assignment 3

In order to identify what kind of data exist, and how to retrieve that data. The following steps should be
taken.
1. Understanding the objective
Ask what the problems are in the given industry and competitive market. Identify and understand
them thoroughly. Establishing this foundational knowledge will help to make better inferences
with data later on.

Before collecting data, we should start by identifying the business questions that need to be
answered to achieve organizational goals. By determining the precise questions we need to know
to inform strategy, we will be able to streamline the data collection process and avoid wasting
resources.

2. Identifying data sources.


Putting together the sources from which we will be extracting our data. We might be coordinating
information from different databases, web-driven feedback forms, and even social media.

3. Cleaning and organizing data.

Surprisingly, 80 percent of a data analyst’s time is devoted to cleaning and organizing data, and
only 20 percent is spent actually performing analysis. This so-called “80/20 rule” illustrates the
importance of having clean, orderly information before we can attempt to interpret what it might
mean for organization.
The term “data cleaning” refers to the process of preparing raw data for analysis by removing or
correcting data that is incorrect, incomplete, or irrelevant. To do so, we will start by building
tables to organize and catalog what we have found. We will than create a table that catalogs each
of our variables and translates them into what they mean in the context of this particular project.
This information could include data type and other processing factors, as well.
4. Performing statistical analysis.
Once we have thoroughly cleaned the data, we can begin to analyze the information using
statistical models. We will start to build models to test our data and answer the business questions
we identified earlier in the process. Testing different models such as linear regressions, decision
trees, random forest modeling, and others can help us determine which method is best suited to
our data set.
Here, we will also need to decide how to present the information in order to answer the question
at hand. There are three different ways to demonstrate the findings:
Descriptive Information: Just the facts
Inferential Information: The facts, plus an interpretation of what those facts indicate in the context
of a particular project.
Predictive Information: An inference based upon facts and advice for further action based on our
reasoning.
Clarifying how the information will be most effectively presented will help us remain organized
when it comes time to interpret the data.
5. Draw conclusions.

The last step in data-driven decision making is coming to a conclusion. Ask, “What new
information did we learn from the collection of statistics?” Despite pressure to discover
something entirely new, a great place to start is by asking ourselves questions to which we
already know—or think we know—the answer.
Many companies make frequent assumptions about their products or market. For example, they
might believe, “A market for this product exists,” or, “This is what our customers want.” But
before seeking out new information, first put existing assumptions to the test. Proving these
assumptions are correct will give us a foundation to work from. Alternatively, disproving these
assumptions will allow us to eliminate any false claims that have, perhaps unknowingly, been
negatively impacting the company. Keeping in mind that an exceptional data-driven decision
usually generates more questions than answers.
The conclusions drawn from the analysis will ultimately help the organization make more
informed decisions and drive strategy moving forward. It is important to remember, though, that
these findings can be virtually useless if they are not presented effectively. Thus, data analysts
must become skilled in the art of data storytelling to communicate their findings with key
stakeholders as effectively as possible.

You might also like