Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Data Collection Planning

9skillsfactory.com/change-leadership/data-collection-planning

Why We Plan Data Collection the Way We Do


2 Jun 2020

The Data Worksheet Explains Why We Plan Data Collection the Way We Do

THE END RESULT

The end result of any data collection activity, assuming we do it correctly, is a consolidation of data that
we can study and analyse to find relationships between variables.

The layout of that data collection must match the way analysis packages such as SigmaXL and Minitab
view the data, otherwise no analysis can be undertaken.

Those packages will simply not be able to recognise the different data types in the worksheet.
Matching the needs of those packages is quite simple and involves using the top row to name the
variables in the data collection, I.e. the names of the Y and the Xs.

And then sticking the data directly underneath those headings.

An effectively laid out worksheet looks like this.

1/5
The Y variable (your primary metric) is in column A.

All of the different Xs are contained in columns B to I inclusive and are a mix of numerical and categorical
variables.

THE DATA COLLECTION PLAN

The data collection plan that matches this assembly of data looks like this.

2/5
You'll notice the categorical Xs are listed as stratification variables because that's exactly what we will do
to study their relationship with the primary metric ... stratify the data and compare results from each
grouping.

The sampling plan guides us in the number of rows of data (I.e. data points) we collect and assemble in
the data file.

Numerical variables are listed as secondary metrics which we study in a different way than categorical
variables.

In most cases a correlation analysis is the primary strategy for looking at their relationship with the
primary metric.
KEY POINTS ABOUT DATA COLLECTION PLANNING

The key points are these:

(A) Our data collection (DC) plan is there to help us design the elements of the data worksheet.

(B) The list of variables in the DC plan - the primary metric, the categorical Xs and the numerical Xs -
determine the column headings in the data worksheet.

(C) Every time we collect a data point for the primary metric, we also collect one data point for every other
variable.

(D) The sampling plan guides us in how we collect the data and how many rows we collect.

3/5
(E) Because there can be a lot of variation in how people collect the numerical variables, we need to
operationally define what those variables are and how they must be collected.

(F) Categorical variables don't need the same definition as the numerical, because they are observed
data that makes it easy for data collectors to be consistent in what they record.

For more information, check the data collection planning section in Process Mastery with Lean Six Sigma
2nd Edition.

OUR MOST SIGNIFICANT LEADERSHIP TRAINING

GLOBAL LEADERSHIP SKILLS

More Information

Build a CV That Sells YOU Most Effectively

Download George's CV template and guidebook here.

© 2019 Soarent Publishing - All Rights Reserved | PO Box 267, Ravenshoe, Qld. Australia 4888 | ABN:
89699416331 | Contact Us: support@9skillsfactory.com

4/5
Yes I Agree
We use cookies to give you the best possible experience on our website. By continuing to browse this
site, you give consent for cookies to be used. For more details please read our Cookie Policy

5/5

You might also like