Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

HR Data Collection

Data collection is the process of gathering and measuring information on


targeted variables in an established system, which then enables one to
answer relevant questions and evaluate outcomes.

HR departments have a tradition of collecting vast amounts of HR data.


Unfortunately, this data often remains unused. As soon as organizations
start to analyze their people problems by using this data, they are
engaged in HR analytics.

By using HR analytics you don’t have to rely on gut feeling anymore.


Analytics enables HR professionals to make data-driven decisions.
Furthermore, analytics helps to test the effectiveness of HR policies and
different interventions.

Broadly, the data required by an HR analytics tool is classified into


internal and external data. One of the biggest challenges in data collection
is the collection of the right data and quality data.

Internal data

Internal data specifically refers to data obtained from the HR department


of an organization. The core HR system contains several data points that
can be used for an HR analytics tool. Some of the metrics that an HRIS
system contains includes:

 Employee tenure
 Employee compensation
 Employee training records
 Performance appraisal data
 Reporting structure
 Details on high-value, high-potential employees
 Details on any disciplinary action taken against an employee

The only challenge here is that sometimes, this data is disconnected and
so may not serve as a reliable measure. This is where the data scientist
can play a meaningful role. They can organize this scattered data and
create buckets of relevant data points, which can then be used for the
analytics tool.

External data

External data is obtained by establishing working relationships with other


departments of the organization. Data from outside the organization is
also essential, as it offers a global perspective that working with data
from within the organization cannot.

 Financial data: Organization-wide financial data is key in any HR


analysis to calculate, for instance, the revenue per employee or the
cost of hire.
 Organization-specific data: Depending on the type of organization
and its core offering (product or service), the type of data that HR
needs to supplement analytics will vary.
 Passive data from employees: Employees continually provide data
that is stored in the HRIS from the moment they are approached for
a job. Additionally, data from their social media posts and shares
and from feedback surveys can be used to guide HR data analysis.
 Historical data: Several global economic, political, or environmental
events determine patterns in employee behavior. Such data can
offer insights that limited internal data cannot.

Data Sources

HR professionals gather data points across the organization from sources


like:

 Employee surveys
 Attendance records
 Employee reviews
 Salary and promotion history
 Employee work history
 Demographic data
 Personality data
 Recruitment process
 Employee databases

Data Collection Plans

A data collection plan is a guide that identifies goals, objectives, and


special focus areas, and lays out timelines, procedures, and best practices
for collecting data. You will need to follow a series of steps to ensure that
data collection process is stable and reliable

 formulate a clear statement of the problem


 define and list the characteristics to be measured
 select the right measurement technique
 construct a clear and simple data collection form
 arrange the sampling method
 determine who will collect the data, who will analyze and interpret
the data, and who will report the results

Data Collection Methods


Few types of data collection methods includes

 Check sheets – It is a structured, well-prepared form for collecting


and analyzing data consisting of a list of items and some indication
of how often each item occurs. There are several types of check
sheets like confirmation check sheets for confirming whether all
steps in a process have been completed, process check sheets to
record the frequency of observations with a range of measurement,
defect check sheets to record the observed frequency of defects and
stratified check sheets to record observed frequency of defects by
defect type and one other criterion. It is easy to use, provides a
choice of observations and good for determining frequency over
time. It should be used to collect observable data when the
collection is managed by the same person or at the same location
from a process.
 Coded data- It is used when presence of too many digits are to be
recorded into small blocks or during data capturing of large
sequences of digits from a single observation or rounding off errors
are observed whilst recording large digit numbers. It is also used if
numeric data is used to represent attribute data or data quantity is
not enough for a statistical significance in the sample size. Various
types of coded data collection are
 Truncation coding for storing only 3,2 or 9 for 1.0003, 1.0002, and
1.0009
 Substitution coding – It stores fractional observation, as integers
like expressing the number 32 for 32-3/8 inches with 1/8 inch as
base.
 Category coding – Using a code for category like “S” for scratch
 Adding/subtracting a constant or multiplying/dividing by a factor –
It is usually used for encoding or decoding
 Automatic measurements – In it a computer or electronic
equipment performs data gathering without human intervention like
radioactive level in a nuclear reactor. The equipment observes and
records data for analysis and action. Technological tools for
automated data collection include video recording, self-recording
test equipment, computers with verifications and crosschecking, bar
codes, magnetic strips, scanning devices, and radio-frequency
identification (RFID).

Guidelines before data collection

 Is there a genuine benefit to collect this data? – There is always the


temptation to collect data, just in case you need it later, or because
it can have some minor value to the company. When making
informed HR data decisions, it’s best to limit data collection to what
is truly valuable and necessary for the business to run successfully.
 Could the intended purpose of the data collection have any negative
ramifications to employees? – Consider whether employees would
be okay with this information being collected, and whether the data
could be used to negatively impact their job or opportunities at
work.
 How could this data be misused? – A lot of problems with collecting
employee’s personal information relate to misuse and abuse.
 Are HR allowed to collect and process this data in the in the
locations where employees work? – You can have the best idea to
improve an HR practice, but if the data collection is not allowed
where employees are working, it’s a non-starter. If you aren’t sure
whether you can collect a certain piece of data, check each
country’s data collection guidelines.

For having an effective collection of data, the data being collected must
be valid, reliable and bias free. These characteristics only will make the
process more useful and hold up to the scrutiny while performing data
analysis. Three key terms that refer to accuracy in data collection are –
Reliability, Validity, and Margin of Error.

 Reliability: Reliability refers to the consistency of the data collection


method. The higher the sample size is in relation to the population
size, the more reliable it is.
 Validity: Validity refers to the accuracy of the data collection efforts
being made. The purpose here is to analyze that the chosen data
collection method truly measures what it seeks to measure, and if it
does, then it must be considered valid.
 Margin of Error: Margin of error ties into our surveys as they are
subject to some uncertainty about how well a sample represents a
population, and the validity and reliability of the testing tool. In this
case it becomes important to make every effort to guarantee that
the data is free of errors as it may affect both the reliability and the
validity. This implies that the error shouldn’t be so significant that it
prevents from reaching valid conclusions.

Primarily there are two types of errors such as sampling error and non-
sampling error.

 Sampling Error: A sampling error is statistical in nature and is


caused by human error. The sampling error from surveying is where
a portion of the population is surveyed versus getting a
representative sample from the entire population.
 Non-Sampling Errors: Non-sampling error, are statistical in nature
and is caused by human error.

You might also like