Data Preparation

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

Data preparation

According to the burke way, multiple small errors exist because of the inconsistent quality of
interviewing, therefore it is important to prepare the data before analyzing it. 
The data preparation consists of the following steps:
The first step is checking the questionnaire for completeness. These checks are usually made
when the fieldwork is underway. It can also be made independently. It is important to make
checks because there is a certainty that it might be incomplete, the respondent did not understand
the questions or it has been answered by someone who does not qualify as a participant. Editing
is a review of the questionnaire to achieve accuracy and data precision. Responses may become
illegible if they have not been recorded properly. Unsatisfactory responses can be handled by
returning to the field to get better data. 
Unacceptable responses can be treated in one of the three ways, returning the responses to the
field, this approach can be efficiently used for business surveys in which the sample sizes are
small and the respondents can be identified easily. If returning the responses is not feasible,
another possibility is to assign values for unacceptable values. This approach can be used if the
inacceptable responses are small, and the variables used are not key variables of the research.
The third approach that can be used is discarding inacceptable respondents. According to this
approach, respondents with a substandard response are simply cast-off. This approach can be
used when the sample size is large, the demographic features of the respondents do not differ
from the satisfactory answers. 
The assignment of a code is to represent a specific question. For an unstructured questionnaire,
codes are assigned when the questionnaire is returned whereas for a structured questionnaire
precoding can be done. Coding of structured questions is comparatively easy. For example, for
close-ended questions- simple yes or no questions, a single column for coding is sufficient,
whereas the coding of unstructured or open-ended questions is complex comparatively. For such
questions: categories should be mutually exclusive and collectively exhaustive. The data should
be coded: in as much detail as possible. A fixed field code is a code in which the number of
records for each respondent is the same and data appears in the same column. Similarly, a
codebook contains the coding instructions and recovery information about a set of variables. 
Transcribing involves relocating data from the questionnaires into computers. If the information
is collected; via the internet: then this step is needless. The data can be transcribed: in multiple
ways, such as using bar codes, using software programs, voice recognition systems, etc. Data
cleaning is usually a thorough and extensive check to obtain consistent and precise responses.
Consistency checks are a part of the data cleaning process that identifies data that is out of range,
inconsistent, or has extreme values. Computer software like SPSS, Excel, etc.: can be used to
highlight extreme values or responses. Questions that have unambiguous answers: can be treated
in different ways, such as substituting a neutral value so that the mean of the variable remains
unchanged and unaffected. Another way can be through imputed responses, the researcher infers
a response from the available data. Case wise omission, is a method for handling missing
responses; by deleting. Another common; method is the use of pairwise deletion, in which no
responses are unwanted rather: complete responses are measured. 
One of the three ways of altering data is weighting which is a statistical change of data in which
each response is given: a certain weight to reflect importance relative to other respondents. It is
usually used to make sample data more representative of the characteristics of the target
population. A respondent with a higher weight is considered: to be more important as compared
to the ones with a lower weight. 
Variable re-specification is a way in which data is transformed to create new variables or modify
the existing variables. It often involves the use of dummy variables, which are usually used for
one or two values. Lastly, scale transformation is a maneuver of scale values to ensure that it was
comparable with other scales and is suitable for analysis. A 5-point Likert scale or a 7-point
semantic differential scale is used to measure variables. Standardization is a process of correcting
data to reduce them to the same scale by subtracting the mean and dividing the standard
deviation. 
This process is based; on the marketing research process, known characteristics, statistical
techniques. The properties of statistical techniques are compulsory to consider. 
Statistical techniques that are suitable for analyzing data when there is a single measurement of
each element in the data are univariate. In the case of multiple variables, each variable is studied
in isolation: whereas, multivariate methods are used when there are two or more measurements
that are analyzed simultaneously. Univariate techniques can be classified based on metric or non-
metric data. Metric data is the one that has intervals, or ratios whereas non-metric is data derived
from the nominal or ordinal scale. 
Sample in data is independent if they are drawn randomly from different populations. On the
other hand, data is paired when data for the two samples relate to the same group of respondents. 
Multivariate techniques can be categorized: as dependence or interdependence procedures. When
one; or more variables can be recognized: as dependent variables, and the remaining can be
recognized as independent variables, then the multivariate technique is appropriate.
Interdependent techniques are used when multivariate statistical techniques: attempt to group
data based on underlying similarity and allow interpretation of data. 
Data analysis should be conducted at an individual level, country, or cross country level. In an
individual data analysis technique data from each individual is separately explored. Within the
country, analysis data are analyzed separately for each country. Cross-country examination or
pan cultural examination requires that all the respondents are pooled, and scrutinized. 

You might also like