Professional Documents
Culture Documents
Epidemological Data Collection and Management
Epidemological Data Collection and Management
11-Feb-21 1
Chapter1
Identification of lesions.
11-Feb-21 4
Methods of data collection…
• Observation is central to the practice of clinical veterinary medicine and
investigation).
data.
• However, there are occasions when the appropriate information is not readily
11-Feb-21 6
• If the procedure for diagnostic tests is involves complex equipment;
• The person using it must master all its aspects before the survey begins, to
ensure that an acceptable level of consistency in the measurements is being
obtained.
11-Feb-21 7
A. Errors due to variations between observers
• Many epidemiological studies are conducted with the help of enumerators, usually field
services staff
• Criteria need to be established by which a diagnosis is arrived at and adhered to by all those
engaged in the study.
• An additional problem frequently encountered is that of bias on the part of the observer.
• Can be avoided by the use of a “blind” technique where by the observer is kept ignorant of
the distribution of the determinant in the groups being studied.
11-Feb-21 8
B. Errors due to measurement
• Errors inherent in the procedures by which variable is being measured are
common in epidemiological studies.
• For example, if two weighing scales are being used in a study, one scale
may consistently give a higher reading than the other.
• Careful checking and monitoring of such apparatus before and during the
study will reduce errors of this kind
• Further errors may occur when diagnostic test are being used to determine
the presence or absence of an infectious agent.
11-Feb-21 9
The terms used to describe the reliability of diagnostic procedures are:
11-Feb-21 10
Questionnaires
A questionnaire is a set of written questions with different structure.
• Types of questionnaire
• Structures vs semi-structured
11-Feb-21 11
Open-ended questions
• These allow the respondent freedom to answer in his or her own words
• Advantage:
• The respondent is allowed to comment, pass opinions and discuss other events that
are related to the question's topic.
• Disadvantages:
• Can increase the length of time taken to complete a questionnaire and the answers
cannot be coded when the questionnaire is designed, because the full range
of answers is not known.
• [
• The questions may be dichotomous; that is, with two possible answers (yes or no)?
• Advantages:
• Ease of analysis and coding because of the limited, fixed response that is allowed.
• Ease to answer.
• Disadvantage:
• Because the options of answers are fixed, the answers may not reveal
related events that may be significant.
11-Feb-21 13
Completing questionnaire
• By mail
11-Feb-21 14
Mailed and self-completed questionnaires
The main requirements for a mailed or self-completed questionnaire are great clarity and politely
explaining the reason for sending the questionnaire on the covering letter.
Advantages:
• Allows a highly motivated respondent to 'check the facts' over a period of time;
Disadvantages:
Necessity of clarity of question and response rate low - 50% is not uncommon, and the value can be
as low as 10%.
11-Feb-21 15
Interviews
• Can overcome some of the disadvantages of mailed and self-completed questionnaires,
• Is particularly useful if many of the questions are open-ended, and where illiteracy of the
respondent is a problem.
• Questionnaires can be longer than self-completed ones, and response rates of 90% can
sometimes be achieved.
• Telephone interviews have high response rates, and can produce results more quickly and
cheaply than personal interviews and mailed questionnaires.
• A polite letter, explaining the reason for producing the questionnaire, and the value
of the results deriving from its completion should be enclosed.
• Ultimately how the data will be processed (coding and computer entry)
11-Feb-21 20
Testing questionnaires
• Several drafts of a questionnaire are usually required following testing.
• Informal testing: is carried out on colleagues who can detect trivia, ambiguities and defects in
questionnaire design.
• Formal testing: is undertaken on a small random sample of the population on which the full survey
will be conducted.
• The size of the sample is chosen using the guidelines for sample size-determination in surveys
• This survey should never be used as part of the full survey, and respondents used in the pilot survey
should never be used again in the full one
11-Feb-21 21
Coding and editing of questionnaire
• Before administering any questionnaire procedures for coding of responses and
computer data entry should be considered.
11-Feb-21 22
• Coding of responses is best accomplished directly on the paper forms
• Do not attempt to combine coding and data entry into a single step.
• It is a good idea to use a distinctive colour of ink for recording all codes
on the forms so it is easy to differentiate writing done by the coder from
that done by the respondent or interviewer.
• While they are convenient and easy to set up for data entry, the ability to sort individual columns
in the spreadsheet makes it possible to completely destroy the data
• General-purpose database managers are useful and allow greater manipulation of the data.
• However, because most data will ultimately be transferred to a statistical package for verification
and analysis,
• It is advisable to perform all data manipulations in that statistical package, where it is easier to
document and record all procedures carried out.
11-Feb-21 24
Data management
A. Data collection sheets
• Some things to consider when dealing with the file are as follows:
• Never ship the original to another location without first making copies of it.
11-Feb-21 25
• Set up a system for recording the insertion of data collection sheets into the file so
that you know how many remain to be collected before further work begins.
• Once all of the forms have been collected, before you do anything else, scan
through all sheets to get an impression for their completeness.
• If there are omissions in the data-collection sheet (i.e. forgetting to complete the
last page of a questionnaire),
• Retuning to the data source to complete these data will more likely be successful
if it is done soon after data were initially collected rather than weeks or
months later (after data analysis has begun).
11-Feb-21 26
B. Data coding
• It is advisable to have a space to allow for coding directly on the data collection sheet
• if you have 'open' questions, scan the responses and develop a list of needed codes before
starting coding
• In general, avoid the use of string variables except for rare instances where you need to
capture some textual information (e.g a comment field).
• Never make compound codes e.g l =male, domestic shorthair, 2=female domestic
shorthair, 3= male Siamese, etc.
11-Feb-21 27
• For all types of data, note any obvious outlier responses
11-Feb-21 28
C. Data entry
Issues to consider when entering your data into a computer file are as follows:
• Double-data entry, followed by comparison of the two files to detect any inconsistencies, is preferable to single-
data entry.
• Spreadsheets are a convenient tool for initial data entry, but these must be used with extreme caution; because it is
possible to sort individual columns, it is possible to destroy your entire dataset with one inappropriate
'sort’ command.
• Custom data entry software programs provide a greater margin of safety and allows to do more data verification at
the time of entry. E.g Epi Data (http://www.epidata.dk/).
• Using hierarchical database software can make data entry and retrieval more efficient for large quantities of multi-
level data (e.g every lactation for each dairy cow from several herds over several years)
• Alternatively, it is possible to set up separate files for data at each level (e.g a herd file, a cow file etc and merge the
files after data entry.
11-Feb-21 29
• As soon as the data-entry process has been completed, save the original data files in a safe
location.
• In large, expensive trials it might be best to have a copy of all originals stored in another
location.
• If the data entry program which you use does not have the ability to save your data in the
format of the statistical package that you are going to use, there are a number of
commercially available software programs geared specifically to convert data from one
format to another
• If you use a general purpose program (e.g spreadsheet) to enter your data, as soon as the
data are entered, convert them to files usable by the statistical program that you are going
to use for the analysis.
11-Feb-21 30
D. Data editing
• Before beginning any analyses, it is very helpful to spend some time editing your
data
• The most important components of this process are labelling variables and values
within variables, formatting variables and correctly coding missing values
• All variables should have a label attached to them which more fully describes
the contents of the variable
• While variable names are often quite short (eg < 8 or <16 characters), labels
can be much longer.
• Note With some computer programs, the labels are stored in a separate file.
11-Feb-21 31
• Categorical variables should have meaningful labels attached to each of
the
categories.
• For example, sex could be coded as l or 2, but should have labels for 'male' and
'female' attached to those values.
• The number that was assigned to all missing values needs to be converted into
the code used by your statistics program for missing values.
• Some programs will allow you to attach 'notes' directly to the dataset (or to
individual variables within the dataset).
• BY: If you have a very small dataset, to print the entire dataset and check