Epi-Info 7

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 47

At the end of this presentations students are expected to:

 Prepare electronic data entry template (view) using the make view

 Prepare data table and enter data following the creation of views

 Transfer data entered in Epi-Info to SPSS for analysis

 Techniques of data cleaning


Introduction
 Epi Info is a statistical software for epidemiology, which
provides tools for easy data entry form and database
construction, and analyses with epidemiologic statistics,
maps, and graphs

 Developed by the United States Centers for Disease


Control and Prevention (CDC).

 Can be downloaded from :


http://wwwn.cdc.gov/epiinfo/7/index.htm.
Introduction … cont’d
Opening Epi-Info programs:

 Either use the start list or double click the icon on the
windows desktop to open the following Epi-Info Home page

Click here to Click here to


create a form start entering
data

Click here to explore


data and analyzed data Click here to exit Epi Info
in Epi Info
Introduction … cont’d
The Epi Info home page consists of Menu bars and Buttons
The Menu bars:
File
Tools
StatCalc
Help
Buttons on the homepage access programs:
 Create Forms
Enter Data
Classic
Create Maps
Epi Info Website
Exit
Introduction … cont’d
Concepts and terms
 Project: an electronic storage unit that holds a collection of
views/ questionnaires

 A project contains many views (Epi Info questionnaire), and


each view shows information about one data table.

 Epi Info questionnaire: a Form or view that organizes


streamlines and stores your data in Epi Info – It is visible as a
Form

 Form(s): a questionnaire(s) that is organized within a project


Concepts and terms … cont’d

 Field: name of a question or a statement used in the


questionnaire

 Variable: used interchangeably with term field to refer to the


question or statement used in the questionnaire

 Database: an organized body of related information which is


computerized or it is a computerized collection of data
arranged for ease and speed of search and retrieval

 Data entry: the organized process of storing data


Conceptual relationship between Project,
Forms and Fields

Project

View1/Questionnaire 1 View2/Questionnaire 2 View 3/Questionnaire 3

Field/Question Field/Question Field/Question Field/Question Field/Question


Field/Question

8
Designing a Form/questionnaire

Before designing a new Form, we need first to create a new project or


open an existing project.

Step 1: Opening New Project and New Form


 Open Create Forms
 Click New Project (or Go to File  New project)
 The Create or Open PROJECT Dialog box will appear
 Enter a project name (Eg. Nutrition) and form name (e.g.
mathernalHealth i.e. without space between words)
 Under Location, click Choose, highlight Desktop, and click OK
 We have now created a new Form name called MathernalHealth
under a projected named “Nutrition” (in Microsoft Access
database format of extension ‘.mdb’) and saved in the Desktop.
Major Components of Create Form

A Pop-up dialog box to create Project Name and Form Name

Type the
Write the
Project
description
Name here
of the
project here

Type the
Form Name
here

 Note: When you give a Project and form names, you don’t give space
between words
Major Components of Create Form
 The left column – Displays the project and form names,
page information, field types, and templates

 The right panel – Design and edit Views (questionnaires)

 Each view has a corresponding data table “in the


background”. No spreadsheet is involved.

 You can also add check codes, create skip patterns, and
even define new variables created by other variables
automatically.
Major Components of Create Form
Form Designer

Form name

This is the Epi Info™ 7 Form Designer after


creating a new project

You’re now ready to create fields and decide


how those fields should appear
Designing a Form … cont’d
Step 2: Creating fields/variables in the Form
 Right click on the upper left corner of the blank page (the canvas) screen
 A Field dialog box appears offering options: New Field, New Field
Group, etc
 Point the cursor on New Field option and select the Field type from the
options that you want to create (e.g. Text)
 In a Text dialog, type the text in the Question or Prompt field (e.g. Address).
And then press the TAB key and finally click OK
Designing a Form … cont’d
• Other field types are also available, including number fields,
date fields, checkboxes, and drop-down lists etc.

• For the next field, you could move the cursor and right-click
with the mouse on a suitable location and the prompt. When
we select the field type Number, type Age

• Press the TAB key. In the Pattern drop-down list, select ##.
The ## option will force you to enter only two digits

• Check the Range box. Notice how the Upper and Lower limit
boxes activate

• Type the lower value, and the upper value according to the
age range of your questionnaire range belongs
Designing a Form … cont’d

Notice that each field type has its


own distinctive set of options
available for customizing the field’s
appearance and behavior.
Components of a ‘Field’
 Generally, field type specifies what that text means
i. If the text is a variable name – its dimensions can be an open-
ended response, numeric response, yes/no, date, time, etc.
ii. If the text is a Title then the ‘label/ title’ type would apply

 Question or Prompt: Creates the text that will appear on the


questionnaire
 This text can be a variable name/ Title of the survey/ Subtitle of the
survey etc

 Field Name: Refers to the label of the variable or Text – Epi Info gives
the same name as the Question or Prompt but without space. You can
change it by double clicking in the prompt

 Pattern (field pattern): Defines numerical patterns for specific types of


variables/ fields – like number, date, time…
Type of Fields
 Label/Title – contains the name of the label or title for the
question or prompt on the view. This field is not visible for
Check Code and is not searchable.

 Text variable field is an alphanumeric field that holds 255


characters. You can also set the size of this field from 1 to 5
characters to save space. Type the size for sizes greater than
five characters..

 Text[Uppercase] field – a forced uppercase field. All


information typed in this field will appear in uppercase. The size
values of this field range from 1 to 5.
Type of Fields
 Multiline – is an alphanumeric field that has the capacity to store
up to 1 gigabyte of information in the field or approximately
2,000,000 characters.
 Option - it is a variable type that creates radio button
selection fields for the view. The Option field is for mutually
exclusive choices; only one choice can be made. If more
than one choice is needed use the Checkbox option
 Number – a numeric field that has six predefined value patterns. A
new pattern can be created by typing the pattern into the Pattern
field.
 Phone Number – a pre-determined mask field for phone numbers
only. Phone extensions or international numbers cannot currently
be used in this field.
 Date – an alphanumeric field with pre-set date patterns that are
selected from the pattern drop-down menu. This field cannot be
Types of Filed

 Time – an alphanumeric field with pre-set time patterns that are


selected from the pattern drop-down menu. This field cannot be
altered.

 Yes/No - a pre-determined field in which the only values that can


be selected are yes or no. The yes or no answer is stored in the
database as a 1 or 0; 1 = Yes and 0 = No.
Type of Field
Check Box (Creating Check Box)
 Text and Numeric field types allow you to enter open-ended
data and may not always be useful in all scenarios

 If we want to collect data which have more than one choice to


be selected, we could use a checkbox field instead

 A checkbox is either checked or unchecked and can never be


set to any other value

 It is well-suited for questions where the only answers are


either yes or no
Type of Fields - Check Box
 First identify the title of the check box, say for example ‘Source of
information’ (e.g. if you are interested to know the source where
study subjects get information about HIV prevention)

 Thus create a Title called “Source of Information”

 Under the title you created, again create checkboxes by selecting


their field type as ‘Checkbox’ step by step as independent fields

 The checkboxes could be for example:


– Radio
– TV
– Newspaper
– Friends etc.
Type of Fields – Grid
• Grid Field is best suited to deal with repeating data within a
questionnaire in table of row and columns, for example at the
household level.

• Right-click in the upper left corner of the form and enter “Children in
the Household” as the prompt Click on the GRID button in the dialog
 Enter Prompt Name Press Tab click Add Enter the name of
the first grid column, “Name,” in the prompt box that now says, ENTER
COLUMN NAME FOR GRID Click on SAVE COLUMN and so on
Type of Fields -Adding a drop-down list of values

 The Yes/No field is an excellent choice for questions where the


only possible values are yes, no, and unknown.

 What if we want to present a drop-down list of choices to the


user, where we have defined our own list of possible values?

 Epi Info 7 provides three field types with which to do this: Legal
Values, Comment Legal, and Codes.
Types of Fields – Legal Value
 To add a new field with Legal Value, right-click on the canvas.
 From different options available, select Legal Values.
 In the Legal Values dialog box, do what has been seen in the
picture bellow

Click here if
there is the
other variable
used before
with the same
level

Click and type


1 Female
2. Male-
Type of Fields – Comment Legal
 To add a new field with Comment Legal, right-click on the canvas

 From the list of choices, select Comment Legal


 In the popping-up dialog box, do what has been seen in the following
picture

Click here if
there is other
variable that
Click and type
used before
1-Female
with the same
2-Male-
level

 Use a “-” (hyphen) between the number and the text value of a variable
Automating the data entry process
 Automates are helpful to reduce data entry errors and automate
the data entry process

 There are two ways to create these automation in Epi-Info:


o Setting the attributes
o Using the Check Code Editor .

 Both techniques should be created when you are creating and


modifying your “Form” using the Create Form program (i.e.,
before data entry).
Attributes of Fields
The Attributes of fields may include:
 Repeat last  Automatically repeat the last value entered in that field

 Required Prevents missing values

 Read only Nothing can be typed in this field

 Range Sets minimum and maximum values

 Font/Prompt font: To format the font, font style and size of the text of the
title , question, or prompt

 Field font: Used to edit font type, size, style of the data to be entered
Using the Check Code Editor
Creating skip rules
 First create variables using a Form (e.g. sex, pregnancy, children
ever born)
 Re-arrange the Form by left clicking (to select the fields) and
dragging fields; and then go to the Format tool bar to click
Alignment

After re-
arranged
Creating skip rules … cont’d

 Then identify the variables on which we want to perform the skip


rule, (e.g. sex, i.e men could not be asked about pregnancy)

 Click the Check Code button  In the Choose Field Block for
Action pane, expand the Page 1 node double-click on the
variable sex: Comment Legal left click on After (because we
want the action after we feed the Sex data) Select If under Add
Block: Sex After complete the If condition in the If Condition
text box using information from Available Variables box or type
the character that satisfied to skip e.g. For male it was “1”
click the button next to the Then list box  select Goto 
Children ever born click the Validate Check Code button
Save Click Close button
Groups
 One can easily group questions that are related to each other,
and are already created

 In the analysis, data fields in a group can be analyzed


separately or as a group.
To group questions:
 Drag the mouse to put a dotted line around the questions to be
grouped

 Select “Insert” from the menu and select “Group”.

 A dialog box will appear that allows you to provide a


description of the group and set the background color.
Adding page for a Form
Two ways of adding Pages on a Form:
1. Click Insert (Tool Bar)  Add Page
2. Right click on the Form Name (MaternalHealth)  Add Page

Click here to add


page

Note that we can also insert a page using the two techniques by
clicking on Insert Page. It inserts a page before a selected page;
we can also rename pages following insertions.
Example
 Create the Project called Obstetrics , and

 A Form prenatal which is under the obstetrics project

 In the Prenatal questionnaire, create fields which are


presented in table (next slide)
Data
Question or prompt Type Remark
Personal information Label/Title Bold,size 18
Date of interview Date MM-DD-YYYY
Date of birth Date MM-DD-YYYY
Age of mother Number select “read only”
Marital status Comment legal married/ single

Mother Smokes Yes/No

Number of cigarettes per day Number


Postpartum depression Yes/No

 Form a group of two and give appropriate title to each group


 Make sure that the skip rules are there
Entering Data

• After the form is created, the next step will be to create data
entry form
• We can open the Form in two ways
• We can directly Click ‘Enter Data’ button from the main window

– Open Form from Current project Click Ellipsis (browse) and search the
file as per the file location based on the form name and project name
Click OK to open the form

– If you have already opened the Form, you can directly click on Enter Data
 Completing the Data table name and starting ID after accepting the
warning  OK
Example

 Open the view ‘prenatal’ under the project ‘obstetrics ‘


you created, and

 Enter the data in the following slide


Data

Question or prompt Subject-1 Subject-2 Subject-3 Subject-4 Subject-5

Age of mother
Marital status married single single married single
Mother Smokes No Yes No Yes No

# of cigarettes per day - 5 - 6 -


Motther uses alcohol Yes No Yes No No

Postpartum depression Yes No No No No


Data Cleaning
Conventional Definition of Data Quality
Accuracy
The data was recorded correctly.
Completeness
All relevant data was recorded.
Uniqueness
Entities are recorded once.
Timeliness
The data is kept up to date.
Special problems in federated data: time consistency.
Consistency
The data agrees with itself.
Data Cleaning
• Once the survey data have been gathered, they need to be
entered into a computer data file
– checked for errors,
– impossible
– implausible values and
– inconsistencies that may be due to coding or data
entry errors
• Errors can result from incorrect reading, incorrect reporting,
incorrect filling, incorrect sensing, incorrect coding, incorrect
typing, etc.
Techniques of Data Cleaning Using SPSS

Use of simple frequency

Tabulation for consistency (e.g. cross-tabulation)

Ascending and descending (e.g. sorting)


Techniques of Data Cleaning
From the ‘cigarette data’:

Delete few observations and see how:

 The ‘simple frequency’ changes

 Use ‘Crosstab’ to see the cross classification of categorical


variables with missing values or outliers

 Using ‘sorting’ to identify where you have missed or outliers


Verifying Data Entry
Limit user choices via Code Tables
Legal Values
Comment Legal Values
Codes

Range check
Check code
Required
Skip patterns
How to import EPI-Info data into SPSS data:

Open SPSS, then go to

Window  File  Open Debase  New Query Browse where the


data is placed  Ok  Transfer only the table name  Next
Next  After checking for numeric data of strings  finish
Sample size using statistical software

As an alternative method, we can use EPI INFO


statistical software to calculate the sample size
required for the study.
Start page
• Step 1: Go to the main menu at the top left corner and click on
StatCalc and subsequent options
• You will see the following window
Sample size determination

• Fill the components necessary to determine sample size for the


three case scenarios: Population survey, Unmatched case control,
and Cross-sectional or Cohort study designs
Exercise
1. Let us assume the population that we want to conduct the
study has a target population of size N=100,000, and the
proportion of the variable of interest is not known which
means there is no previous study done and hence we decided
to use 50 percent as an estimate of the prevalence for that
variable.

I. What would be the sample size required if the sampling technique


was simple random sampling?

II. What will be the sample size if the sampling technique is two stage
sampling and is also expected to have 10% non-response rate?
Exercise
2. The prevalence of under weight of newborns is compared
between two regions. In one region, it is estimated that
about 30% of them could be under weight. In other region it
is probably 15%.
If the required sample is to show with a 90% likelihood
(power) that the percentage of newborns is different in these
two regions at 95% confidence level, what would be the
sample size? Suppose that the current study uses multistage
sampling and another similar study has reported 91.5%
response rate.

You might also like