Health Statistics 1 3 1

Health Statistics 1
Code : HES 106

Hours: 60
By: Alex Mwaniki
Module Competence
 This module is designed to enable the learner utilize the

various statistical concepts for the collection, compilation,
computation, analysing, interpretation and dissemination of
health and health related statistics required for planning and
management of health services
Module Outcomes
• Demonstrate understanding of the importance of statistics

(20hrs)
• Explain the sampling processes (8hrs)
• Identify and utilize various health administrative statistics and
interpretation of related formulae (10hrs)
• Explain the importance of data organization and presentations
(8hrs)
Module Content
1. Introduction to Health Statistics (20hrs)”10 lessons)”

i. Definition of Health, Statistics, Health Statistics,
ii. Health Statistics uses and Importance.
iii. Types of Statistics
iv. Definition of terms: data, information, variables,
v. Measurements of scale,
vi. Sources of data and sources of health data,
vii. Data collection tools, data collection methods,
viii. Survey process, population, pilot survey, purpose of conducting a pilot survey,
procedures of conducting a pilot survey, procedure of pre-testing questionnaire
and interview schedule
ix. Data editing, methods of editing data, procedures of editing, definition and
importance of editing data
x. Approximation, errors and measurements of errors
xi. Code and classify data, different methods of coding data, purpose and importance
of data
Module Content cont...
2. Sampling processes (8hrs)” (4 lessons)”

i. Definition of terms: Population, Sample, Sampling, Sample size
Determination, probability and Non-probability sampling methods.
3. Health Facility Administrative Statistics (10hrs)” (5 lessons)”

ii. Definition of terms: Description and interpretation of procedures,
tools for collecting health administrative statistics, uses, importance,
iii. Standard terms and terminologies and computation of health facility
administrative;
Module Content cont...
4. Data Organization and Presentation (8hrs)” (4 lessons)”

– Describe data classification, Grouped and ungrouped data,
inclusion and exclusion form of grouping
– Sturge formula, narrative, array presentation tabular
presentations, Simple tables, multiple column tables, contingency
tables, principles of table construction, charts presentation,
simple bar charts, component bar charts, percentage component
bar charts, multiple bar graph, principles of bar charts
construction, pie charts, graphs presentations, principles of graph
construction frequency distribution, cumulative frequency tables,
relative frequency, cumulative frequency curve (ogive), histogram
(graphs of time series; Z-charts, histogram, Lorenze curve and
pictograms, data analysis and presentation using computers
1. Introduction to Health Statistics (20hrs)”10 lessons)”
Expected learning at the end of the module

i. Definition of Health, Statistics, Health Statistics,
ii. Health Statistics uses and Importance.
iii. Types of Statistics
v. Measurements of scale,
vi. Sources of data and sources of health data,
vii. Data collection tools, data collection methods,
viii. Survey process, population, pilot survey, purpose of conducting a pilot survey,
procedures of conducting a pilot survey, procedure of pre-testing questionnaire
and interview schedule
ix. Data editing, methods of editing data, procedures of editing, definition and
importance of editing data
x. Approximation, errors and measurements of errors
xi. Code and classify data, different methods of coding data, purpose and
importance of data
Definition of Health, Statistics, Health Statistics,
• Health
– It’s a state of complete physical, mental and social well-being and
not merely the absence of diseases or infirmity.
• Statistics
– It’s the study of how to collect, organize, analyze and interpret
numerical information from data.
– It’s the body of theory, concepts, methods and methodologies of
collection, analysis, interpretation and representation of
numerical data for decision making.
• Health statistics
– It’s the science of collecting, organizing, summarizing/analyzing,
presenting and interpreting health data and using them for
decision making.
i. Uses of Health Statistics
i. Allocation of resources by hospital administration or

management. The resources include finance, human resources
and equipment.
ii. Making informed decisions
iii. Teaching or education of medical students and patients
iv. Research purposes – finding new knowledge in the area of health
v. Allows general conclusions to be made from data provided
(limited/unlimited)
vi. Evidence based decision making affecting:-
a. Age and sex distribution of the population by social groups
 Birth rate, crude death rates, IMR, MMR, CDR, PMR, SBR, NDR, PNDR etc.
 Incidence, prevalence and attack rates
ii. Importance of Statistics
• Analyses general activities of an organization

• Planning for recurrent and development votes e.g. Capital projects of organizations
• Monitoring and evaluation of activities being undertaken e.g. clinical practices by
doctors etc.
• Clinical Research and Clinical trails
• Epidemiological Studies
• An aid to supervision
• Base for planning of the health care or services for immediate or future
• Eyes of administration
• Arithmetic of human welfare
• Disclose connection between related factors
• Helpful in business
• Used in all sciences
• Helpful in data processing
Types of Statistics
1 Descriptive statistics
• Summarizes population data numerically or graphically
• It can be defined as those methods involving the collection, presentation and characterization of a set of data to properly describe the
various features of that set of data.
• It’s a type of statistics that describes large masses of data.
e.g. statistics pertaining to central tendency such as mean, median and mode.
- Statistics pertaining to dispersion around the central tendency such us range, standard deviation, variance, and quartile deviation.
- Statistics of graphs depicting the shape of a distribution.
2 Inferential Statistics
• They include;
 Estimation
 Modelling relationships
 Hypothesis testing.
• It’s a type of statistics that infers or induces from a small group (sample) and generalizes to the whole group(population).
• Also known as inductive statistics, it deals with the method of drawing conclusions from numbers observed.
• Involves drawing the right conclusions from the statistical analysis that has been performed using descriptive statistics
• Most predictions of the future and generalizations about a population by studying a smaller sample
i Estimation
– It’s the group of statistics which allows for the estimation about population values based upon sample data. E.g population parameter estimates and
confidence inter
i Modelling relationships
– Allows us to develop mathematical equations which describe the interrelations between 2 or more variables
i Hypothesis testing
– Allows us to test for whether a particular hypothesis we’ve developed is supported by a systematic analysis of the data.
– As built on the descriptive statistics by going a step further to make interpretation with a view to population upon which a decision would be based
e.g cli-square, t-test, f-test etc.
• Data:
Data is a collection of raw facts, figures or instructions that do not have much
meaning to the user.
Data may be in form of numbers, alphabets/letters or symbols, and can be
processed to produce information.
• Information:
Information is the data which has been refined, summarized & manipulated in the
way you want it, or into a more meaningful form for decision-making.
The information must be accurate, timely, complete and relevant.
• Variable
A variable is a quantity that may change within the context of a mathematical
problem or experiment. Typically, we use a single letter to represent a variable.
The letters x, y, and z are common generic symbols used for variables.
Comparison between Data and Information
Data Information
1. Unprocessed (raw) facts or figures. 1. It is the end-product of data processing

2. Not arranged. (processed data)
3. Does not have much meaning to the user. 2. Arranged into a meaningful format.
4. Cannot be used for decision-making. 3. More meaningful to the user.
4. Can be used to make decisions.
Types of data
a) Quantitative
b) Qualitative data
What is Quantitative data?
 This refers to data that can be quantified, verified and manipulated

statistically
 Numerical, countable, quantified, and analysed mathematically
 Quantitative data is categorized into:
i) Discrete e.g. number of persons
ii) Continuous e.g. height and time
 Quantitative data are used to:
o Draws correlations between factors
o Best used to generalize to population
o Easy to present in tables and charts
 Quantitative data is highly reliable.
o Reliability: extent to which a procedure yields same results on
repeated trials
What is qualitative data?
 Qualitative data are data that describes meaning and is generally non-numerical
 Qualitative data is classified into:
i) Nominal : The gender of a person
ii) Ordinal :
 Qualitative data has greater validity.
o Validity: extent to which an indicator measures what it intends to measure
 Qualitative data:
o Represents the “voice” of the individual or group
o Not generalizable to the population
o Time-consuming to collect and analyse
 Qualitative and Quantitative data can further be classified into:
i) Primary data
ii) Secondary data
 Note that a mixture of both quantitative and qualitative data provides the most
comprehensive set of data for program evaluation.
v. Measurements of scale
1. Nominal scale of measurement-“cold, warm, hot
and very hot.”
2. Ordinal scale of measurement-‘one = happy, two =
neutral, and three = unhappy.
3. Interval scale of measurement
4. Ratio scale of measurement
vi. Sources of data and sources of health data
The main sources of health statistics are;

a) Surveys
b) Administrative and medical records
c) Claims data
d) Vital records
e) Surveillance
f) Disease registries
g) Peer-reviewed literature.
We’ll take a look into these sources, and the pros and
cons of using each to create health statistics.
Sources of data and sources of health data
Surveys
• Surveys are an important means of collecting health and social science information from a sample of
people in a standardized way to better understand a larger population. There are many methods used to
conduct surveys, including questionnaires and in-depth interviews via phone, mail, email, and in-person.
• Survey research allows researchers to collect empirical data in a relatively short period of time.
Depending on the design and scope, surveys can collect data on a representative sample of people
Medical records
Medical records are used to track events and transactions between patients and health care providers. They
offer information on diagnoses, procedures, lab tests, and other services. Medical records help us
measure and analyze trends in health care use, patient characteristics, and quality of care.
Claims data or administrative data
Claims data, also known as administrative data, are another sort of electronic record, but on a much bigger
scale
The good thing about claims data is that, like other medical records, they come directly from notes made by
the health care provider, and the information is recorded at the time patient sees the doctor.
Vital records
Vital records are collected by the National Vital Statistics System, and are maintained by state and local governments. Vital
records include births, deaths, marriages, divorces, and fetal deaths. They also record information about the cause of
death, or details of the birth.
Sources of data and sources of health data cont...
Surveillance
Surveillance is the ongoing systematic collection, analysis, and interpretation of data, closely
integrated with the timely dissemination of these data to those responsible for preventing and
controlling disease and injury.”1 Surveillance activities are usually associated with the study of
infectious diseases.
Disease registries
• Registries are systems that allow people to collect, store, retrieve, analyze, and
disseminate information about people with a specific disease or condition. Disease
registries let researchers estimate how large a health problem is, determine the
incidence of the disease, study trends over time, and evaluate the effects of certain
environmental exposures. Registries provide information to improve the quality and
safety of care, and allow for comparison of effective treatment.
• Registries are kept by governments, hospitals, universities, non-profits, and private
groups. They store data from hospital records, lab reports, and other sources.
Because clinical data is sent securely to registries from the various points of care
that a patient may receive, registries allow the possibility to track and better
understand rare diseases
vii. Data Collection Methods
1) Questionnaires/Surveys
 Good way of gathering a lot of data and provides a broad perspective.
 Can be administered electronically, mail or face to face.
 Mail/Email surveys have wider reach, relatively cheaper to administer,
information is
standardized and privacy can be maintained.
 Prone to low response rate, low in depth, not appropriate among the
illiterate and does
not allow for any observation.
 Chances of reporter bias particularly on sensitive issues or where disclosure
on trust
is required
 Piloting on sample target group is required for validity purposes and test
appropriateness for the target group
Data Collection Methods
2) Interviews
 Used when you want to understand impressions and experiences in more detail and be
able to expand or clarify responses
 Interviews can be conducted face-to-face or by telephone.
 Range from In-depth, Semi-structured, Unstructured, Structured, Key Informant
depending on the information being sought.
Face-to-Face Interview:
Advantages
i)Detailed questions can be asked
ii) Probing can be done to provide rich data
iii) Literacy requirements of participants is not an issue
iv) Non-verbal data can be collected through observation
v) Complex and unknown issues can be explored
vi) Higher response rates
Disadvantages
i)They can be expensive and time consuming
ii) Interviewer training is required to reduce bias
iii) Prone to interviewer and interpreter bias
iv) Sensitive issues maybe challenging
Telephone Interviews
Advantages
i)Cheaper and faster than face to face interviews to conduct
ii) Use less resources than face to face interviews
iii) Allow to clarify questions
iv) Do not require literacy skills
Disadvantages
i)Making repeated calls if first calls are not answered
ii) Potential bias if call backs (follow up) are not made to those absent
iii) Only suitable for short surveys
iv) Only accessible to the population with a telephone
v) Not appropriate for exploring sensitive issues
Data Collection Methods cont…
3) Observations
 Information is gathered about a program as the program’s activities occur
 Data is collected by direct or indirect participation within program activities
 Evaluator better understands the context in which measures are undertaken, and
facing directly the program implementation enables the evaluator to 'feel at home'
with a given issue
 Trained evaluator may also perceive such phenomena that – as they are obvious –
escape others' attention, as well as issues that are not tackled by participants in
interviews (like conflicts, sensitive)
 Observation enables evaluator to exceed participants' selective perception
 Possible to present the versatile picture of the program that would not be possible
using only questionnaires and interviews.
Advantages of Observation
• Gives relatively more accurate data on behavior, attitude and activities being performed.
• It is the best method to obtain first hand information
• It gives more relevant and accurate information
• Data collected is highly reliable
• It is relatively cheaper
Disadvantages of Observation
• The presence of the investigator may make performer or respondent to work in a different manner.
• It does not help produce accurate data automatically
• The results of observations may be different under different conditions
4) Document Analysis (Document Review)

a. Used when program documents or literature are available and can provide insight into the program
or evaluation
b. Advantages includes
i. Data already exists
ii. Does not interrupt the program
iii. Little or no burden on others
iv. Can provide historical or comparison data
v. Introduces little bias
c. Disadvantages includes:
i. Time consuming
ii. Data limited to what exists and is available
iii. Data may be incomplete
iv. Requires clearly defining the data you are seeking
d. Examples of documents that can be used as data sources includes:
i. Reports
ii. Brochures, Pamphlets, Posters, Flyers
iii. Logs, Registers, Diaries
iv. Minutes of meetings
v. Memos, Notice boards, Dash boards
vi. Patient records
5) Focus Group Discussions

 Focus groups or group discussions are useful to further explore a topic, providing a
broader understanding of why the target group may behave or think in a particular
way, and assist in determining the reason for attitudes and beliefs.
 Conducted with a small sample of the target group and are used to stimulate
discussion and gain greater insights.
Advantages
i. Useful in exploring cultural values and health beliefs
ii. Useful in examining how and why people think in a particular way and how it
influences their beliefs and values
iii. Can be used to explore complex issues
iv. Can be used to develop hypothesis for further research
v. Do not require participants to be literate
vi. Allows in-depth discussion
Disadvantages
i. Lack of privacy (confidentiality)
ii. Need to balance group to ensure cultural and gender appropriateness
iii. Potential for risk of ‘group thinking’ rather than individuals
iv. Risk for group domination by few individuals
v. Can be time consuming to conduct and not easy to analyse data
6) Laboratory testing
a. Precise measurement of specific objective phenomenon e.g. weight, water quality
7) Key Informant Interviews
viii. Data collection tools/technologies
1) Paper tools e.g.

 Checklists
 Questionnaires
 Interview guides/schedules
 Meeting minutes
 Registers
 Reports e.g.
i. Surveillance
ii. Kenya Demographic Health Survey
iii. Census
 Patient records
 Tally sheets
Data collection tools/technologies cont…
2) Electronic Data Capture
 Smart phones
 Computers
i. Audio Computer Assisted Interview (ACASI)
ii. Computer Assisted Personal Interview (CAPI)
iii. Computer Assisted Self Interview (CASI)
 Electronic Medical Records/ databases
i. C-PAD
ii. IQ-Care
iii. Fan soft
iv. DATIM
v. Kenya EMR
vi. DHIS
 Cameras
 Voice/ Video recorders
 Kenya Open Data Portal
 Open DataKit (ODK)
i. Is a free and open-source set of tools used to manage mobile data collection
solutions. ODK can be used to:
– Design data collection forms
– Collect data on a mobile device and send it to a server
– Data aggregation on a server and extraction in any formats
 Personal Digital Assistants (PDAs)
i. Handheld device that combines computing, telephone/fax, Internet
and networking features
 Research Electronic Data Capture (REDCap)
i. Free, secure, web-based application designed to support data capture for
research studies
Rules/principals for constructing a questionnaire/design.
• List the objectives that you want the questionnaire to accomplish. This will help in
writing the questions, since questions are related to objectives.
• Clarity is essential/important. Terms like several, must, because have no precise
meaning and they should therefore be avoided.
• The questions are focused and are limited to single idea. i.e short questions should
be used as they are easier to understand. They should be short and clear e.g have
you gone for VCT and how often?
• Double barreled questions should be avoided. Questions that have more than two
meanings yet allows for one answer e.g Do you think that students should have
more classes about history and culture? What motivates your work? Pleasant work
and nice co-workers.
• Leading or biased questions should be avoided e.g asking for the gender of a child-
like is he a boy when you see by yourself. Were you at the KCs bar on the night of
15th?
• Very personal and sensitive questions should be avoided as the respondent may be
dishonest in answering them.
Rules/principals for constructing a questionnaire/design.
• Simple words that are easily understandable should be used.

Difficult words that are not familiar will discourage the
respondent. Avoid jagon words e.g short hand etc.
• Questions that assume facts with no evidence should be
avoided. Such questions offend and discourage the
respondent. e.g asking a mother if she affords formula milk
for breastfeeding by looking at her social class. This may
discourage her to respond. You got HIV because you were
unfaithful.
• Avoid psychologically threatening questions e.g asking a
mother, are you afraid that your child who is HIV positive is
about to die?
Essentials of a good questionnaire
• Should be comparatively short.

• Questions should proceed in a logical sequence
e.g simple to difficult questions.
• Personal and intimate questions should be left
to the end.
• Questions may be dichotomous (yes/no)
answers or the multiple choice questions.
• There should be provisional for indicators of
uncertainty i.e. ‘don’t know’.
Steps for designing a questionnaire.
• List the objectives and variables that you want

to include in your questionnaire.
• List down the questions.
• Do the pilot study.
• Evaluate the questionnaire.
Interview Schedule
• This is a data collection tool which is very much like

questionnaire. The little difference is that the tool is filled
by the enumerator/investigator/researcher/interviewer
who are specially appointed for the purpose.
• N/B: In certain situations the tool may be handed over to
the respondent and the enumerator helps in recording the
answers.
• Enumerator explains the aims and also removes the
difficulties which the respondent may fail in understanding
the definition or concept of difficult times.
Advantages of Interview Schedule
• More information and in greater depth can be obtained.

The enumerator has a chance to probe for more
information.
• There is greater flexibility as there is an opportunity to
restructure questions.
• Personal information can be obtained more easily.
• There is no missing returns and non-response is low.
• There is control on who will answer the questions.
• The language of the interview can be adopted to the
ability or educational level of the respondent and as such
misinterpretations concerning questions can be avoided.
Disadvantages of Interview Schedule
• There may arise a communication barrier between the interviewer and

the respondent.
• The program of the interviewer may not coincide with that of the
respondent.
• Its time consuming as the interviewer can take a lot of time in one
respondent.
• There is a tendency of biasness (interview bias).
• There is a probability of shunning down of the interviewer by the
respondent e.g a young person interviewing an old person about his sex
life.
• Very expensive especially when large and widely spread geographical
sample is taken.
• Certain types of respondents such VIPs may not be approachable under
this method.
Types of interview schedule.
 Observation schedule
This is a type of schedule having questions which guide an
observer systematically.
 Rating schedule
Set of questions that helps guide a psychologist or sociologist
to measure the attitude and behavior of an individual.
 Survey schedule
Formulated for a surveyor to guide him on his information
collection
 Interview schedule
Set of questions with structured answers to guide an interview.
Principles of interview schedule design
Open-ended interviews
1. The first question must confirm that the potential respondent is aware of the purposes
and the scope of the interview, the amount of time it will take, and the way the
responses will be used e.g addressing confidentiality issues.
2. Address whether or not the responses will be recorded. Recording responses has the
advantage that all materials are captured but the disadvantage that many respondents
may speak less freely if they are being recorded obtain a consent before recording.
3. It must contain questions that will answer the research questions.
4. It must not contain questions that are not related to the research questions.
5. Each question must address a single use.
6. Topics that might be covered include:
a. Demographic questions-age, education, position (occupation),gender can be
observed.
b. Knowledge- questions about what a person knows about a specific topic.
c. Behaviour- questions about what a person does in general or has done on a
specific occasion and/or what a person plans to do in future.
d. Opinions or values- questions as to what a person thinks about a topic.
e. Feelings- questions about how a person feels about an issue.
f. Sensory- questions as to what a person has seen, touched, heard, tasted or
smelt.
Benefits of interview schedule
• The interviewer is present and can establish his/her credentials and

develop rapport with the respondent.
• In face to face interview, the interviewer can use visual aids to
illustrate points or identify issues he/she is addressing. These aids
include photographs, diagrams and physical models.
• If the respondent does not understand a question, the interviewer can
interpret it.
• The respondent is not provided with potential resources so that
respondent’s genuine views are obtained.
• The interviewer can notice that a respondent is distressed and either
terminate the interview or take steps to reassure the respondent.
• The data are complete as the interviewer can check that all issues have
been covered and all relevant demographic data is recorded.
Checklist
• This involves a schedule containing a set of

questions which are filled by the researcher or
enumerators as he/she directly observes things
around him/her which are of interest by asking the
respondent.
• The checklist is systematically planned and includes
all the items or points that must be considered
during observation in a field or when extracting
data from existing records. E.g ventilated pit latrine,
you want to observe the functionality.
Advantages of checklist
• Subjective bias is eliminated if observation is

done accurately.
• It’s less demanding of active cooperation on
the part of the respondent.
• The information obtained under this method
relates to what is currently happening or what
recently happened.
Disadvantages of checklist
• It’s expensive in terms of labour.

• Information provided by this method is very
limited.
• Sometimes unforeseen factors may interfere
with the observation task.
Focus group discussion guides
• This is a type of tool that collects data from a focused

group discussion. These are visually composed of
homogeneous member of the target population for
instance, similar in age, education level, gender,
profession etc.
• Questions used are always open-ended.
• Participants are randomly selected. To obtain meaningful
information, a highly skilled and trained facilitator must
guide the group but careful not to leave it in pre-
determined direction.
• It’s of 6-8 participants.
Advantages of focus group discussion guide.
• The researcher can interact with the participants,

pose follow ups questions or ask questions about
the situation.
• Results can be easier to understand than
complicated statistical data.
• The researcher can get information from non-verbal
responses, such as facial expressions or body
language.
• Information is provided more quickly than if people
were interviewed separately.
Disadvantages of focus group discussion guide
• The small sample size means the groups might not

be a good representation of the larger population.
• Group discussions can be difficult to steer and
control so time can be lost to irrelevant topics.
• Respondents can feel peer pressure to give similar
answers to the moderator’s questions.
• The moderator’s skills in phrasing questions along
with the setting can affect the responses and skew
results.
• No guarantee of confidentiality.
Ways of transforming data into information
• By bringing related pieces of data together.

• Summarizing data.
• Tabulation and diagrammatic techniques.
• Statistical analysis.
• Final report.
ix. Data Analysis
 Data analysis is the process of evaluating data using analytical and logical reasoning to
examine each component of the data provided
 Data analysis refers to the process of inspecting, cleaning, transforming, and
modeling data with the goal of discovering useful information, suggesting conclusions, and
support decision-making
 Data analysis helps in obtaining usable and useful information. Data analysis may:
i. Describe and summarise the data
ii. Identify relationships between variables
iii. Compare variables
iv. Identify the difference between variables
v. Forecast outcomes
 Data analysis must be planned for before data collection
 Plan data analysis as follows:
i . Consider the purpose of your evaluation. How useful is the data in understanding and
improving program?
ii . Decide who will analyse the data. He/she must have training and experience in the
analysis procedures and software to be used
iii. Develop a database management system to collect, organize and store data
iv. Plan for data cleaning procedures
v. Obtain data analysis software
vi. Analysing Quantitative Data
Survey
 Goal: Increase skills and knowledge on concepts and

procedures of conducting community survey
 Specific objectives:
– Understand the terms used in community health survey,
reasons and types of survey
– Describe the process of carrying a survey, biased sample
and response rate in relation to survey
– Outline points to note in constructing survey tool, causes
of inaccuracy and report presentation
– Describe evaluation and feedback in relation to
community survey
Definitions
• A survey- is defined as a brief interview or discussion

with individuals about a specific topic
• Community survey- is a way of asking group or
community members what they see as the most
important needs of that group or community.
• Questionnaire - is a research instrument consisting of a
series of questions and other prompts for the purpose of
gathering information from respondents.
• Evaluation -is a systematic determination of a subject's
merit, worth and significance, using criteria governed by
a set of standards.
The following are reasons for carrying out a survey:
• Gather information about residents, their opinion, attitude

and knowledge
• Measuring behaviors and population characteristics
• Solicit community reaction to policies, proposals and
solutions
• Residents assess effectiveness of program, facilities and
services
• Make residence aware of problems and their effects
• Provide residence access to policy making process
• Provide opportunity for communities to influence public
decisions
Process of planning and conducting a survey:
 Decide on your goals

• Before you can start your research, you will need to form a clear picture in your mind
of the expected outcome.
• Identify the overall reasons for conducting the survey.
• What will you know after you have conducted the survey? Do you need feedback on
a product or your service? Is the information you are looking for of a general nature
or very specific? Do you have a particular audience in mind, or will you be sending
out online surveys to the general public?
• The answers to these questions will help you to decide how to target your survey.
• Determine the deadline for reporting the data the survey will collect. This will ensure
you complete the survey on time
 Design the methodology for conducting the survey

• Decide the procedures for conducting the survey: the number of people you will
survey (sample size), how you will survey them (self-administered questionnaire or
interview)
Process of planning and conducting a survey cont.…
 Create a list of questions

• Identify the questions you want the data to answer.
• There are many different types of questions that can be used on a survey, like open questions, closed
questions, matrix table questions, and single- or multi-response questions.
• One of the benefits of designing an online survey is that participants don't have to fill in questions that are
not relevant to them. Based on their answers certain questions can be skipped.
• Develop the tools and pretest them
 Invite the participants

• Who you want to take part in your survey will help you to decide on the best contact method.
• Identify the study respondents and enumerators
• You can send an email or call the participants
• Train them and assign them responsibilities
 Gather your responses

• Determine when to start the data collection
• It is important to assign a unique code to each participant to make sure that they only take part in your
survey once.
• Distribute survey forms. As they are returned, track the number completed.
Process of planning and conducting a survey cont.…
 Analyze the results

• Analyze your date according to your objectives.
• Make visual representations of the data by presenting the results in tables and graphs.
• You can also print out the data in the form of a spreadsheet, which can then be used
for further analysis.
• There are several software packages available for data analysis (Epi-info, SPSS, SAS,
STATA, RA)
• Present the analyzed report in Narratives, Graphs, Charts, Tables or Pictures
 Write a report
• The final step in conducting online surveys is to write a report explaining your findings.
• A successful survey will provide the answers to the questions you had about your
business, product or service.
• Use proper report writing layout
• Provide summary of your results
Points to be considered when constructing a survey tool
• Open ended questions with probes

• Simple and brief questions
• Avoid closed questions
• Avoid leading questions
• Start with most easy questions
• Make personal questions last
• Each question to address one issue
• When writing the questions, keep the language very simple and avoid
ambiguity or double negations.
• Make questions specific
• Ask questions in a logical order
• Construct response categories carefully – let it not be too long
• Provide clear and sufficient instructions or directions including reasons for the
survey
Errors leading to inaccuracy when constructing a survey tool
• Unclear objectives
• Questions not matching the survey objectives
• Questions are ambiguous
• Tool doesn’t provide adequate space for data
entry
• Questions are not well numbered
Processes of survey feedback:
– Data Collection - The first step in survey feedback is data collection

usually by a consultant based on a structured questionnaire.
– Feedback of Information - After the data are analyzed, feedback is
given to the persons who have participated in the fulfilling up of
questionnaire. The feedback may be given either orally or in a
written form. In oral system of feedback, it is provided through
group discussion or problem-solving sessions conducted by the
consultant.
– Follow-up Action - Survey feedback programme is not meaningful
unless some follow-up action is taken based on the data collected.
One such follow-up action may be to advise the participants to
develop their own action plans to overcome the problems
revealed by the consultant.
ix. Data editing
It’s the examination of the collected data in order

to find out mistakes, errors and omissions. The
aspects of accuracy, approximation and errors are
analyzed.
Data editing is the process of reviewing the data for
consistency, detection of errors and outliers (values
that are extremely larger or smaller than the rest of
the data) and correction of errors, in order to
improve the quality, accuracy and adequacy of the
data and make it suitable for the purpose
Objectives of data editing
 Detect of errors that would affect the validity

of outputs
 Detecting inconsistent values and outliers and
adjust them.
 Provide information enabling assessment of
the overall level of accuracy of the data
 Validate the data for the purposes it was
collected for
Types of data editing
1. Validity and completeness of data

The validity of the data refers to the correctness of the responses obtained, based on the
possible range of answers for each variable. This type of checking verifies validity by
ensuring the absence of non–numerical answers in fields devoted to numerical answers
and vice versa and completeness of data
(see Table 1: column 1, row 2: the answer is textual although the entire table is
numerical). Data completeness is means ensuring that all of the fields have been filled
and that there are no inappropriate missing values for any fields (see Table 1: column 3,
and row 3 / no response).
Types of data editing cont…
2. Range
The range sets the minimum and maximum expected values of the variable. In
this type of editing, the items on the questionnaire are individually checked
(the questionnaire is the data collection tool, and includes all the forms used to
record or collect the data) to verify that data in a given field are within the
boundaries specified for that field
3. Duplicate data entry

Verifying that the data of each unit of the register or the database was entered
only once, with no duplication, especially when there are variations in some
index fields of the unit within the record
Types of data editing cont.…
4. Outliers
This type of editing follows other checks and is used for the
detection of extreme values, based on the distribution of the
current data and previous data series, which makes it easier to
detect the values that can be considered unusual or extreme, so
that they can be checked and verified. (See Table 7: income of
employee 2 and employee 6)
Importance of data editing
• It helps to maximize the usefulness of data,

making it imperative to ensure that the data
used is free of the errors arising during their
collection or entry, and is coherent and
consistent.
Accuracy
 It involves describing a phenomenon exactly as it is.
 Absolute or perfect accuracy cannot be obtained therefore in statistics
relative accuracy is required and not absolute.
 The degree of accuracy depends on the nature and purpose of inquiry and
also on the materials of measurement. E.g when measuring the height of
men and women it should be accurate up to inches or centimeters and
while measuring the length between two cities it should be accurate up to
kilometer.
Approximation
 This is the basis of rounding off the figures with a view to simplify them
thus affecting the standard of reasonable accuracy.
 Rounding means making a number simpler but keeping its value close to
what it was. The result is less accurate, but easier to use.
ix. Data coding
• A systematic way in which to condense

extensive data sets in to smaller analyzable
units through the creation of categories and
concepts derived from data.
• When data have been collected in research
they have to edited and coded in a numerical
form ready to be summarized to table, charts,
diagram or grouped into frequencies before
calculations are made.
Levels of Data Coding
• Open
 Breakdowns compare and categorize data.
• AXIAL
 Make connections with categories after open
coding
• SELECTIVE
 Select the core category, relate it to other
categories and confirm the explanation to those
relationships.
Why Data coding
• It lets you make sense of and analyze your data.

• For qualitative studies it can have you generate
a general theory.
• The type of statistical analysis you can use
depends on the type data you collect, how you
collect it and how its coded.
• Coding facilitates the organization, retrieval and
interpretation of data and leads to conclusions
on the basis of that interpretation
Measurements of errors
 Errors
• The word error has a special meaning in statistics. We
can distinguish between mistake and error.
• A mistake, means incorrect presentation or man
factors. Can occur in the collection of the data. E.g the
respondent may have mistakenly ticked the ‘yes’ box
instead of ‘no’ box.
• An error means the difference between the actual
figures. The deviation is just by chance and it’s not
due to carelessness of human beings.
Sampling processes (8hrs)” (4 lessons)”
 Definition of terms:
 Population, Sample, Sampling, Sample size
Determination, probability and Non-
probability sampling methods.
Population
• In statistics, a population is the entire pool
from which a statistical sample is drawn. A
population may refer to an entire group of
people, objects, events, hospital visits, or
measurements
• The population is the entire group that you
want to draw conclusions about.
Sample
• The sample is the specific group of individuals that you will collect
data from
• When the sample are selected from a population, the units selected
must be taken random.
• Sample is a portion, piece or segment that represents the whole
(population).
• According to this method, a few units from the whole population must
have equal chance to be selected. The units selected are just by chance
or coincidence. If these units selected are not taken at random then
bias will take place and undue importance will be given to some units.
In this case the sampling will not be fair and representative e.g if from
a class students selected to be part of the study will be only intelligent
students. This will not be representative regarding the performance.
Sampling
• The process of selecting a number of
individuals for a study in such a way that the
selected individuals represent the large group
from which they were selected.
• The purpose of sampling is to secure a
representative group which will enable the
researcher to gather information about a
population.
Advantages of sampling method
• As only a small part of the whole population is

studied, its cheaper to collect the data.
• The data are collected and analyzed more
quickly, thus sampling saves a lot of time.
• Since only a part of the whole population is to
be studied a good quality of labour with better
supervision can be provided.
• An investigation of a small part of the
population gives us more detailed information.
Sampling frame
• Sampling frame (synonyms: "sample frame",

"survey frame") is the actual set of units from
which a sample has been drawn: in the case of a
simple random sample, all units from the sampling
frame have an equal chance to be drawn and to
occur in the sample.
• The sampling frame is the actual list of individuals
that the sample will be drawn from. Ideally, it
should include the entire target population (and
nobody who is not part of that population).
Sample size
 The number of individuals you should include

in your sample depends on various factors,
including the size and variability of the
population and your research design. There
are different sample size calculators and
formulas depending on what you want to
achieve with statistical analysis.
Types of sampling methods:
To draw valid conclusions from your results, you have

to carefully decide how you will select a sample that is
representative of the group as a whole. There are two
types of sampling methods:
1. Probability sampling involves random selection,
allowing you to make strong statistical inferences
about the whole group.
2. Non-probability sampling involves non-random
selection based on convenience or other criteria,
allowing you to easily collect data
1. Probability sampling
 Probability sampling means that every member of the

population has a chance of being selected. It is mainly
used in quantitative research. If you want to produce
results that are representative of the whole
population, probability sampling techniques are the
most valid choice.
 There are four main types of probability sample ;
i. Simple random sampling
ii. Systematic random sampling
iii. Stratified sampling.
iv. Cluster Sampling
1. Simple random sampling
• The name comes from the fact that no complexities are involved.
• Random expresses the idea of chance being the only criterion for selection.
• It’s therefore a sampling procedure that provides equal opportunity of
selection from each element in a population. All is needed is clearly defined
population in a sample frame (boundaries should be defined).
• Random samples are satisfactory when the population is homogeneous
(uniformity of certain characteristics)
• The main objective of the simple random is to eliminate any form of bias in
the selection and to obtain a representative sample.
• There are various techniques of selecting randomly, the most common is
lottery technique, where a symbol of each unit population is placed in a
container mixed well and then the likely numbers are drawn, that constitute
the sample.
• A more sophisticated method particularly used for large populations is the
use of random number tables. These tables are mathematically prepared so
that numbers are written in a random way and therefore each item has an
equal probability of being selected.
Example of Simple random sampling
 Suppose you want to investigate the socio-economic status of patients

with diabetes in Kisumu county, here are the steps to follow;
i. Get all the information on the total number of people with this
condition (sampling frame). Such information can be obtained from
existing health records or previous investigation.
ii. Decide on your sample size
iii. Give a number to each patient in the location within Kisumu County.
iv. Write those numbers on pieces of paper: fold them properly making
sure that the numbers cannot be seen.
v. Put all the papers in the basket and shake them properly.
vi. Pick any of the papers at random, repeat several times until you
reach your sample size.
vii. You will then go to the location and interview those patients whose
numbers you have picked.
2. Systematic random sampling
• This type of sampling is very similar to simple random sampling.
The technique of collection instead of relying on number of
tables it’s based on the selection of elements at each equal
interval starting at a randomly selected element on the
population list.
• To make systematic random, it’s necessary to compute the
length (k) of the sampling interval which is determined by the
ratio of the total population (N) to the desired sample size (n).
– K=N/n
– Where;
• K=interval length
• N=total population
• n=sample size
Example of Systematic random sampling
 While drawing the samples in your study in Kisumu County, the following steps will be
followed;
• Determine the affected population. The number of people affected 10000 (N)
• List the population (in our case diabetes cases), from 1-10000. While listing you have to
make sure that:
• Specific characters don’t occur all the time at any given interval.
• Each number corresponds to a specific point or case in the population (to eliminate
biased representation)
• Determine the sample size e.g 1000(n)
• Divide the population by sample size, this is the interval which you will use while picking
your samples from the population.
• K=N/n
• K=10000
• 1000
• K= 10 interval.
• Determine the starting point from which you will start picking every 10th item e.g using
simple random, if the first randomly selected sample is number 3 and K is 10th item after
number 3,then the items or samples we shall come with are 3,13,23,33 etc all selected
Advantages of systematic random sampling
• Easy to organize.
• More precise than simple random sampling
and more evenly spread over population.
• Simple to apply the analysis of data and has a
sound mathematical basis.
• Biasness is eliminated.
Disadvantages
• There is no guarantee that the behaviour of

these people represents the behaviour of the
other groups.
3. Stratified sampling.
• This is a method of obtaining a sample from a population when distinct
groups (strata) of a population can be identified. The principle of this
sampling is to divide a population into different groups called strata such
that each element of the population belongs to one and only stratum.
• The population divided into groups should be in such a way that units
within each group are as similar as possible. The groups should be
homogeneous.
• The composition of groups can be for instance different tribes, religions,
gender, socio-economic groups, occupation, age, income groups etc.
• The chosen variables should be one that result in internally
homogeneous stratum, then within each stratum, random sampling is
applied using either simple or systematic random method to choose the
sample.
• Stratified sampling can be proportional or non-proportional. In
proportional sampling the participants are chosen in proportion to the
number in each group. Non-proportional occurs when the response
Examples of Stratified sampling
Steps
1. Obtain information about the total population and population of a location in
Kisumu town which is 100,000 and locations (Kisumu East 30,000 and Kisumu West
70,000)
2. Obtain proportion representation of a location, this is worked in the form of a
ratio as follows
• Kisumu East = 30,000/100, 000
•
• Kisumu West =70,000/100,000
=7/10
3. Ten is common in both locations and therefore we shall say that the
proportional representation is 3:7. This means that for every 3 people in Kisumu
East location there are 7 people in Kisumu West location.
4. If the sample size is 1000, apply the same ratio3:7 in which the population
occurs in order to get smaller sub- samples which are 3/10 of 1000 for Kisumu East
and 7/10 of 1000 for Kisumu West, this smaller sub –samples are called stratified
samples.
5. After establishing the sub-samples, then apply simple random sampling to get
4. Cluster sampling method
• Cluster sampling also involves dividing the population into

subgroups, but each subgroup should have similar characteristics
to the whole sample. Instead of sampling individuals from each
subgroup, you randomly select entire subgroups.
• If it is practically possible, you might include every individual
from each sampled cluster. If the clusters themselves are large,
you can also sample individuals from within each cluster using
one of the techniques above. This is called multistage sampling.
• This method is good for dealing with large and dispersed
populations, but there is more risk of error in the sample, as
there could be substantial differences between clusters. It’s
difficult to guarantee that the sampled clusters are really
representative of the whole population.
Similarities between Cluster sampling and stratified sampling
• Both methods are examples of probability sampling

methods – every member in the population has an equal
probability of being selected to be in the sample.
• Both methods divide a population into distinct groups (either
clusters or stratums).
• Both methods tend to be quicker and more cost-effective
ways of obtaining a sample from a population compared to a
simple random sample.
Cluster sampling and stratified sampling share the following differences
• Stratified sampling divides a population into

groups, then includes some members of all of
the groups.
• Cluster sampling divides a population into
groups, then includes all members
of some randomly chosen groups.
Multi-Stage Sampling
• To draw the sample, this method actually uses

a combination of various techniques. In this
method, the population is divided into groups
at various levels. A group within a group,
within a group and so on, the sample is finally
drawn from the smallest group among all the
groups.
2. Non-probability sampling
• In a non-probability sample, individuals are selected based on non-
random criteria, and not every individual has a chance of being
included.
• This type of sample is easier and cheaper to access, but it has a
higher risk of sampling bias. That means the inferences you can
make about the population are weaker than with probability
samples, and your conclusions may be more limited. If you use a
non-probability sample, you should still aim to make it as
representative of the population as possible.
• Non-probability sampling techniques are often used in exploratory
and qualitative research. In these types of research, the aim is not
to test a hypothesis about a broad population, but to develop an
initial understanding of a small or under-researched population.
Non-probability sampling cont...
• Unlike probability sampling method, non-
probability sampling technique uses
nonrandomized methods to draw the sample.
Non-probability sampling method mostly
involves judgment. Instead of randomization,
participants are selected because they are
easy to access. For example; your classmates
and friends have a better chance to be part of
your sample
Non-probability sampling
i. Convenience sampling
ii. Voluntary response sampling
iii. Purposive sampling
iv. Snowball sampling
1. Convenience sampling
• A convenience sample simply includes the

individuals who happen to be most accessible
to the researcher.
• This is an easy and inexpensive way to gather
initial data, but there is no way to tell if the
sample is representative of the population, so
it can’t produce generalizable results.
Convenience sampling cont...
• In this type of sampling, researchers prefer participants as per
their own convenience. The researcher selects the closest live
persons as respondents. In convenience sampling, subjects
who are readily accessible or available to the researcher are
selected. For example, you will choose your classmates and
friends for the study as per your convenience.
• In other words, in this type of non-probability sampling
method, whoever meets the researcher qualifies to be the
part of your sample. For example; people in the streets. To get
the questionnaire filled, you take the copies of your
questionnaire and stand in the corner of a street. You will give
the copies of the questionnaire to people passing by you
2. Voluntary response sampling
• Similar to a convenience sample, a voluntary

response sample is mainly based on ease of
access. Instead of the researcher choosing
participants and directly contacting them, people
volunteer themselves (e.g. by responding to a
public online survey).
• Voluntary response samples are always at least
somewhat biased, as some people will inherently
be more likely to volunteer than others
3. Purposive sampling
• This type of sampling, also known as judgement

sampling, involves the researcher using their
expertise to select a sample that is most useful to
the purposes of the research.
• It is often used in qualitative research, where the
researcher wants to gain detailed knowledge about
a specific phenomenon rather than make statistical
inferences, or where the population is very small
and specific. An effective purposive sample must
have clear criteria and rationale for inclusion.
4. Snowball sampling
• Also called "chain referral sampling,”

• If the population is hard to access, snowball sampling can be
used to recruit participants via other participants. The number
of people you have access to “snowballs” as you get in contact
with more people.
Example
• You are researching experiences of homelessness in your city.
Since there is no list of all homeless people in the city,
probability sampling isn’t possible. You meet one person who
agrees to participate in the research, and she puts you in
contact with other homeless people that she knows in the
area.
5.Quota sampling
Quota sampling is a two-stage non-probability
sampling method that assigns quotas to the
population in order to ensure that when
elements of the population are selected, the
sample group is representative of the
population’s characteristics. After quotas are
assigned, researchers choose elements from
the subgroups using convenience or judgment
How to conduct Quota Sampling
• Divide the population into subgroups according to

relevant control characteristics. The subgroups should
be mutually exclusive and collectively exhaustive so as
to not have an overlap of elements in subgroups.
• Define the proportions of the subgroups in order to
decide how many elements will be chosen from each
subgroup (quotas).
• Select an appropriate sample size and then select
elements from subgroups keeping in mind how many
elements can be selected from each subgroup (the
quotas).
3. Health Facility Administrative Statistics
(10hrs)” (5 lessons)”
i. Standard terms and terminologies and computation of

health facility administrative;
ii. Definition of terms: Description and interpretation of
procedures, tools for collecting health administrative
statistics, uses, importance,
Health administrative statistics uses
 Comparison of present and past performance of the
hospital or clinic
 Guide for planning future development of the
hospital or clinic
 Appraisal of work performed by the medical, nursing
and other staff
 Hospital or clinic funding if government funded
 Research
 Information for public/regional health policy making,
based on the real healthcare services
Definition of terms
1. Admission
 The formal process whereby a person is accepted by a hospital for
the purpose of hospital treatment as an inpatient. If an inpatient is
formally discharged from the1hospital and then returns for further
treatment, the admission process is repeated and a second
admission is recorded in the statistics
 Live births in the hospital are considered inpatient admissions, but
are always recorded separately as new-born admissions whether or
not they require, during their continuous stay in the hospital since
birth, special medical care in the nursery or in another clinical
service of the hospital (for example, neonatal intensive care unit). A
new-born admission is deemed to occur at the time of birth in the
hospital.
Typically, a patient should be admitted as an inpatient if treatment
and/or care is provided by hospital staff over a period of 24 hours.
Definition of terms cont..
2. Visit (also called Attendance)
 A visit is a single encounter with a healthcare
professional that includes all of the services
supplied during the encounter
 Can be Sub-Divided into two
 New Visit- first time ever for treatment services
 Re-Visit- Multiple visit or Not the first time to visit
that hospital for treatment visits
3. In - patient
 Person who has gone all through admission procedures and is
expected to occupy a bed in hospital ward at the in – patient
departments at a specified occurrence or session.
4. In – patient days (IPD )
It refers to occupied bed days ( OBD )
• This is the total number of patients remaining or occupying a bed in
the ward or hospital at each day.
5. Period
 This is usually the available time in which a patient can be treated
while admitted and is usually counted in days. The common period of
reference is usually 365/366 which is one year. However, months or
part of the year can be computed as half ayear or quarter year.
6. Allocated beds / Available beds / Staffed beds / bed compliment
 This is the number of beds constantly (permanently) allocated to a ward or
hospital for treatment of patients.
7. Bed state / Bed count day
 Refers to the number of patients occupying hospital beds during a given period
of time.
8. Census
 A count of inpatients at a given time. The census is always taken in a hospital at
the same time each day, usually at the lowest migration time period ( eg.
midnight or 0001 hours). The census provides the number of inpatients at
census taking time
9. Daily census (daily inpatient census)
 The daily census is the number of patients present at census taking time, plus
any patients who were admitted after the previous census-taking time and
discharged before the next census-taking time.
10. Percentage occupancy
 This is the percentage of the actual in – patient days in comparison to the
maximum available patient days as determined by the allocated beds capacity
during a given period.
11. Available bed days
 Is the total number of bed days available of which the patient can occupy.
Those particular beds in ward or hospital. This usually gives the available bed
space for patient to use.
12. Turnover per bed / through put for bed / bed turnover
 Refers to the number of patients expected to have used each bed available in a
particular ward or hospital during the specified period of time.
13. Turnover interval
 Is the average period of time in which beds stay vacant between two successive
patients’ use of the same bed. This may also be said to be the period between a
discharge and admission on the same bed in ward or hospital. It ii also the
number of days beds lie vacant between two successive admissions
14. Length of stay
 Number a patient occupies a bed thus from admission time of a particular
patient to the discharge time of the same patient.
15. Occupied bed days
 Is the total number of patients remaining in the hospital / ward each day
added together over the reference period
16. Average occupancy/ average number of patients
 This is the average number of patients in a hospital during a specified period
17. Amenity beds
 Is a bed provided in a single room or small ward for the time being required
by any patient on medical grounds for which a patient is charged part of the
cost.
18. Day case -
 Are patients attending as a non – residence patient for investigations ,
therapeutic test, operative procedures or other treatment and who require
some form of preparation period of recovery or both involving provision of
19. Delivery
The act of giving birth to either a living child or a dead fetus. A pregnant woman
who delivers may have multiple births. For example, a woman who gives birth to twins
will have one delivery but two births.
20. Discharge (Separation)
The formal process whereby an inpatient leaves the hospital at the end of an
episode of care.
21. Encounter
The direct contact between a patient and a physician or other licensed
independent practitioner, to order or furnish healthcare services for the diagnosis
or treatment of a patient. (Horton)
22.Fetal death or Stillborn
“Fetal death is death prior to the complete expulsion or extraction from its mother
of a product of conception, irrespective of the duration of pregnancy; the death is
indicated by the fact that after such separation the fetus does not breathe or
show any other evidence of life, such as beating of the heart, pulsation of the
umbilical cord or definite movement of voluntary muscles.”
23. Live birth
 "The complete expulsion or extraction from its mother of a product of conception,
irrespective of the duration of the pregnancy, which, after such separation, breathes or
shows any other evidence of life, such as beating of the heart, pulsation of the umbilical
cord, or definite movement of voluntary muscles, whether or not the umbilical cord has
been cut or the placenta is attached; each product of such a birth is considered live
born."
24. Maternal death
 Death of any woman while pregnant, or within 42 days of termination of pregnancy,
irrespective of duration and site of pregnancy, from any cause related to or aggravated
by the pregnancy, or its management, but not from accidental or incidental causes.
(1) Direct obstetric deaths
 Those resulting from obstetric complications of the pregnant state (pregnancy, labour
and puerperium), from interventions, omissions, incorrect treatment, or from a chain of
events resulting from any of the above.
(2) Indirect obstetric deaths
 Those resulting from previous existing disease or disease that developed during
pregnancy and which was not due to direct obstetric causes, but which was aggravated
by physiological effects of pregnancy.
26. Neonatal death
The neonatal period commences at birth and ends 28 completed
days after birth. Neonatal deaths (deaths among live births
during the first 28 completed days of life) may be subdivided into
early neonatal deaths, occurring during the first seven days of
life, and late neonatal deaths , occurring after the seventh day
but before 28 completed days of life
27. Perinatal death
A perinatal death is one occurring during the perinatal period,
which commences at 22 completed weeks (154 days) of
gestation (the time when birth weight is normally 500 g), and
ends seven completed days after birth
Daily Bed Return (DBR) or Daily Bed State
 Daily Bed Return is a document completed in a ward covering 24 hours
ward bed state. The actual time for completing DBR is determined by the
Hospital Administration, however it should be completed during the night
normally at 12 midnight when patient movement within the hospital is
minimized.
 Daily Bed Return should indicate MOH 328
a) Patient movement in and out of the ward that is admissions and discharges
from the ward.
b) Patient movement within the hospital that is ward inter-transfers.
c) Actual patient counts that is number of patients in the ward
d) Number of vacant beds and cots
e) A section for computation of figures by the Medical Records Officer
Example of Daily Ward Return MOH 328
Form 1
Section 1
Hospital Name………….Date………Ward
Admissions Discharges and Deaths
Hospital No. Patient Name Hospital NO. Patient Name Comments

Example of Daily Ward Return
Section 2
Inter-ward transfers within the hospital
Admissions Discharges
Hospital Patient From To Ward Hosp No Patient Comment

No. Name Ward Name s
Section 3
Paroles
Admissions from Paroles Discharges from Paroles
Hospital No. Patient Name Hosp No Patient Name Comments

Section 4 Computation
Previously Daily Return NO Todays Daily Return Numbers
Beds Cots Total Beds Cots Total
Patients Patients
Well / People Well / People
Vacant Vacant
Total Total
Daily In-Patient Statistics
Daily Bed Return summary form for In-patient Statistics
Ward……………………………………….. Month……………………………………
Days
1 New Adm Re-Adm Total Adm Discharge Home Inter ward Trans Parole Deaths Absco Trans in Trans Out IP Days
6
Daily summary form for In-patient Statistics
Health Administrative Statistics Summary Tool
Ward……………………………………….. Month……………………………………
Transfer Transfer No. Well Occupied Bed
Beds Cots Admissions Discharges In Out Deaths Absco Persons Days
M/W
F/W
MAT
P/W
NBU
FORMULAS FOR COMPUTING HOSPITAL ADMINISTRATIVE
STATISTICS
1. Available bed days =period( days ) Allocated beds
ABD = PAB
2. Occupied bed days / in patient days = Direct summary of DBR state
OBD= Period daily IP days / daily occupancy
OBD= P DIPD /DO
OR
OBD = ABD
OR
OBD =LOS D+D
3. %occupancy = occupied bed days / available bed days 100
%OCC=100
4. Vacant bed days= available bed days –occupied bed days
VBD=ABD-OBD (daily bed vacancy period)
5. Average number of occupied beds or = Occupied Bed Days x 100%

Average Number of Patients Days in Period
FORMULAS FOR COMPUTING HOSPITAL ADMINISTRATIVE
STATISTICS cont..
5. Excess bed days =OBD-ABD (Daily excess bed state
period)
EBD=OBD-ABD
6. Length of stay = occupied bed days /discharges and
death Duration of stay
Average length of stay
ALOS=OBD/D+D
7. VBD + OBD = ABD
8. OBD = ABD- VBD
9. Average number of occupied beds / average number
of beds
OB =
FORMULAS FOR COMPUTING HOSPITAL ADMINISTRATIVE STATISTICS cont..
10. Deaths and discharges

D+D =
11. Available beds / staffed beds / average available beds
AB =
12. Turnover per bed =discharges & deaths / number of available beds
TOB =
13. Discharge and deaths = turnover per bed x available beds
D +D = TOBXAB
14. Turnover interval = vacant bed days / number of deaths &
discharges
TOI =
15. VBD = TOI X D+D
16. D+D =
Exercise
1. In the year 2003 Agha Khan gynecological

specialty had 120 beds allocated throughout
the year. It had a percentage occupancy of
110%. Calculate
i. Occupied Bed Days
ii. Average daily Population
iii. Excess patient days
iv. Average daily population without beds
Exercise
2. In the year 2004, Etihad hospital had 800 beds
permanently allocated for inpatient use. During the
period the hospital percentage occupancy was 110%
and that there were 1500 patients discharged home
alive, 80 patients went to parole and 20 deaths. Use the
information to calculate
i. Occupied bed days
ii. Average length of stay
iii. Turnover per bed
iv. Excess/ vacant Bed days
v. Turnover interval
Exercise
3. In 2005 a hospital recorded the following data.
In-patient hospital days = 10,676
Number of allocated beds = 25 Throughout the year)
Discharges & deaths = 124
• Calculate:
i. Excess in-patient days
ii. The extra number of beds required to cater for excess patient days
through the year.
iii. % occupation
iv. ALOS
v. Average daily population/average number of patients.
vi. TOI
vii. TOB
4. Data Organization and Presentation (8hrs)” (4
lessons)”
 Describe data classification, Grouped and ungrouped

data, inclusion and exclusion form of grouping
– Sturge formula, narrative, array presentation tabular
presentations, Simple tables, multiple column tables,
contingency tables, principles of table construction, charts
presentation, simple bar charts, component bar charts,
percentage component bar charts, multiple bar graph,
principles of bar charts construction, pie charts, graphs
presentations, principles of graph construction frequency
distribution, cumulative frequency tables, relative frequency,
cumulative frequency curve (ogive), histogram (graphs of time
series; Z-charts, histogram, Lorenze curve and pictograms, data
analysis and presentation using computers
Grouped and ungrouped data
 Grouped Data – Data that has been organized
into groups (into a frequency distribution). If
you see a table similar to the one below, you
will know that you are dealing with grouped
data:
Grouped and ungrouped data
 Ungrouped Data – Data that has not been
organized into groups. Ungrouped data looks
like a big ol’ list of numbers.
e.g 34, 76, 34, 54,54,27
 In simple words, ungrouped data or raw data
is a mere list of numbers that does not convey
anything. This is because no summarization or
aggregation is possible
How to Group Data
 On your exam, you may have to construct a frequency distribution. Constructing

a frequency distribution is the same thing as grouping data.
 The first step in grouping data is deciding how large of a class interval to use.
(Class interval = Class size)
 There are 2 formulas for determining the appropriate class interval. You must be
able to choose which one would be appropriate for any given problem
1. Class interval = Highest Value – Lowest Value

Number of classes
Use when the problem states the number of classes to be used.
2. Class interval = Highest Value – Lowest Value
1+ 3.322 log (N)
Use when the problem does not state the number of classes to be used.
**Don’t forget to always round up to the nearest whole number when dealing with
class interval.**
Sturge formula
 The Sturges rule is used to determine the

number of classes when the total number of
observations is given.
 Formula used: Sturges rule to find the number
of classes is given by $K = 1 + 3.322\
log \,N$ where $K$ is the number of classes
and $N$ is the total frequency.
Array presentation
 This is the arrangement of a group of data in a

pleasing way so that they are in order. When
you are conducting an investigation you will
reach a stage when you will have several
figures referring to different classifications.
When these figures are written down the way
they were obtained from the original
classification, they are referred to as an array.
Presentation of data
Principles of presentation of data:

1. Data should be arranged in such a way that it will
arouse interest in reader.
2. The data should be made sufficiently concise
without losing important details.
3. The data should presented in simple form to
enable the reader to form quick impressions and
to draw some conclusion, directly or indirectly.
4. Should facilitate further statistical analysis .
5. It should define the problem and suggest its
solution.
Methods of presentation of data
The first step in statistical analysis is to

present data in an easy way to be
understood.
The two basic ways for data presentation are
 Tabulation (Tables)
 Charts and Diagram
1. Tabulation (Tables)
2. Charts-graphical representation for data visualization
 Types of Charts and Graphs examples :
• Bar Chart.
• Line Chart.
• Pie Chart. ...
• Maps. ...
• Density Maps. ...
• Scatter Plot. ...
• Gantt Chart. ..
Tabulation
 What is Tabular Presentation of Data? It is a table that helps to represent
even a large amount of data in an engaging, easy to read, and coordinated
manner. The data is arranged in rows and columns. This is one of the most
popularly used forms of presentation of data as data tables are simple to
prepare and read.
Objectives Of Tabulation
• To simplify the complex data

• To bring out essential features of the data
• To facilitate comparison
• To facilitate statistical analysis
• Saving of space
Explain the Main Parts of a Table
(1) Table number ● Table number is the very first item mentioned on the top of each table for easy identification
and further reference.
(2) Title ● Title of the table is the second item that is shown just above the table.
● It narrates the contents of the table, hence it has to be very clear, brief, and carefully worded.
(3) Head note ● It is the third item just above the table and shown after the title.
● It gives information about units of data like, ‘amount in rupees or $’, “quantity in tonnes’, etc.
● It is generally given in brackets.
(4) Captions or ● At the top of each column in a table, a column designation/head is given to explain the figures
Column headings of the column.
● This column heading is known as ‘caption’.
(5) Stubs or Row ● The title of the horizontal rows is known as ‘stubs’.
headings
(6) Body of the table ● It contains the numeric information and reveals the whole story of investigated facts. Columns
are read vertically from top to bottom and rows are read horizontally from left to right.
(7) Source note ● It is a brief statement or phrase indicating the source of data presented in the table.
(8) Footnote ● It explains the specific feature of the table which is not self-explanatory and has not been
explained earlier. For example, points of exception if any.
Rules and guidelines for tabular presentation
1. Table must be numbered
2. Brief and self explanatory title must be given to each
table.
3. The heading of columns and rows must be clear,
sufficient, concise and fully defined.
4. The data must be presented according to size of
importance, chronologically, alphabetically or
geographically
5. If data includes rate or proportion, mention the
denominator.
6. Table should not be too large.
7. Figures needing comparison should be placed as close
as possible.
Continued..
8. The classes should be fully defined, should not lead to any
ambiguity.
9. The classes should be exhaustive i.e. should include all the
given values.
10. The classes should be mutually exclusive and non
overlapping.
11. The classes should be of equal width or class interval
should be same
12. Open ended classes should be avoided as far as possible.
13. The number of classes should be neither too large nor too
small.Can be 10-20 classes.
14. Formula for number of classes(K):
K=1+3.322 log10 N, where N is total frequency
Tabulation
• Can be Simple or Complex depending
upon the number of measurements of
single set or multiple sets of items.
• Simple table :
Title: Numbers of cases of various diseases in Nair hospital in 2009
Disease Cases
Malaria 1100
Acute GE 248
Leptospirosis 60
Dengue 100
Total 1308
Frequency distribution table with qualitative data:
• Title: Cases of malaria in adults and children in the

months of June and July 2010 in Nair Hospital.
Jun-10 Jul-10
Type of
malaria Adult Child Adult Child Total
P.Vivax 54 9 136 23 222

P.Falciparu
m 11 0 80 13 104
Mixed
malaria 11 4 36 12 63
Total 76 13 225 43 389

Frequency distribution table with quantitative data:
• Fasting blood glucose level in diabetics at

the time of diagnosis
Fasting No of diabetics
glucose level
Male Female Total
120-129 8 4 12
130-139 4 4 8
140-149 6 4 10
150-159 5 5 10
160-169 9 6 15
170-179 9 9 18
180-189 3 2 5
44 34 78
Chart and diagram
Graphic presentations used to illustrate

and clarify information. Tables are
essential in presentation of scientific
data and diagrams are complementary
to summarize these tables in an easy,
attractive and simple way.
The diagram should be:
• Simple
• Easy to understand
• Save a lot of words
• Self explanatory
• Has a clear title indicating its content
• Fully labeled
• The y axis (vertical) is usually used for
frequency
Advantages of diagrams and charts
i. They provide an easy and attractive means of

representing data.
ii. They make the information contained in data
readily understandable.
iii. They facilitate comparisons.
iv. They save time and labour.
v. They give an effective impression.
vi. They have great memorizing value as
compared to mere figures.
Limitations of diagrams and charts
i. Diagrams do not give accurate result but rough

ideas.
ii. A technical man can construct a diagram so a
common man cannot do it correctly.
iii. Comparisons of diagrams cannot be made of the
unit not common or the phenomena is not the
same.
iv. They can be misused very easily.
v. This method of data presentation is very
expensive.
Various charts and diagrams
1) Bar Diagram
2) Histogram
3) Frequency polygon
4) Cumulative frequency curve
5) Scatter diagram
6) Line diagram
7) Pie diagram
1.Bar Diagram
 This is a means of presenting information visually by drawing bars
that represent specific data frequency. All charts have common
principles of constructing.
General principles of bar charts construction:
• It should have a clear title.
• Scale used should be indicated clearly.
• Show the attribute (quality) or variable (quantity) on the horizontal
axis (x-axis).
• Frequencies should be appear in the vertical axis (Y-axis).
• The height of the bars should be proportional to the frequencies.
• The source of information should be indicated normally as
footnotes.
• The axis should be clearly labelled either X or Y axis.
• The Y axis should be ¾ of the length of X-axis.
Bar diagram cont..
• Widely used, easy to prepare tool for comparing
categories of mutually exclusive discrete data.
• Different categories are indicated on one axis and
frequency of data in each category on another axis.
• Length of the bar indicate the magnitude of the frequency
of the character to be compared.
• Spacing between the various bar should be equal to half of
the width of the bar.
• 3 types of bar diagram:
Simple
Multiple or compound
Component or proportional
Simple bar diagram
• This type of charts is used to present data which has been presented in a
simple table. The height of each bar indicates the size of the figure
represented.
• The width of the bars is not taken into account and it should be uniform
for all bars
Malaria cases in Nair Hospital in July 2010
YEAR Pts
2001 61000
2002 75000
2003 50000
2004 80000
Total 266000
Simple bar diagram:
Malaria cases in Nair Hospital in July 2010
120
100
80
Total No cases Male

60
40
20
0
P.Vivax P.Falciparum Mixed malaria
• Multiple bar chart: Each observation has more
than one value, represented by a group of bars.
Percentage of males and females in different
countries, percentage of deaths from heart
diseases in old and young age, mode of delivery
(cesarean or vaginal) in different female age
groups.
Multiple or Compound diagram
Distribution of malaria cases in Nair Hospital in July
2010
120
100 102
80 Male
Female
60 62
57
40
31 29
20
19
0
P.Vivax P.Falciparum Mixed malaria
Multiple or Compound bar chart.
600000
500000
400000
clinic attendance
300000 CLINIC ATTENDANCE MEDICAL

CLINIC ATTENDANCE SURGICAL
CLINIC ATTENDANCE PEDIATRICS
CLINIC ATTENDANCE TOTAL
200000
100000
0
2001 2002 2003 2004 TOTAL
years
Component/Sectional bar chart
• This is also called subdivided bar diagrams. Instead of placing the bars for
each component side by side we may place these one on top of the other.
• Each bar is subdivided into two or more components across the years.
1200000
1000000
clinic attendance
800000 CLINIC ATTENDANCE TOTAL

CLINIC ATTENDANCE PE-
600000 DIATRICS
CLINIC ATTENDANCE
400000 SURGICAL
CLINIC ATTENDANCE MED-
200000 ICAL
0
2001 2002 2003 2004 TOTAL
years
Percentage/Component bar charts
• These are component bar charts that show

percentage component parts of the total
instead of actual parts and actual totals of the
bar.
• The length of the bar represents the
percentage of the total. It will therefore be
100% and a series of pull bars will have the
same height.
Pictograms
• Pictograms are also known as symbol charts or

ideographs.
• When the relative values of items are represented
by pictures. They are known as pictograms.
• There are two kinds of pictograms;
– Those in which the same picture, always the same size,
as shown repeatedly. The value of a figure represented
is indicated by the number of pictures shown.
– Those in which pictures change in size. The value of a
figure represented is indicated by the size of the picture.
Pictograms cont..
production
2004 1250
2005 1500
2006 1000
2007 600
Source: production department.
2004 2005 2006 2007

Histogram:
• It is very similar to the bar chart with the
difference that the rectangles or bars are
adherent (without gaps).
• It is used for presenting class frequency
table (continuous data).
• Each bar represents a class and its height
represents the frequency (number of cases),
its width represent the class interval.
Histogram
Distribution of studied group according to their height
30
25
20
number of individuals
15
10
0
100- 110- 120- 130- 140- 150-
height in cm
Frequency Polygon
• Derived from a histogram by connecting
the mid points of the tops of the rectangles
in the histogram.
• The line connecting the centers of
histogram rectangles is called frequency
polygon.
• We can draw polygon without rectangles
so we will get simpler form of line graph.
• A special type of frequency polygon is the
Normal Distribution Curve.
Frequency polygon
Fasting blood glucose level in diabetics at the time
of diagnosis
20
18
16
14
12 No of diabetics
10
8
6
4
2
0
120- 130- 140- 150- 160- 170- 180-
129 139 149 159 169 179 189
Cumulative frequency diagram or O’give
• Here the frequency of data in each

category represents the sum of data from
the category and the preceding categories.
• Cumulative frequencies are plotted
opposite the group limits of the variable.
• These points are joined by smooth free
hand curve to get a cumulative frequency
diagram or Ogive.
O’give:
Fasting blood glucose level in diabetics at the time of diagnosis
90
80
70
60
50 No of diabetics
40
30
20
10
0
120-129 130-139 140-149 150-159 160-169 170-179 180-189
Scatter/ dot diagram
Also called as Correlation diagram ,it is useful to
represent the relationship between two numeric
measurements, each observation being
represented by a point corresponding to its value
on each axis.
In negative correlation, the points will be
scattered in downward direction, meaning that
the relation between the two studied
measurements is controversial i.e. if one
measure increases the other decreases
While in positive correlation, the points will be
scattered in upward direction.
Malaria cases During monsoon in Nair Hospital:
Year 2010
500
450 450
400
350
300 304 Malaria cases
250
200
150
100 89
50
Series1; 30
0
Line diagram:
It is diagram showing the relationship between two numeric variables
(as the scatter) but the points are joined together to form a line (either
broken line or smooth curve. Used to show the trend of events with the
passage of time.
Changes in body temperature of a patient after use of antibiotic
39.5
39
38.5
temperature
38
37.5
37
36.5
36
1 2 2 4 5 6 7
time in hours
Pie diagram:
• Consist of a circle whose area represents
the total frequency (100%) which is
divided into segments.
• Each segment represents a proportional
composition of the total frequency.
Pie diagram:
Distribution of malaria cases in Nair Hospital in july

2010
Mixed malaria
15%
P.Vivax
P.Falciparum
53%
32%
Thank you

Health Statistics 1 3 1

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Health Statistics 1 3 1

Uploaded by

Copyright:

Available Formats

Health Statistics 1

Code : HES 106

 This module is designed to enable the learner utilize the

• Demonstrate understanding of the importance of statistics

1. Introduction to Health Statistics (20hrs)”10 lessons)”

2. Sampling processes (8hrs)” (4 lessons)”

3. Health Facility Administrative Statistics (10hrs)” (5 lessons)”

4. Data Organization and Presentation (8hrs)” (4 lessons)”

Expected learning at the end of the module

i. Allocation of resources by hospital administration or

• Analyses general activities of an organization

1. Unprocessed (raw) facts or figures. 1. It is the end-product of data processing

 This refers to data that can be quantified, verified and manipulated

The main sources of health statistics are;

4) Document Analysis (Document Review)

5) Focus Group Discussions

1) Paper tools e.g.

• Simple words that are easily understandable should be used.

• Should be comparatively short.

• List the objectives and variables that you want

• This is a data collection tool which is very much like

• More information and in greater depth can be obtained.

• There may arise a communication barrier between the interviewer and

• The interviewer is present and can establish his/her credentials and

• This involves a schedule containing a set of

• Subjective bias is eliminated if observation is

• It’s expensive in terms of labour.

• This is a type of tool that collects data from a focused

• The researcher can interact with the participants,

• The small sample size means the groups might not

• By bringing related pieces of data together.

 Goal: Increase skills and knowledge on concepts and

• A survey- is defined as a brief interview or discussion

• Gather information about residents, their opinion, attitude

 Decide on your goals

 Design the methodology for conducting the survey

 Create a list of questions

 Invite the participants

 Gather your responses

 Analyze the results

• Open ended questions with probes

– Data Collection - The first step in survey feedback is data collection

It’s the examination of the collected data in order

 Detect of errors that would affect the validity

1. Validity and completeness of data

3. Duplicate data entry

• It helps to maximize the usefulness of data,

• A systematic way in which to condense

• It lets you make sense of and analyze your data.

• As only a small part of the whole population is

• Sampling frame (synonyms: "sample frame",

 The number of individuals you should include

To draw valid conclusions from your results, you have

 Probability sampling means that every member of the

 Suppose you want to investigate the socio-economic status of patients

• There is no guarantee that the behaviour of

• Cluster sampling also involves dividing the population into

• Both methods are examples of probability sampling

• Stratified sampling divides a population into

• To draw the sample, this method actually uses

• A convenience sample simply includes the

• Similar to a convenience sample, a voluntary

• This type of sampling, also known as judgement

• Also called "chain referral sampling,”