Professional Documents
Culture Documents
Health Statistics 1 3 1
Health Statistics 1 3 1
• Health
– It’s a state of complete physical, mental and social well-being and
not merely the absence of diseases or infirmity.
• Statistics
– It’s the study of how to collect, organize, analyze and interpret
numerical information from data.
– It’s the body of theory, concepts, methods and methodologies of
collection, analysis, interpretation and representation of
numerical data for decision making.
• Health statistics
– It’s the science of collecting, organizing, summarizing/analyzing,
presenting and interpreting health data and using them for
decision making.
i. Uses of Health Statistics
1 Descriptive statistics
• Summarizes population data numerically or graphically
• It can be defined as those methods involving the collection, presentation and characterization of a set of data to properly describe the
various features of that set of data.
• It’s a type of statistics that describes large masses of data.
e.g. statistics pertaining to central tendency such as mean, median and mode.
- Statistics pertaining to dispersion around the central tendency such us range, standard deviation, variance, and quartile deviation.
- Statistics of graphs depicting the shape of a distribution.
2 Inferential Statistics
• They include;
Estimation
Modelling relationships
Hypothesis testing.
• It’s a type of statistics that infers or induces from a small group (sample) and generalizes to the whole group(population).
• Also known as inductive statistics, it deals with the method of drawing conclusions from numbers observed.
• Involves drawing the right conclusions from the statistical analysis that has been performed using descriptive statistics
• Most predictions of the future and generalizations about a population by studying a smaller sample
i Estimation
– It’s the group of statistics which allows for the estimation about population values based upon sample data. E.g population parameter estimates and
confidence inter
i Modelling relationships
– Allows us to develop mathematical equations which describe the interrelations between 2 or more variables
i Hypothesis testing
– Allows us to test for whether a particular hypothesis we’ve developed is supported by a systematic analysis of the data.
– As built on the descriptive statistics by going a step further to make interpretation with a view to population upon which a decision would be based
e.g cli-square, t-test, f-test etc.
iv. Definition of terms: data, information, variables,
• Data:
Data is a collection of raw facts, figures or instructions that do not have much
meaning to the user.
Data may be in form of numbers, alphabets/letters or symbols, and can be
processed to produce information.
• Information:
Information is the data which has been refined, summarized & manipulated in the
way you want it, or into a more meaningful form for decision-making.
The information must be accurate, timely, complete and relevant.
• Variable
A variable is a quantity that may change within the context of a mathematical
problem or experiment. Typically, we use a single letter to represent a variable.
The letters x, y, and z are common generic symbols used for variables.
Comparison between Data and Information
Data Information
a) Quantitative
b) Qualitative data
What is Quantitative data?
Qualitative data are data that describes meaning and is generally non-numerical
Qualitative data is classified into:
i) Nominal : The gender of a person
ii) Ordinal :
Qualitative data has greater validity.
o Validity: extent to which an indicator measures what it intends to measure
Qualitative data:
o Represents the “voice” of the individual or group
o Not generalizable to the population
o Time-consuming to collect and analyse
Qualitative and Quantitative data can further be classified into:
i) Primary data
ii) Secondary data
Note that a mixture of both quantitative and qualitative data provides the most
comprehensive set of data for program evaluation.
v. Measurements of scale
1. Nominal scale of measurement-“cold, warm, hot
and very hot.”
2. Ordinal scale of measurement-‘one = happy, two =
neutral, and three = unhappy.
3. Interval scale of measurement
4. Ratio scale of measurement
vi. Sources of data and sources of health data
Surveys
• Surveys are an important means of collecting health and social science information from a sample of
people in a standardized way to better understand a larger population. There are many methods used to
conduct surveys, including questionnaires and in-depth interviews via phone, mail, email, and in-person.
• Survey research allows researchers to collect empirical data in a relatively short period of time.
Depending on the design and scope, surveys can collect data on a representative sample of people
Medical records
Medical records are used to track events and transactions between patients and health care providers. They
offer information on diagnoses, procedures, lab tests, and other services. Medical records help us
measure and analyze trends in health care use, patient characteristics, and quality of care.
Claims data or administrative data
Claims data, also known as administrative data, are another sort of electronic record, but on a much bigger
scale
The good thing about claims data is that, like other medical records, they come directly from notes made by
the health care provider, and the information is recorded at the time patient sees the doctor.
Vital records
Vital records are collected by the National Vital Statistics System, and are maintained by state and local governments. Vital
records include births, deaths, marriages, divorces, and fetal deaths. They also record information about the cause of
death, or details of the birth.
Sources of data and sources of health data cont...
Surveillance
Surveillance is the ongoing systematic collection, analysis, and interpretation of data, closely
integrated with the timely dissemination of these data to those responsible for preventing and
controlling disease and injury.”1 Surveillance activities are usually associated with the study of
infectious diseases.
Disease registries
• Registries are systems that allow people to collect, store, retrieve, analyze, and
disseminate information about people with a specific disease or condition. Disease
registries let researchers estimate how large a health problem is, determine the
incidence of the disease, study trends over time, and evaluate the effects of certain
environmental exposures. Registries provide information to improve the quality and
safety of care, and allow for comparison of effective treatment.
• Registries are kept by governments, hospitals, universities, non-profits, and private
groups. They store data from hospital records, lab reports, and other sources.
Because clinical data is sent securely to registries from the various points of care
that a patient may receive, registries allow the possibility to track and better
understand rare diseases
vii. Data Collection Methods
1) Questionnaires/Surveys
Good way of gathering a lot of data and provides a broad perspective.
Can be administered electronically, mail or face to face.
Mail/Email surveys have wider reach, relatively cheaper to administer,
information is
standardized and privacy can be maintained.
Prone to low response rate, low in depth, not appropriate among the
illiterate and does
not allow for any observation.
Chances of reporter bias particularly on sensitive issues or where disclosure
on trust
is required
Piloting on sample target group is required for validity purposes and test
appropriateness for the target group
Data Collection Methods
2) Interviews
Used when you want to understand impressions and experiences in more detail and be
able to expand or clarify responses
Interviews can be conducted face-to-face or by telephone.
Range from In-depth, Semi-structured, Unstructured, Structured, Key Informant
depending on the information being sought.
Face-to-Face Interview:
Advantages
i)Detailed questions can be asked
ii) Probing can be done to provide rich data
iii) Literacy requirements of participants is not an issue
iv) Non-verbal data can be collected through observation
v) Complex and unknown issues can be explored
vi) Higher response rates
Disadvantages
i)They can be expensive and time consuming
ii) Interviewer training is required to reduce bias
iii) Prone to interviewer and interpreter bias
iv) Sensitive issues maybe challenging
Telephone Interviews
Advantages
i)Cheaper and faster than face to face interviews to conduct
ii) Use less resources than face to face interviews
iii) Allow to clarify questions
iv) Do not require literacy skills
Disadvantages
i)Making repeated calls if first calls are not answered
ii) Potential bias if call backs (follow up) are not made to those absent
iii) Only suitable for short surveys
iv) Only accessible to the population with a telephone
v) Not appropriate for exploring sensitive issues
Data Collection Methods cont…
3) Observations
Information is gathered about a program as the program’s activities occur
Data is collected by direct or indirect participation within program activities
Evaluator better understands the context in which measures are undertaken, and
facing directly the program implementation enables the evaluator to 'feel at home'
with a given issue
Trained evaluator may also perceive such phenomena that – as they are obvious –
escape others' attention, as well as issues that are not tackled by participants in
interviews (like conflicts, sensitive)
Observation enables evaluator to exceed participants' selective perception
Possible to present the versatile picture of the program that would not be possible
using only questionnaires and interviews.
Advantages of Observation
• Gives relatively more accurate data on behavior, attitude and activities being performed.
• It is the best method to obtain first hand information
• It gives more relevant and accurate information
• Data collected is highly reliable
• It is relatively cheaper
Disadvantages of Observation
• The presence of the investigator may make performer or respondent to work in a different manner.
• It does not help produce accurate data automatically
• The results of observations may be different under different conditions
Data Collection Methods cont…
• List the objectives that you want the questionnaire to accomplish. This will help in
writing the questions, since questions are related to objectives.
• Clarity is essential/important. Terms like several, must, because have no precise
meaning and they should therefore be avoided.
• The questions are focused and are limited to single idea. i.e short questions should
be used as they are easier to understand. They should be short and clear e.g have
you gone for VCT and how often?
• Double barreled questions should be avoided. Questions that have more than two
meanings yet allows for one answer e.g Do you think that students should have
more classes about history and culture? What motivates your work? Pleasant work
and nice co-workers.
• Leading or biased questions should be avoided e.g asking for the gender of a child-
like is he a boy when you see by yourself. Were you at the KCs bar on the night of
15th?
• Very personal and sensitive questions should be avoided as the respondent may be
dishonest in answering them.
Rules/principals for constructing a questionnaire/design.
Observation schedule
This is a type of schedule having questions which guide an
observer systematically.
Rating schedule
Set of questions that helps guide a psychologist or sociologist
to measure the attitude and behavior of an individual.
Survey schedule
Formulated for a surveyor to guide him on his information
collection
Interview schedule
Set of questions with structured answers to guide an interview.
Principles of interview schedule design
Open-ended interviews
1. The first question must confirm that the potential respondent is aware of the purposes
and the scope of the interview, the amount of time it will take, and the way the
responses will be used e.g addressing confidentiality issues.
2. Address whether or not the responses will be recorded. Recording responses has the
advantage that all materials are captured but the disadvantage that many respondents
may speak less freely if they are being recorded obtain a consent before recording.
3. It must contain questions that will answer the research questions.
4. It must not contain questions that are not related to the research questions.
5. Each question must address a single use.
6. Topics that might be covered include:
a. Demographic questions-age, education, position (occupation),gender can be
observed.
b. Knowledge- questions about what a person knows about a specific topic.
c. Behaviour- questions about what a person does in general or has done on a
specific occasion and/or what a person plans to do in future.
d. Opinions or values- questions as to what a person thinks about a topic.
e. Feelings- questions about how a person feels about an issue.
f. Sensory- questions as to what a person has seen, touched, heard, tasted or
smelt.
Benefits of interview schedule
Data analysis is the process of evaluating data using analytical and logical reasoning to
examine each component of the data provided
Data analysis refers to the process of inspecting, cleaning, transforming, and
modeling data with the goal of discovering useful information, suggesting conclusions, and
support decision-making
Data analysis helps in obtaining usable and useful information. Data analysis may:
i. Describe and summarise the data
ii. Identify relationships between variables
iii. Compare variables
iv. Identify the difference between variables
v. Forecast outcomes
Data analysis must be planned for before data collection
Plan data analysis as follows:
i . Consider the purpose of your evaluation. How useful is the data in understanding and
improving program?
ii . Decide who will analyse the data. He/she must have training and experience in the
analysis procedures and software to be used
iii. Develop a database management system to collect, organize and store data
iv. Plan for data cleaning procedures
v. Obtain data analysis software
vi. Analysing Quantitative Data
Survey
Write a report
• The final step in conducting online surveys is to write a report explaining your findings.
• A successful survey will provide the answers to the questions you had about your
business, product or service.
• Use proper report writing layout
• Provide summary of your results
Points to be considered when constructing a survey tool
• Unclear objectives
• Questions not matching the survey objectives
• Questions are ambiguous
• Tool doesn’t provide adequate space for data
entry
• Questions are not well numbered
Processes of survey feedback:
• Open
Breakdowns compare and categorize data.
• AXIAL
Make connections with categories after open
coding
• SELECTIVE
Select the core category, relate it to other
categories and confirm the explanation to those
relationships.
Why Data coding
Errors
• The word error has a special meaning in statistics. We
can distinguish between mistake and error.
• A mistake, means incorrect presentation or man
factors. Can occur in the collection of the data. E.g the
respondent may have mistakenly ticked the ‘yes’ box
instead of ‘no’ box.
• An error means the difference between the actual
figures. The deviation is just by chance and it’s not
due to carelessness of human beings.
Sampling processes (8hrs)” (4 lessons)”
Definition of terms:
Population, Sample, Sampling, Sample size
Determination, probability and Non-
probability sampling methods.
Population
• In statistics, a population is the entire pool
from which a statistical sample is drawn. A
population may refer to an entire group of
people, objects, events, hospital visits, or
measurements
• The population is the entire group that you
want to draw conclusions about.
Sample
• The sample is the specific group of individuals that you will collect
data from
• When the sample are selected from a population, the units selected
must be taken random.
• Sample is a portion, piece or segment that represents the whole
(population).
• According to this method, a few units from the whole population must
have equal chance to be selected. The units selected are just by chance
or coincidence. If these units selected are not taken at random then
bias will take place and undue importance will be given to some units.
In this case the sampling will not be fair and representative e.g if from
a class students selected to be part of the study will be only intelligent
students. This will not be representative regarding the performance.
Sampling
• The process of selecting a number of
individuals for a study in such a way that the
selected individuals represent the large group
from which they were selected.
• The purpose of sampling is to secure a
representative group which will enable the
researcher to gather information about a
population.
Advantages of sampling method
While drawing the samples in your study in Kisumu County, the following steps will be
followed;
• Determine the affected population. The number of people affected 10000 (N)
• List the population (in our case diabetes cases), from 1-10000. While listing you have to
make sure that:
• Specific characters don’t occur all the time at any given interval.
• Each number corresponds to a specific point or case in the population (to eliminate
biased representation)
• Determine the sample size e.g 1000(n)
• Divide the population by sample size, this is the interval which you will use while picking
your samples from the population.
• K=N/n
• K=10000
• 1000
• K= 10 interval.
• Determine the starting point from which you will start picking every 10th item e.g using
simple random, if the first randomly selected sample is number 3 and K is 10th item after
number 3,then the items or samples we shall come with are 3,13,23,33 etc all selected
Advantages of systematic random sampling
• Easy to organize.
• More precise than simple random sampling
and more evenly spread over population.
• Simple to apply the analysis of data and has a
sound mathematical basis.
• Biasness is eliminated.
Disadvantages
=7/10
3. Ten is common in both locations and therefore we shall say that the
proportional representation is 3:7. This means that for every 3 people in Kisumu
East location there are 7 people in Kisumu West location.
4. If the sample size is 1000, apply the same ratio3:7 in which the population
occurs in order to get smaller sub- samples which are 3/10 of 1000 for Kisumu East
and 7/10 of 1000 for Kisumu West, this smaller sub –samples are called stratified
samples.
5. After establishing the sub-samples, then apply simple random sampling to get
4. Cluster sampling method
Form 1
Section 1
Hospital Name………….Date………Ward
Admissions Discharges and Deaths
Section 2
Inter-ward transfers within the hospital
Admissions Discharges
Section 3
Paroles
Admissions from Paroles Discharges from Paroles
Section 4 Computation
Previously Daily Return NO Todays Daily Return Numbers
Beds Cots Total Beds Cots Total
Patients Patients
Vacant Vacant
Total Total
Example of Daily Ward Return
Daily In-Patient Statistics
Ward……………………………………….. Month……………………………………
Days
1 New Adm Re-Adm Total Adm Discharge Home Inter ward Trans Parole Deaths Absco Trans in Trans Out IP Days
6
Daily summary form for In-patient Statistics
Ward……………………………………….. Month……………………………………
Transfer Transfer No. Well Occupied Bed
Beds Cots Admissions Discharges In Out Deaths Absco Persons Days
M/W
F/W
MAT
P/W
NBU
FORMULAS FOR COMPUTING HOSPITAL ADMINISTRATIVE
STATISTICS
1. Available bed days =period( days ) Allocated beds
ABD = PAB
2. Occupied bed days / in patient days = Direct summary of DBR state
OBD= Period daily IP days / daily occupancy
OBD= P DIPD /DO
OR
OBD = ABD
OR
OBD =LOS D+D
3. %occupancy = occupied bed days / available bed days 100
%OCC=100
4. Vacant bed days= available bed days –occupied bed days
VBD=ABD-OBD (daily bed vacancy period)
Tabulation (Tables)
Charts and Diagram
1. Tabulation (Tables)
2. Charts-graphical representation for data visualization
Types of Charts and Graphs examples :
• Bar Chart.
• Line Chart.
• Pie Chart. ...
• Maps. ...
• Density Maps. ...
• Scatter Plot. ...
• Gantt Chart. ..
Tabulation
What is Tabular Presentation of Data? It is a table that helps to represent
even a large amount of data in an engaging, easy to read, and coordinated
manner. The data is arranged in rows and columns. This is one of the most
popularly used forms of presentation of data as data tables are simple to
prepare and read.
Objectives Of Tabulation
(2) Title ● Title of the table is the second item that is shown just above the table.
● It narrates the contents of the table, hence it has to be very clear, brief, and carefully worded.
(3) Head note ● It is the third item just above the table and shown after the title.
● It gives information about units of data like, ‘amount in rupees or $’, “quantity in tonnes’, etc.
● It is generally given in brackets.
(4) Captions or ● At the top of each column in a table, a column designation/head is given to explain the figures
Column headings of the column.
● This column heading is known as ‘caption’.
(5) Stubs or Row ● The title of the horizontal rows is known as ‘stubs’.
headings
(6) Body of the table ● It contains the numeric information and reveals the whole story of investigated facts. Columns
are read vertically from top to bottom and rows are read horizontally from left to right.
(7) Source note ● It is a brief statement or phrase indicating the source of data presented in the table.
(8) Footnote ● It explains the specific feature of the table which is not self-explanatory and has not been
explained earlier. For example, points of exception if any.
Rules and guidelines for tabular presentation
1. Table must be numbered
2. Brief and self explanatory title must be given to each
table.
3. The heading of columns and rows must be clear,
sufficient, concise and fully defined.
4. The data must be presented according to size of
importance, chronologically, alphabetically or
geographically
5. If data includes rate or proportion, mention the
denominator.
6. Table should not be too large.
7. Figures needing comparison should be placed as close
as possible.
Continued..
8. The classes should be fully defined, should not lead to any
ambiguity.
9. The classes should be exhaustive i.e. should include all the
given values.
10. The classes should be mutually exclusive and non
overlapping.
11. The classes should be of equal width or class interval
should be same
12. Open ended classes should be avoided as far as possible.
13. The number of classes should be neither too large nor too
small.Can be 10-20 classes.
14. Formula for number of classes(K):
K=1+3.322 log10 N, where N is total frequency
Tabulation
• Can be Simple or Complex depending
upon the number of measurements of
single set or multiple sets of items.
• Simple table :
Title: Numbers of cases of various diseases in Nair hospital in 2009
Disease Cases
Malaria 1100
Acute GE 248
Leptospirosis 60
Dengue 100
Total 1308
Frequency distribution table with qualitative data:
Jun-10 Jul-10
Type of
malaria Adult Child Adult Child Total
120-129 8 4 12
130-139 4 4 8
140-149 6 4 10
150-159 5 5 10
160-169 9 6 15
170-179 9 9 18
180-189 3 2 5
44 34 78
Chart and diagram
YEAR Pts
2001 61000
2002 75000
2003 50000
2004 80000
Total 266000
Simple bar diagram:
Malaria cases in Nair Hospital in July 2010
120
100
80
40
20
0
P.Vivax P.Falciparum Mixed malaria
• Multiple bar chart: Each observation has more
than one value, represented by a group of bars.
Percentage of males and females in different
countries, percentage of deaths from heart
diseases in old and young age, mode of delivery
(cesarean or vaginal) in different female age
groups.
Multiple or Compound diagram
Distribution of malaria cases in Nair Hospital in July
2010
120
100 102
80 Male
Female
60 62
57
40
31 29
20
19
0
P.Vivax P.Falciparum Mixed malaria
Multiple or Compound bar chart.
600000
500000
400000
clinic attendance
200000
100000
0
2001 2002 2003 2004 TOTAL
years
Component/Sectional bar chart
• This is also called subdivided bar diagrams. Instead of placing the bars for
each component side by side we may place these one on top of the other.
• Each bar is subdivided into two or more components across the years.
1200000
1000000
clinic attendance
0
2001 2002 2003 2004 TOTAL
years
Percentage/Component bar charts
30
25
20
number of individuals
15
10
0
100- 110- 120- 130- 140- 150-
height in cm
Frequency Polygon
• Derived from a histogram by connecting
the mid points of the tops of the rectangles
in the histogram.
• The line connecting the centers of
histogram rectangles is called frequency
polygon.
• We can draw polygon without rectangles
so we will get simpler form of line graph.
• A special type of frequency polygon is the
Normal Distribution Curve.
Frequency polygon
Fasting blood glucose level in diabetics at the time
of diagnosis
20
18
16
14
12 No of diabetics
10
8
6
4
2
0
120- 130- 140- 150- 160- 170- 180-
129 139 149 159 169 179 189
Cumulative frequency diagram or O’give
90
80
70
60
50 No of diabetics
40
30
20
10
0
120-129 130-139 140-149 150-159 160-169 170-179 180-189
Scatter/ dot diagram
Also called as Correlation diagram ,it is useful to
represent the relationship between two numeric
measurements, each observation being
represented by a point corresponding to its value
on each axis.
In negative correlation, the points will be
scattered in downward direction, meaning that
the relation between the two studied
measurements is controversial i.e. if one
measure increases the other decreases
While in positive correlation, the points will be
scattered in upward direction.
Malaria cases During monsoon in Nair Hospital:
Year 2010
500
450 450
400
350
300 304 Malaria cases
250
200
150
100 89
50
Series1; 30
0
Line diagram:
It is diagram showing the relationship between two numeric variables
(as the scatter) but the points are joined together to form a line (either
broken line or smooth curve. Used to show the trend of events with the
passage of time.
Changes in body temperature of a patient after use of antibiotic
39.5
39
38.5
temperature
38
37.5
37
36.5
36
1 2 2 4 5 6 7
time in hours
Pie diagram:
• Consist of a circle whose area represents
the total frequency (100%) which is
divided into segments.
• Each segment represents a proportional
composition of the total frequency.
Pie diagram:
P.Vivax
P.Falciparum
53%
32%
Thank you