Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 20

Data Management & Analysis

Types (tool) of data collection methods

1) Observation:
 -Surveys and questionnaires
 -Investigation [lab sample, X-ray, clinical remark,
etc]
1) Interview
2) Review of Record
 Researchers aim is to condense data in meaningful way to
extract useful information.
The purpose of most studies is to collect data to obtain
information about a particular area of research.
Data is one or more variables that collected.
It could be quantitative (subject to change) or qualitative
(quality, characteristic, or constituent of a
person or thing that can be measured).
Statistics encompasses the methods of collecting,
summarizing, analyzing and drawing conclusions from the data.

 This is achieved through the use of statistical techniques.


Type of variable

Variables are classified according to their:

 Type:
 Quantitative (continuous, discrete)
 Qualitative ( ordinal, nominal)

 Role in the study:


 Dependent
 Independent
Quantitative (Numerical) Variables

When the variable takes some numerical value. Numerical


data can be in two types
Discrete data (Interval scale) – occur when the variable
can only take certain whole numerical values. These
are often counts of numbers of events; e.g. number of
visits to a medical doctor in a year (3 visits/year). Discrete
data cannot have a value with fraction or decimal place.

Continuous data (Ratio scale) – occur when there is no


limitation on the values that the variable can take; e.g.
weight or height, Eg: weight 45.5 kg
Qualitative (Categorical) variables

Each individual can only belong to one of a number of


distinct categories of variables

Nominal data – the categories are are not ordered but


simply have names. E.g. marital status
(married/single/widow/divorced).
Eg: gender (male/female).

Ordinal data – the categories are ordered in some way.


E.g. degree of pain (severe, moderate, mild, none)
Types of Data and Scale used to measure
data
Variable
Categorical Numerical
(qualitative) (quantitative)

Nominal Ordinal Discrete Continuous


Categories Categories are Integer Takes any
are ordered (e.g. values, value in a
unordered Disease stage typically range of
(e.g. Gender mild/moderat counts (e.g. values (e.g.
male/female) e/severe) days of sick weight in
per year) Kg; height
in cm
Dependent variable:
Describes or measure the problem (e.g. disease outcome,
performance) under study.
Independent variable:
Describes or measures the factor that is assumed to cause or at least
influences the problem.
Eg: Cause Effect/Outcome
(Independent variable) (Dependent variable).
Eg: case study on diabetes:
Dependent variable( blood glucose or Hb A1C)
Independent variables (age, gender, weight, duration of disease,
etc).
Derived data
To make the data more meaningful sometimes, derived data
are used.
 Percentages (%): It is is a number or ratio expressed as a
fraction of 100.

 Rate: is the quantity, amount or degree of something


measured in a specified period of time. e.g. Infant mortality
rate: the number of infant deaths per 1000 live births.
 Ratio or quotient: is a numerical expression which indicates
the relationship in quantity, amount or size between two or
more parts. E.g. Body mass index (BMI), calculated as
Weight / (height)2
Variation in Data
A. Bias
B. Accuracy
C. Precision
Bias
Bias is the systematic variation of measurements
(Unlike random variation), bias results in measurements
that are systematically higher or lower than the true
underlying value of diagnostic variable.

 Bias may results from a flaw (defect) in the


measurement process or from sampling error.

26
Accuracy
A B When the measurement
process yield value that
Number of observations

are equal, on the


average, to the true
underlying value of the
Chance variable being measured,
the measurement
Bias
(instrument or process)
80 90 is accurate or unbiased.
Diastolic blood pressure (mm Hg) E.g. (B data)
Evaluating of accuracy. The accuracy of a set of
measurements is determined by comparing the
average (called mean) of set of readings of a
given variable with the true underlying value.

Noma, UMST, 2016


A B
Precision
Number of observations

The degree to which a


series of measurements
fluctuate around a central
Chance
measurement is the
Bias precision, or
reproducibility, of the
80 90
Diastolic blood pressure (mm measurements .
Hg) The central value may or
Mean = 90
may not be the true value
of the variable. (A data)

31
Bias refers to systematic variation in a series of
measurements.
Accuracy refers to a series of measurements that, on the
average, equal the underlying value of the variable being
measured.
Precision describes the degree to which a series of
measurements fluctuate around a central value.
Chance describes the type of variation that, in general,
results in values above or below the true value with equal
probability.
Data cleaning

During this phase the collected data are


inspected‫ ف'''حص‬, reviewed and erroneous data are
corrected.

Data cleaning can be done during data entry. It is


very important to avoid making subjective decisions.
Data Analysis
In quantitative research there are two ways in
which data are analyzed:
Descriptive Statistics
 Procedures used to describe a given collection of data.
 The purpose is to describe the sample at hand/ cases that
we have examined.
Inferential‫ستنتاجي‬1‫ا‬Statistics
 Procedures that let us generalize our findings beyond the
particular sample at hand to the larger population
represented by that sample.
Descriptive Statistics

Measures of central tendency include:


 Mean 
 Median
 Mode
Measures of variability include:
 Range
 Percentiles '‫مقياسا''لنسبة ا''لمئويه‬
 Variance Vs Standard deviation‫ا''النحرا'فا''لمعياري‬
Inferential Statistics
Statistical inference is the process of using data analysis to
obtain properties of an underlying probability distribution. 
Its used tests called (test of significant) eg:
1. 1. RR (Relative Risk), OR( Odd Ratio).
2. Chi square ( to determine association) in analytical study.
By using [P. Value].
3. T. TEST: compare between 2 quantitative means (case and
control); to obtain (P. value).
4. ANOVA: compare between more than 2 quantitative means
(eg: sub-cases with control); to obtain (P. value).
Note: usually probability (P. Value); less than 0.05
considered that: there is significant variation between
results of cases versus results of control.
Descriptive statistics is distinguished
from inferential statistics , in that descriptive
statistics aims to summarize a sample, rather than
use the data to learn about the population that the
sample of data is thought to represent.
This generally means that descriptive statistics,
unlike inferential statistics, is not developed on the
basis of probability theory
Data Analysis Software
Once you have decided on your data collection method,
decided on the level of measurement for your variables,
and collected the data, you are ready to begin analyzing
the data.
There are software programs

SPSS (Statistical Package for the Social Sciences) –


 Epi Info
 BMP (Biomedical Package)
 SAS (Statistical Analysis Software)
 Stata

You might also like