Nursing Research Methods: PH.D in Nursing

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 66

Ph.

D In Nursing

NURSING RESEARCH METHODS

Module No: 8 DATA ANALYSIS

Name of the subtopic


8.1 PARAMETRIC TEST

Faculty Name ;
Date:
Subject Code;
School of Nursing
Learning Objectives
 To know the concept of data analysis
 To understand the purpose of data analysis
 To explore the types of data analysis
 To describe the methods of parametric test

2
List of Contents
 introduction
 Data analysis – meaning
 Purpose of analysis
 Types of analysis
 Parametric test
 Methods
 Summary
 References

3
introduction
 Analysis of data is considered to be highly skilled and
technical job which should be carried out .Only by the
researcher himself or under his close supervision. It is very
important step after the data collection and data preparation.
any researcher who is doing research must know the
techniques of data analysis to choose according to the method
of study.in this module we will discuss about the . Parametric
tests .

4
Meaning of DATA ANALYSIS


 Analysis of data means critical examination of the data for
studying the characteristics of the object under study and for
determining the patterns of relationship among the variables
relating to it’s using both quantitative and qualitative methods.

5
Meaning of DATA ANALYSIS
 Analysis of data is a process of inspecting, cleaning,
transforming, and modeling  data  with the goal of discovering
useful  information, suggesting conclusions, and supporting
decision-making. Data analysis has multiple facets and
approaches, encompassing diverse techniques under a variety
of names, in different business, science, and social science
domains.

6
Meaning of DATA ANALYSIS
 Analysis refers to breaking a whole into its separate
components for individual examination. Data analysis is a
process for obtaining raw data and converting it into
information useful for decision-making by users. Data is
collected and analyzed to answer questions, test hypotheses or
disprove theories.

7
Meaning of DATA ANALYSIS

 Statistician  johntukey defined data analysis in 1961 as:


"Procedures for analyzing data, techniques for interpreting the
results of such procedures, ways of planning the gathering of
data to make its analysis easier, more precise or more accurate,
and all the machinery and results of (mathematical) statistics
which apply to analyzing data.

8
Purpose of Analysis
 It summarizes large mass of data in to understandable and
meaningful form.
 It makes descriptions to be exact.
 It aids the drawing of reliable inferences from observational
data.
 It facilitates identification of the casual factors underlying
complex phenomena
 It helps making estimations or generalizations from the results
of sample surveys.
 Inferential analysis is useful for assessing the significance of
specific sample results under assumed population conditions.

9
Steps in Analysis
 The first step involves construction of statistical distributions
and calculation of simple measures like averages, percentages,
etc.
 The second step is to compare two or more distributions or two
or more subgroups within a distribution.
 Third step is to study the nature of relationships among
variables.
 Next step is to find out the factors which affect the relationship
between a set of variables
 Testing the validity of inferences drawn from sample survey by
using parametric tests of significance.

10
Types of Analysis
Statistical analysis may broadly classified as

 Descriptive Analysis

 Inferential analysis

11
Descriptive statistics - meaning

 Descriptive statistics is the term given to the analysis of data that


helps describe, show or summarize data in a meaningful way
such that, for example, patterns might emerge from the data.
 Descriptive statistics do not, however, allow us to make
conclusions beyond the data we have analysed or reach
conclusions regarding any hypotheses we might have made.
 They are simply a way to describe our data.
Descriptive statistics - methods

 Typically, there are two general types of statistic that are used
to describe data:
 What is the “location” or “center” of the data? (“measures
of location”)
 How do the data vary? (“measures of variability”)
Measures of Location

 The various measures to describe the location are Mean, Median


and Mode.
 Mean is also called as average.
 If describing a population, denoted as , the greek letter m, i.e.
“mu”. (PARAMETER). If describing a sample, denoted as ,
called “x-bar”. (STATISTIC)
 Appropriate for describing measurement data.
x
 Seriously affected by unusual values called “outliers”.
Calculating Sample Mean

Formula:
x i
x i 1
n
That is, add up all of the data points and divide by the number
of data points.

Data (# ER arrivals in 1 hr): 2 8 3 4 1

Sample Mean = (2+8+3+4+1)/5


= 3.6 arrivals
Median

 Another name for 50th percentile. Appropriate for describing


measurement data.
 “Robust to outliers,” that is, not affected much by unusual
values.

Calculating Sample Median

Order data from smallest to largest.


If odd number of data points, the median is the middle value.
Data (# ER arrivals in 1 hr.): 2 8 3 4 1

Ordered Data: 1 2 3 4 8

Median
Calculating Sample Median

Order data from smallest to largest.

If even number of data points, the median is the average of


the two middle values.

Data (# ER arrivals in 1 hr.): 2 8 3 4 1 8

Ordered Data: 1 2 3 4 8 8

Median = (3+4)/2 = 3.5


Mode

 The mode of a data set is the value that occurs with


greatest frequency.
 The greatest frequency can occur at two or more
different values.
 If the data have exactly two modes, the data are
bimodal.
 If the data have more than two modes, the data are
multimodal.
Mode

450 occurred most frequently (7 times)


Mode = 450

425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Percentiles

 A percentile provides information about how the


data are spread over the interval from the smallest
value to the largest value.
 Admission test scores for colleges and universities
are frequently reported in terms of percentiles.

The pth percentile of a data set is a value such that at least p


percent of the items take on this value or less and at least
(100 - p) percent of the items take on this value or more.
Percentiles

Arrange the data in ascending order.

Compute index i, the position of the pth percentile.


i = (p/100)n

If i is not an integer, round up. The p th percentile


is the value in the i th position.

If i is an integer, the p th percentile is the average


of the values in positions i and i +1.
90th Percentile

i = (p/100)n = (90/100)70 = 63
Averaging the 63rd and 64th data values:
90th Percentile = (580 + 590)/2 = 585
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Quartiles

 Quartiles are specific percentiles.


 First Quartile = 25th Percentile
 Second Quartile = 50th Percentile = Median
 Third Quartile = 75th Percentile
Third quartile = 75th percentile
i = (p/100)n = (75/100)70 = 52.5 = 53
Third quartile = 525
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Measures of Variability
 It is often desirable to consider measures of variability
(dispersion), as well as measures of location.
 For example, in choosing supplier A or supplier B we
might consider not only the average delivery time for
each, but also the variability in delivery time for each.

Range
 The range of a data set is the difference between the
largest and smallest data values.
 It is the simplest measure of variability.
 It is very sensitive to the smallest and largest data
values.
Range
Range = largest value - smallest value
Range = 615 - 425 = 190
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Interquartile Range
 The interquartile range of a data set is the difference
between the third quartile and the first quartile.
 It is the range for the middle 50% of the data.
 It overcomes the sensitivity to extreme data values.
Variance
The variance is a measure of variability that utilizes
all the data.

It is based on the difference between the value of


each observation (xi) and the mean ( x for a sample, m for a
population).

The variance is the average of the squared


differences between each data value and the mean.
The variance is computed as follows:
2
2  ( xi  x )
2
 ( x   )
s  2  i
n 1 N

for a for a
sample population
Standard Deviation
The standard deviation of a data set is the positive
square root of the variance.

It is measured in the same units as the data, making


it more easily interpreted than the variance.

The standard deviation is computed as follows:

s s2  2

for a for a
sample population
Coefficient of Variation

The coefficient of variation indicates how large the


standard deviation is in relation to the mean.

The coefficient of variation is computed as follows:


s   
  100  %   100  %
x   
for a for a
sample population
Descriptive Analysis
Descriptive statistics are used to describe the basic features of the
data in a study. They provide simple summaries about the sample
and the measures. Descriptive statistics is the discipline of
quantitatively describing the main features of a collection of data
or the quantitative description itself.
In such analysis there are
 Univariate analysis
 Bivariate analysis and
 Multivariate analysis.

29
Descriptive Analysis
 Univariate Analysis
 Univariate analysis involves describing the distribution of a
single variable, including its central tendency (including the
mean, median, and mode) and dispersion (including the range
and quartiles of the data-set, and measures of spread such as
the variance and standard deviation).
 The shape of the distribution may also be described via indices
such as skewness and kurtosis.
 Characteristics of a variable's distribution may also be depicted
in graphical or tabular format, including histograms and stem-
and-leaf display.

30
Descriptive Analysis
 Bivariate Analysis
Bivariate analysis is one of the simplest forms of the quantitative
(statistical) analysis. It involves the analysis of two variables
(often denoted as X, Y), for the purpose of determining the
empirical relationship between them. Common forms of bivariate
analysis involve creating a percentage table or a scatter plot
graph and computing a simple correlation coefficient

31
Descriptive Analysis
 Multivariate Analysis
In multivariate analysis multiple relations between multiple
variables are examined simultaneously. Multivariate analysis
(MVA) is based on the statistical principle of multivariate
statistics, which involves observation and analysis of more than
one statistical outcome variable at a time. In design and analysis,
the technique is used to perform trade studies across multiple
dimensions while taking into account the effects of all variables
on the responses of interest

32
Descriptive Analysis
Multivariate Analysis - Techniques

 Principal Components Analysis


 Hierarchical Cluster Analysis
 Non-Hierarchical Clustering, or Partitioning
 Discriminant Analysis
 Correspondence Analysis
 Factor Analysis
 Cluster Analysis

33
Inferential Analysis
 Inferential statistics is concerned with making predictions or
inferences about a population from observations and analyses
of a sample. That is, we can take the results of an analysis
using a sample and can generalize it to the larger population
that the sample represents.
There are two areas of statistical inferences
(a) Statistical estimation and
(b) The testing of hypothesis

34
Inferential statistical - Meaning

 Inferential statistics are conclusions that extend beyond the


immediate data alone. It helps to make inference about the
population from samples
 How samples relate to larger collections of data (called

populations) from which they have been drawn is the subject of


inferential statistical methods.
 Inferential statistics are used to make inferences from our data to
more general conditions; we use descriptive statistics simply to
describe what's going on in our data.
Inferential statistical methods

 The methods of inferential statistics are


(1) the estimation of parameter(s) and
(2) testing of statistical hypotheses.
Inferential Statistics involves 5 Steps as follows,
 To determine if SAMPLE means come from same population, use 5 steps
with inferential statistics
1. State Hypothesis
 H : no difference between 2 means; any difference found is due to
o
sampling error
 any significant difference found is not a TRUE difference, but
CHANCE due to sampling error
 results stated in terms of probability that H is false
o
 findings are stronger if can reject H o
 therefore, need to specify H o and H1
Steps in Inferential Statistics
2. Level of Significance
 Probability that sample means are different enough to reject
Ho (.05 or .01)
 level of probability or level of confidence

3. Computing Calculated Value


 Use statistical test to derive some calculated value (e.g., t
value or F value)

4. Obtain Critical Value


 a criterion used based on df and alpha level (.05 or .01) is
compared to the calculated value to determine if findings are
significant and therefore reject Ho
Steps in Inferential Statistics

5. Reject or Fail to Reject Ho


 CALCULATED value is compared to the CRITICAL value to
determine if the difference is significant enough to reject Ho at
the predetermined level of significance
 If CRITICAL value > CALCULATED value --> fail to reject
Ho
 If CRITICAL value < CALCULATED value --> reject H
o
 If reject H , only supports H ; it does not prove H
o 1 1
Testing Hypothesis

 If reject Ho and conclude groups are really different, it doesn’t


mean they’re different for the reason you hypothesized
 may be other reason

 Since Ho testing is based on sample means, not population


means, there is a possibility of making an error or wrong
decision in rejecting or failing to reject Ho
 Type I error
 Type II error
Testing Hypothesis

 Type I error -- rejecting Ho when it was true (it should have been
accepted)
 equal to alpha
 if  = .05, then there’s a 5% chance of Type I error

 Type II error -- accepting Ho when it should have been rejected


 If increase , you will decrease the chance of Type II error
Identifying the Appropriate Statistical Test of
Difference

One variable One-way chi-square

Two variables
(1 IV with 2 levels; 1 DV) t-test

Two variables
(1 IV with 2+ levels; 1 DV) ANOVA

Three or more variables ANOVA


Tests of Significance
 Parametric tests of significance – used if there are at least 30
observations, the population can be assumed to be normally
distributed, variables are at least in an interval scale
 Z tests are used with samples over 30. There are four kinds
(two samples or two categories)
 t-tests are used when samples are 30 or less.
 Single sample t-test (one sample)
 Independent t-test (two samples)
 Paired t-test (two categories
Parametric test
Parametric tests assume that the variable in question has a known
underlying mathematical distribution that can be described
(normal, binomial, poison, etc.). This underlying distribution is
the fundamental basis for all of sample-to-population inference.
Conventional statistical procedures are also called parametric
tests.
In a parametric test a sample statistic is obtained to estimate the
population parameter. Because this estimation process involves a
sample, a sampling distribution, and a population, certain
parametric assumptions are required to ensure all components are
compatible with each other.

43
Parametric test
 The assumption are:

a) The observations must be independent


b) The observations must be drawn from normally distributed
populations
c) These populations must have the same variances
d) The means of these normal and homoscedastic populations
must be linear combinations of effects due to columns and/or
rows*

44
Parametric test
METHODS

 1. Pearson correlation

 2. Independent measures t-test

 3. Analysis of variance

 4. Paired-T Test

45
Parametric test
1. Pearson correlation:
The quantity r, called the linear correlation coefficient, measures
the strength and the direction of a linear relationship between two
variables. The linear correlation coefficient is sometimes referred
to as the Pearson product moment correlation coefficient in
honour of its developer Karl Pearson. The mathematical formula
for computing r is: where n is the number of pairs of data.
The value of r is such that -1 < r < +1. The + and – signs are used
for positive linear correlations and negative linear correlations,
respectively.

46
Parametric test - Pearson correlation
 Pearson correlation
Positive correlation: If x and y have a strong positive linear
correlation, r is close to +1. An r value of exactly +1
indicates a perfect positive fit. Positive values indicate a
relationship between x and y variables such that as values
for x increases, values for y also increase.

47
Parametric test - Pearson correlation
 Negative correlation:
If x and y have a strong negative linear correlation, r is close to
-1. An r value of exactly -1 indicates a perfect negative fit.
Negative values indicate a relationship between x and y such that
as values for x increase, values for y decrease.

48
Parametric test - Pearson correlation
 No correlation:
If there is no linear correlation or a weak linear correlation, r is
close to 0. A value near zero means that there is a random,
nonlinear relationship between the two variables

 Note that r is a dimensionless quantity; that is, it does not


depend on the units employed.

49
Parametric test - Pearson correlation
 A perfect correlation of ± 1 occurs only when the data points
all lie exactly on a straight line. If r = +1, the slope of this line
is positive. If r = -1, the slope of this line is negative.
 A correlation greater than 0.8 is generally described as strong,
whereas a correlation less than 0.5 is generally described as
weak.
 These values can vary based upon the "type" of data being
examined.
 A study utilizing scientific data may require a stronger
correlation than a study using social science data.

50
Parametric test
Independent measures t-test
 2. Independent-measures t-test:

Independent t-Test involves examination of the significant


differences on one factor or dimension (dependent variable)
between means of two independent groups (e.g., male vs.
female, with disability vs. without disability) or two
experimental groups (control group vs. treatment group).

51
Inferential statistics
Used for Testing for Mean Differences

T-test: when experiments include only 2 groups


a. Independent
b. Correlated
i. Within-subjects
ii. Matched

Based on the t statistic (critical values) based on


df & alpha level
Inferential statistics
Used for Testing for Mean Differences

Analysis of Variance (ANOVA): used when


comparing more than 2 groups

1. Between Subjects
2. Within Subjects – repeated measures

Based on the f statistic (critical values) based on


df & alpha level

More than one IV = factorial (iv=factors)


Only one IV=one-way anova
Parametric test
Independent measures t-test
 For example, you might want to know whether there is a
significant difference on the level of social activity between
individuals with disabilities and individuals without
disabilities.
 Hypothesis testing procedure that uses separate samples for
each treatment condition (between subjects design) This test is
used when the population mean and standard deviation are
unknown, and two separate groups are being compared

54
Parametric test
Independent measures t-test
 Assumptions for the Independent t-Test:
 Independence:

Observations within each sample must be independent (they


don’t influence each other)
 Normal Distribution:

The scores in each population must be normally distributed


 Homogeneity of Variance:

The two populations must have equal variances (the degree to


which the distributions are spread out is approximately equal)

55
Parametric test
 3.Analysis of Variance (ANOVA):
ANOVA is a set of statistical methods used mainly to compare
the means of two or more samples. Estimates of variance are
the key intermediate statistics calculated, hence the reference
to variance in the title ANOVA.
 The different types of ANOVA reflect the different
experimental designs and situations for which they have been
developed.
 An analysis of the variation between all of the variables used in
an experiment. This analysis can provide valuable insight into
the behaviour of a security or market index under various
conditions.

56
Parametric test-Analysis of Variance
 A) One –way ANOVA
This function compares the sample means for k groups. There
is an overall test for k means, multiple comparison methods for
pairs of means and tests for the equality of the variances of the
groups.
 Consider four groups of data that represent one experiment
performed on four occasions with ten different subjects each
time.
 One way ANOVA is more appropriate for finding statistical
evidence of inconsistency or difference across the means of the
four groups.

57
Parametric test-Analysis of Variance
 One way ANOVA assumes that each group comes from an
approximately normal distribution and that the variability
within the groups is roughly constant.
 The factors are arranged so that experiments are columns and
subjects are rows, this is how you must enter your data in the
Stats Direct workbook.
 The overall F test is fairly robust to small deviations from these
assumptions but you could use the Kruskal-Wallis test as an
alternative to one way ANOVA if there was any doubt.

58
Parametric test-Analysis of Variance
 Assumptions:
a. Random samples
b. Normally distributed observations in each population
c. Equal variance of observations in each population
d. The homogeneity of variance option

59
Parametric test-Analysis of Variance
 B) Two way variance
This function calculates ANOVA for a two way randomized
block experiment. There are overall tests for differences between
treatment means and between block means. Multiple comparison
methods are provided for pairs of treatment means.

60
Parametric test
 4. PARIED-T TEST:

The paired t test provides an hypothesis test of the difference


between population means for a pair of random samples whose
differences are approximately normally distributed. Please note
that a pair of samples, each of which are not from normal a
distribution, often yields differences that are normally
distributed.
A null hypothesis of no difference between the means is clearly
rejected; the confidence interval is a long way from including
zero.

61
Parametric test
One group t-test. Example
It is known that the weight of young adult male has a mean value
of 70.0 kg with a standard deviation of 4.0 kg.
Thus the population mean, µ= 70.0 and population standard
deviation, σ= 4.0.
 Data from random sample of 28 males of similar ages but with
specific enzyme defect: mean body weight of 67.0 kg and the
sample standard deviation of 4.2 kg.
 Question: Whether the studed group have a significantly lower
body weight than the general population?

62
Parametric test
One group t-test. Example
population mean, µ= 70.0
population standard deviation, σ= 4.0.
sample size = 28
sample mean, x = 67.0
sample standard deviation, s= 4.0.
 Null hypothesis: There is no difference between sample mean
and population mean.
 t - statistic = 0.15, p >0.05
 Null hypothesis is accepted at 5% level

63
Summary
 In this module we have covered the meaning, purpose, types of
data analysis and its methods. We have also discussed
examples and assumptions. We have given concentration only
on parametric test. In the forthcoming module we will discuss
about non parametric test in detail

64
References
 Carol Leslie Macnee, (2008), Understanding Nursing Research:
Using Research in Evidence-based Practice, Lippincott Williams
& Wilkins, ISBN 0781775582, 9780781775588
 Densise.Polit, et.al, (2013). ‘Nursing research-principles and
methods’, revised edition, Philadelphia, Lippincott
 https://statistics.laerd.com/statistical-guides/descriptive-
inferential-statistics.php
 http://www.socialresearchmethods.net/kb/statdesc.php
 http://www.stat.purdue.edu/~wsharaba/stat511/chapter1_print.p
df
 http://fbm.uni-ruse.bg/d/mra/Introduction%20to%20statistical
%20methods.pdf
Thanks

NON
PARAMETRIC
TEST

66

You might also like