Nursing Research Methods: PH.D in Nursing

Ph.
D In Nursing
NURSING RESEARCH METHODS
Module No: 8 DATA ANALYSIS
Name of the subtopic

8.1 PARAMETRIC TEST
Faculty Name ;
Date:
Subject Code;
School of Nursing
Learning Objectives
 To know the concept of data analysis
 To understand the purpose of data analysis
 To explore the types of data analysis
 To describe the methods of parametric test
2
List of Contents
 introduction
 Data analysis – meaning
 Purpose of analysis
 Types of analysis
 Parametric test
 Methods
 Summary
 References
3
introduction
 Analysis of data is considered to be highly skilled and
technical job which should be carried out .Only by the
researcher himself or under his close supervision. It is very
important step after the data collection and data preparation.
any researcher who is doing research must know the
techniques of data analysis to choose according to the method
of study.in this module we will discuss about the . Parametric
tests .
4
Meaning of DATA ANALYSIS

 Analysis of data means critical examination of the data for
studying the characteristics of the object under study and for
determining the patterns of relationship among the variables
relating to it’s using both quantitative and qualitative methods.
5
 Analysis of data is a process of inspecting, cleaning,
transforming, and modeling data with the goal of discovering
useful information, suggesting conclusions, and supporting
decision-making. Data analysis has multiple facets and
approaches, encompassing diverse techniques under a variety
of names, in different business, science, and social science
domains.
6
 Analysis refers to breaking a whole into its separate
components for individual examination. Data analysis is a
process for obtaining raw data and converting it into
information useful for decision-making by users. Data is
collected and analyzed to answer questions, test hypotheses or
disprove theories.
7
 Statistician johntukey defined data analysis in 1961 as:

"Procedures for analyzing data, techniques for interpreting the
results of such procedures, ways of planning the gathering of
data to make its analysis easier, more precise or more accurate,
and all the machinery and results of (mathematical) statistics
which apply to analyzing data.
8
Purpose of Analysis
 It summarizes large mass of data in to understandable and
meaningful form.
 It makes descriptions to be exact.
 It aids the drawing of reliable inferences from observational
data.
 It facilitates identification of the casual factors underlying
complex phenomena
 It helps making estimations or generalizations from the results
of sample surveys.
 Inferential analysis is useful for assessing the significance of
specific sample results under assumed population conditions.
9
Steps in Analysis
 The first step involves construction of statistical distributions
and calculation of simple measures like averages, percentages,
etc.
 The second step is to compare two or more distributions or two
or more subgroups within a distribution.
 Third step is to study the nature of relationships among
variables.
 Next step is to find out the factors which affect the relationship
between a set of variables
 Testing the validity of inferences drawn from sample survey by
using parametric tests of significance.
10
Types of Analysis
Statistical analysis may broadly classified as
 Descriptive Analysis
 Inferential analysis
11
Descriptive statistics - meaning
 Descriptive statistics is the term given to the analysis of data that

helps describe, show or summarize data in a meaningful way
such that, for example, patterns might emerge from the data.
 Descriptive statistics do not, however, allow us to make
conclusions beyond the data we have analysed or reach
conclusions regarding any hypotheses we might have made.
 They are simply a way to describe our data.
Descriptive statistics - methods
 Typically, there are two general types of statistic that are used
to describe data:
 What is the “location” or “center” of the data? (“measures
of location”)
 How do the data vary? (“measures of variability”)
Measures of Location
 The various measures to describe the location are Mean, Median

and Mode.
 Mean is also called as average.
 If describing a population, denoted as , the greek letter m, i.e.
“mu”. (PARAMETER). If describing a sample, denoted as ,
called “x-bar”. (STATISTIC)
 Appropriate for describing measurement data.
x
 Seriously affected by unusual values called “outliers”.
Calculating Sample Mean
Formula:
x i
x i 1
n
That is, add up all of the data points and divide by the number
of data points.
Data (# ER arrivals in 1 hr): 2 8 3 4 1
Sample Mean = (2+8+3+4+1)/5

= 3.6 arrivals
Median
 Another name for 50th percentile. Appropriate for describing

measurement data.
 “Robust to outliers,” that is, not affected much by unusual
values.
Calculating Sample Median
Order data from smallest to largest.

If odd number of data points, the median is the middle value.
Data (# ER arrivals in 1 hr.): 2 8 3 4 1
Ordered Data: 1 2 3 4 8
Median
Calculating Sample Median
Order data from smallest to largest.
If even number of data points, the median is the average of

the two middle values.
Data (# ER arrivals in 1 hr.): 2 8 3 4 1 8
Ordered Data: 1 2 3 4 8 8
Median = (3+4)/2 = 3.5

Mode
 The mode of a data set is the value that occurs with

greatest frequency.
 The greatest frequency can occur at two or more
different values.
 If the data have exactly two modes, the data are
bimodal.
 If the data have more than two modes, the data are
multimodal.
Mode
450 occurred most frequently (7 times)

Mode = 450
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Percentiles
 A percentile provides information about how the

data are spread over the interval from the smallest
value to the largest value.
 Admission test scores for colleges and universities
are frequently reported in terms of percentiles.
The pth percentile of a data set is a value such that at least p

percent of the items take on this value or less and at least
(100 - p) percent of the items take on this value or more.
Percentiles
Arrange the data in ascending order.
Compute index i, the position of the pth percentile.

i = (p/100)n
If i is not an integer, round up. The p th percentile

is the value in the i th position.
If i is an integer, the p th percentile is the average

of the values in positions i and i +1.
90th Percentile
i = (p/100)n = (90/100)70 = 63
Averaging the 63rd and 64th data values:
90th Percentile = (580 + 590)/2 = 585
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Quartiles
 Quartiles are specific percentiles.

 First Quartile = 25th Percentile
 Second Quartile = 50th Percentile = Median
 Third Quartile = 75th Percentile
Third quartile = 75th percentile
i = (p/100)n = (75/100)70 = 52.5 = 53
Third quartile = 525
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Measures of Variability
 It is often desirable to consider measures of variability
(dispersion), as well as measures of location.
 For example, in choosing supplier A or supplier B we
might consider not only the average delivery time for
each, but also the variability in delivery time for each.
Range
 The range of a data set is the difference between the
largest and smallest data values.
 It is the simplest measure of variability.
 It is very sensitive to the smallest and largest data
values.
Range
Range = largest value - smallest value
Range = 615 - 425 = 190
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Interquartile Range
 The interquartile range of a data set is the difference
between the third quartile and the first quartile.
 It is the range for the middle 50% of the data.
 It overcomes the sensitivity to extreme data values.
Variance
The variance is a measure of variability that utilizes
all the data.
It is based on the difference between the value of

each observation (xi) and the mean ( x for a sample, m for a
population).
The variance is the average of the squared

differences between each data value and the mean.
The variance is computed as follows:
2
2  ( xi  x )
2
 ( x   )
s  2  i
n 1 N
for a for a
sample population
Standard Deviation
The standard deviation of a data set is the positive
square root of the variance.
It is measured in the same units as the data, making

it more easily interpreted than the variance.
The standard deviation is computed as follows:
s s2  2
for a for a
sample population
Coefficient of Variation
The coefficient of variation indicates how large the

standard deviation is in relation to the mean.
The coefficient of variation is computed as follows:

s   
  100  %   100  %
x   
for a for a
sample population
Descriptive Analysis
Descriptive statistics are used to describe the basic features of the
data in a study. They provide simple summaries about the sample
and the measures. Descriptive statistics is the discipline of
quantitatively describing the main features of a collection of data
or the quantitative description itself.
In such analysis there are
 Univariate analysis
 Bivariate analysis and
 Multivariate analysis.
29
 Univariate Analysis
 Univariate analysis involves describing the distribution of a
single variable, including its central tendency (including the
mean, median, and mode) and dispersion (including the range
and quartiles of the data-set, and measures of spread such as
the variance and standard deviation).
 The shape of the distribution may also be described via indices
such as skewness and kurtosis.
 Characteristics of a variable's distribution may also be depicted
in graphical or tabular format, including histograms and stem-
and-leaf display.
30
 Bivariate Analysis
Bivariate analysis is one of the simplest forms of the quantitative
(statistical) analysis. It involves the analysis of two variables
(often denoted as X, Y), for the purpose of determining the
empirical relationship between them. Common forms of bivariate
analysis involve creating a percentage table or a scatter plot
graph and computing a simple correlation coefficient
31
 Multivariate Analysis
In multivariate analysis multiple relations between multiple
variables are examined simultaneously. Multivariate analysis
(MVA) is based on the statistical principle of multivariate
statistics, which involves observation and analysis of more than
one statistical outcome variable at a time. In design and analysis,
the technique is used to perform trade studies across multiple
dimensions while taking into account the effects of all variables
on the responses of interest
32
Multivariate Analysis - Techniques
 Principal Components Analysis

 Hierarchical Cluster Analysis
 Non-Hierarchical Clustering, or Partitioning
 Discriminant Analysis
 Correspondence Analysis
 Factor Analysis
 Cluster Analysis
33
Inferential Analysis
 Inferential statistics is concerned with making predictions or
inferences about a population from observations and analyses
of a sample. That is, we can take the results of an analysis
using a sample and can generalize it to the larger population
that the sample represents.
There are two areas of statistical inferences
(a) Statistical estimation and
(b) The testing of hypothesis
34
Inferential statistical - Meaning
 Inferential statistics are conclusions that extend beyond the

immediate data alone. It helps to make inference about the
population from samples
 How samples relate to larger collections of data (called
populations) from which they have been drawn is the subject of

inferential statistical methods.
 Inferential statistics are used to make inferences from our data to
more general conditions; we use descriptive statistics simply to
describe what's going on in our data.
Inferential statistical methods
 The methods of inferential statistics are

(1) the estimation of parameter(s) and
(2) testing of statistical hypotheses.
Inferential Statistics involves 5 Steps as follows,
 To determine if SAMPLE means come from same population, use 5 steps
with inferential statistics
1. State Hypothesis
 H : no difference between 2 means; any difference found is due to
o
sampling error
 any significant difference found is not a TRUE difference, but
CHANCE due to sampling error
 results stated in terms of probability that H is false
o
 findings are stronger if can reject H o
 therefore, need to specify H o and H1
Steps in Inferential Statistics
2. Level of Significance
 Probability that sample means are different enough to reject
Ho (.05 or .01)
 level of probability or level of confidence
3. Computing Calculated Value

 Use statistical test to derive some calculated value (e.g., t
value or F value)
4. Obtain Critical Value

 a criterion used based on df and alpha level (.05 or .01) is
compared to the calculated value to determine if findings are
significant and therefore reject Ho
Steps in Inferential Statistics
5. Reject or Fail to Reject Ho

 CALCULATED value is compared to the CRITICAL value to
determine if the difference is significant enough to reject Ho at
the predetermined level of significance
 If CRITICAL value > CALCULATED value --> fail to reject
Ho
 If CRITICAL value < CALCULATED value --> reject H
o
 If reject H , only supports H ; it does not prove H
o 1 1
Testing Hypothesis
 If reject Ho and conclude groups are really different, it doesn’t

mean they’re different for the reason you hypothesized
 may be other reason
 Since Ho testing is based on sample means, not population

means, there is a possibility of making an error or wrong
decision in rejecting or failing to reject Ho
 Type I error
 Type II error
Testing Hypothesis
 Type I error -- rejecting Ho when it was true (it should have been
accepted)
 equal to alpha
 if  = .05, then there’s a 5% chance of Type I error
 Type II error -- accepting Ho when it should have been rejected

 If increase , you will decrease the chance of Type II error
Identifying the Appropriate Statistical Test of
Difference
One variable One-way chi-square
Two variables
(1 IV with 2 levels; 1 DV) t-test
Two variables
(1 IV with 2+ levels; 1 DV) ANOVA
Three or more variables ANOVA

Tests of Significance
 Parametric tests of significance – used if there are at least 30
observations, the population can be assumed to be normally
distributed, variables are at least in an interval scale
 Z tests are used with samples over 30. There are four kinds
(two samples or two categories)
 t-tests are used when samples are 30 or less.
 Single sample t-test (one sample)
 Independent t-test (two samples)
 Paired t-test (two categories
Parametric test
Parametric tests assume that the variable in question has a known
underlying mathematical distribution that can be described
(normal, binomial, poison, etc.). This underlying distribution is
the fundamental basis for all of sample-to-population inference.
Conventional statistical procedures are also called parametric
tests.
In a parametric test a sample statistic is obtained to estimate the
population parameter. Because this estimation process involves a
sample, a sampling distribution, and a population, certain
parametric assumptions are required to ensure all components are
compatible with each other.
43
Parametric test
 The assumption are:
a) The observations must be independent

b) The observations must be drawn from normally distributed
populations
c) These populations must have the same variances
d) The means of these normal and homoscedastic populations
must be linear combinations of effects due to columns and/or
rows*
44
Parametric test
METHODS
 1. Pearson correlation
 2. Independent measures t-test
 3. Analysis of variance
 4. Paired-T Test
45
Parametric test
1. Pearson correlation:
The quantity r, called the linear correlation coefficient, measures
the strength and the direction of a linear relationship between two
variables. The linear correlation coefficient is sometimes referred
to as the Pearson product moment correlation coefficient in
honour of its developer Karl Pearson. The mathematical formula
for computing r is: where n is the number of pairs of data.
The value of r is such that -1 < r < +1. The + and – signs are used
for positive linear correlations and negative linear correlations,
respectively.
46
Parametric test - Pearson correlation
 Pearson correlation
Positive correlation: If x and y have a strong positive linear
correlation, r is close to +1. An r value of exactly +1
indicates a perfect positive fit. Positive values indicate a
relationship between x and y variables such that as values
for x increases, values for y also increase.
47
 Negative correlation:
If x and y have a strong negative linear correlation, r is close to
-1. An r value of exactly -1 indicates a perfect negative fit.
Negative values indicate a relationship between x and y such that
as values for x increase, values for y decrease.
48
 No correlation:
If there is no linear correlation or a weak linear correlation, r is
close to 0. A value near zero means that there is a random,
nonlinear relationship between the two variables
 Note that r is a dimensionless quantity; that is, it does not

depend on the units employed.
49
 A perfect correlation of ± 1 occurs only when the data points
all lie exactly on a straight line. If r = +1, the slope of this line
is positive. If r = -1, the slope of this line is negative.
 A correlation greater than 0.8 is generally described as strong,
whereas a correlation less than 0.5 is generally described as
weak.
 These values can vary based upon the "type" of data being
examined.
 A study utilizing scientific data may require a stronger
correlation than a study using social science data.
50
Parametric test
Independent measures t-test
 2. Independent-measures t-test:
Independent t-Test involves examination of the significant

differences on one factor or dimension (dependent variable)
between means of two independent groups (e.g., male vs.
female, with disability vs. without disability) or two
experimental groups (control group vs. treatment group).
51
Inferential statistics
Used for Testing for Mean Differences
T-test: when experiments include only 2 groups

a. Independent
b. Correlated
i. Within-subjects
ii. Matched
Based on the t statistic (critical values) based on

df & alpha level
Inferential statistics
Used for Testing for Mean Differences
Analysis of Variance (ANOVA): used when

comparing more than 2 groups
1. Between Subjects
2. Within Subjects – repeated measures
Based on the f statistic (critical values) based on

df & alpha level
More than one IV = factorial (iv=factors)

Only one IV=one-way anova
Parametric test
 For example, you might want to know whether there is a
significant difference on the level of social activity between
individuals with disabilities and individuals without
disabilities.
 Hypothesis testing procedure that uses separate samples for
each treatment condition (between subjects design) This test is
used when the population mean and standard deviation are
unknown, and two separate groups are being compared
54
Parametric test
 Assumptions for the Independent t-Test:
 Independence:
Observations within each sample must be independent (they

don’t influence each other)
 Normal Distribution:
The scores in each population must be normally distributed

 Homogeneity of Variance:
The two populations must have equal variances (the degree to

which the distributions are spread out is approximately equal)
55
Parametric test
 3.Analysis of Variance (ANOVA):
ANOVA is a set of statistical methods used mainly to compare
the means of two or more samples. Estimates of variance are
the key intermediate statistics calculated, hence the reference
to variance in the title ANOVA.
 The different types of ANOVA reflect the different
experimental designs and situations for which they have been
developed.
 An analysis of the variation between all of the variables used in
an experiment. This analysis can provide valuable insight into
the behaviour of a security or market index under various
conditions.
56
Parametric test-Analysis of Variance
 A) One –way ANOVA
This function compares the sample means for k groups. There
is an overall test for k means, multiple comparison methods for
pairs of means and tests for the equality of the variances of the
groups.
 Consider four groups of data that represent one experiment
performed on four occasions with ten different subjects each
time.
 One way ANOVA is more appropriate for finding statistical
evidence of inconsistency or difference across the means of the
four groups.
57
 One way ANOVA assumes that each group comes from an
approximately normal distribution and that the variability
within the groups is roughly constant.
 The factors are arranged so that experiments are columns and
subjects are rows, this is how you must enter your data in the
Stats Direct workbook.
 The overall F test is fairly robust to small deviations from these
assumptions but you could use the Kruskal-Wallis test as an
alternative to one way ANOVA if there was any doubt.
58
 Assumptions:
a. Random samples
b. Normally distributed observations in each population
c. Equal variance of observations in each population
d. The homogeneity of variance option
59
 B) Two way variance
This function calculates ANOVA for a two way randomized
block experiment. There are overall tests for differences between
treatment means and between block means. Multiple comparison
methods are provided for pairs of treatment means.
60
Parametric test
 4. PARIED-T TEST:
The paired t test provides an hypothesis test of the difference

between population means for a pair of random samples whose
differences are approximately normally distributed. Please note
that a pair of samples, each of which are not from normal a
distribution, often yields differences that are normally
distributed.
A null hypothesis of no difference between the means is clearly
rejected; the confidence interval is a long way from including
zero.
61
Parametric test
One group t-test. Example
It is known that the weight of young adult male has a mean value
of 70.0 kg with a standard deviation of 4.0 kg.
Thus the population mean, µ= 70.0 and population standard
deviation, σ= 4.0.
 Data from random sample of 28 males of similar ages but with
specific enzyme defect: mean body weight of 67.0 kg and the
sample standard deviation of 4.2 kg.
 Question: Whether the studed group have a significantly lower
body weight than the general population?
62
Parametric test
One group t-test. Example
population mean, µ= 70.0
population standard deviation, σ= 4.0.
sample size = 28
sample mean, x = 67.0
sample standard deviation, s= 4.0.
 Null hypothesis: There is no difference between sample mean
and population mean.
 t - statistic = 0.15, p >0.05
 Null hypothesis is accepted at 5% level
63
Summary
 In this module we have covered the meaning, purpose, types of
data analysis and its methods. We have also discussed
examples and assumptions. We have given concentration only
on parametric test. In the forthcoming module we will discuss
about non parametric test in detail
64
References
 Carol Leslie Macnee, (2008), Understanding Nursing Research:
Using Research in Evidence-based Practice, Lippincott Williams
& Wilkins, ISBN 0781775582, 9780781775588
 Densise.Polit, et.al, (2013). ‘Nursing research-principles and
methods’, revised edition, Philadelphia, Lippincott
 https://statistics.laerd.com/statistical-guides/descriptive-
inferential-statistics.php
 http://www.socialresearchmethods.net/kb/statdesc.php
 http://www.stat.purdue.edu/~wsharaba/stat511/chapter1_print.p
df
 http://fbm.uni-ruse.bg/d/mra/Introduction%20to%20statistical
%20methods.pdf
Thanks
NON
PARAMETRIC
TEST
66

Nursing Research Methods: PH.D in Nursing

Uploaded by

Copyright:

Available Formats

You might also like

Nursing Research Methods: PH.D in Nursing

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Nursing Research Methods: PH.D in Nursing

Uploaded by

Copyright:

Available Formats

Ph.

NURSING RESEARCH METHODS

Module No: 8 DATA ANALYSIS

Name of the subtopic

 Statistician johntukey defined data analysis in 1961 as:

 Descriptive statistics is the term given to the analysis of data that

 The various measures to describe the location are Mean, Median

Data (# ER arrivals in 1 hr): 2 8 3 4 1

Sample Mean = (2+8+3+4+1)/5

 Another name for 50th percentile. Appropriate for describing

Calculating Sample Median

Order data from smallest to largest.

Order data from smallest to largest.

If even number of data points, the median is the average of

Data (# ER arrivals in 1 hr.): 2 8 3 4 1 8

Median = (3+4)/2 = 3.5

 The mode of a data set is the value that occurs with

450 occurred most frequently (7 times)

 A percentile provides information about how the

The pth percentile of a data set is a value such that at least p

Arrange the data in ascending order.

Compute index i, the position of the pth percentile.

If i is not an integer, round up. The p th percentile

If i is an integer, the p th percentile is the average

 Quartiles are specific percentiles.

It is based on the difference between the value of

The variance is the average of the squared

It is measured in the same units as the data, making

The standard deviation is computed as follows:

The coefficient of variation indicates how large the

The coefficient of variation is computed as follows:

 Principal Components Analysis

 Inferential statistics are conclusions that extend beyond the

populations) from which they have been drawn is the subject of

 The methods of inferential statistics are

3. Computing Calculated Value

4. Obtain Critical Value

5. Reject or Fail to Reject Ho

 If reject Ho and conclude groups are really different, it doesn’t

 Since Ho testing is based on sample means, not population

 Type II error -- accepting Ho when it should have been rejected

One variable One-way chi-square

Three or more variables ANOVA

a) The observations must be independent

 2. Independent measures t-test

 Note that r is a dimensionless quantity; that is, it does not

Independent t-Test involves examination of the significant

T-test: when experiments include only 2 groups

Based on the t statistic (critical values) based on

Analysis of Variance (ANOVA): used when

Based on the f statistic (critical values) based on

More than one IV = factorial (iv=factors)

Observations within each sample must be independent (they

The scores in each population must be normally distributed

The two populations must have equal variances (the degree to

The paired t test provides an hypothesis test of the difference

You might also like