Statistical Evaluation of Data QA050-02

SOP Number : QA050–02
STANDARD OPERATING Effective Date :

PROCEDURE
Review Date :
Department : Quality Assurance Page Number : 1 of 10
Title
Statistical Evaluation of data
:
TABLE OF CONTENTS
S. No. Description Page
1.0 Objective 2 of 10
2.0 Scope 2 of 10
3.0 Responsibilities 2 of 10
4.0 Procedure 2 of 10
5.0 Abbreviations 10 of 10
6.0 References 10 of 10
7.0 Annexure 10 of 10
1.0 Objective
PREPARED BY REVIEWED BY APPROVED BY

User Department
User Department Quality Assurance
Head
(Sign & Date) (Sign & Date) (Sign & Date)

Format Number: QA001/F01–04
PROCEDURE
Review Date :
Title
:
1.1 To establish a procedure for identifying basic statistical tools used for evaluation of data in
pharmaceutical industries.
2.0 Scope
2.1 This procedure applies to the Pharmaceutical Manufacturing Facility (Finished Dosage
Forms) of Pharmed Health Care, Sadat City.
3.0 Responsibilities
3.1 All department heads to identify basic statistical tools used and supervise implementation.
3.2 Training coordinator to training of relevant staff using statistical tools.
3.3 Quality Assurance Head to assure training and implementation.
4.0 Procedure
4.1 Basic statistics
4.1.1 Mean:
The mean is a measure of how close the data are to a certain figure.
It is the average of a set of n data xi =
4.1.2 Standard deviation:
It is a measure of dispersion or spread of data.
This is the most commonly used measure of the spread or dispersion of data around
the mean. The standard deviation is defined as the square root of the variance (V).
The variance is defined as the sum of the squared deviations from the mean, divided
by n-1.
4.1.3 Relative standard deviation (Coefficient of variation).

User Department
Head

PROCEDURE
Review Date :
Title
:
Because the standard deviation usually depends on the magnitude of the data, the
larger the figures, the larger the standard deviation. Therefore, it is better to use an
absolute term that doesn't depend on the magnitude of the data. This is called relative
standard deviation (RSD).
The RSD is expressed as a fraction, but more usually we use the percentage and is
then called coefficient of variation (CV).
4.2 Confidence limit of measurement.
4.2.1 In case of a single measurements:
In routine analytical work, results are usually single values obtained in batches of
several test samples. No laboratory will analyze a test sample 50 times to be
confident that the result is reliable.
Therefore, the statistical parameters have to be obtained in another way. Most usually
by method validation and/or by keeping control charts.
The equation is here reduced to be:
 = x + ts
where
 = "true" mean value.
x = single measurement.
t = applicable t (from tables).
s = standard deviation of set of previous measurements.
4.2.1 In case of replicate measurements:
The more an analysis or measurement is replicated, the closer the mean x of the
results will approach the "true" value , (assuming absence of bias).
A single analysis of a test sample can be regarded as literally sampling the imaginary
set of a multitude of results obtained for that test sample. The uncertainty of such sub
sampling is expressed by

User Department
Head

PROCEDURE
Review Date :
Title
:
where
 = "true" mean value (mean of large set of replicates)
x¯ = mean of subsamples
t = a statistical value which depends on the number of data and the required
confidence (usually 95%).
s = standard deviation of mean of subsamples
n = number of subsamples
The critical values for t are tabulated in special tables.

To find the applicable value, identify the number of degrees of freedom df as follows:
df = n -1
N.B.: The term is known as the standard error of the mean
4.3 Control charts
A statistical tool used to distinguish between process variation resulting

from common causes and variation resulting from special causes.
4.3.1 In each batch of test samples at least one control sample is analyzed and the result is
plotted on the control chart of the attribute.

User Department
Head

PROCEDURE
Review Date :
Title
:
4.3.1.1 The basic assumption is that when a control result falls within a distance of 2s from the
mean, the system was under control and the results of the batch as a whole can be
accepted. A control result beyond the distance of 2s from the mean (the "Warning
Limit") signals that something may be wrong or tends to go wrong.
4.3.1.2 While a control result beyond 3s (the "Control Limit" or "Action Limit")
indicates that the system was statistically out of control and that the results have to be
rejected.
4.3.2 Before constructing a control chart, a run chart is constructed to collect data.
A control chart can be started when a sufficient number of data of an attribute of the
control sample is available (or data of the performance of an analyst in analyzing an
attribute, or of the performance of an instrument on an analyte).
4.3.3 Constructing the control chart:
Calculate the mean and the standard deviation of the previous chart (or of the
initial data set).
Five lines are drawn on the next control chart as follows:
- One for the Mean (x¯),

- One for upper warning (alert) limit (x¯ + 2*SD)
- One for lower warning (alert) limit (x¯ - 2*SD)
- One for upper control (action) limit (x¯ + 3*SD)
- One for lower control (action) limit (x¯ - 3*SD)
4.3.4 Interpreting data:
4.3.4.1 Warning rule (if occurring, then data require farther inspection):
- One control result beyond Warning Limit.
The Warning Rule is exceeded by mere chance in less than 5% of the cases.
4.3.4.2 Rejection rules (if occurring, then data are rejected):

1. One control result beyond Action Limit.
2. Two successive control results beyond same Warning Limit.
3. Ten successive control results are on the same side of the Mean.
4.4 Process Capability:

User Department
Head

PROCEDURE
Review Date :
Title
:
4.4.1 Introduction:
Process capability is a statistical tool which compares the output of an in-control

process to the specification limits by using capability indices.
This can be represented pictorially by the plot below:
4.4.2 A capable process is one where almost all the measurements (6) fall inside the
specification limits.
4.4.3 There are several statistics to measure capability index, the most applicable among
them is CpK.

User Department
Head

PROCEDURE
Review Date :
Title
:
Where:
Cp = Process capability (Two-sided).

Cpu = Upper process capability (One-sided).
Cpl = Lower process capability (One-sided).
CpK = Process capability index.
USL = Upper specifications limit.
LSL = Lower specifications limit.
µ = "true" mean value (mean of large set of replicates).
 = "true" standard deviation (of large set of replicates).
4.4.4 Acceptance Criteria:
- Recommended process capability index for processes that are stable and normally
distributed is NLT 1.33.
4.5 Linear correlation and regression:
The technique the same for both, but there is a fundamental difference in concept:
4.5.1 Correlation analysis is applied to independent factors: if X increases, what will Y do

(increase, decrease, or perhaps not change at all)?
4.5.2 In regression analysis a unilateral response is assumed: changes in X result in

changes in Y, but changes in Y do not result in changes in X.
4.5.3 In analytical work, correlation analysis can be used for comparing methods or
laboratories, whereas regression analysis can be used to construct calibration graphs.
Laboratories or methods are in fact independent factors. However, for regression

analysis one factor has to be the independent or "constant" factor.
This factor is by convention designated X, whereas the other factor is then the
dependent factor Y (thus, we speak of "regression of Y on X").
4.5.4 The principle is to establish a statistical linear relationship between two sets of
corresponding data by fitting the data to a straight line by means of the "least squares"
technique.

User Department
Head

PROCEDURE
Review Date :
Title
:
Such data are, for example, analytical results of two methods applied to the same
samples (correlation), or the response of an instrument to a series of standard
solutions (regression).
The resulting line takes the general form:
y = bx + a
where
a = intercept of the line with the y-axis
b = slope (tangent)
In laboratory work ideally, when there is perfect positive correlation without bias, the
intercept a = 0 and the slope = 1. This is the so-called "1:1 line" passing through the
origin.
If the intercept a ≠ 0 then there is a systematic discrepancy (bias, error) between X

and Y.
When b ≠ 1 then there is a proportional response or difference between X and Y.
The correlation between X and Y is expressed by the correlation coefficient r which

can be calculated with the following equation:
where
xi = data X
x¯ = mean of data X
yi = data Y
y¯ = mean of data Y
It can be shown that r can vary from 1 to -1:
r = 1 perfect positive linear correlation

r = 0 no linear correlation (maybe other correlation)
r = - 1 perfect negative linear correlation
4.5.5 r2 = the coefficient of determination or coefficient of variance.
The advantage of r2 is that, when multiplied by 100, it indicates the percentage of

variation in Y associated with variation in X.

User Department
Head

PROCEDURE
Review Date :
Title
:
Thus, for example, when r = 0.71 (r2 = 0.504), about 50% of the variation in Y is due
to the variation in X.
4.5.6 The line parameters b and a are calculated with the following equations:
a = y¯ - b x¯
4.5.7 It is worth to note that r is independent of the choice which factor is the independent
factory and which is the dependent Y. However, the regression parameters a and do
depend on this choice as the regression lines will be different (except when there is
ideal 1:1 correlation).
Example of calibration curve. The dashed lines delineate the 95% confidence area of
the graph. Note that the confidence is highest at the centroid of the graph.
5.0 Abbreviations
N/A.

User Department
Head

PROCEDURE
Review Date :
Title
:
6.0 References
N/A.
7.0 Annexure
N/A.
END OF THE DOCUMENT

User Department
Head


Statistical Evaluation of Data QA050-02

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistical Evaluation of Data QA050-02

Uploaded by

Copyright:

Available Formats

SOP Number : QA050–02

STANDARD OPERATING Effective Date :

PREPARED BY REVIEWED BY APPROVED BY

(Sign & Date) (Sign & Date) (Sign & Date)

3.2 Training coordinator to training of relevant staff using statistical tools.

3.3 Quality Assurance Head to assure training and implementation.

4.1 Basic statistics

It is the average of a set of n data xi =

4.1.2 Standard deviation:

It is a measure of dispersion or spread of data.

4.1.3 Relative standard deviation (Coefficient of variation).

PREPARED BY REVIEWED BY APPROVED BY

(Sign & Date) (Sign & Date) (Sign & Date)

4.2 Confidence limit of measurement.

4.2.1 In case of a single measurements:

The equation is here reduced to be:

4.2.1 In case of replicate measurements:

PREPARED BY REVIEWED BY APPROVED BY

(Sign & Date) (Sign & Date) (Sign & Date)

The critical values for t are tabulated in special tables.

N.B.: The term is known as the standard error of the mean

4.3 Control charts

A statistical tool used to distinguish between process variation resulting

PREPARED BY REVIEWED BY APPROVED BY

(Sign & Date) (Sign & Date) (Sign & Date)

4.3.3 Constructing the control chart:

Five lines are drawn on the next control chart as follows:

- One for the Mean (x¯),

4.3.4 Interpreting data:

4.3.4.2 Rejection rules (if occurring, then data are rejected):

4.4 Process Capability:

PREPARED BY REVIEWED BY APPROVED BY

(Sign & Date) (Sign & Date) (Sign & Date)

Process capability is a statistical tool which compares the output of an in-control

PREPARED BY REVIEWED BY APPROVED BY

(Sign & Date) (Sign & Date) (Sign & Date)

Cp = Process capability (Two-sided).

4.4.4 Acceptance Criteria:

4.5 Linear correlation and regression:

4.5.1 Correlation analysis is applied to independent factors: if X increases, what will Y do

4.5.2 In regression analysis a unilateral response is assumed: changes in X result in

Laboratories or methods are in fact independent factors. However, for regression

PREPARED BY REVIEWED BY APPROVED BY

(Sign & Date) (Sign & Date) (Sign & Date)

The resulting line takes the general form:

If the intercept a ≠ 0 then there is a systematic discrepancy (bias, error) between X

The correlation between X and Y is expressed by the correlation coefficient r which

It can be shown that r can vary from 1 to -1:

r = 1 perfect positive linear correlation

The advantage of r2 is that, when multiplied by 100, it indicates the percentage of

PREPARED BY REVIEWED BY APPROVED BY

(Sign & Date) (Sign & Date) (Sign & Date)

PREPARED BY REVIEWED BY APPROVED BY

(Sign & Date) (Sign & Date) (Sign & Date)

END OF THE DOCUMENT

PREPARED BY REVIEWED BY APPROVED BY

(Sign & Date) (Sign & Date) (Sign & Date)

You might also like