Professional Documents
Culture Documents
Statistical Evaluation of Data QA050-02
Statistical Evaluation of Data QA050-02
TABLE OF CONTENTS
S. No. Description Page
1.0 Objective 2 of 10
2.0 Scope 2 of 10
3.0 Responsibilities 2 of 10
4.0 Procedure 2 of 10
5.0 Abbreviations 10 of 10
6.0 References 10 of 10
7.0 Annexure 10 of 10
1.0 Objective
1.1 To establish a procedure for identifying basic statistical tools used for evaluation of data in
pharmaceutical industries.
2.0 Scope
2.1 This procedure applies to the Pharmaceutical Manufacturing Facility (Finished Dosage
Forms) of Pharmed Health Care, Sadat City.
3.0 Responsibilities
3.1 All department heads to identify basic statistical tools used and supervise implementation.
4.0 Procedure
4.1.1 Mean:
The mean is a measure of how close the data are to a certain figure.
This is the most commonly used measure of the spread or dispersion of data around
the mean. The standard deviation is defined as the square root of the variance (V).
The variance is defined as the sum of the squared deviations from the mean, divided
by n-1.
Because the standard deviation usually depends on the magnitude of the data, the
larger the figures, the larger the standard deviation. Therefore, it is better to use an
absolute term that doesn't depend on the magnitude of the data. This is called relative
standard deviation (RSD).
The RSD is expressed as a fraction, but more usually we use the percentage and is
then called coefficient of variation (CV).
In routine analytical work, results are usually single values obtained in batches of
several test samples. No laboratory will analyze a test sample 50 times to be
confident that the result is reliable.
Therefore, the statistical parameters have to be obtained in another way. Most usually
by method validation and/or by keeping control charts.
= x + ts
where
= "true" mean value.
x = single measurement.
t = applicable t (from tables).
s = standard deviation of set of previous measurements.
The more an analysis or measurement is replicated, the closer the mean x of the
results will approach the "true" value , (assuming absence of bias).
A single analysis of a test sample can be regarded as literally sampling the imaginary
set of a multitude of results obtained for that test sample. The uncertainty of such sub
sampling is expressed by
where
= "true" mean value (mean of large set of replicates)
x¯ = mean of subsamples
t = a statistical value which depends on the number of data and the required
confidence (usually 95%).
s = standard deviation of mean of subsamples
n = number of subsamples
df = n -1
4.3.1 In each batch of test samples at least one control sample is analyzed and the result is
plotted on the control chart of the attribute.
4.3.1.1 The basic assumption is that when a control result falls within a distance of 2s from the
mean, the system was under control and the results of the batch as a whole can be
accepted. A control result beyond the distance of 2s from the mean (the "Warning
Limit") signals that something may be wrong or tends to go wrong.
4.3.1.2 While a control result beyond 3s (the "Control Limit" or "Action Limit")
indicates that the system was statistically out of control and that the results have to be
rejected.
4.3.2 Before constructing a control chart, a run chart is constructed to collect data.
A control chart can be started when a sufficient number of data of an attribute of the
control sample is available (or data of the performance of an analyst in analyzing an
attribute, or of the performance of an instrument on an analyte).
Calculate the mean and the standard deviation of the previous chart (or of the
initial data set).
4.3.4.1 Warning rule (if occurring, then data require farther inspection):
- One control result beyond Warning Limit.
The Warning Rule is exceeded by mere chance in less than 5% of the cases.
4.4.1 Introduction:
4.4.2 A capable process is one where almost all the measurements (6) fall inside the
specification limits.
4.4.3 There are several statistics to measure capability index, the most applicable among
them is CpK.
Where:
- Recommended process capability index for processes that are stable and normally
distributed is NLT 1.33.
The technique the same for both, but there is a fundamental difference in concept:
4.5.3 In analytical work, correlation analysis can be used for comparing methods or
laboratories, whereas regression analysis can be used to construct calibration graphs.
This factor is by convention designated X, whereas the other factor is then the
dependent factor Y (thus, we speak of "regression of Y on X").
4.5.4 The principle is to establish a statistical linear relationship between two sets of
corresponding data by fitting the data to a straight line by means of the "least squares"
technique.
Such data are, for example, analytical results of two methods applied to the same
samples (correlation), or the response of an instrument to a series of standard
solutions (regression).
y = bx + a
where
a = intercept of the line with the y-axis
b = slope (tangent)
In laboratory work ideally, when there is perfect positive correlation without bias, the
intercept a = 0 and the slope = 1. This is the so-called "1:1 line" passing through the
origin.
where
xi = data X
x¯ = mean of data X
yi = data Y
y¯ = mean of data Y
Thus, for example, when r = 0.71 (r2 = 0.504), about 50% of the variation in Y is due
to the variation in X.
4.5.6 The line parameters b and a are calculated with the following equations:
a = y¯ - b x¯
4.5.7 It is worth to note that r is independent of the choice which factor is the independent
factory and which is the dependent Y. However, the regression parameters a and do
depend on this choice as the regression lines will be different (except when there is
ideal 1:1 correlation).
Example of calibration curve. The dashed lines delineate the 95% confidence area of
the graph. Note that the confidence is highest at the centroid of the graph.
5.0 Abbreviations
N/A.
6.0 References
N/A.
7.0 Annexure
N/A.