2019 - FDA Quality Metrics Research - ST Gallen

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 86

FDA QUALITY METRICS RESEARCH

3RD YEAR REPORT DECEMBER 2019

RESEARCH REPORT INCLUDES SUMMARY OF YEAR 1 AND 2 RESEARCH

Prof. Dr. Thomas Friedli, Prabir Basu, PhD Nuala Calnan, PhD
Dr. Stephan Köhler, OPEX and cGMP Consultant Regulatory Science Adjunct
Julian Macuvele, Steffen Eich, Research Fellow at TU Dublin
Marten Ritz
University of St.Gallen
TABLE OF CONTENTS

2 | QUALITY METRICS RESEARCH


LIST OF FIGURES................................................................................................................... 5
LIST OF TABLES..................................................................................................................... 6
LIST OF ABBREVIATIONS........................................................................................................7
GLOSSARY ............................................................................................................................ 8
0 KEY FINDINGS OF YEAR 1 AND 2 OF THE ST.GALLEN FDA RESEARCH............................. 9
1 EXECUTIVE SUMMARY...............................................................................................10
2 INTRODUCTION AND BACKGROUND............................................................................12
3 SUMMARY OF YEAR 1 AND YEAR 2 RESEARCH OUTCOMES.........................................14
3.1 Main Findings........................................................................................................................................................................ 19
3.2 Summary Year 1 Findings.................................................................................................................................................... 20
3.3 Summary Year 2 Findings..................................................................................................................................................... 21

4 THE ENHANCED PHARMACEUTICAL PRODUCTION SYSTEM MODEL........................... 22


4.1 Extended PPSM enabler landscape.....................................................................................................................................24
4.2 Operationalization of PQS Excellence...............................................................................................................................24
4.3 Drawing from Seminal Work in Operations Management............................................................................................27

5 GOOD QUALITY – GOOD BUSINESS.............................................................................. 30


5.1 Data Sample............................................................................................................................................................................31
5.2 Relation between effectiveness and efficiency...................................................................................................................31
5.3 Distinguishing four groups of plants by levels of effectiveness and efficiency.............................................................31
5.4 Difference between Cluster A and Cluster B.................................................................................................................... 32
5.5 Difference between effective and less effective plants.....................................................................................................34
5.6 Difference between excellent and worst performing plants..........................................................................................36

QUALITY METRICS RESEARCH | 3


6 THE ICH Q10 ARCHITECTURE FROM A DATA PERSPECTIVE.......................................... 40
6.1 ICH Q10 Related Enablers & Overall PPSM Enabler Landscape................................................................................... 41
6.2 ICH Q10 Related Enablers & Operational Performance.................................................................................................42

7 THE CRUCIAL ROLE OF THE LABS................................................................................. 44


7.1 The Operating Context and Business Environment.......................................................................................................45
7.2 Three Patterns of QC Labs – A Quantitative Perspective...............................................................................................47
7.3 Three Patterns of QC Labs – A Qualitative Perspective................................................................................................ 49
7.4 Quality Culture in QC.......................................................................................................................................................... 53

8 QUALITY CULTURE – RE-VISITED................................................................................. 56


8.1 Research Approach................................................................................................................................................................ 57
8.2 Results.....................................................................................................................................................................................59
8.3 Conclusion..............................................................................................................................................................................59

9 A CLOSER LOOK FROM A QUALITY RISK PERSPECTIVE.................................................61


9.1 Operational Excellence Benchmarking database and FDA inspection outcomes......................................................62
9.2 Single company case study of operational KPIs and regulatory evaluation............................................................... 64
9.3 Data usage for predictive modelling...................................................................................................................................67

10 SUMMARY................................................................................................................. 68
REFERENCES..................................................................................................................... 70
APPENDIX ...........................................................................................................................72
Appendix 1 ICH Q10 Enabler Questionnaire.................................................................................................................... 73
Appendix 2 Operationalization of operational excellence enabler
dimensions and link to quality behavior/maturity.......................................................................................................... 81

4 | QUALITY METRICS RESEARCH


LIST OF FIGURES
Figure 1: Operational Excellence Manufacturing database overview.................................................................................13
Figure 2: Operational Excellence QC Lab database overview...............................................................................................13
Figure 3: St.Gallen OPEX Model............................................................................................................................................... 16
Figure 4: The Pharmaceutical Production System Model.................................................................................................... 16
Figure 5: PPSM House with Metrics and Enabler.................................................................................................................. 18
Figure 6: The enhanced Pharmaceutical Production System Model (PPSM).................................................................... 23
Figure 7: Number of legacy PPSM Enablers assigned per ICH Q10 category................................................................... 25
Figure 8: Newly derived enablers per ICH Q10 category...................................................................................................... 25
Figure 9: Relation between enabler implementation and level PQS Excellence...............................................................27
Figure 10: Relation between PQS Effectiveness and PQS Efficiency.................................................................................... 32
Figure 11: Results of clustering and subsequent grouping of plants.................................................................................... 33
Figure 12: ICH Q10 Enabler Implementation Level & Legacy PPSM
Enabler Implementation Level not assigned to ICH Q10.................................................................................... 41
Figure 13: ICH Q10 Related Enabler Implementation Level & Aggregated PQS Effectiveness........................................43
Figure 14: Management Responsibilities Enabler Implementation Level & Aggregated PQS Effectiveness.................43
Figure 15: Knowledge Management Enabler Implementation Level & Aggregated PQS Effectiveness..........................43
Figure 16: Process Performance/ Product Quality Monitoring System Enabler
Implementation Level & Aggregated PQS Effectiveness......................................................................................43
Figure 17: Scatter plot of enabler relation with QC lab effectiveness for three clusters.................................................. 48
Figure 18: Scatter plot of enabler relation with QC lab effectiveness for two clusters......................................................50
Figure 19: Selected clusters for qualitative research................................................................................................................ 51
Figure 20: Configurational differences between QC labs....................................................................................................... 53
Figure 21: Relation between Quality Maturity and Quality Behavior in QC......................................................................54
Figure 22: Relation between Quality Maturity and Quality Behavior from three data sources
(cf. Patel et al., 2015, Friedli et al., 2017, and this report)...................................................................................... 55
Figure 23: Site clustering on Operational Stability and Quality Maturity for Quality Culture analyses........................58
Figure 24: Matched FDA inspection outcomes compared to overall population (2009 – 04/2019)................................62
Figure 25: Sites' inspection outcomes by enabler implementation level..............................................................................63
Figure 26: Inspection outcomes by detailed Total Quality Management enabler implementation level.......................63
Figure 27: Inspection outcomes by operational KPIs..............................................................................................................65
Figure 28: Technology classification of site data records with good and bad inspection..................................................65
Figure 29: Performance evaluation of Logistic Regression models by input data used.....................................................67

QUALITY METRICS RESEARCH | 5


LIST OF TABLES
Table 1: Calculation of aggregated PQS Effectiveness Score............................................................................................26
Table 2: Overview Operational Stability Metrics and Purpose of Measure....................................................................26
Table 3: Overview Supplier Reliability Metrics and Purpose of Measure.......................................................................26
Table 4: Performance Indicators of Lab Robustness..........................................................................................................26
Table 5: Overview of conversion cost components............................................................................................................28
Table 6: Comparison of reasons for launching OPEX (excerpt of selected reasons):
‘Contenders/world class’ vs. ‘Won’t go the distance’...........................................................................................28
Table 7: Comparison of enabler implementation levels for category ‘EI&CI’:
‘Contenders/world class’ vs. ‘Won’t go the distance’...........................................................................................28
Table 8: Comparison of reasons for launching OPEX (excerpt of selected reasons):
‘Contenders/world class’ vs. ‘Promising’...............................................................................................................29
Table 9: Analysis of differences for selected structural factors: ‘Contenders/world class’ vs. ‘Promising’................29
Table 10: Overview of relevant site types in sample..............................................................................................................31
Table 11: Differences in PQS Effectiveness metrics between efficient and less efficient plants................................... 33
Table 12: Differences between efficient and less efficient plants concerning PQS Efficiency components................ 33
Table 13: Differences in productivity in terms of FTE/Batch............................................................................................. 35
Table 14: Differences between efficient and less efficient plants concerning operating context................................. 35
Table 15: Differences between efficient and less efficient plants concerning enabler implementation...................... 35
Table 16: Differences between effective and less effective plants concerning enabler implementation.....................36
Table 17: Differences between effective and less effective plants concerning operating context.................................36
Table 18: Differences between excellent and the worst performing plants concerning Enabler Implementation... 37
Table 19: Differences between excellent and the worst performing plants concerning operating context................ 37
Table 20: Differences between excellent and the worst performing plants concerning headcount structure........... 37
Table 21: Differences between excellent and the worst performing plants concerning employee productivity.......38
Table 22: Overview of group characteristics..........................................................................................................................39
Table 23: Propositions related to the operating context and QC lab effectiveness........................................................ 46
Table 24: QC lab effectiveness definition.............................................................................................................................. 46
Table 25: Analyzed characteristics of context and business environment...................................................................... 46
Table 26: Conclusions regarding research propositions..................................................................................................... 48
Table 27: Hypotheses to test significant difference in the enabler implementation between QCHPs and QCLPs. 48
Table 28: T-test results for all enabler dimensions comparing QCHPs and QCLPs.......................................................50
Table 29: Conclusions from a quantitative perspective....................................................................................................... 51
Table 30: Quality Maturity attributes regression on Quality Behavior.............................................................................58
Table 31: Quality Maturity attributes regression on Operational Stability..................................................................... 60
Table 32: Quality Behavior attributes regression on Operational Stability..................................................................... 60
Table 33: Test configurations showing significant (at least 5%) relationships with inspection outcomes..................63

6 | QUALITY METRICS RESEARCH


LIST OF ABBREVIATIONS
API Active Pharmaceutical Ingredient
BE Basic Elements
CAPA Corrective Action and Preventive Action
CI Continuous Improvement
EMA European Medicines Agency
EMS Effective Management System
EH&S Environment, Health & Safety
FDA US Food and Drug Administration
FTE Full-Time-Equivalent
HP High Performer
JIT Just-in-Time
KPI Key Performance Indicator
LP Low Performer
NAI No Action Indicated
OAI Official Action Indicated
OEE Overall Equipment Effectiveness
OOS Out-of-specification
OPEX Operational Excellence
OS Operational Stability
OTIF On-time-in-full
PPSM Pharmaceutical Production System Model
PQS Pharmaceutical Quality System
QC Quality Control
SKU Stock Keeping Unit
SR Supplier Reliability
TPM Total Productive Maintenance
TQM Total Quality Management
VAI Voluntary Action Indicated

Funding for this report was made possible, in part, by the Food and Drug Administration through
grant [1U01FD005675-01]. The views expressed in written materials or publications and by speakers
and moderators do not necessarily reflect the official policies of the Department of Health and Hu-
man Services; nor does any mention of trade names, commercial practices, or organization imply
endorsement by the United States Government.
We express our gratitude to the FDA Quality Metrics Team for three years of outstanding collabo-
ration. An additional thank you goes to the following peer reviewers for their comments on earlier
versions of this report: Clive Brading (Sanofi), Kira Ford (Eli Lilly) and Tina Morris (PDA).

QUALITY METRICS RESEARCH | 7


GLOSSARY
Conversion costs Conversion costs equal the total production cost or Cost of Goods Sold (COGS) minus material cost. It is
an exhaustive measure to assess operational efficiency, which excludes material cost and therefor allows
an operations focused assessment of cost efficiency.
Cultural Excellence is operationalized as the average of the KPI-based employee Engagement Score,
Cultural Excellence
Quality Behavior Score and Quality Maturity Score.
Employee Engagement is a KPI-based construct that aims to capture how committed employees are to
Employee Engagement
their work. The included metrics are displayed in Figure 5.
Enablers represent capabilities of an organization that are associated with reaching a high operational
Enabler
performance.
A moderator is a variable that affects a relation between two dimensions A and B. It can affect the
Moderator
direction and/or strength of the relation between A and B.
Operational Stability equates to the provision of capable and reliable processes and equipment and
Operational Stability embodies the core capabilities of Quality and Dependability. The Operational Stability Score is calculated
from eight underlying metrics.
Effectiveness describes the “degree to which something is successful in producing a desired result;
success.” The effectiveness of an organization’s PQS was defined as its ability to provide quality drugs
PQS Effectiveness 1
while ensuring a high level of delivery capability. PQS Effectiveness is an aggregated score comprising the
two sub categories Operational Stability and Supplier Reliability.
Efficiency addresses the “productivity of a process and the utilization of resources.” The revised PQS
PQS Efficiency Efficiency Score is calculated as the inverse of “conversion cost per batch”, which covers all production
related cost items.
PQS Excellence comprises both, PQS Effectiveness and PQS Efficiency. The newly introduced PQS
Excellence Score is calculated as the average of the PQS Effectiveness Score and the PQS Efficiency Score.
PQS Excellence
Accordingly, effectiveness and efficiency are both considered performance dimensions in contrast to a
solely effectiveness or efficiency focused performance understanding.
A combination of quality and service performance. It is built on 12 individual performance indicators
QC Lab Effectiveness
related to process quality/stability and delivery performance in QC labs.
Geographical Distribution (North America, Europe, High/low cost location)
Portfolio Complexity (drug substance type, drug product type, number of final drug products tested)
Test Allocation Strategy (Centralized vs. decentralized, degree of centralization)
QC Lab Operating Context
Organizational Scale (No. of QC FTEs, no. of site FTEs)
& Business Environment
Economy of Scale (No. of batches processed, no. of tests)
Technology & Innovation (Age of instruments, age of methods, level of automation)
Regulatory Approval (US, EU, China, Japan)
Quality Behavior See Quality Culture
Quality Culture as defined by PDA is an aggregation of Quality Maturity and Quality Behavior. The exact
definitions are included in Chapter 8. Both constructs are operationalized as aggregations of enabler
Quality Culture
questions from the benchmarking database. The categories the enabler questions stem from is illustrated
in Figure 5.
Quality Maturity See Quality Culture
Are viewed as background information on the site, such as size and FTEs, technology, product program.
Structural factors
They allow to build meaningful peer groups for comparisons (“compare apples with apples”)
Supplier reliability represents as measure to assess the reliability of external suppliers. The Supplier
Supplier Reliability Reliability Score is calculated as an average of two relative values of the metrics namely Complaint Rate
(Supplier) and Service Level Supplier.

1. The approach to build PQS Effectiveness (and also other scores) on a specific number of metrics is named operationalization. The conclusions built from
the aggregated scores (e.g. PQS Effectiveness) have to be linked to the specific operationalization that was used in this research. The operationalization is
a construct built on a match between content of each metric and the aggregate but might has some limitations due to the reliance on available metrics in
the St.Gallen benchmarking databases.

8 | QUALITY METRICS RESEARCH


Year 1
0 KEY FINDINGS OF YEAR 1 AND 2
OF THE ST.GALLEN FDA RESEARCH
Year 2
» The St.Gallen Pharmaceutical Production System Model » The metrics Operational Stability and Lot Acceptance
(PPSM) was developed to demonstrate how Pharmaceutical Rate both showed to be meaningful indicators for internal
Quality System (PQS) Excellence may be achieved. It is a audit and external inspection outcomes at a site.
holistic model that illustrates a system-based understanding
» The PQS Effectiveness Score correlates positively with
of a modern pharmaceutical production environment.
the degree of implementation of numerous technical and
» PQS Excellence comprises of both, PQS Effectiveness cultural enablers, demonstrating the importance of an
and PQS Efficiency. A positive correlation has been integrated Enabler implementation approach.
demonstrated between these two elements of the PPSM.
» Sterile Liquid production only plants show, on average,
» A key performance indicator, Service Level Delivery a lower PQS Effectiveness Score, including a lower Lot
(OTIF), has been identified as a suitable surrogate for the Acceptance Rate than Oral Solid Dosage only plants.
effectiveness of the Pharmaceutical Quality System for the Furthermore, as the Enabler implementation levels between
purpose of data analysis. these two technology groups are not significantly different,
it appears that the complexity of requirements for sterile
» Operational Stability has been found to have a significant
production rather than the OPEX enabler implementation
impact on PQS Effectiveness.
impacts the PQS Effectiveness.
» Supplier Reliability has been found to have a significant
» QC Lab Robustness High Performers:
impact on Operational Stability.
­ Have a significantly lower Invalidated OOS (per
» PQS Effectiveness high performing sites have a significantly
100’000 tests).
higher Cultural Excellence compared to PQS Effectiveness
low performing sites. ­ Have a significantly higher level of automation.
­ Show patterns of an integrated (programmatic)
» A newly developed Inventory – Stability Matrix (ISM)
Enabler implementation to achieve their superior
allows for a better understanding of the impact of inventory
robustness score, instead of a situational (piecemeal)
on PQS performance on a site.
Enabler implementation.
» A high level of Inventory (Days on Hand) can compensate
for operational stability issues experienced on sites but may
also mask insufficient process capability.
» Sites with Low Stability and Low Inventory have the
highest risk profile regarding Rejected Batches, Customer
Complaint Rate and Service Level Delivery (OTIF) (PQS
Effectiveness surrogate)
» Operational Stability high performing sites have a
significantly lower level of Customer Complaints and a
significantly lower level of Rejected Batches compared to
Operational Stability low performing sites.

QUALITY METRICS RESEARCH | 9


1 EXECUTIVE SUMMARY

This report presents the findings from three years of Quality Met- developed in the first two years of Quality Metrics Research by
rics Research and builds on seminal outcomes from earlier op- the University of St.Gallen (Friedli, Köhler, Buess, Basu, & Calnan,
erations and quality management research, e.g. Voss et al. (Voss, 2017, 2018). The following main results are highlighted below and
Blackmon, Hanson, & Oak, 1995), Ferdows and De Meyer (Ferdows are further described in more detail in this report in the relevant
& De Meyer, 1990), Deming (Deming, 1986). The work undertak- chapters noted.
en in year 3 has deepened the insights and enhanced the models

The enhanced Pharmaceutical Production System Model (PPSM) (cf. Chapter 4)


The original Pharmaceutical Production System Model (PPSM) the enhanced PPSM comprise not only of the 136 questions
has been further enhanced in year 3 as follows: of the traditional St.Gallen Operational Excellence Model (cf.
chapter 3) but have been increased to a total of 185 questions,
» The now operationalized PQS Excellence score is calculated
assessing maturity, behaviors and capabilities across a range
based on both effectiveness and efficiency
of OPEX, cultural and technical competencies including the
» The enhanced PPSM is now built on an extended foundation ICH Q10 architecture, covering all guideline elements.
comprising both technical and cultural excellence
» Due to the high relevance of Environment, Health and
» The individual cultural and technical excellence enablers of Safety (EH&S) this dimension was also added to the PPSM

10 | QUALITY METRICS RESEARCH


Good Quality – Good Business (cf. Chapter 5)
Quality excellence describes an advanced approach to quality » Excellent plants excel for most PQS effectiveness metrics
which goes beyond merely being compliant with regulations. and achieve this with lower inventory levels, while handling
Quality excellence is patient-driven, culturally embedded more Stock Keeping Units (SKUs). They show a higher
and built into the processes and behavior of an organization. enabler implementation level across all categories covering
Accordingly, the PQS excellence score comprises a PQS both technical and cultural enablers.
effectiveness score and a PQS efficiency score. The research team
» Plants with low PQS excellence report a higher share of
analyzed what distinguishes PQS excellent plants from worse
indirect labor QA/QC FTE, likely a reaction to underlying
plants with the following outcomes:
operational stability problems.
» Effectiveness and efficiency are slightly positively correlated

The ICH Q10 Architecture from a Data Perspective (cf. Chapter 6)


A key element of the work undertaken in year 3 has involved » There is a positive link between the enabler implementation
mapping the ICH Q10 Pharmaceutical Quality System of these 27 enablers directly related to ICH Q10 and the other
architecture onto the PPSM Model Architecture. 110 enablers not assigned to ICH Q10 elements. Thus, the
aggregated ICH Q10 enabler implementation level seems to
» 27 legacy enablers from the traditional St.Gallen OPEX
be an indicator for the overall enabler implementation level.
Model could be allocated to ICH Q10 categories in the
revised PPSM architecture. The legacy dataset does not » There is a positive link between the enabler implementation
cover all categories and the preliminary results will have level of these 27 enablers and the aggregated PQS
to be confirmed and detailed with comprehensive datasets effectiveness, representing Operational Performance.
(covering all guideline elements) in the future. Therefore, The degree of determination is on a comparable level
data gathering for all guideline categories is currently on- to linking all enablers with PQS Effectiveness. This fact
going. provides confidence in the meaningfulness of the ICH Q10
architecture’s set-up.

QC Lab Findings (cf. Chapter 7)


Year 2 of this research focused on understanding the factors of low performing QC labs only focus on implementing
impacting QC Lab Robustness. Further examination this year single basic enablers.
has shown:
» The Quality Maturity, Quality Behavior link also exists in
» QC Lab Operating Context & Business Environment (see QC labs: R2=56%.
glossary) have a moderating impact on QC lab robustness.
» Operational Excellence is a journey: Pioneer QC labs
» Robust QC labs (i.e. those with high QC lab effectiveness that have implemented Operational Excellence enablers
and high enabler implementation) focus on an integrated for a longer time show higher QC lab robustness (i.e.
implementation of basic and advanced enablers. A majority effectiveness).

Quality Culture – Re-visited (cf. Chapter 8)


The role of Quality Culture has gained increased recognition » Beyond our research in year 1, we can now also show that
within the pharmaceutical industry in recent years. A healthy for the majority of the sites there is a positive impact of
Quality Culture translates into a system state “that puts interest Quality Maturity (R²=54%) and Behavior (R²=56%) attributes
and safety of patients […] above all else and where people do on Operational Stability.
what is right versus what is good enough” (Patel et al., 2015). We
separated Quality Culture into Quality Maturity and Quality
Behavior in Year 1 following the PDA definitions (see also
glossary).

A closer Look from a Risk Perspective (cf. Chapter 9)


For an external validation of the PPSM we researched interdepend- » Advancement in streamlining operations (Just-In-Time JIT
ences between operational KPIs and Enablers and internal quality enabler implementation) appears to have a similar effect.
audit results and regulatory inspection outcomes. The findings
» The Quality Maturity construct remodeled through
show:
St.Gallen Operational Excellence benchmarking database
» Sites that show a higher level of implementation of enablers also can be linked to inspection outcomes.
Operational Excellence have better inspection outcomes.
» Operational KPIs improve the quality of models to predict
This relationship is stronger and highly significant for
adverse inspection / internal audit outcomes providing
enablers specifically associated with the quality system as
some additional proof for the rational of the FDA quality
measured in the enabler category Total Quality Management
metrics initiative.
TQM.

QUALITY METRICS RESEARCH | 11


2 INTRODUCTION AND
BACKGROUND

The FDA Quality Metrics initiative has emerged directly from the strengthens the PPSM model to act as a true excellence model pro-
FDA Safety and Innovation Act (US Congress, 2012). It aims to ‘sup- viding the basis for an in-depth evaluation of a case for quality (see
port the modernization of pharmaceutical manufacturing as part chapter 5). Additionally, research in year 3 focused on deepening
of the FDA’s mission to protect and promote public health’ (Re- the understanding of the role of the QC Labs (chapter 7), enhanc-
search, 2016), by encouraging both the industry and the regulators ing the evaluation of the importance of the role of culture (chapter
to implement modern, risk-based pharmaceutical quality assess- 8) and the identification of early operational indicators or predic-
ment systems. tors of quality issues that are likely to occur in the future (chapter
9).
As part of this initiative, the FDA awarded in 2016 a research grant
to the University of St.Gallen, Switzerland to help establishing a The analyses in this report are mainly based on the St.Gallen OPEX
scientific basis for relevant pharmaceutical manufacturing perfor- database(s) (cf. Figure 1 and Figure 2). The participating sites either
mance metrics, which may prove useful in assessing the current initiate contact with the St.Gallen team or embark on the project
state of quality and in predicting risks of quality failures and/or after the research team proposes the benchmarking opportunity
drug shortages. Results from the first year of this collaboration and benefits. The motives to participate are quite broad: While
were published in a research report issued2 in October 2017 and some sites seek to validate the success of longstanding Operational
key findings were also presented at the ISPE annual meeting in San Excellence initiatives, others use it as a baseline assessment before
Diego (Friedli et al., 2017). In July 2017, the grant was extended for a initiating such an improvement initiative. Large companies that
second year. The results from this second year were again summa- participate with several sites often aim to cover a broad spectrum
rized in a comprehensive report3, issued in November 2018 (Friedli, from lighthouse or flagship sites to others that are lagging in their
Köhler, Buess, Basu, et al., 2018). Operational Excellence journey. The team therefore assumes the
sites are not more advanced than the industry average.
This report, building upon the findings of the first two years of re-
search (see chapter 3 for a comprehensive summary), provides an The research team gathers these datasets remotely, in very close
account of the research activities undertaken in the third year and collaboration with the corresponding manufacturing sites and
presents an outlook for future research on this topic. The research labs. The data gathering process involves several steps for data val-
in year 3 focused on enhancing the Pharmaceutical Production Sys- idation and feedback to the sites to ensure data quality. At the time
tem Model (PPSM) used for structuring and analyzing the available of writing this report, the St.Gallen benchmarking databases con-
data, by broadening the enabler base and further operationalizing sisted of 381 manufacturing sites and 66 QC lab locations.
the outcome calculation (see chapter 4 & 5). This modification

2. The final report of the first year of the FDA Quality Metrics research project can be requested under: www.item.unisg.ch/fda
3. The final report of the second year of the FDA Quality Metrics research project can be requested under: www.item.unisg.ch/fda

12 | QUALITY METRICS RESEARCH


Figure 1: Operational Excellence Manufacturing database overview

Our wording:

Overview as of today and to -be until mid -2018


120 104
100
Cur r e nt Sta tus

Site FTE

 13 15 17 19
80 66

60
40  Up to 200 FTEs 201 to 400 401 to 600 Above 600
24
20  14
67%
0
Regional Distribution

Labs in Database On-gonig Labs Scheduled Labs Labs To-Be 17% 9%

0%
31 10 20 5
DS

8%
0%

Chemicals Biologics Mixed No Drug Substance 0%

23 13 12 9 9
DP

56% 44%
1 Drug Product Type Any 2 Drug Product Types
Centralization

Any 3 Drug Product Types More than 3 Drug Product Types


No Drug Products
Centralized Decentralized
QC FTE

62% 19% 19%


17 18 17 14

Up to 25 % centralization 26 % to 50 % Above 50%


Up to 30 FTEs 31 to 60 61 to 90 Above 90

Figure 2: Operational Excellence QC Lab database overview

QUALITY METRICS RESEARCH | 13


3 SUMMARY OF YEAR 1 AND YEAR 2
RESEARCH OUTCOMES

14 | QUALITY METRICS RESEARCH


Since 2004, the Institute of Technology Management at the Uni- efficiency and improved business performance. Quality was viewed
versity of St.Gallen has been leading a collaborative effort with as an extra operation, step or layer, it added to everyone’s workload
various pharmaceutical companies around the world to improve and a burden on a corporation’s profitability and ability to meet
their performance through participation in a unique, first-of-its- market and customer demands. This adversarial view resulted in
kind, Operational Excellence (OPEX) benchmarking study. St.Gal- the past practices of attempting to control quality after the fact.
len pharmaceutical OPEX benchmarking has established itself as When the big wave of quality tools — such as Lean Manufacturing,
an important and successful tool providing practitioners in the Kaizen, and Kanban, rose to prominence in the 1980s and 1990s,
pharmaceutical companies with exclusive industry intelligence they were often grafted onto existing processes, tools and methods.
and support for databased decision-making. But, again largely due to the collaborative research efforts on OPEX
at St.Gallen, it is now common understanding that pharmaceutical
At the heart of Operational Excellence is proper and consistent
manufacturing facilities must synergistically supply quality to their
application of certain methodologies such as Lean, Six Sigma and
customers in terms of product, service, delay, reliability, etc. The
tools such as Kaizen, DMAIC, Kanban in order to produce quality
very first step is to stop looking at quality as a weight to carry and,
product and realize business goals. Thus, it is important to have
instead, seeing it as one of the major strong points. Investment in
efficient work practices and business processes such as “doing right
quality leads to improvement in operational efficiency and ulti-
the first time”. An integral component of operational excellence is
mately higher profitability.
about creating the right work culture, one that stresses empow-
erment and rewards the right behaviors. Ultimately, Operational The St.Gallen OPEX Benchmarking database can provide valuable
Excellence is a philosophy of the workplace where problem-solv- information about excellence in quality as well as in operations for
ing, teamwork, and leadership leads to continuous improvement pharmaceutical companies. The St.Gallen Operational Excellence
in an organization. The process involves focusing on the custom- (OPEX) model is a philosophy that directs an organization towards
ers’ needs, keeping the employees empowered, and continually im- continuous improvement. It is a guide to balance the management
proving the current activities in the workplace. As first utilizers of of cost, quality and time, focusing on the needs of the customer,
these principles, Japanese manufacturers were able to offer prod- comprised of both structural and behavioral changes that support
ucts of high quality and more efficient production costs than their the necessary activities executed in the best way possible. To be
US American and European counterparts through systems such as sustainable, it must be driven by top management and must be de-
Lean. When integrated under the umbrella of operational excel- signed to engage every single employee.
lence and applied across the organization, the systematic study,
The St.Gallen OPEX Model serves as an analytical “thought mod-
analysis and measurement of operation leads to higher yields, re-
el” for the benchmarking process, providing a sound basis for an
duction of waste, improvement in quality and increased customer
overall system-based interpretation of data. The current St.Gallen
satisfaction.
OPEX Model is shown below in Figure 3. The OPEX Model in-
Traditionally, quality and operational excellence were not always cludes several sub-systems, each of which constitutes an important
regarded as closely connected. However, the link between quality element that contributes to the overall operational performance.
and continuous improvement leads to an improvement in busi- Another important characteristic of this model is the way in which
ness. In addition to developing this rare and unique collaborative these sub-systems reinforce each other. Thus, the model repre-
model between pharma companies across several continents, the sents manufacturing as a holistic system in which single elements
St.Gallen OPEX effort has led to the realization and recognition or interventions have both direct and indirect impacts on other
that quality and operational excellence are the two sides of the elements or sub-systems. At the highest level of abstraction, the
same coin, and they are integrally related. Where quality is driving St.Gallen OPEX Model is divided into two larger sub-systems:
Operational Excellence there is a culture of quality, with strong ex-
1. Technical Sub-System - The technical sub-system
ecutive leadership and organizational structure, employee partici-
comprises of well-known manufacturing programs such
pation, integrated with corporate objectives, integrated processes,
as Total Productive Maintenance (TPM), Total Quality
and real-time traceable metric measurement. When one talks about
Management (TQM) and Just-in-Time (JIT), and structures
quality, there is often a common theme connecting compliance
them in a logical and consistent manner (Cua, McKone, &
and Operational Excellence (OPEX). Most pharmaceutical com-
Schroeder, 2001).
panies understand what quality is and what is required to comply
with the relevant CGMP regulations. By tapping into the culture of 2. Social Sub-system - the social sub-system takes up the
continuous improvement with OPEX driven by quality, companies quest for an operational characterization of the quality of
will achieve what is “desired” versus just what is “required,” and management and work organization. This second, high-
result in confidence in a reliable quality output and supply chain. level sub-system focuses on supporting, encouraging and
motivating people to steadily improve processes. By doing
In part due to the collaborative effort of the OPEX team at St.Gal-
so, it applies a range of technical practices and expertise in
len with the pharmaceutical industry around the world, quality has
ways that contribute to the overall goal of the company.
come to be recognized as a strategic tool for attaining operational

QUALITY METRICS RESEARCH | 15


Gallen OPEX Model
amework for thinking about Operational Excellence

Figure 3: St.Gallen OPEX Model

25 | © IT

Customer Complaint
Rate

E PQS
Lot Acceptance Excellence
Rate/
1 -Rejected Batches Invalidated OOS
Result System D
PQS Effectiveness PQS Efficiency

C Supplier Reliability Operational Stability Lab Robustness


Structural Factors

B CAPA Effectiveness
Enabling
System
A
Cultural Excellence

Figure 4: The Pharmaceutical Production System Model

1 | © ITEM-HSG

16 | QUALITY METRICS RESEARCH


Connections between the different subsystems, especially between oversight has evolved in recent years. The FDA Quality Metrics in-
quality and OPEX, are not obvious in pharmaceutical manufac- itiative, which stems from the FDA Safety and Innovation Act (US
turing because there is a tendency to separate quality from oper- Congress, 2012) (FDASIA, 2012), aims to develop and implement the
ational excellence. Sometimes, traditional quality departments are reporting of a set of standardized manufacturing quality metrics.
suspicious about operational excellence programs viewing them The establishment and collection of these metrics should provide
largely as cost-cutting exercises. The St.Gallen OPEX Model rein- various stakeholders – from industry to regulators – with better in-
forces that maintenance programs, process stability, system pro- sight into the state of quality at a given manufacturing facility, and
ductivity and ultimately product quality are all strongly interrelat- allow stakeholders to better anticipate and address quality issues,
ed. In fact, quality, maintenance, and production planning strongly as well as their associated risks, while simultaneously reducing ex-
interact and jointly determine those aspects of a company’s success tensive regulatory burden.
that are related to production quality, which ultimately determines
As part of this initiative, the FDA awarded a research grant (Grant
the company’s ability to deliver on-time thus meeting customer ex-
#1UO1FD005675-01; Title: FDA Pharmaceutical Manufacturing
pectations, while keeping resource utilization to a minimum level.
Quality Metrics Research) to the University of St.Gallen in 2016 to
This research demonstrates the benefits of an integrated approach
help establish a data-driven, scientific basis for such metrics. One
to management of quality and overall organizational performance.
of FDA’s motivation is to establish the scientific base for relevant
While compliance is an important part of quality in pharmaceuti-
performance metrics which might be useful in predicting risks of
cal manufacturing, the role of quality should never be defined by
quality failures or drug shortages. An important factor in the ac-
compliance alone. A compliance-first view fosters the concept of
ademic collaboration for this research was the availability of the
quality as a ‘Cost Center’ and as a Quality Police. In addition, just
St.Gallen Pharmaceutical OPEX Benchmarking database, consist-
as it is important to have a good handle on traditional financial
ing today of key performance data related to more than 380 phar-
metrics like revenue growth, return on assets, cost of goods sold
maceutical manufacturing sites, facilitating a detailed analysis of
and more, in pharmaceutical manufacturing, it is perhaps even
the FDA draft guidance metrics of Lot Acceptance Rate and Cus-
more important to understand the operational metrics and the role
tomer Complaint Rate among others.
quality can play in performance. Thus, quality metrics should play
an equally important role in this integrated process management One of the main outcomes of this research project has been the
system. The quality metrics should reflect a company’s priorities, development of the Pharmaceutical Production System Model
direction, and culture. (PPSM) as a complementary model to the established St.Gallen
OPEX model, with the intent of integrating the existing Quality
A robust Pharmaceutical Quality System (PQS) provides key ele-
and Excellence functions within pharmaceutical manufacturing
ments of assurance and oversight necessary for both pharmaceu-
operations. It seeks to establish, through data-driven analysis, the
tical manufacturing and quality control laboratory processes. The
underlying relationships between quality and operational perfor-
question this research seeks to address is how to develop suitable
mance. The Pharmaceutical Production System Model (PPSM) was
indicators to measure PQS effectiveness, PQS efficiency and ulti-
specifically developed for this FDA Quality Metrics project. It has
mately deliver PQS excellence thereby yielding the desired benefits
been designed to enable a structured analysis of the key compo-
to the patients and the companies. “A PQS is successful when it
nents which support the achievement of Pharmaceutical Quality
assures an ongoing state of control. In a healthy PQS, managers
System (PQS) Excellence, which remains the primary focus of this
establish a vigilant quality culture in which timely action is taken to
research. The PPSM illustrates a holistic, system-based under-
prevent risks to quality. Lifecycle adaptations are made to address
standing of a pharmaceutical production environment.
manufacturing weaknesses and continually improve systems. An
effective PQS will ultimately support stable processes, and predict- Also in this collaboration with the FDA, an additional database was
able quality and supply.”4. As can be seen from Figure 3 above, the established with data collected from QC laboratories which sup-
St.Gallen OPEX model already incorporated the general structure port pharmaceutical manufacturing sites. Both databases together
to represent, analyze and measure Pharmaceutical Quality Sys- provide an even more comprehensive picture about the current
tems. state of operational excellence and quality at pharmaceutical plants
compared to that which was available heretofore. The additional
Since the beginning of this century, FDA has been promoting its
QC lab database also provides the opportunity to analyze Invalidat-
vision of “a maximally efficient, agile, flexible manufacturing sector
ed OOS, the FDA draft guidance metric not previously examined
that reliably produces high-quality drug products without exten-
in year 1 of this research project. Furthermore, while conceptually
sive regulatory oversight.” But, a new approach to FDA’s quality

4. https://www.fda.gov/drugs/pharmaceutical-quality-resources/quality-systems-drugs

QUALITY METRICS RESEARCH | 17


Figure 5: PPSM House with Metrics and Enabler
PQS Excellence:
Score build from PQS Effectiveness & PQS Efficiency

PQS Effectiveness: Nr. of Observations: PQS Efficiency:


 Service Level Delivery (OTIF) - From internal audit  Maintenance Cost/Total Cost
 Customer Complaint Rate  Quality Cost/Total Cost
 Cost for Preventive Maintenance/Total Cost
Operational Stability:  FTE QC/ Total FTE
 Unplanned Maintenance  FTE QA/Total FTE
 OEE (average)  Inventory
 Rejected batches
 Deviation
Supplier Reliability
 Yield
 Service level supplier (OTIF)
 Scrap rate
 Complaint rate supplier
 Release time (formerly DQ)
 Deviation closure time (formerly DQ) Lab Quality & Robustness: Requiring Investigation
 Analytical Right First Time  Product Re-Tests due to
 Lab Investigations Complaints
Engagement Metrics  Sick leave  Invalidated OOS  Routine Product Re-tests
 Suggestions  Training  Total OOS  Annual Product Quality
(Quantity)  Level of qualification  Lab Deviation Events Reviews (APQR)
 Suggestions (Quality)  Level of safety Cultural Excellence: Quality Maturity
 Preventive maintenance [3]  Recurring Deviation  APQR On Time Rate
 Employee turnover (Incidents)  CAPAs Overdue  Stability Reports
 Housekeeping [2]
 Process management [6]  Customer Complaints  Audits
Cultural Excellence: Quality Behavior
 Preventive maintenance [4]  Cross functional product development [3]
CAPA Effectiveness
 Housekeeping [1]  Customer involvement [2]
 Number of CAPAs
 Process management [1]  Supplier quality management [5]
 Number of critical overdue CAPAs
 Cross functional product development [1]  Set-up time reduction [1]
 Number of non-critical overdue CAPAs
CAPA:eCorrective
Figur 5: PPSMAnd Preventive
House Action and Enabler5
with Metrics
PQS: Pharmaceutical Quality System

“QC Lab Robustness” had already been designed into the PPSM ultimate goal of PQS Excellence can only be achieved by sustaining
analysis developed in year one, as there was little or no QC Lab improvement of each of the PPSM building blocks and Cultural Ex-
data available in the original benchmarking database, a specific QC cellence is the basis for delivering that improvement. While specific
1 | © ITEM-HSG
Lab Robustness score could not be calculated or be considered in controls and safeguards may influence an improvement in specif-
the overall consideration of the effectiveness of the Pharmaceutical ic high-level KPIs like OTIF (on time in full)7 maximum benefit is
Quality System (PQS Effectiveness Score). gained when organizations are fully committed to excellence and
use a superior management of equipment and processes (effective-
A high-level view of the PPSM is depicted in Figure 4 and Figure 5
ness) to drive down costs too (efficiency). For overall PQS Effective-
shows the model including all the metrics that went into the mod-
ness, the most significant category is Operational Stability.
el. The three FDA metrics from the draft guidance are highlighted
in Figure 4 and Figure 5. The model serves several aims:
The main characteristics of this PPSM house are: 1. The PPSM provides a structured and holistic depiction of the
relevant, available data from the St.Gallen OPEX Database,
1. The foundation of the PPSM house is Cultural Excellence6
including: Key Performance Indicators8 (e.g. metrics within
which combines employee engagement, quality behavior
the C-categories), Enabler implementation9 (e.g. qualitative
and quality maturity of a company.
enablers within the category Cultural Excellence) and
2. The second level of the house is referred to in the model the Structural Factors10 of the given organization (e.g. site
as the Corrective Action and Preventive Action (CAPA) structure, product mix, technology employed).
effectiveness. This element incorporates broader quality
2. The model facilitates positioning of the three metrics,
management system capabilities, which are required to
suggested in the revised FDA Draft Guidance (2016), within
correct, prevent and bring about improvements to an
the broader context of the holistic St.Gallen understanding,
organization’s processes by proactively and reactively
in order to test them for significance from a system
eliminating causes of non-conformities or other under-
perspective. By doing so:
performance outcomes and is an integral part of any GMP
environment. a. The KPI Lot Acceptance Rate was assigned to the
C-category Operational Stability.
3. The third level considers the stability of production. In the
very center of this level, the model includes a performance b. The KPI Invalidated OOS to the C-category Lab Quality
assessment of Operational Stability. As this stability is also and Robustness.
impacted by the reliability of the supply of critical materials c. Customer Complaint Rate is considered as an outcome
and components, Supplier Reliability is the second building metric within the PPSM and therefore is located in the
block on level 3. The third component, at this level, are the D-category PQS Effectiveness.
measures associated with Lab Robustness. As mentioned 3. The model facilitates the grouping and discussion of the
earlier, this score could not be calculated in year 1 because of elements within the PPSM as well as the examination of the
the lack of appropriate data but was considered in the year relationships between elements. For instance, the proposal
2 research. A deeper analysis of the role of labs can be found to examine the “Relationship of individual Operational
in Chapter 7 of this report. Stability metrics with PQS Effectiveness”, clearly defines the
Based on Cultural Excellence (Level A), robust CAPA processes scope of the analysis to be discussed.
(Level B) and high Supplier Reliability, Operational Stability and 4. The PPSM provides a structure for the overall research
Lab Robustness (Level C) the model has provided evidence of how project as it facilitates the tracking and communication of
a company can achieve higher PQS Effectiveness. Related efforts in each analysis already performed as well as indicating any
assessment of costs and headcounts provide one means to calculate potential blank spots between the different PPSM elements,
a PQS Efficiency score. These two aggregated measures build the thereby supporting the identification of future potentially
two pillars of assessing PQS Excellence. Another approach for the interesting analysis.
efficiency calculation is introduced in chapter 4 of this report. The

18 | QUALITY METRICS RESEARCH


III. A prerequisite to identifying risks based on the reportable
3.1 Main Findings metrics will be to define appropriate thresholds or ranges
for these metrics.11
IV. The absence of any metrics addressing culture should be
reconsidered given the shown high importance of Cultural
3.1.1 Operational Excellence and Quality Management are Excellence on PQS performance.
the two sides of the same coin V. Reporting on a product level should also be reconsidered as
In pharmaceutical manufacturing, quality should drive operational the additional value (e.g. for preventing drug shortages) is
excellence. This has been confirmed by this research. One of the limited and may not justify the comparably high reporting
key conclusions arising out of this research program is that for burden across the supply chain. On the other hand, it must
pharmaceutical companies, it is beneficial to align the reporting be acknowledged that FDA intends to use quality metrics
of quality performance metrics with internal OPEX programs in data as well for other reasons such as a more targeted
order to: preparation of site inspections.

» Justify the additional reporting effort and highlight the VI. Without considering the level of inventory range (measured
benefits to the business as Days on hand), the program’s ability to assess the risk for
drug shortages is limited.12
» Further systematize continuous improvement activities
within organizations VII. Evaluating advantages and disadvantages of other voluntary
reporting programs (such as (OSHA, 2019)) versus mandatory
» Improve the understanding of the actual performance of the participation is recommended.
company’s production system in general, and the reported
FDA metrics in particular. The chosen focus on measurable KPIs in the draft guidance is rea-
sonable and does make sense for a regulator. However, the research
This will lead to the integration of the business and quality. Analy- has demonstrated that a focus on the level of integration and lev-
sis and measuring quality will be process oriented. This analysis will el of enabler implementation should also be included for any tar-
lead to prioritization of work processes, changes in the culture on geted discussions with industry or during the inspection process.
and beyond the shop floor, as well as change in how management In particular, the importance of cultural excellence should not be
views quality as a benefit to improving efficiency and productivity. underestimated and is best considered together with any KPI-led
To deal with this radical shift, organizations would need to change performance assessment. The complexity in patterns examined,
their mindset. Where Quality is driving Operational Excellence especially the interdependencies would benefit from further crit-
there is a Culture of Quality, with strong Executive Leadership and ical understanding of the specific context for each organization.
Organizational Structure, integrated with corporate objectives, in- The existence of significant numbers of “out of pattern” groups
tegrated processes, and real-time traceable metric measurement. indicates that it is useful to explore the potential for engaging in
direct discussions with the pharmaceutical industry about the met-
3.1.2 St.Gallen Research validates the FDA Quality rics and metrics systems in use within organizations.
Metrics Program Based on the findings and the comparison of the FDA Quality
I. Lot Acceptance Rate and Customer Complaint Rate are Metrics with the St.Gallen PPSM Approach the research team can
reasonable measures to assess Operational Stability and confirm that the FDA proposed metrics are key metrics of the Phar-
PQS Effectiveness and should remain part of the Quality maceutical Quality System
Metrics Program.
II. The level of detail of FDA suggested quality metrics
definitions is appropriate given the limited number of
metrics requested.

5. The selection of metrics is often influenced and limited by the availability of metrics available in the OPEX benchmarking database. The operationaliza-
tion of subsystems like ‘Supplier Reliability’ could use additional KPIs for a broader representation.
6. Cultural Excellence has been investigated in year 1 of this research. The importance of culture for superior quality could be shown based on available data.
For a deeper insight compare FRIEDLI ET AL.(Friedli et al., 2017)
7. We showed in year 1 that high inventory levels might have a positive impact on OTIF (Friedli et al., 2017)
8. Key performance indicators (KPIs) are a set of quantifiable measures that a company uses to gauge its performance over time. These metrics are used to
determine a company's progress in achieving its strategic and operational goals, and also to compare a company's finances and performance against other
businesses within its industry.
9. Enablers are production principles (methods & tools but also observable behavior). The values show the degree of implementation based on a self-assess-
ment on a 5 point Likert scale.
10. Structural factors provide background information on the site, such as size and FTEs, technology, product program. Structural factors allow to build
meaningful peer groups for comparisons (“compare apples with apples”).
11. To do this bears some complexity: first risk has to be operationalized and then it needs a certain amount of data to be able to find relations between the
metric values and the risk exposure. As FDA intends to do the analysis only in combination with other data they already have available there may be other
patterns arising that serve the aim to identify respective risks.
12. This conclusion has not been derived from data analysis but from theory and from the study of sources like the Drug Shortages report by International
Society for Pharmaceutical Engineering [ISPE] and The Pew Charitable Trusts [PEW] (The Pew Charitable Trusts & International Society for Pharmaceuti-
cal Engineering, 2017).

QUALITY METRICS RESEARCH | 19


3.1.3 The PPSM Model provides a framework for 3.1.4 Implications for Pharmaceutical Industry
PQS Excellence I. Metrics systems should become integrated across Operations
and Quality and be used to derive joint conclusions. It is
The main findings arising from the first two years of research con- important to also include assessment of Enabler capabilities.
ducted by the University of St.Gallen in close collaboration with The focus should be on integrated enabler implementation
the FDA Quality Metrics Team are summarized below. The St.Gal- and linking them to specific outcome metrics.13
len Pharmaceutical Production System Model (PPSM) was devel-
oped as a prerequisite for conducting a structured data analysis as II. The role of the labs for a lean and excellent overall value
a framework for how Pharmaceutical Quality System (PQS) Ex- chain should be explored and optimized accordingly.
cellence may be achieved. It is a holistic model that is based on a
system-based understanding of pharmaceutical production. For an 3.2 Summary Year 1 Findings
overview of the model and the architecture of included constructs
Service Level Delivery (OTIF)
like ‘Cultural Excellence’ please refer to Figure 5. Some of the key
characteristics are: I. Service Level Delivery (OTIF) is deemed to be a good
surrogate for PQS Effectiveness measured by the aggregated
a. PQS Excellence comprises both, PQS Effectiveness and PQS
PQS Effectiveness Score.
Efficiency. The relation between these two elements of the
PPSM is further investigated in chapter 5. II. Sites with high levels of Rejected Batches and low inventory
show a comparably low level of Service Level Delivery.
b. The key performance indicator Service Level Delivery
(OTIF) has been identified as a suitable surrogate for the
effectiveness of the Pharmaceutical Quality System for the
purpose of data analysis. Operational Stability

c. Operational Stability has been found to have a significant I. A high level of operational stability seems to be the major
impact on PQS Effectiveness. lever to achieve high levels of Service Level Delivery. Service
Level Delivery (OTIF) is deemed to be a good surrogate
d. Supplier Reliability has been found to have a significant for PQS Effectiveness measured by the aggregated PQS
impact on Operational Stability. Effectiveness Score.
e. PQS Effectiveness high performing sites have a significantly II. Sites with low operational stability show significant higher
higher Cultural Excellence compared to PQS Effectiveness levels of Rejected Batches.
low performing sites.
III. Sites with low stability and low inventory show a weak
f. A newly developed Inventory –Stability Matrix (ISM) allows performance for both Rejected Batches and Customer
for a better understanding of the impact of inventory on Complaint Rate.
PQS performance on a site.
g. A High level of Inventory (Days on Hand) can compensate
for stability issues experienced on sites including insufficient Customer Complaint Rate
process capability. I. Sites with low stability and low inventory show a weak
I. Sites with Low Stability and Low Inventory have performance for both metrics, Rejected Batches and
the highest risk profile regarding Rejected Batches, Customer Complaint Rate.
Customer Complaint Rate and Service Level Delivery II. A higher Customer Complaint Rate is accompanied by a low
(OTIF) (PQS Effectiveness surrogate) aggregated PQS Effectiveness Score.
II. Operational Stability high performing sites have a
significantly lower level of Customer Complaints and a
significantly lower level of Rejected Batches compared
to Operational Stability low performing sites.
III. Fostering Quality Maturity will have a positive impact
on the Quality Behavior at a site, leading to superior
Cultural Excellence and subsequently providing the
foundation of PQS Excellence.

13. Today in most OPEX and Production System implementations the main focus is to introduce the various tools and enablers. Controlled is usually the
degree to which the different components of an OPEX program or a production system have been implemented. Unfortunately, the link to performance is
not always made, respectively there are no KPI systems introduced to measure the impact of these implementations and adapt the programs accordingly.

20 | QUALITY METRICS RESEARCH


Inventory Levels QC Lab Performance
I. A high level of operational stability seems to be the major For the QC lab performance, the following findings, based on the
lever to achieve high levels of Service Level Delivery. QC Lab Robustness score, were derived: The QC Lab Robustness
High Performers:
II. High level of inventories may compensate for stability
issues. I. Show a better performance for all KPIs .
III. Inventory mitigates the negative effect of high levels of II. Have a significantly lower Invalidated OOS (per 100’000
Rejected Batches on the Service Level Delivery (OTIF). tests).
IV. Level of inventory has a mitigating effect on the impact of III. Have a significantly higher level of automation.
low operational stability and high level of Rejected Batches
IV. Show patterns of an integrated (programmatic) Enabler
on the Customer Complaint Rate.
implementation to achieve their superior robustness score,
V. High level of inventory reduces the negative impact of instead of a situational (piecemeal) Enabler implementation.
Rejected Batches on the Service Level Delivery level.
V. Have an average Invalidated OOS rate of 12 (per 100’000
tests).
PQS Effectiveness VI. Invalidated OOS (per 100’000 tests) can be directly linked to
I. A high degree of PQS Effectiveness is accompanied with a QC Lab Robustness performance.
high level of Cultural Excellence evidenced at the site. The
impact of Cultural Excellence and Quality Culture on the
effectiveness of a site’s PQS can be statistically confirmed Furthermore, it was shown that:
with the data of the St.Gallen OPEX Benchmarking database. » There is a positive link between operational excellence
II. Regarding CAPA a highly significant correlation is only Enabler implementation and QC Lab Robustness.
detectable between the metric ‘Number of non-critical » There is a positive link between management commitment
overdue CAPAs’ and PQS Effectiveness (OTIF). and the Technical Enablers.
III. Pharmaceutical manufacturing sites with a higher PQS
Effectiveness have the tendency to also show a higher PQS
Efficiency though the degree of determination is rather
limited.
IV. PQS Effectiveness High Performer have a significantly higher
implementation level of Cultural Excellence compared to
the PQS Effectiveness Low Performer.

Quality Maturity
I. Quality Maturity is strongly correlated with Quality
Behavior.
II. We identified the ten individual Quality Maturity attributes
that differentiate sites most regarding their overall Quality
Maturity score (Details included in Chapter 8).

3.3 Summary Year 2 Findings


PQS Effectiveness
I. PQS Effectiveness High Performers (HP) (top 10%) show a
higher level of performance across all performance metrics
considered in this research.
II. The PQS Effectiveness Score correlates positively with
the degree of implementation of numerous technical and
cultural enablers, demonstrating the importance of an
integrated Enabler implementation approach.

QUALITY METRICS RESEARCH | 21


4 THE ENHANCED PHARMACEUTICAL
PRODUCTION SYSTEM MODEL

22 | QUALITY METRICS RESEARCH


Chapter 4 focuses on further development of the Pharmaceutical 1 & Year 2 (Friedli et al., 2017; Friedli, Köhler, Buess, Basu, et al.,
Production System Model (PPSM) in the recent research period. 2018) and uses data from the St.Gallen OPEX Database to replicate
Expanding the foundation of the legacy PPSM used in the first two an analysis conducted by VOSS ET AL. to separate performance
years of the research project by additional enablers allows a com- clusters of production plants (Voss et al., 1995). The PQS Excellence
prehensive analysis including all ICH Q10 categories (cf. chapter Score is an aggregated score comprising PQS Effectiveness and
4.1). Furthermore, this chapter introduces the newly developed PQS Efficiency and forms the basis for analysis in chapter 5. The
PQS Excellence Score, which is based on a revised PQS Efficiency newly revised model is shown in Figure 6.
Score (cf. chapter 4.2) compared to the final research reports Year

PQS
Excellence

PQS Effectiveness PQS Efficiency

Supplier Reliability Operational Stability Lab Robustness

CAPA Effectiveness

Cultural Excellence Technical Excellence


Process Performance & Product Quality Monitoring System
Change Management System Knowledge Management

Management Review
ICH Q10
CAPA System
Management Responsibilities Quality Risk Management

Environment, Health and Safety

Figure 6: The enhanced Pharmaceutical Production System Model (PPSM)

QUALITY METRICS RESEARCH | 23


4.1 Extended PPSM Enabler Landscape
The legacy St.Gallen OPEX enabler landscape addresses capabili- benchmarking program and data gathering is currently on-going.
ties, e.g. practices, tools and routines implemented to support and Depending on the specific focus of an analysis, evaluation results
drive superb operational performance. It contains of both cultural can be both aggregated along the dimensions of the St.Gallen Op-
and technical excellence aspects. The traditional St.Gallen Oper- erational Excellence Model as well as along the ICH Q10 guideline
ational Excellence Model (see Figure 3) groups the enablers in the to assess and compare the implementation level of ICH Q10 across
dimensions Total Productive Maintenance (TPM), Total Quality the pharmaceutical industry.
Management (TQM), Just in time (JIT), Effective Management Sys-
tem (EMS) and Basic Elements (BE).
Following the objective to additionally cover all guideline catego- 4.2 Operationalization of PQS Excellence
ries of ICH Q10, the guideline categories are integrated in the lega-
cy enabler landscape as shown in Figure 6 and act as new structure The fundamental principle of PQS Excellence is derived from ex-
for the enhanced enabler foundation of the PPSM. While ICH Q10 isting Excellence Models in Operations Management. The litera-
uses the term enabler only for Knowledge Management & Qual- ture suggests to combine effectiveness and efficiency to measure
ity Risk Management, the objective of the present research is to excellence (Cross & Lynch, 1988; EFQM, 2012; Kaplan & Norton,
assess the implementation level of ICH Q10 comprehensively. 1992; Keegan, Eiler, & Jones, 1989; MBNQA, 2017; Neely, Adams,
That is why an assessment to address the implementation level of & Crowe, 2001). Therefore, the newly introduced PQS Excellence
all guideline categories is designed, regardless if the categories are Score is calculated as the average of the PQS Effectiveness Score
tagged as enabler or PQS element in ICH Q10. As the assessment and the PQS Efficiency Score. Accordingly, effectiveness and effi-
focuses the implementation level of approaches and principles to ciency are both considered performance dimensions in contrast
enable driving forward performance instead of measuring outcome to a solely effectiveness or efficiency focused performance under-
performance based on metrics, the assigned and newly derived standing.
questions are named enabler even if related to ICH Q10 elements. The operationalization of the PQS Effectiveness Score corresponds
Therefore, the enabler landscape extension follows a two-step to the operationalization developed during the first year of this
approach: First, all legacy enablers are evaluated and assigned to Quality Metrics research project. On the other hand, the PQS Ef-
the appropriate ICH Q10 category. As a first-step result the con- ficiency Score has been completely revised compared to the final
tent of 27 out of the 137 legacy OPEX enablers align to ICH Q10, so research reports Year 1 & Year 2 (Friedli et al., 2017; Friedli, Köhler,
those 27 enablers are identified as ICH Q10 relevant and assigned Buess, Basu, et al., 2018). The reasons for this revision are twofold.
to one of the guideline’s categories Management Responsibilities, On the one hand, previous operationalization carried the material
Knowledge Management, Quality Risk Management, Process Per- costs, which are not directly influenced by the quality of the man-
formance and Product Quality Monitoring System, Change Man- ufacturing operations. On the other hand, it did not allow for a
agement and Management Review. There is no legacy enabler cov- comprehensive enough assessment, since relevant cost items were
ering aspects of the CAPA System as described in ICH Q10. Figure only addressed partially. For instance, only QA and QC headcounts
7 summarizes the number of assigned legacy OPEX Enablers per were considered previously to the neglect of other direct and in-
ICH Q10 category. direct labor areas (Friedli et al., 2017; Friedli, Köhler, Buess, Basu,
et al., 2018). Consequently, the revised PQS Efficiency Score aims
As the assigned legacy enablers are not sufficient to cover all as- to cover all production related cost items, to allow for a compre-
pects discussed in ICH Q10, additional enablers dedicated to the hensive assessment of operational efficiency. In the following, the
guideline’s categories are derived to ensure the content of ICH Q10 operationalization of the PQS Excellence Score and its sub compo-
is fully covered. Overall, 44 newly conceptualized enablers are add- nents, the PQS Effectiveness Score and the PQS Efficiency Score is
ed to the enhanced PPSM enabler architecture. Figure 8 depicts the outlined.
number of newly added enabler per ICH Q10 category.
Besides integrating the legacy OPEX enabler architecture in the
PPSM and ensure full compatibility with ICH Q10, the dimension
Environment, Health and Safety (EH&S) completes the enhanced
PPSM as suggested by industry representatives across pharmaceu-
tical companies. EH&S is seen as one of the most important action
fields when improving and sustaining operations. Following a two-
step approach similar to the ICH Q10 extension described above, 4
of the OPEX legacy enabler can be classified as EH&S related. Fur-
thermore, 4 additional enablers dedicated to EH&S are derived and
included in the enhanced PPSM enabler landscape.
Overall, the legacy OPEX enabler architecture (consisting of 137
legacy enabler) is extended by 44 newly derived enabler questions
related to ICH Q10 as well as 4 new EH&S enablers. The enhanced
set of 185 enablers in total allows a comprehensive assessment of
the enabler implementation level within a pharmaceutical produc-
tion site. The extended ICH Q10 enabler questionnaire (see Ap-
pendix 1) is incorporated in the St.Gallen Operational Excellence

24 | QUALITY METRICS RESEARCH


Legay Enabler - Assigned to ICH Q10 categories
Management Knowledge Capa Quality Risk Process Perf. & Product Change Management Management
Responsibilities Management System Management Quality Monitoring System System Review
8 6 0 2 9 1 1

Figure 7: Number of legacy PPSM Enablers assigned per ICH Q10 category

Legay Enabler - Assigned to ICH Q10 categories


Management Knowledge Capa Quality Risk Process Perf. & Product Change Management Management
Responsibilities Management System Management Quality Monitoring System System Review
6 6 6 6 6 6 8

Figure 8: Newly derived enablers per ICH Q10 category

Facilitated robustness of the manufacturing process, through


4.2.1 Operationalization of PQS Effectiveness facilitation of continual improvement through science and risk-
During the first year of this Quality Metrics research project, the based post approval change processes; Further reducing risk of
effectiveness of an organization’s PQS was defined as its ability to product failure and incidence of complaints and recalls thereby
provide high-quality drugs while ensuring a high level of delivery providing greater assurance of pharmaceutical product consist-
capability. As this ability depends on stable production, on the ency and availability (supply) to the patient; (FDA, 2011)
reliability of delivery from suppliers as well as on the consistent Operational Stability Score
achievement of a certain level of quality of incoming materials,
PQS Effectiveness was defined as an aggregated score comprising The Operational Stability Score is calculated as an average of the
the two performance categories Operational Stability and Suppli- relative values of the metrics shown in Table 2.
er Reliability (cf. chapter 4.2). Operational Stability represents the Supplier Reliability (SR)
most significant category of PQS Effectiveness and is calculated
out of eight metrics. Supplier Reliability, on the other hand, com- According to the ICH Q10 guideline, the pharmaceutical quality
prises two metrics. In order to receive a meaningful PQS Effec- system also extends to the control and review of any outsourced
tiveness Score for a plant, the research team defined that at least activities and quality of purchased materials. The PQS is respon-
four out of the eight underlying Operational Stability metrics and sible for implementing systematic processes which ensure the
both Supplier Reliability metrics must be available. Table 1 depicts control of outsourced activities and the quality of all purchased
the underlying metrics of the PQS Effectiveness Score calculation. material. This includes the assessment of suitability and compe-
The upward or downward facing arrows indicate whether higher tence of any other third party prior to outsourcing operations or
or lower values are considered better for PQS Effectiveness Score selecting material suppliers. It also requires the establishment of a
aggregation. clear definition of responsibilities for all quality-related activities of
any involved parties and for monitoring of the quality of incoming
The following understanding of Operational Stability and Sup- material (FDA, 2009). In order to assess the reliability of external
plier Reliability has already been outlined accordingly in the final suppliers, represented by the PPSM Supplier Reliability Score, the
research report year 1 (Friedli et al., 2017). However, as both are crit- research team uses the following metrics from the St.Gallen OPEX
ical for the analysis in this report (cf. chapter 5) and as this report Benchmarking; Service Level Supplier which is a measurement of
is intended as a stand-alone report, they are introduced again here. the supplier’s ability to deliver on-time and Complaint Rate Suppli-
Operational Stability (OS) er which is a measurement of the supplier’s ability to deliver prod-
ucts of high quality14.
In the PPSM, Operational Stability equates to the provision of ca-
pable and reliable processes and equipment. Referring to the Sand Supplier Reliability Score
Cone Model (Ferdows and DeMeyer, 1990), Operational Stability The Supplier Reliability Score is calculated as an average of the rel-
embodies the core capabilities of Quality and Dependability. The ative values of the metrics: depicted in Table 3, namely Complaint
importance of robust manufacturing processes was highlighted in Rate (Supplier) and Service Level Supplier.
the ICH Quality Implementation Working Group on Q8 / Q9 / Q10
Questions & Answers document which outlines the potential ben-
efits of implementing effective PQS as follows:

14. In order to describe Supplier Reliability the research team used all available measures from the benchmarking questionnaire that help assess supplier relia-
bility. However, overall supplier reliability could encompass more aspects than the ones captured with the benchmarking questionnaire.

QUALITY METRICS RESEARCH | 25


Lab Robustness Excellence. It therefore includes the pharmaceutical value chain on
one level including Supplier Reliability, Operational Stability and
In the PPSM, QC Lab Robustness is positioned on the same level as
Lab Robustness. This meets the understanding of pharmaceutical
Supplier Reliability and Operational Stability (cf. Figure 6). How-
end-to-end value creation from an operational lens.
ever, it is not integrated into the PQS Effectiveness calculation.
Whereas, in year 2 of the St.Gallen research for FDA the research Lab Robustness represents the effectiveness of the QC lab to de-
focus was set to integrating Lab Robustness into PQS Effectiveness liver robust test results. It covers process quality and service/deliv-
(Friedli, Köhler, Buess, Basu, et al., 2018) in the enhanced PPSM Lab ery-related aspects of the QC operations to generate accurate test
Robustness is more and more seen as a moderator. Consequently, results. The efficiency of QC labs is not covered in the dimension
different to Supplier Reliability and Operational Stability, Lab Ro- Lab Robustness. QC efficiency (i.e. direct and indirect QC labor
bustness in the research is not considered for the PQS Effective- cost) is covered in the PQS Efficiency definition in the “roof” of the
ness calculation. The research team has found first evidence for a PPSM (cf. Figure 6). In this way, QC efficiency and QC effective-
non-linear relation between Lab Robustness and PQS Effective- ness are independently assessed as part of the PPSM. Following the
ness. In the future, the focus is set to analyze Lab Robustness as cumulative logic of Ferdows and De Meyer (Ferdows & De Meyer,
a moderator for the relation between PQS Effectiveness and PQS 1990) Lab Robustness (effectiveness) builds one of the fundaments
Efficiency15. for PQS Efficiency (incl. lab efficiency). Table 4 depicts the perfor-
mance indicators that are considered for Lab Robustness. All per-
Nevertheless, in the PPSM Lab Robustness stays on the same lay-
formance indicators in Table 4 relate to process stability and im-
er as Supplier Reliability and Operational Stability as the model is
pact due to discrepancies from the routine that result in additional,
based on systems-thinking with a holistic understanding of PQS
unplanned effort of the QC lab.

Metric Unit Better if Aggregation 1 Aggregation 2 Metric Measured Aspect


Complaint Rate % Supplier Unplanned Maintenance Equipment stability

Supplier Reliability OEE Equipment stability
Service Level Supplier % ⬆ Score
Rejected batches Process stability/product quality
Unplanned % ⬇ Deviations per batch Process stability/product quality
Maintenance
Yield Process stability
OEE % ⬆
Scrap Rate Process stability
Rejected batches % ⬇ Aggregated
PQS Release Time Delivery stability
Number/ ⬇
Deviations per batch
batch Operational Effectiveness Deviation Closure Time Delivery stability
Stability Score
Yield % ⬆ Score Table 2: Overview Operational Stability Metrics and
Purpose of Measure
Scrap Rate % ⬇
Working ⬇
Release Time
days
Deviation Closure Working ⬇
Time days

Table 1: Calculation of aggregated PQS Effectiveness Score Indicator Unit


Adherence to Lead Time %

Metric Measured Aspect Adherence to Schedule %

Complaint Rate Supplier Quality of input material Analytical Right First Time %
Service Level Supplier Reliability of supply Customer Complaint
No./100,000 Tests
Investigation Rate
Table 3: Overview Supplier Reliability Metrics and Purpose Invalidated OOS Rate No./100,000 Tests
of Measure Lab CAPAs Overdue %
Lab Deviation Rate No./1,000 Tests
Lab Investigation Rate No./1,000 Tests
Product Re-Tests due to
%
Complaints
Recurring Lab Deviations %

Table 4: Performance Indicators of Lab Robustness

15. A moderator is a variable that affects a relation between two dimensions A and B. It can affect the direction and/or strength of the relation between A and
B. A moderator analysis to understand the moderating effects of Lab Robustness on the relation between PQS Effectiveness and PQS Efficiency is part of
on-going research that will be published in the future.

26 | QUALITY METRICS RESEARCH


Contenders/world class plants comprise those plants that have the
4.2.2 Operationalization of PQS Efficiency potential to achieve the highest levels of both practice implemen-
To comprehensively assess operational efficiency, the revised PQS tation and performance or which have already reached it. The
Efficiency Score is calculated as the inverse of “conversion cost promising plants show high practice implementation levels, which,
per batch”, thereby ensuring that lower cost values are considered however, have not yet translated into superior performance. Won’t
better. Conversion cost equal the total production cost or Cost of go the distance plants, on the other hand, achieve high levels of per-
Goods Sold (COGS) minus material cost. Thus, on the one hand, formance without having established high enabler implementation
this operationalization has the advantage, that it excludes material levels. Accordingly, they are regarded as vulnerable to performance
cost and therefore allows an operations focused assessment of cost slumps. Finally, Voss et al. describe punchbags, which show both
efficiency. On the other hand, it represents an exhaustive measure low performance and low practice implementation levels (Voss et
that allows a comprehensive efficiency assessment of the quality al., 1995).
system. Table 5 depicts the components of conversion cost.
In order to avoid comparing absolute costs, which are dependent
on the company size and the volume of produced goods, and to
create a comparable PQS Efficiency measure, the cost component
has to be normalized. For this purpose, different alternative divi-
sors were evaluated, such as total number of FTE, number of FTE
in production or number of batches produced. The research team
identified the number of batches produced as the best suited divi-
sor, because it is not influenced by the number of FTE and there-
fore allows comparison of both highly and less automated plants.
Consequently, the conversion cost item is divided by number of
batches produced.
Operationalization of PQS Efficiency Score:

Figure 9: Relation between enabler implementation and level PQS Ex-


cellence
As the PQS Efficiency Score is calculated from the conversion cost
Based on the St.Gallen OPEX Benchmarking database, the variance
and the number of batches produced, it can be influenced by a vari-
that can be explained varies between 13% and 61%, depending on
ety of factors that affect either the denominator or the nominator.
whether all categories are considered or whether promising and
One example includes smaller batch sizes, which would improve
won’t go the distance plants are excluded from the analysis. Won’t go
the denominator. At the same time, the additional number of re-
the distance plants achieve their high overall PQS Excellence Score
quired change overs could also impact the numerator. While it
without showing high enabler implementing levels. Following Voss
would be value adding to study the impact of factors such as lot size
et al. (Voss et al., 1995), they can be viewed as vulnerable to per-
on the PQS Efficiency Score, their impact can only be quantified
formance drops and might struggle to sustain these performance
based on the available data sample. Consequently, other factors
outcomes. This is in line with the findings from chapter 5, which
than the ones discussed in the following might also have an impact
suggest that high performing plants, which are proven to be both
on the PQS Efficiency Score.
more effective and more efficient, consistently show significantly
higher overall enabler implementation levels. For the promising
cluster, the research team hypothesizes that these are likely plants,
4.3 Drawing from Seminal Work in which have started their OPEX journey more recently and may not
yet have translated their above median enabler implementation
Operations Management into increased performance.

In their seminal work, Voss et al. examine competitiveness by fo- The future oriented character of these hypotheses complicates a
cusing on the relation between operational performance and OPEX reliable verification or falsification. Furthermore, the available data
practice implementation (Voss et al., 1995). The study aims, among does neither provide insights concerning when an OPEX program
others, at identifying how companies benefit from adopting suc- was launched at a plant nor does it allow adequate longitudinal
cessful practices from key areas, such as quality, production, prod- analysis of performance developments. Nevertheless, two plausi-
uct development. For this purpose, performance is operationalized ble explanations can be made based on the available data from the
in terms of productivity, market and financial measures. Voss et al. St.Gallen OPEX Benchmarking database.
suggest that higher practice implementation will lead to increased To identify why won’t go the distance plants are vulnerable to per-
levels of operational performance, which will in turn result in su- formance drops, the research team draws on OPEX literature,
perior business performance (Voss et al., 1995). In a similar fashion, which stresses the importance of empowered employees for sus-
the research team has replicated the logic of Voss et al. by corre- tainable performance (Zhang, L.; Narkhede, B. E.; Chaple, 2017).
lating enabler implementation and PQS Excellence based on the When comparing the reasons for launching OPEX of won’t go the
established St.Gallen OPEX Benchmarking database (cf. Figure 9). distance plants with the reasons of contenders/world class plants, the
Based on the level of practice implementation and performance, analysis shows significant differences in only three of the thirteen
Voss et al. distinguish six categories of plants (Voss et al., 1995), of surveyed reasons, namely to empower employees, to transition
which the research team adopts the four categories ‘punchbags’, towards process organization and to increase cost awareness (cf.
‘won’t go the distance’, ‘promising’ and ‘contenders/world class’. Table 6). A subsequent statistical analysis comparing the level of

QUALITY METRICS RESEARCH | 27


enabler implementation in the category employee involvement other factors than performance.
and continuous improvement (EI&CI) reveals significant differenc-
The tables depicted below show the results of the statistical analy-
es between won’t go the distance plants and contenders/world class
sis concerning reasons for launching OPEX. To better understand
plants. The analysis supports the view that won’t go the distance
the categories won’t go the distance and promising and to validate
plants indeed lack critical capabilities for sustainable performance.
the hypothesis outlined above, future research should apply fur-
On the other hand, statistical analysis comparing promising plants
ther longitudinal research to study the relation between OPEX
with contenders/world class plants neither show significant differ-
practice implementation and performance with a focus on promis-
ences in respective OPEX objectives nor in analyzed structural
ing and won’t go the distance plants.
factors, such as level of automation. This supports the hypothesis,
that promising plants do not show significant differences regarding

Cost item Components


Direct labor cost Production, QC lab testing, in-process testing
Indirect labor cost Incoming inspection, QC equipment maintenance, QA, maintenance, supply chain
Cost for machines/tools Operating cost, depreciation, amortization (CAPEX)
Cost for property/plants Operating cost, depreciation, amortization (CAPEX)
Corporate allocations Expenses related to owning or operating properties
Other cost Other cost, not covered by categories above, such as inventory write-off cost

Table 5: Overview of conversion cost components

Reason Category n Average Std. Deviation Sig. t-value

(2-tailed)
A 21 2.52 1.436 0.683 -0.413
Meet FDA regulations
B 9 2.78 1.787
Transition from functional organization to process A 22 3.23 1.270 0.034 2.222
organization B 9 2.11 1.269
A 22 4.45 0.739 0.028 2.315
Increase employee empowerment
B 9 3.67 1.118
A 22 4.64 0.581 0.095 1.727
Increase employee involvement
B 9 4.22 0.667
A 22 4.50 0.673 0.013 2.639
Increase cost awareness
B 9 3.56 1.333
Initiate cultural change for continuous A 22 4.64 0.727 0.593 -0.541
improvement B 9 4.78 0.441
A 22 4.45 0.739 0.257 1.156
Introduce standardized methodologies
B 9 4.11 0.782
A 21 4.29 0.845 0.328 -0.996
Fulfill site targets
B 8 4.63 0.744

(A) Contenders/world class


(B) Won’t go the distance
Table 6: Comparison of reasons for launching OPEX (excerpt of selected reasons): ‘Contenders/world class’ vs.
‘Won’t go the distance’

Enabler category Category n Average Std. Sig. t-value


Deviation
(2-tailed)
A 22 .633 .222 0.032 2.246
Employee involvement and continuous improvement (EI&CI)
B 9 .428 .253

(A) Contenders/world class


(B) Won’t go the distance
Table 7: Comparison of enabler implementation levels for category ‘EI&CI’: ‘Contenders/world class’ vs. ‘Won’t go the distance’

28 | QUALITY METRICS RESEARCH


Reason Category n Average Std. Sig. t-value
Deviation
(2-tailed)
A 21 2.52 1.436 0.505 -0.674
Meet FDA regulations
C 15 2.87 1.598
A 22 3.23 1.270 0.927 -0.092
Transition from functional organization to process organization
C 15 3.27 1.280
A 22 4.45 0.739 0.604 0.523
Increase employee empowerment
C 15 4.33 0.617
A 22 4.64 0.581 0.880 -0.152
Increase employee involvement
C 15 4.67 0.617
A 22 4.50 0.673 0.286 1.084
Increase cost awareness
C 15 4.20 1.014
A 22 4.64 0.727 0.566 0.579
Initiate cultural change for continuous improvement
C 15 4.47 1.060
A 22 4.45 0.739 0.817 0.233
Introduce standardized methodologies
C 15 4.40 0.632
A 21 4.29 0.845 0.147 1.484
Fulfill site targets
C 15 3.73 1.387

(A) Contenders/world class


(C) Promising

Table 8: Comparison of reasons for launching OPEX (excerpt of selected reasons): ‘Contenders/world class’ vs. ‘Promising’

Factor Category n Average Std. Sig. t-value


Deviation
(2-tailed)
A 21 76.9 50.1 .051 -2.144
Inventory level (days on hand)
C 13 189.3 184.8
A 22 0.3 0.4 .638 0.475
% of manually operated machines
C 15 0.3 0.3
A 21 398.8 405.1 .488 0.703
# of Total FTEs
C 16 329.8 169.5
A 22 62.3 104.4 .119 1.619
# of different market products produced
C 16 25.4 19.3
A 18 280.8 231.1 .889 0.142
# of different SKUs
C 10 257.3 493.1
A 22 15.6 17.2 .139 1.516
# of new drug introductions in last 3 years
C 16 9.0 9.6
A 21 139.9 183.3 .050 2.050
# of SKU at site in last 3 years
C 13 49.1 68.6

(A) Contenders/world class


(C) Promising
Table 9: Analysis of differences for selected structural factors: ‘Contenders/world class’ vs. ‘Promising’

QUALITY METRICS RESEARCH | 29


5 GOOD QUALITY
– GOOD BUSINESS

30 | QUALITY METRICS RESEARCH


This research sets out to investigate whether a positive correla- While the correlation analysis reveals a slightly positive correla-
tion between Quality and Efficiency, as suggested by seminal work tion between effectiveness and efficiency, the rather low R²-value
from Deming (Deming, 1986) and Ferdows (Ferdows & De Meyer, indicates that to understand differences between high and low
1990), can be shown based on St.Gallen OPEX Benchmarking data. performing plants in terms of PQS Excellence, additional factors
To study the relation between Quality and Efficiency based on the beyond effectiveness and efficiency need to be investigated. In the
PPSM, first, the research team builds on the operationalization of OPEX literature, the implementation of OPEX practices represents
PQS Excellence described in chapter 4.1. Additionally, this research the primary driver of operational performance (Cua, McKone,
intends to uncover what differentiates excellent plants in terms & Schroeder, 2001; Flynn, Sakakibara, & Schroeder, 1995; Voss et
of PQS Excellence from those that show low scores for both PQS al., 1995). These OPEX practices are referred to as enablers in the
Effectiveness and PQS Efficiency. In a first step, the usable data St.Gallen OPEX Benchmarking (cf. chapter 4.). Thus, to understand
sample will be briefly introduced (cf. chapter 5.1). In a second step, whether the two clusters are doing things differently, the research
a high-level overview of the relation between effectiveness and team focused on analyzing to what degree the two clusters show
efficiency will be provided (cf. chapter 5.2). As PQS Excellence is different enabler implementation levels in the enabler categories of
comprised by PQS Effectiveness and PQS Efficiency, in a third step, the St.Gallen OPEX Benchmarking, namely Total Productive Main-
four groups of plants are identified based on respective levels of ef- tenance (TPM), Total Quality Management (TQM), Just-in-Time
fectiveness and efficiency (cf. chapter 5.3). Subsequently, differenc- Production (JIT), Effective Management System (EMS) as well as
es between efficient and less efficient plants respectively between Basic Elements (BE) (cf. Figure 3). Additionally, a focus is set on
effective and less effective plants are identified and outlined in the differences in operating context, such as size of plant, headcount
fourth and fifth step (cf. chapters 5.4 and 5.5). Finally, differences structure or level of automation.
between excellent and the worst performing plants are identified
and outlined (cf. chapter 5.6). Consequently, in the following, the
terms excellence, effectiveness and efficiency refer to PQS Excel-
lence, PQS Effectiveness and PQS Efficiency respectively. 5.3 Distinguishing four groups of plants by
levels of effectiveness and efficiency
To identify truly excellent plants, the research team analyzes what
5.1 Data Sample distinguishes excellent plants from those that are either efficient
The data sample that is used in the following is based on the St.Gal- (PQS Efficiency) only, effective (PQS Effectiveness) only or neither
len OPEX Benchmarking database and includes all plants, for which efficient nor effective. Thus, to better understand differences be-
a PQS Efficiency and a PQS Effectiveness Score can be calculated tween high and low performing plants, four groups of plants are
(cf. chapter 4.2). The usable sample size equals n=62 and comprises distinguished based on their respective levels of efficiency and ef-
three categories of plants. First, plants with formulation and pack- fectiveness.
aging operations (n=37). Second, plants with API, formulation, and
packaging operations (n=4) and, third, plants with API production 5.3.1 Research Approach
only (n=21). The sample does not include sites that primarily do To understand differences between high and low performing plants,
packaging. Therefore, the number of formulation and API batches in a first step, a two-step cluster analysis is conducted. The cluster
produced represents the most suitable divisor to calculate the PQS analysis uses PQS Effectiveness and PQS Efficiency as continuous
Efficiency measure (cf. chapter 4.2.2). Table 10 provides an overview cluster variables. Furthermore, the final number of clusters is not
of relevant site types for the following analysis. predetermined by the research team. Even though the clustering is
based on the two variables, interestingly, the clustering results in
nearly a separation of the sample across the PQS Efficiency median
5.2 Relation between effectiveness (cf. Figure 11). A subsequent comparison of the standard deviations
shows differences between PQS Efficiency (0.228) and PQS Effec-
and efficiency tiveness (0.112). For more detailed analysis, in a second step, each
cluster is further broken down across the PQS Effectiveness me-
To analyze the relation between the revised PQS Efficiency Score
dian into bottom/top sub-samples, resulting in four groups with
and PQS Effectiveness a correlation analysis is performed. The
different levels of effectiveness and efficiency. Figure 11 presents the
correlation analysis shows a slightly positive correlation between
two clusters, Cluster A (yellow) and Cluster B (blue), which are fur-
the PQS Effectiveness Score and the revised PQS Efficiency Score
ther broken down across the PQS Effectiveness median into groups
(cf. 4.2.2). The results suggest that higher effectiveness also results
G1/G2 and G3/G4.
in higher efficiency. However, PQS Effectiveness only explains
around 4% of variation in PQS Efficiency. Figure 10 shows four li- The four groups form the basis to investigate the differences be-
near regression lines, one representing the entire sample and three tween high and low performing plants with regards to effectiveness
representing each of the three relevant site types outlined in chap- and efficiency. Therefore, the four groups are analyzed not only re-
ter 5.1. In addition to the overall positive correlation, the plot also garding differences in levels of effectiveness and efficiency, but also
shows a positive correlation for all three site types individually. regarding differences in other factors, such as enabler implementa-
tion or operating context.

Relevant site types in sample Number of batches produced


Formulation + Packaging Number of Formulation batches
API Number of API batches
Formulation + API + Packaging Number of Formulation + API batches

Table 10: Overview of relevant site types in sample

QUALITY METRICS RESEARCH | 31


Figure 10: Relation between PQS Effectiveness and PQS Efficiency

5.3.2 Results 5.4 Difference between Cluster A and


The two-step cluster analysis reveals two similar sized clusters with Cluster B
comparable effectiveness, but with one cluster (Cluster A), on aver-
age, being more efficient than the other (Cluster B). Breaking down Analysis reveal significant differences between efficient and less ef-
each cluster across the PQS Effectiveness median into bottom/top ficient plants. In the following, the research approach and the key
sub-samples results in the previously mentioned four groups, of differences will be outlined. Subsequently, the conclusions will be
which group G1 depicts the highest performing sub-sample based derived.
on the assumption, that excellence must be built on high effective-
ness. On the other hand, group G3 represents the lowest perform- 5.4.1 Research Approach
ing group under consideration of both dimensions (cf. Figure 11).
To study differences between efficient and less efficient plants, the
Accordingly, in this chapter, plants from group G1 are referred to as
following analysis build on the results of the before mentioned clus-
‘excellent’ plants, while plants from group G3 represent the ‘worst
ter analysis. As the cluster analysis identified two clusters, which
performing plants’.
almost split the sample across PQS Efficiency in half, these same
clusters can be used to investigate differences between efficient and
5.3.3 Conclusion less efficient plants (cf. chapter 5.3.2). First, to investigate, wheth-
Breaking down the two clusters (Cluster A and Cluster B) across er the difference in PQS Efficiency is significant, an independent
PQS Effectiveness median results in four distinct groups of plants sample T-test is conducted, comparing plants from Cluster A with
(G1 to G4) that can be further analyzed on the one hand to under- plants from Cluster B with regards to PQS Efficiency. Second, in
stand, why quality is good business and whether effectiveness can addition to investigating differences in PQS Efficiency, the T-Test
be viewed as the basis for efficiency. Accordingly, the four groups is also applied to identify differences in other factors, such as levels
can be analyzed with regard to differences in effectiveness or effi- of enabler implementation or operating context.
ciency. On the other hand, the four groups can be used to under-
stand differences between high and low performing plants in terms
of PQS Excellence. For both objectives, further analysis needs to
look beyond differences in PQS performance, as underlined by the 5.4.2 Results
fact that the two clusters show different PQS Efficiency but almost
Differences between efficient and less efficient plants are manifold
similar PQS Effectiveness. Thus, effectiveness and efficiency alone
and include differences in enabler implementation levels, PQS Ef-
are insufficient to identify key levers of PQS Excellence.
fectiveness and operating context. In the following the main differ-
ences between efficient and less efficient plants are outlined.
Differences between Cluster A and Cluster B in PQS Effectiveness per-
formance

32 | QUALITY METRICS RESEARCH


median (x)

Two-step Cluster
analysis:

Cluster A
Cluster B
G2 G1
PQS Efficiency

G3 G4

PQS Effectiveness

Figure 11: Results of clustering and subsequent grouping of plants

PQS Effectiveness Cluster n Average Std. Sig. t-value PQS Efficiency com- Cluster n Average Std. Sig. t-value
Metric Deviation ponent Deviation
(2-tailed) (2-tailed)
Unplanned A 31 .533 .277 Direct labor cost / A 32 .671 .1865
.409 0.834 .000 7.955
Maintenance B 25 .469 .285 batches produced B 30 .274 .2075
A 26 .561 .164 Indirect labor A 32 .619 .1752
OEE .728 -0.350
B 25 .586 .319 cost / 30 .324 .000 5.875
A 32 .619 .234 B .2199
Rejected batches .578 0.559 batches produced
B 30 .584 .260 Cost for machines A 32 .644 .1962
Deviations per A 32 .601 .232 & tools / 30 .304 .000 5.793
.000 4.589 B .2636
batch B 30 .323 .245 batches produced
A 31 .553 .231
Yield .546 -0.608 Cost for property A 32 .575 .2196
B 27 .598 .314 & plant / 30 .339 .001 3.653
A 27 .670 .245 B .2859
Scrap Rate .319 -1.007 batches produced
B 20 .739 .214
Corporate A 32 .608 .3007
A 31 .591 .258 allocations / 30 .511 .280 1.089
Release Time .034 2.169 B .3979
B 28 .435 .295 batches produced
Deviation A 32 .619 .244
.010 2.657 Other cost / A 32 .659 .2837
Closure Time B 30 .447 .268 .007 2.813
batches produced B 30 .434 .3425
Complaint Rate A 32 .482 .298
.644 0.456
Supplier B 30 .567 .264 Table 12: Differences between efficient and less efficient plants
A 32 .567 .286 concerning PQS Efficiency components
Service Level
.892 0.136
Supplier B 30 .557 .269

Table 11: Differences in PQS Effectiveness metrics between efficient and


less efficient plants

QUALITY METRICS RESEARCH | 33


The T-Test reveals that the more efficient cluster (Cluster A) on Differences between Cluster A and Cluster B in enabler implementation
average performs better for most underlying effectiveness meas- levels
ures compared to plants from Cluster B. For three out of the ten
To understand the uncovered differences between the two clusters
underlying PQS Effectiveness KPIs plants from Cluster A perform
in PQS Efficiency, the research team analyzes, whether the two
significantly better (cf. Table 11). The depicted scores take into ac-
clusters are doing things differently. Consequently, the research
count whether a higher or lower value is better for the respective
team investigated, to what degree the two clusters show different
metric (cf. Table 1). The scores for each metric are normalized so
enabler implementation levels. The T-test comparing Cluster A
that each score ranges between 0 to 1. Thereby higher values con-
and Cluster B reveals that the more efficient cluster has significant-
sistently indicate better performance. Accordingly, Table 11 shows
ly higher Overall and JIT enabler implementation levels (cf. Table
that plants from Cluster A have significantly fewer deviations per
15), indicated by higher scores. These results support the OPEX
batch, shorter release times and shorter deviation closure times at
theory that higher Overall Enabler implementation leads to high-
a 0.05 level. Based on the previously defined threshold for PQS Ef-
er operational performance, including effectiveness and efficiency
fectiveness Score calculation (cf. chapter 4.2.1) some plants might
(Shah & Ward, 2003; Voss et al., 1995). The results are also in line
not have provided data for all metrics depicted in Table 11. Conse-
with the OPEX literature, which suggests JIT enabler implemen-
quently, the table shows variable sample sizes for the same clusters
tation based on high Overall enabler implementation to be a key
depending on the data availability for respective KPIs.
driver for efficiency.
Differences between Cluster A and Cluster B in PQS Efficiency perfor-
mance 5.4.3 Conclusion
As outlined above, plants from Cluster A are on average more effi- Literature suggests that quality can have a positive impact on effi-
cient than plants from cluster B. A more detailed view reveals that, ciency (Deming, 1986; Ferdows & De Meyer, 1990). However, cost
in fact, the more efficient plants perform significantly better for advantages can also be gained at the disadvantage of quality and
most of underlying PQS Efficiency components and not just for a customer satisfaction. Consequently, this research sets out to bet-
small number of components of the PQS Efficiency measure (cf. ter understand the relation between PQS Effectiveness and PQS
Table 5). The only exception represents the cost for corporate al- Efficiency to identify characteristics of both highly effective and ef-
locations, which is dependent on the overall production network ficient plants. The results outlined in this chapter show that more
setup and on the degree of centralization of resources. Accordingly, efficient plants perform significantly better for individual PQS
higher efficiency is not only achieved by high performance for iso- Effectiveness metrics. More specifically, the more efficient cluster
lated conversion cost components but by high efficiency across the shows significantly better performance for the three PQS Effec-
board including direct labor cost, indirect labor cost and cost for tiveness metrics ‘deviations per batch’, ‘deviation closure time’ and
machine and property cost per batch. Table 12 shows the results of ‘release time’. These results suggest that increased effectiveness can
a T-test comparing Cluster A and Cluster B concerning all underly- lead to higher efficiency, which supports a case for quality. Further-
ing components of the PQS Efficiency Score (cf. chapter 4.2.2). The more, the results indicate that efficiency gains are neither driven
scores for each component are normalized so that all values range by headcount reduction or level of automation. Instead, the results
from 0 to 1, whereby a higher value indicates better cost efficiency suggest that higher efficiency is achieved by higher Overall enabler
per batch. implementation levels, which are an indication for enhanced OPEX
Differences between Cluster A and Cluster B in operating context capabilities, and which is supported by the fact that the more effi-
cient plants are able to handle higher product complexity.
To understand to what degree efficient and less efficient plants are
operating in a different context, the research team analyzed several
factors of the operating context. Based on the available St.Gallen
OPEX Benchmarking data, no significant differences are identified 5.5 Difference between effective and less
for the factors level of automation or company size in terms of total
FTEs, for which the research team hypothesized significant differ-
effective plants
ences. However, while both clusters show similar head count struc- The results outlined above suggest that effectiveness can be a driv-
tures, efficient plants demonstrate significantly higher productivity er of efficiency. Furthermore, findings in the final year 1 report
(in terms of batches/FTE) in all analyzed direct and indirect labor shows that a high level of Operational Stability is a key lever for
areas. Table 13 shows the results of the statistical analysis compar- high levels of Service Level Delivery (cf. chapter 3). Thus, to better
ing plants from Cluster A with plants from Cluster B. The respec- understand what distinguishes effective from less effective plants,
tive scores are normalized in a way that higher scores on a range in the following, differences in enabler implementation, operating
from 0 to 1 indicate higher productivity in terms of batches/FTE. context and performance are investigated.

The analyses further reveal that efficient plants are both more ef-
fective and efficient while having significantly lower inventory lev-
5.5.1 Research Approach
els compared to less efficient plants. Inventory levels are operation- To study potential differences between effective and less effective
alized as the average Inventory less write downs multiplied by 365 plants the sample is broken down across PQS Effectiveness medi-
and divided by the COGS. The more efficient plants also launched an as outlined in chapter 5.3. To identify key differences between
significantly more SKUs in the last three years. Table 14 shows the effective and less effective plants, an independent sample T-test is
actual average values for all depicted metrics, which reveals that conducted subsequently, comparing the two resulting sub-samples
plants from Cluster A have on average inventory levels of 71.9 days (G2, G3) with (G1, G4) (cf. Figure 11).
compared to 189.9 days for Cluster B.

34 | QUALITY METRICS RESEARCH


Operating context factor Cluster n Aver- Std. Sig. t-value Operating context factor Cluster n Average Std. Sig. t-value
age Devia- Deviation
tion (2-tailed) (2-tailed)

FTE Direct labor / # A 31 .572 0.231 Inventory level (days A 31 71.9 56.4
.001 -3.231
.000 -5.253 on hand) B 25 189.9 175.5
of batches produced B 30 .283 0.197
% of manually A 32 40 40
FTE Indirect labor A 31 0.582 0.244 .303 1.039
/ # of batches .000 -5.373 operated machines B 29 30 30
30 .271 0.204
B
produced A 31 450.9 441.3
# of Total FTEs .204 1.294
B 30 341.9 156.8
Table 13: Differences in productivity in terms of FTE/Batch
# of different A 32 55.7 87.5
market products 30 27.5 45.4 .115 1.607
B
produced
A 22 258.2 219.4
# of different SKUs .415 0.827
B 19 177.9 370.7
# of new drug A 32 22.0 48.4
introductions in last 30 5.5 7.3 .066 1.900
B
3 years
# of SKU at site in A 31 114.3 156.5
.017 2.770
last 3 years B 23 30.3 54.5

Table 14: Differences between efficient and less efficient plants


concerning operating context

Enabler category Cluster n Average Std. Sig. t-value


Deviation
(2-tailed)
A 32 .639 .274
Overall Enabler .039 2.105
B 30 .487 .298
Total Productive A 32 .589 .313 .832 -0.213
Maintenance (TPM) B 30 .293 .293
Total Quality A 32 .265 .265 .162 1.415
Management (TQM) B 30 .271 .271
Just-in-Time A 32 .579 .284 .006 2.830
Production (JIT) B 30 .374 .287
Effective A 32 .599 .279 .076 1.808
Management System 30 .470 .284
B
(EMS)
Basic Elements (BE) A 32 .579 .288 .064 1.893
B 30 .435 .311

Table 15: Differences between efficient and less efficient plants


concerning enabler implementation

QUALITY METRICS RESEARCH | 35


Enabler category Group n Average Std. Sig. t-value Operating context Group n Average Std. Sig. t-value
Deviation factor Deviation
(2-tailed) (2-tailed)
2+3 31 .482 .297 .024 -2.314 Inventory level 2+3 28 151.4 157.1
Overall Enabler .144 1.486
1+4 31 .649 .270 (days on hand) 1+4 28 97.7 109.0
Total Productive 2+3 31 .589 .285 .828 -0.017 % of manually 2+3 30 40 30
Maintenance 1+4 31 .606 .320 operated 1+4 31 30 40 .239 1.188
(TPM) machines
Total Quality 2+3 31 .516 .278 .013 -0.168 2+3 31 411.1 326.5
Management Total FTEs .748 0.323
1+4 31 .684 .238 1+4 30 383.1 348.7
(TQM)
# of different 2+3 31 39.3 54.4
Just-in-Time 2+3 31 .377 .258 .006 -0.021 market 1+4 31 44.7 85.7 .768 -0.297
Production (JIT) 1+4 31 .584 .310 products
produced
Effective 2+3 31 .476 .306 .093 -0.122
Management 1+4 31 .598 .256 # of different 2+3 21 171.0 203.8
.285 -1.089
System (EMS) SKUs 1+4 20 273.5 370.8
Basic Elements 2+3 31 .449 .312 .127 -0.119 # of new drug 2+3 31 15.0 48.6
(BE) 1+4 31 .568 .292 introductions 1+4 31 13.1 15.8 .840 0.204
in last 3 years
Table 16: Differences between effective and less effective plants concerning # of SKU at site 2+3 27 78.9 162.4
enabler implementation .987 0.017
in last 3 years 1+4 27 78.3 89.4

Table 17: Differences between effective and less effective plants


concerning operating context

5.5.2 Results 5.6.1 Research Approach


The T-test reveals significant differences between effective (G1, G4) As outlined in chapter 5.3.2, plants from group G1 are defined in
and less effective (G2, G3) plants (cf. Figure 11). The main differenc- this report as ‘excellent’ since it is the only group that achieves
es are outlined in the following. high levels of effectiveness and efficiency concurrently. In contrast,
plants from group G3 represent the worst performing plants. Con-
Differences between effective and less effective plants in enabler imple-
sequently, differences between the two groups are identified in the
mentation
following based on independent sample T-tests.
Enabler implementation is the key lever for driving effectiveness.
Effective sites have significantly higher Overall, but also individual 5.6.2 Results
JIT and TQM enabler implementation levels indicated by higher
scores compared to less effective plants (cf. Table 16). The T-test reveals several significant differences between excellent
and the worst performing plants, which are in line with the find-
Differences between effective and less effective plants in operating con- ings outlined in chapter 5.4 and chapter 5.5. The significant differ-
text ences between the two groups are presented in the following.
In contrast, the T-test does not reveal significant differences be- Differences in enabler implementation
tween effective and less effective plants for factors such as level
of automation, number of FTE, number of produced SKUs and The comparison of enabler implementation levels between the two
number of drug introductions in the last three years (cf. Table 17). groups supports the view of enabler implementation as the key le-
Accordingly, operating context is not viewed as a major driver of ver for driving PQS Excellence (cf. chapter 5.4.2 and chapter 5.5.2).
effectiveness. The T-test shows that excellent plants have significantly higher
enabler implementation levels in most enabler categories covering
both technical and social aspects of the organization. Specifically,
5.5.3 Conclusion excellent plants build upon significantly higher enabler implemen-
The findings suggest that in order to achieve high levels of effec- tation Overall, as well as in subordinate categories TQM, JIT, EMS,
tiveness that can ultimately lead to higher efficiency, companies and BE, which is indicated by higher scores (cf. Table 18).
should focus on enabler implementation. In contrast, the operat-
ing context plays a subordinate role for influencing effectiveness.
Differences in operating context
Excellent plants achieve higher effectiveness and efficiency with
significantly lower inventory levels (days on hand). They are also
5.6 Difference between excellent and worst able to handle higher complexity, which is reflected by significantly
performing plants higher numbers of different products and different SKUs produced
in the last year as well as higher numbers of new drug introduc-
After focusing on isolated differences in efficiency (cf. chapter 5.4) tions and newly launched SKUs in the last three years, while main-
and effectiveness (cf. chapter 5.5), this chapter focuses on differenc- taining high levels of efficiency (cf. Table 19).
es between excellent and the worst performing plants in terms of
PQS Excellence. Accordingly, in the following, both performance The worst performing plants on the other hand require a signifi-
dimensions are taken into account. cantly higher share of indirect QA/QC labor, which can be an in-

36 | QUALITY METRICS RESEARCH


Enabler category Group n Average Std. Sig. t-value Operating context Group n Average Std. Sig. t-value
Deviation factor Deviation
(2-tailed) (2-tailed)
1 20 .653 .279 Inventory level 1 16 61.6 53.4
Overall Enabler .007 -2.849 .012 -2.793
3 19 .397 .283 (days on hand) 3 19 198.8 190.3

Total Productive 1 20 .559 .325 % of manually 1 18 30 40


Maintenance .981 -0.023 operated 3 20 40 30 .930 -0.088
3 19 .558 .279
(TPM) machines
Total Quality 1 20 .685 .261 1 19 427.8 414.0
Total FTEs .523 0.649
Management 3 19 .474 .282 .020 -2.427 3 19 362.7 139.9
(TQM) # of different 1 19 60.2 103.6
Just-in-Time 1 20 .611 .310 market products 3 20 33.7 55.0 .324 1.003
.001 3.785
Production (JIT) 3 19 .282 .228 produced
Effective 1 20 .607 .261 # of different 1 11 260.4 207.8
.031 2.342
Management 3 19 .406 .283 .027 -2.303 SKUs 3 12 94.2 125.9
System (EMS) # of new drug 1 19 17.0 17.5
Basic Elements 1 20 .564 .302 introductions in 3 20 5.3 6.3 .010 2.818
.036 -2.182
(BE) 3 19 .353 .302 last 3 years
# of SKU at site 1 15 94.4 88.7
Table 18: Differences between excellent and the worst performing plants .004 3.144
in last 3 years 3 19 25.3 32.3
concerning Enabler Implementation
Table 19: Differences between excellent and the worst performing plants
concerning operating context

Headcount component Group n Average Std. Sig. t-value


Deviation
(2-tailed)
FTE Direct labor / 1 19 0.401 0.245
.068 -1.880
Total FTE 3 19 0.551 0.246

FTE Direct Production labor / 1 19 0.424 0.245


.092 -1.733
Total FTE 3 19 0.563 0.248

FTE Direct QC labor / 1 19 0.387 0.280


.394 -0.864
Total FTE 3 19 0.472 0.328

FTE Indirect labor / 1 19 0.598 0.247


.067 1.890
Total FTE 3 19 0.447 0.247

FTE Indirect QA/QC labor / 1 19 0.563 0.218


.013 2.631
Total FTE 3 19 0.354 0.268

FTE Indirect maintenance labor / 1 19 0.527 0.271


.896 -0.132
Total FTE 3 19 0.540 0.323

FTE Indirect other labor / 1 19 0.593 0.278


.544 0.612
Total FTE 3 19 0.539 0.268

Table 20: Differences between excellent and the worst performing plants
concerning headcount structure

QUALITY METRICS RESEARCH | 37


Headcount component Group n Average Std. Sig. t-value
Deviation
(2-tailed)
FTE Direct labor / 1 19 0.595 0.235
.000 4.559
# of batches produced 3 19 0.289 0.173

FTE Direct Production labor / 1 19 0.588 0.248


.000 4.400
# of batches produced 3 19 0.285 0.171

FTE Direct QC labor / 1 19 0.600 0.214


.008 2.813
# of batches produced 3 19 0.362 0.300

FTE Indirect labor / 1 19 0.666 0.202


.000 5.428
# of batches produced 3 19 0.288 0.227

FTE Indirect QA/QC labor / 1 19 0.678 0.214


.000 5.267
# of batches produced 3 19 0.282 0.248

FTE Indirect maintenance labor / 1 19 0.654 0.207


.000 4.301
# of batches produced 3 19 0.340 0.241

FTE Indirect other labor / 1 20 0.612 .255


.001 4.181
# of batches produced 3 19 0.323 .255

Table 21: Differences between excellent and the worst performing plants
concerning employee productivity

dicator for underlying stability problems (cf. Table 20). The head- More detailed information of significance of the group differences
count shares depicted in Table 20 are normalized whereby higher are therefore not provided in Table 22.
values indicate a lower share of FTE per component.
The results of the descriptive analysis outlined in Table 22 show
Finally, in addition to differences in headcount structure, excellent clear differences in enabler implementation levels between the
plants show significant differences in employee productivity in best and the worst performing groups (group G1 respectively group
terms of FTE per batch. A T-test reveals that excellent plants show G3). It also shows that all groups but group G3 have similarly high
significantly higher employee productivity in all direct and indirect Overall enabler implementation scores. A more detailed look at
areas indicated by higher normalized values for FTE per batch. Ta- underlying enabler categories, however, reveals that the more ef-
ble 21 depicts the differences between the two groups concerning fective groups (group G1 and group G4) also show the highest TQM
employee productivity. enabler implementation levels. Furthermore, the very best group
(group G1) separates itself from the other groups in terms of show-
5.6.3 Conclusion ing the highest JIT and EMS enabler implementation levels, which
underlines the importance of an effective management system and
The results outlined in this chapter support the view of enabler of JIT capabilities based on high Overall enabler implementation.
implementation as a key lever for driving PQS Excellence. Plants
from group G1 excel for enabler categories Overall, but also for en- The results further suggest that excellent plants and the worst
abler sub-categories TQM, JIT and EMS. In contrast, plants from performing plants are structurally different. However, these dif-
group G3, which represent the worst performing group show the ferences are viewed as the outcome of PQS performance rather
lowest enabler implementation levels. Table 22 provides a descrip- than as drivers. Examples include the significantly lower inventory
tive overview of key characteristics of the four groups G1 to G4. levels and the higher number of different SKUs produced of excel-
For each criterion the respective highest and lowest scoring values lent plants (cf. Table 19) as well as the lower share of indirect la-
are highlighted in bold. The objective of this table is to provide an bor (cf. Table 20). Therefore, companies should not try to improve
overview of mean values for each of the criteria discussed in this performance by reducing inventory or headcount without having
chapter and for each group. Therefore, the table does not focus on achieved sustainable operational stability first. Since OPEX litera-
ranking the different groups. By definition, excellent plants show ture emphasizes the importance of continuous improvement based
high levels of both PQS Effectiveness and PQS Efficiency. Conse- on an engaged workforce as a key driver of sustainable and superior
quently, these plants excel for most of the underlying metrics of performance, companies should further strive for enhanced em-
both the PQS Effectiveness Score and of the PQS Efficiency Score. ployee involvement in CI as opposed to headcount cost reduction.

38 | QUALITY METRICS RESEARCH


Criteria Group n Average Std. Deviation
1 20 0.609 0.092
2 12 0.622A 0.054
PQS Excellence
3 19 0.327B 0.099
4 11 0.395 0.045
1 20 0.630 A
0.069
2 12 0.488 0.046
PQS Effectiveness
3 19 0.441 B
0.105
4 11 0.612 0.032
1 20 0.588 0.175
2 12 0.756A 0.092
PQS Efficiency
3 19 0.214 0.131
4 11 0.177B 0.084
1 20 .653 A
.279
2 12 .617 .278
Overall Enabler
3 19 0.397 B
0.283
4 11 0.641 0.268
Total Productive Maintenance 1 20 .560 .325
(TPM) 2 12 .639 .298
3 19 .558 B
0.280
4 11 .689 A
0.309
Total Quality Management 1 20 .685A .261
(TQM) 2 12 .581 .271
3 19 .474B 0.282
4 11 .681 0.201
Just-in-Time Production (JIT) 1 20 .611A .310
2 12 .527 .238
3 19 .282B 0.228
4 11 .533 0.319
Effective Management System 1 20 .607A .261
(EMS) 2 12 .587 .318
3 19 .406B 0.283
4 11 .582 0.258
Basic Elements (BE) 1 20 .564 .302
2 12 .603 A
.273
3 19 .353B 0.302
4 11 .577 0.285

(A) highest average value in the sample


(B) lowest average value in the sample

Table 22: Overview of group characteristics

QUALITY METRICS RESEARCH | 39


6 THE ICH Q10 ARCHITECTURE FROM
A DATA PERSPECTIVE

40 | QUALITY METRICS RESEARCH


The following chapter outlines first findings analyzing enabler re- bler implementation level scores are calculated per production site:
lations within the enhanced Pharmaceutical Production System
1. Average implementation level of the 27 legacy OPEX
Model (PPSM). A special focus is set on better understanding the
enablers assigned to ICH Q10 guideline categories
impact of a higher ICH Q10 related enabler implementation lev-
el. Chapter 6.1 depicts the interrelation between ICH Q10 related 2. Average implementation level of the 110 legacy OPEX
enablers and the overall PPSM enabler landscape, while Chapter enablers not assigned to ICH Q10 guideline categories
6.2 discusses observable links between the implementation level of
To understand the relation between ICH Q10 related enabler im-
ICH Q10 related enablers and operational performance of pharma-
plementation level and the overall enhanced PPSM enabler imple-
ceutical production sites.
mentation level a scatter plot is built and the degree of determina-
As data-gathering for the newly derived and recently added ena- tion is used to assess how much variance in overall PPSM enabler
blers is currently on-going, both presented analyses build on the implementation level can be explained by ICH Q10 related enabler
subset of 27 legacy OPEX enablers assigned to the ICH Q10 guide- implementation level.
line categories (see Chapter 4.1).
6.1.2 Results
Figure 12 depicts the relation between the average implementation
6.1 ICH Q10 Related Enablers & Overall PPSM level of the 27 legacy OPEX enablers assigned to ICH Q10 guideline
categories on the X-Axis and the average implementation level of
Enabler Landscape the 110 legacy OPEX enablers not assigned to ICH Q10 guideline
categories on the Y-Axis.
The subset of 27 legacy OPEX enablers are assigned to one of the
ICH Q10 categories based on the content discussed in the individu- The scatter plot in Figure 12 shows a positive correlation. The de-
al legacy enabler questions. As ICH Q10 intends to be a comprehen- gree of determination is 69.1 %. This means, that 69.1 % of the var-
sive model (ICH, 2008), the relationship between the implemen- iation in implementation level of the 110 legacy OPEX enablers not
tation level of ICH Q10 related enablers and the 110 other legacy assigned to ICH Q10 guideline categories can be explained by the
enablers is tested. implementation level of enablers assigned to ICH Q10 categories.

6.1.1 Research Approach 6.1.3 Conclusion


The enabler implementation level in pharmaceutical production The scatter plot shows a positive correlation and a high degree of
sites is assessed as an integral part of the St.Gallen Operational Ex- determination meaning variation in average enabler implemen-
cellence Benchmarking program. Specific questions allow to eval- tation level explained by the average implementation level of the
uate the implementation level for every single enabler on a 5-point enablers assigned to ICH Q10 categories. That is why the imple-
Likert scale. Please see the final research report year 1 report for ad- mentation level of ICH Q10 related enablers in a pharmaceutical
ditional details about the enabler implementation level evaluation production plant can be seen as a surrogate for the overall PPSM
procedure (Friedli et al. 2017). enabler implementation level. However, relying on the ICH Q10 as-
signed enabler is not recommended, as it does not provide the full,
At the point of analysis, the enabler implementation level data of comprehensive picture needed to derive OPEX priorities following
358 pharmaceutical production sites is available. Summarizing the the principles of continuous improvement.
implementation level of enabler subsets, two different average ena-

Figure 12: ICH Q10 Enabler Implementation Level & Legacy


PPSM Enabler Implementation Level not assigned to ICH Q10

QUALITY METRICS RESEARCH | 41


The figures on page 43 depict the relation between the average
6.2 ICH Q10 Related Enablers & implementation level of the 27 legacy OPEX enablers assigned
Operational Performance to ICH Q10 guideline categories Management Responsibilities
(Figure 14), Knowledge Management (Figure 15) and Process
ICH Q10 was conceptualized as a reference model for an effective Performance / Product Quality Monitoring System (Figure
pharmaceutical quality system (ICH, 2008). The hypothesis of 16) on the X-Axis and the Aggregated PQS Effectiveness of
higher ICH Q10 related enabler’s implementation level leading to pharmaceutical production sites on the Y-Axis.
better operational performance is tested in the following chapter.
All scatter plots presented in Figures Figure 14, Figure 15 and Figure
16 show a positive correlation. The degrees of determination are
6.2.1 Research Approach 5.8%, 7.4% and 7.8%. This means, that 5.8% (Management Respon-
The relationship between the average implementation level of ICH sibilities), 7.4% (Knowledge Management) and 7.8% (Process Per-
Q10 assigned enablers and the PQS effectiveness of pharmaceuti- formance / Product Quality Monitoring System) of the variation in
cal production sites is analyzed. Following the approaches taken in Aggregated PQS Effectiveness can be explained by the implemen-
Year 1 and Year 2 research, Aggregated PQS effectiveness is calcu- tation level of enablers assigned to the individually analyzed ICH
lated based on Supplier Reliability (SR) and Operational Stability Q10 category.
(OS) summarizing the KPIs Complaint Rate Supplier, Service Level
Supplier, Unplanned Maintenance, OEE, Rejected Batches, Yield,
Scrap Rate, Release Time and Deviation Closure Time. Please see
the final research report year 1 report for additional details on Ag-
gregated PQS Effectiveness (Friedli et al., 2017). At the moment of 6.2.3 Conclusion
analysis, the Aggregated PQS Effectiveness and average enabler
implementation levels can be calculated for 105 pharmaceutical A positive relation between the ICH Q10 related enabler implemen-
production sites. tation level and operational performance, measured by Aggregated
PQS effectiveness, is shown in Figure 13. The degree of determina-
To understand the relation between ICH Q10 related enabler im- tion (9.1 %) is on a comparable level to the degree of determination
plementation level and the Aggregated PQS Effectiveness of phar- (11.4%) observable when linking Aggregated PQS effectiveness to
maceutical production sites a scatter plot is built and the degree of the implementation level of the entire legacy OPEX enabler set as
determination is used to assess how much variance in Aggregated analyzed in the final research report year 2 report (Friedli, Köhler,
PQS Effectiveness can be explained by ICH Q10 related enabler Buess, Basu, et al., 2018). Thus, the implementation of ICH Q10
implementation level. Furthermore, the relation between Aggre- related enablers contributes to PQS effectiveness similarly to the
gated PQS Effectiveness and the implementation level of enablers implementation of all legacy OPEX enablers. Degrees of determi-
assigned to the ICH Q10 categories Management Responsibilities, nation around 10% are in line with previous studies investigating
Knowledge Management and Process Performance / Product Qual- the relation between enabler implementation level and operational
ity Monitoring System are analyzed individually. The other ICH performance, see exemplarily Ghosh (Ghosh, 2012).
Q10 categories cannot be analyzed separately as the number of leg-
acy OPEX enablers assigned to the categories is too small to draw For all of the three individually analyzed ICH Q10 categories Man-
robust conclusions. That is why the results have to be understood agement Responsibilities, Knowledge Management and Process
as preliminary and need to be detailed and confirmed as soon as Performance / Product Quality Monitoring System a positive link
datasets covering all ICH Q10 categories (see Appendix 1) are avail- to Aggregated PQS Effectiveness can be shown. The variation ex-
able for analysis. plainable by the implementation level of the individual category
5.8%, 7.4% and 7.8% is on a comparable level. Thus, there is no par-
ticular striking driver across the ICH Q10 categories.
6.2.2 Results
The presented analyses build on the limited datasets of legacy
Figure 13 depicts the relation between the average implementation
OPEX enablers assigned to the ICH Q10 categories. These legacy
level of the 27 legacy OPEX enablers assigned to ICH Q10 guideline
datasets are not fully representative for all ICH Q10 guideline cate-
categories on the X-Axis and Aggregated PQS Effectiveness on the
gories. Thus, the results need to be understood as preliminary. The
Y-Axis.
research team will reconfirm and further investigate the early find-
The scatter plot in Figure 13 shows a positive correlation. The ings as soon as an appropriate number of datasets including the
degree of determination is 9.1 %. This means, that 9.1 % of the newly derived questions covering the whole ICH Q10 architecture
variation in Aggregated PQS Effectiveness can be explained by the (see Appendix 1) has been collected.
implementation level of enablers assigned to ICH Q10 categories.

42 | QUALITY METRICS RESEARCH


Figure 14: Management Responsibilities Enabler Implementation Level &
Figure 13: ICH Q10 Related Enabler Implementation Level & Aggregated PQS Effectiveness
Aggregated PQS Effectiveness

Figure 16: Process Performance/ Product Quality Monitoring System Enabler


Figure 15: Knowledge Management Enabler Implementation
Implementation Level & Aggregated PQS Effectiveness
Level & Aggregated PQS Effectiveness

QUALITY METRICS RESEARCH | 43


7 THE CRUCIAL ROLE OF THE LABS

44 | QUALITY METRICS RESEARCH


This chapter is focused on QC lab robustness within the PPSM. and QCLPs as well as comparing the characteristics within both
The mixed-methods approach combining quantitative and quali- groups. The decision to reject a proposition is primarily based on
tative analyses allows a sound understanding of the phenomenon. the between groups comparison. The within group comparison al-
lows understanding the relation better.
First, the impact of the operating context and business environ-
ment is investigated. Second, a quantitative perspective on QC lab
robustness (i.e. QC lab effectiveness) is provided. Third, three case 7.1.2 Results
studies serve as a complement to the preceding quantitative anal- In the following sections the descriptive statistic for the outlined
ysis. The qualitative analysis deepens the impact of the operating propositions is described. For more details and tabular overviews
context on QC lab performance. Fourth, a past analysis on Quality reference should be made to Köhler (Köhler, Friedli, & Basu, 2019).
Culture (Friedli et al., 2017) is transferred to the QC labs. More spe- As stated above, the presented results are of descriptive nature and
cifically, the relation between Quality Maturity and Quality Behav- due to the sample size no decision on significance can be made. No
ior in QC is analyzed. results are presented for individual characteristics of the proposi-
tions with equal to or less than five QC labs.
Table 25 depicts an overview of the operating context and the ana-
lyzed characteristics of each dimension.
7.1 The Operating Context and Business Environment
The operating context is often discussed as an influencing factor 7.1.2.1 Geographical Location
for the enabler implementation and performance improvements
The comparison between QCHPs and QCLPs regarding their re-
(Ahmad, Schroeder, & Sinha, 2003; Shah & Ward, 2003). To test the
gional distribution reveals no potential differences.
impact of the operating context for QC labs we derived eight prop-
ositions. Each of the propositions summarizes specific dimensions In this category the analysis compared Europe and North America
that are closely linked together (see Table 23). as well as high and low cost location. QC lab effectiveness seems
not to be related to by the geographical location of a QC lab. In
The following analysis is focused on striking differences related to
both, Europe and North America, approximately half of the QC
the geographical location, portfolio complexity, test allocation strat-
labs are QCHPs and QCLPs. Dividing the overall sample into high
egy, organizational scale, economy of scale, technology & innovation
and low cost QC labs depicts an equal distribution of QC labs.
and regulatory approval. The analysis investigates each proposition
In high cost locations 49 % of the QC labs are QCHPs and 51 %
and related dimensions whether a striking relation with QC lab effec-
QCLPs. In low cost locations the split is 50:50. Other regions (e.g.
tiveness exists. Table 23 depicts an overview of the propositions and
Asia) of the world were not compared as part of this analysis due
details on aspects that are considered.
to limited data from these regions. Therefore, no conclusions for
other regions than those mentioned above can be made.
7.1.1 Research Approach
At the point of analysis the St.Gallen QC Lab OPEX Benchmarking 7.1.2.2 Portfolio Complexity
database comprised 53 QC labs. In some cases, separating the 53
The comparison between QCHPs and QCLPs regarding their port-
QC labs based on their operating context leads to relatively low
folio complexity reveals potential differences.
sample sizes. A descriptive statistic separating above median per-
forming QC labs from below median performing QC labs allows In this category the analysis compared drug substance type, drug
finding indication how the operating context impacts the QC lab product type, and number of final drug product types tested. In-
effectiveness. However, due to the sometimes low sample sizes for dependent from the number of drug product types tested, the
certain operating context characteristics no test for statistical sig- analysis shows a difference between chemical and biological drug
nificance is performed. substance testing. Counter-intuitively, a majority of QC labs (78 %)
that is testing biological drug substance belongs to the QCHPs. At
QC lab robustness in the PPSM is represented by QC lab effective-
the same time, a majority of QC labs (67 %) that is testing chemical
ness, a combination of quality and service performance. The ag-
drug substance belongs to the QCLPs. Comparing no drug product
gregated QC lab effectiveness score builds the basis to distinguish
QC labs with those QC labs that test drug products, the majority
above from below performing QC labs. The score is built on 12 in-
of 75 % of QC labs testing no drug product belongs to the QCHPs.
dividual performance indicators related to quality and service in
However, this is mainly driven by biological drug substance QC
QC labs (Köhler, 2019). Table 24 exhibits an overview of the perfor-
labs as stated before. 64 % of QC labs that only test the drug prod-
mance indicators used for QC lab effectiveness.
uct type sterile liquids belong to the QCHPs. 63 % of QC labs test-
The group of above median performing QC labs represents the QC ing a mix of different drug product types belong to the QCLPs. Ad-
Lab Effectiveness High Performers (QCHPs). The group of QC Lab ditionally, almost 2/3 of QC labs testing up to 50 final drug product
Effectiveness Low Performers (QCLPs) includes all below median types are QCHPs. The majority (72%) of QC labs testing more than
performing QC labs. The overall number of 53 QC labs can be dis- 100 final drug product types belong to the QCLPs.
tinguished into 26 QCHPs and 27 QCLPs.
The descriptive comparison is focused on analyzing whether there
are striking context factors for QCHPs and QCLPs. The con-
clusions are based on comparing the characteristics of QCHPs

QUALITY METRICS RESEARCH | 45


No. Proposition Dimensions
The operating context of a QC lab has no impact on the QC
P1 Summary of P2 to P8
lab effectiveness.
The geographical location of the QC lab has no impact on the
P2 Country, Regional Distribution, Cost Location
QC lab effectiveness.
The portfolio complexity of the QC lab has no impact on the Drug Substance Type, Drug Product Type, No. of final Drug
P3
QC lab effectiveness. Product Types Tested
The test allocation strategy of the QC lab has no impact on
P4 Centralization, Degree of Centralization
the QC lab effectiveness.
The organizational scale of the QC lab and site has no impact
P5 QC FTEs, Site FTEs
on the QC lab effectiveness.
The economy of scale of the QC lab has no impact on the QC
P6 No. of Batches processed, No. of Tests
lab effectiveness.
The technology and innovation structure of the QC lab has
P7 Age of Instruments, Age of Methods, Automation
no impact on the QC lab effectiveness.
The regulatory approval of the QC lab has no impact on the
P8 US Approval, EU Approval, China Approval, Japan Approval
QC lab effectiveness.

Table 23: Propositions related to the operating context and QC lab effectiveness

Performance Dimension Indicator Unit


Service Adherence to Lead Time %
Service Adherence to Schedule %
Quality Analytical Right First Time (A) %
Quality Customer Complaint Investigation Rate No./100,000 Tests
Quality Invalidated OOS Rate (A) No./100,000 Tests
Quality Lab CAPAs Overdue %
Quality Lab Deviation Rate No./1,000 Tests
Quality Lab Investigation Rate (A) No./1,000 Tests
Quality Product Re-Tests due to Complaints %
Quality Recurring Lab Deviations %
(A) Metric is aggregated from different testing types performed in the lab (drug substance, intermediate, in-
process-control, raw material, stability, drug product, packaged product, microbial environmental, microbial
product, component & packaging material

Table 24: QC lab effectiveness definition

Operating Context & Business Environment Analyzed Characteristic


Geographical Location Country, Regional Distribution, Cost Location
Portfolio Complexity Drug Substance Type, Drug Product Type, No. of final Drug Product Types
Test Allocation Strategy Centralization, Degree of Centralization
Organizational Scale QC FTEs, Site FTEs
Economy of Scale No. of Batches processed, No. of Tests
Technology & Innovation Age of Instruments, Age of Methods, Automation
Regulatory Approval US Approval, EU Approval, China Approval, Japan Approval
Table 25: Analyzed characteristics of context and business environment

46 | QUALITY METRICS RESEARCH


7.1.2.3 Test Allocation Strategy 7.1.3 Conclusion
The comparison between QCHPs and QCLPs regarding their test- The analysis of the operating context concludes that it has an im-
ing allocation strategy reveals potential differences. pact on the effectiveness of QC labs. QCHPs and QCLPs show a
In this category the analysis compared the fact of QC testing cen- different portfolio complexity, test allocation strategy, organiza-
tralization16 and the degree of it. Whereas decentralized and cen- tional scale, and technology & innovation structure.
tralized QC labs are almost evenly distributed across QCHPs and Biological drug substance testing QC labs account for a large pro-
QCLPs the degree of centralization seems to have an impact. When portion of QCHPs. Additionally, QC labs testing sterile liquids or
the focus is only set to the centralized QC labs a majority (60%) no drug product (i.e. any drug substance but mainly driven by bio-
with up to 25 % degree of centralization belongs to the QCLPs. 78 logical drug substance) are more often QCHPs than QCLPs. Com-
% of the QC labs with above 25 % degree of centralization belong pared to QCLPs with a high number of final drug product types
to the QCHPs. tested, QCHPs show a low number of final drug product types test-
ed. While the fact of centralization does not have a major impact
7.1.2.4 Organizational Scale the great majority (more than 2/3) of centralized QC labs with a
degree of centralization above 25 % are QCHPs. While QCHPs are
The comparison between QCHPs and QCLPs regarding their or-
those organizations with a tendency to large scale organizations
ganizational scale reveals potential differences.
QCLPs include a majority of organizations with a lower number of
In this category the analysis compared QC FTEs and site FTEs. Re- employees. However, this does not mean that small organizations
garding QC FTEs the QCLPs show a majority (71 %) of QC labs cannot achieve high QC lab effectiveness. Moreover, QCHPs show
with up to 30 FTEs. On the contrary, a majority (62 %) with above a high level of automation. Table 26 depicts an overview of the op-
90 FTEs are QCHPs. On site-level a majority of QCLPs belongs erating context conclusions.
to those sites with below 200 FTEs. The largest share of QCHPs
belongs to the category of sites with above 600 FTEs. Nevertheless,
the differences regarding organizational scale is not as distinct as
in other dimensions. Further investigation on an increased number 7.2 Three Patterns of QC Labs
of QC labs will allow a reevaluation of the impact of the organiza- – A Quantitative Perspective
tional scale.
The quantitative analysis focused on the relation between the ena-
7.1.2.5 Economy of Scale bler implementation and QC lab effectiveness. The enabler imple-
mentation comprises 68 individual enabler questions that are sum-
The comparison between QCHPs and QCLPs regarding their marized in 13 dimensions: Preventive Maintenance (1), Technology
economy of scale does not reveal potential differences. Assessment & Usage (2), Housekeeping (3), Process Management
In this category the analysis compared number of batches pro- (4), Standardization & Simplification (5), Set-up Time Reduction (6),
cessed and number of tests. For both categories there as an almost Pull Approach (7), Layout Optimization (8), Planning Adherence
even distribution of QCHPs and QCLPs for the two categories of a (9), Visual Management (10), Management Commitment & Com-
low respectively high number of processed batches and tests. pany Culture (11), Employee Involvement & Continuous Improve-
ment (12), Functional Integration & Qualification (13). The enablers
are assessed on a 5-point Likert scale and based on self-assessment.
7.1.2.6 Technology & Innovation For the definition of QC lab effectiveness reference should be made
The comparison between QCHPs and QCLPs regarding their tech- to chapter 7.1.1.
nology & innovation structure reveals substantial differences.
In this category the analysis distinguished age of instruments, age
7.2.1 Research Approach
of methods, and level of automation. The age of instruments and At the point of analysis the St.Gallen QC Lab OPEX Benchmark-
methods does not show substantial differences between QCHPs ing database comprised 53 QC labs. To allow generalization of the
and QCLPs. A large majority of QC labs work with more than 50 % results of the preceding operating context analysis are not used to
of instruments and methods that are older than five years. Looking build different peer groups. Consequently, all 53 QC labs were used
at QCHPs isolated they are evenly distributed across a low and high if no other indication is given.
level of automation. A majority of 67 % QCLPs has a low level of
automation. This results in a majority of QCHPs (59 %) in the cat- First, to structure and understand the available data a two-step
egory high automation level and a majority of QCLPs (58 %) in the cluster analysis is applied. The exploratory technique allows iden-
category low automation level. tifying patterns of similarity within the overall dataset. Second, an
independent samples T-test is performed to understanding signifi-
cant differences in the enabler implementation (on the dimension-
7.1.2.7 Regulatory Approval al level) between above and below median performing QC labs.
The comparison between QCHPs and QCLPs regarding their regu- Table 27 is summarizing the hypothesis of the quantitative analysis.
latory approval does not reveal potential differences.
In this category the analysis compared US, EU, China, and Japan
approval. For all four regulatory approvals there is an almost equal
distribution of QCHPs and QCLPs.

16. A centralized lab conducts test for the own site and other internal or external sites. A decentralized lab only tests products that are manufactured on-site
where the lab is located.

QUALITY METRICS RESEARCH | 47


No. Conclusion
P1 The operating context has an impact on QC lab effectiveness.
P2 The geographical location does not have an impact on QC lab effectiveness.
P3 The portfolio complexity has an impact on QC lab effectiveness.
P4 The test allocation strategy has an impact on QC lab effectiveness.
P5 The organizational scale has an impact on QC lab effectiveness.
P6 The economy of scale does not have an impact on QC lab effectiveness.
P7 The technology and innovation structure has an impact on QC lab effectiveness.
P8 The regulatory approval does not have an impact on QC lab effectiveness.

Table 26: Conclusions regarding research propositions

No. Hypothesis
H1 QCHPs do not show a significantly higher enabler implementation in Preventive Maintenance compared to QCLPs.
H2 QCHPs do not show a significantly higher enabler implementation in Technology Assessment & Usage compared to QCLPs.
H3 QCHPs do not show a significantly higher enabler implementation in Housekeeping compared to QCLPs.
H4 QCHPs do not show a significantly higher enabler implementation in Process Management compared to QCLPs.
H5 QCHPs do not show a significantly higher enabler implementation in Standardization & Simplification compared to QCLPs.
H6 QCHPs do not show a significantly higher enabler implementation in Set-up Time Reduction compared to QCLPs.
H7 QCHPs do not show a significantly higher enabler implementation in Pull Approach compared to QCLPs.
H8 QCHPs do not show a significantly higher enabler implementation in Layout Optimization compared to QCLPs.
H9 QCHPs do not show a significantly higher enabler implementation in Planning Adherence compared to QCLPs.
H10 QCHPs do not show a significantly higher enabler implementation in Visual Management compared to QCLPs.
QCHPs do not show a significantly higher enabler implementation in Management Commitment & Company Culture
H11
compared to QCLPs.
QCHPs do not show a significantly higher enabler implementation in Employee Involvement & Continuous Improvement
H12
compared to QCLPs.
QCHPs do not show a significantly higher enabler implementation in Functional Integration & Qualification compared to
H13
QCLPs.

Table 27: Hypotheses to test significant difference in the enabler implementation between QCHPs and QCLPs

Low Effectiveness,
.80 Low Enabler (C1)
Low Effectiveness,
High Enabler (C2)
High Effectiveness,
QC Lab Effectiveness

High Enabler (C3)


.60

.40

.20

.50 .60 .70 .80 .90 1.00


Enabler Implementation

Figure 17: Scatter plot of enabler relation with QC lab effectiveness for three clusters

48 | QUALITY METRICS RESEARCH


7.2.2 Results 7.2.3 Conclusion
The two variables QC lab effectiveness and enabler implementa- The combination of the two quantitative research methods allows
tion were used in the two-step cluster analysis to build the clusters. to get a deep understanding of the enabler - QC lab effectiveness
The goal of applying the cluster analysis was to identify whether understanding. First, the cluster analysis shows that there is not
the QC labs support the Operations Management understanding only one pattern but three different patterns between enabler im-
of a supportive relation between enabler implementation and per- plementation and QC lab effectiveness. Two of three clusters iden-
formance. The cluster analysis reveals three distinct clusters with tified support the Operations Management understanding associ-
different characteristics. Of all available 53 QC labs this analysis ating enabler implementation with performance improvements.
was conducted with 50 QC labs. The remaining 3 QC labs were One cluster is contradicting this understanding and serves as the
early pilot benchmarking participants. During the pilot the enabler basis for the qualitative analysis in the subsequent chapter 7.3. Ta-
section was not yet included in the benchmarking questionnaire. ble 29 depicts a summary of the conclusions of the quantitative
The first cluster (26 QC labs) combines low enabler implementa- analyses.
tion with low QC lab effectiveness. The second cluster (15 QC labs)
combines high enabler implementation with low QC lab effective-
ness. The third cluster (9 QC labs) combines high enabler imple-
mentation with high QC lab effectiveness. The cluster quality was 7.3 Three Patterns of QC Labs
determined as high with a size ratio between the largest and small-
est cluster of 2.89. Figure 17 illustrates the three cluster that were – A Qualitative Perspective
identified by applying the two-step cluster analysis. In chapter 7.2 three patterns were identified for the relation be-
The result of the cluster analysis shows that there is one group tween enabler implementation and QC lab effectiveness. Whereas
(cluster 2) of QC labs that does not support the common under- two of three clusters support the common Operations Manage-
standing of a supportive relation between enabler implementation ment understanding of associating a high enabler implementation
and performance. To understand better what distinguishes well with performance improvements, one cluster contradicts this un-
and not so well performing QC labs the following quantitative derstanding. The contradicting cluster that shows a high enabler
analysis is focused on cluster 1 and cluster 3. The quantitative sta- implementation but a low QC lab effectiveness builds the basis for
tistical analysis methods do not allow to derive conclusions why the qualitative perspective in this chapter. By contrasting the sup-
cluster 2 differs from the common understanding between enabler portive clusters from the contradicting cluster distinct differences
and performance. Thus, this cluster is investigated in qualitative are disclosed.
case studies to understand better why the unexpected relation be-
tween the enabler implementation and QC lab effectiveness exists 7.3.1 Research Approach
(cf. chapter 7.3).
To explain how and why the cluster 2 contradicts the common Op-
In the following the T-test to understand significant differences be- erations Management understanding case study research is con-
tween QCHPs and QCLPs regarding the enabler implementation ducted. To provide profound research results a set of representative
level is based on cluster 1 and cluster 3, a total of 35 QC labs. Figure companies was derived from the preceding quantitative analysis.
18 shows the remaining QC labs that are included in the subse- The three-step selection process concluded in three companies.
quent quantitative analysis after excluding cluster 2. Consequently, three comprehensive case studies were conducted.
In total, 22 QC labs were investigated to understand the differences
The T-Test comparing the enabler implementation level of QCHPs
between cluster 2 and the rest of the QC labs. The source of data
and QCLPs reveals that QCHPs have a significantly higher imple-
varied in terms of quantity and depth but combined a set of per-
mentation in 9 out of 13 enabler dimensions. QCHPs show a signif-
sonal interviews with corporate and local senior executives, confi-
icantly higher implementation in Technology Assessment & Usage,
dential and publicly available company material, workshop results,
Housekeeping, Process Management, Standardization & Simplifi-
personal notes, emails and sometimes on-site lab observations.
cation, Pull Approach, Layout Optimization, Planning Adherence,
Triangulation ensured validity and reliability of the research find-
Employee Involvement & Continuous Improvement, and Func-
ings. Figure 19 illustrates the clusters selected for the qualitative
tional Integration & Qualification. No significant difference can
case studies.
be observed for Preventive Maintenance, Set-up Time Reduction,
Visual Management, and Management Commitment & Company The configurational approach of the case study research allows to
Culture. For further details it should be referred to Appendix 2 that identify “multidimensional constellation of […] distinct character-
includes a detailed list of all individual enablers of each enabler di- istics that commonly occur together” (Meyer, Tsui, & Hinings, 1993,
mension. Table 28 depicts the T-test results highlighting significant p. 1175). The cross-case analysis consolidated the in-depth findings
differences between QCHPs and QCLPs for 9 out of 13 enabler di- of each company case study and highlights communalities across
mensions in bold. the companies regarding cluster 2. It also highlights characteristics
of cluster 1 and 3 to depict differences.
The above described results confirm and enhance the research
findings of the past research report17 that disclosed that well-per- The applied case study approach with multiple companies results
forming QC labs have a significantly higher implementation in in multiple in-depth company cases followed by a cross-case analy-
Layout Optimization, Process Management, Standardization & sis. To present the key results and conclusions in a target-oriented
Simplification, Planning Adherence, and Housekeeping. In this way the focus in this report is set to the cross-case analysis. For
most recent analysis all these dimensions and four additional di- interested readers reference should be made to Köhler for all indi-
mensions (see above) were found significantly different between vidual case studies (Köhler, 2019).
QCHPs and QCLPs.

17. FDA Quality Metrics Research – 2nd Year Report (Friedli, Köhler, Buess, Calnan, & Basu, 2018)

QUALITY METRICS RESEARCH | 49


Low Effectivenes,
.80 Low Enabler (C1)
High Effectiveness,
High Enabler (C3)
QC Lab Effectiveness

.60

.40

.20
.50 .60 .70 .80 .90 1.00
Enabler Implementation
Figure 18: Scatter plot of enabler relation with QC lab effectiveness for two clusters

Std. Sig.
Dependent Variable Category n Average t-value
Deviation (2-tailed)
QCHPs 17 0.72 .15
Preventive Maintenance .536 .0626
QCLPs 18 0.69 .12
QCHPs 17 0.64 .12
Technology Assessment & Usage1 .027 2.372
QCLPs 18 0.56 .06
QCHPs 17 0.86 .13
Housekeeping .027 2.315
QCLPs 18 0.73 .19
QCHPs 17 0.79 .12
Process Management .021 2.420
QCLPs 18 0.69 .12
QCHPs 17 0.83 .13
Standardization & Simplification .022 2.399
QCLPs 18 0.73 .11
QCHPs 17 0.66 .19
Set-up Time Reduction .076 1.830
QCLPs 18 0.55 .16
QCHPs 17 0.74 .19
Pull Approach .025 2.353
QCLPs 18 0.60 .17
QCHPs 17 0.75 .14
Layout Optimization .020 2.436
QCLPs 18 0.64 .11
QCHPs 17 0.78 .14
Planning Adherence .016 2.542
QCLPs 18 0.67 .13
QCHPs 17 0.69 .30
Visual Management .601 .528
QCLPs 18 0.64 .26
QCHPs 17 0.79 .11
Management Commitment & Company Culture1 .294 1.073
QCLPs 18 0.75 .06
QCHPs 17 0.69 .10
Employee Involvement & Continuous Improvement1 .005 3.098
QCLPs 18 0.60 .06
QCHPs 17 0.79 .14
Functional Integration & Qualification .016 2.528
QCLPs 18 0.68 .11
1 Equal variances not assumed

Table 28: T-test results for all enabler dimensions comparing QCHPs and QCLPs

50 | QUALITY METRICS RESEARCH


Analysis Conclusion
Three different patterns regarding the relation between enabler implementation and QC lab effectiveness
exist:
Cluster Low enabler implementation, low QC lab effectiveness
Analysis
High enabler implementation, low QC lab effectiveness
High enabler implementation, high QC lab effectiveness
QCHPs have a significantly higher enabler implementation in 9 out of 13 enabler dimensions.

The significantly higher implementation is linked to:


Technology Assessment & Usage
Housekeeping,
Process Management,
Standardization & Simplification,
Pull Approach,
Layout Optimization,
T-Tests
Planning Adherence,
Employee Involvement & Continuous Improvement,
Functional Integration & Qualification

No significantly different enabler implementation was found for18:


Preventive Maintenance,
Set-up Time Reduction,
Visual Management
Management Commitment & Company Culture

Table 29: Conclusions from a quantitative perspective

Figure 19: Selected clusters for qualitative research

18. That no significant difference for these enabler dimensions was found does not mean that these are enabler dimensions that are irrelevant. While for the
available dataset of QC labs these dimensions were not significantly different for QCHPs and QCLPs this is only linked to the operationalization of these
elements that can be found in Appendix 2 and may subject to change when the database size increases. Please review the definition of each dimension
carefully. We can only make conclusions based on the outlined questions per enabler dimension.

QUALITY METRICS RESEARCH | 51


7.3.2 Results 7.3.2.3 Business Environment
In the following the differences of cluster 1, 2, and 3 regarding ena- Whereas in cluster 1 and 2 the QC labs focus primarily on tradition-
bler, performance, business environment, and organization & peo- al (chemical) drug substance or a mix of traditional and new (bio-
ple are presented. logical) drug substance, the majority of well performing QC labs in
cluster 3 is exclusively working on new (biological) drug substance.
7.3.2.1 Enabler Additionally, the QC labs in cluster 3 have a high homogeneity and
predictability caused by low variety of testing activities. No major
All three clusters focus on the Technical Enabler System19 and the changes in their testing portfolio has allowed these QC labs to in-
Management Enabler System20. A detailed analysis on a more gran- vest resources in continuous improvement. While a majority of QC
ular level focused on individual enabler dimensions reveals that labs in cluster 1 and 2 have started their OPEX journey at a more
cluster 1 and 2 with a low respectively high enabler implementa- recent point in time, the QC labs in cluster 3 were pioneers and
tion and a low QC lab effectiveness especially focus on basic ena- already moved to a late transformation stage. Besides, cluster 2 is
bler dimensions in the Technical Enabler System. The basic enabler confronted with a constantly high number of new product launch-
dimensions are foundational elements Housekeeping, Standardi- es. This combination caused QC labs in cluster 2 to have less re-
zation & Simplification, and Visual Management. These basic ena- sources to develop long-term and lasting routines. The QC labs
bler dimensions are typically implemented at the beginning of an rather dedicate all their resource to their always changing business
OPEX journey. Cluster 3, the well performing QC labs, differs in requirements.
two ways from the other two clusters. First, the well performing
QC labs show a higher implementation of the basic enabler dimen- 7.3.2.4 Organization & People
sions compared to QC labs in cluster 1 and 2. Second, the QC labs
in cluster 3 also focus on more advanced technical enabler dimen- Taking a closer look at the organizational set-up and employee de-
sions Pull Approach and Planning Adherence. velopment the analysis reveals improvement potential in cluster 1
and 2. Whereas, the testing variety and testing volume is a business
Investigating the Management Enabler System more closely, the decision and cannot be changed there are other aspects that can
case studies disclose that cluster 1 QC labs focus on the basic man- be improved to achieve a higher QC lab effectiveness in the future.
agerial enabler dimension Management Commitment & Company The proactiveness toward CI and resources availability can be im-
Culture. Cluster 2 and 3 QC labs have expanded their effort be- proved in cluster 1 and 2. In addition, the span of control and train-
yond this focus to more advanced enabler dimensions Employee ing effort in cluster 2 must be better aligned with the complexity
Involvement & Continuous Improvement and Functional Integra- of the business environment in this cluster. The combination of
tion & Qualification. below industry average training days per employee across the labs
Focusing on the approach to working on implementing enablers in cluster 2 in combination with low homogeneity and low predict-
all three clusters differ. Whereas cluster 1 especially focuses on im- ability of the business in these labs is not sustainable.
plementing single dimensions, cluster 2 shows a low level of inte-
gration. Only cluster 3 achieves a high level of integration. This has 7.3.3 Conclusion
been proven to be beneficial for the overall success of a company Overall, the research team identified distinct differences between
(Shah & Ward, 2007). the three clusters. The results support the previous descriptive sta-
tistic showing the impact of the operating context in Chapter 7.1.
7.3.2.2 Performance The qualitative case studies and cross-case analysis reemphasize
The performance differences between cluster 1, 2, and 3 are grouped that the business environment has an impact on the QC lab effec-
into five categories: Performance Focus, Process Robustness, Cus- tiveness. Moreover, a major finding of the qualitative analysis is the
tomer Complaints, Planning Adherence, and Performance Varia- mismatch between organizational set-up, employee development
tion. and the business complexity in some of the QC labs. This mismatch
stops these QC labs to translate enabler implementation into QC
Both, cluster 1 and 2, have a tendency toward a higher quality per- lab effectiveness. The case studies conclude that there are several
formance compared to the service performance. However, their levers to improve the overall performance state of a QC lab. Re-
process robustness and planning adherence is low. The QC labs garding the enabler these levers are the appropriate enabler focus,
have a high number of invalidated OOS and lab investigations. a thorough enabler implementation, and an integrated approach
Additionally, the process reliability shows improvement potential. to the enabler implementation. Regarding performance the levers
The labs are not successful in delivering as planned and handling are a balanced performance focus, improvements of the process ro-
unplanned tasks at the same time. In contrast, the QC labs in clus- bustness, planning adherence, low number of customer complaints
ter 3 achieve equally high quality and service performance resulting disturbing the routine and minimizing performance variation. The
in high process robustness and planning adherence. Taking a closer business environment is given and does not include the ability for
look at the performance variation among individual performance short-term changes. However, a lever of a QC lab is to improve the
indicators discloses a three-fold pattern. Whereas, QC labs in clus- alignment of the organizational set-up and employee development
ter 3 show the lowest variability, cluster 2 QC labs do not show a with the complexity of the business. Figure 20 depicts a summary
balanced performance in all indicators. QC labs in cluster 1 show of the configurational differences of cluster 1, 2, and 3.
the highest variability across the performance indicators.

19. Technical Enabler System: Preventive Maintenance, Technology Assessment & Usage, Housekeeping, Process Management, Standardization & Simplifica-
tion, Set-up Time Reduction, Pull Approach, Layout Optimization, Planning Adherence, and Visual Management
20. Management Enabler System: Management Commitment & Company Culture, Employee Involvement & Continuous Improvement, and Functional
Integration & Qualification.

52 | QUALITY METRICS RESEARCH


Enabler Performance Business Environment Organization & People

Limited CI Resources
Single Dimension

Average Training
Low Enable r & Pe rf.

Frequently new

and Embracing

Low Span of
Basic MES

Very High
High Customer Complaints

Variation

products
Cluste r 1

Control
Low Planning Adherence

Effort
Low Process Robustness

High Testing Variety


Early Transformation
Low Homogeneity

Low Predictability

High Lab Volume


Basic TES

Follower
Traditional
Quality

and New
High Enable r & Low P

Business
Low Integration

High Variation

Constantly new

Low Training
High Span of
TES & MES

Limited CI
Cluste r 2

Resources
products

Control

Effort
Basic & Advance d MES

High Homogeneity

High Predictability
High Enable r & High P

High Integration
Basic & Advanced

Service & Quality

Infrequently new

Low Lab Volume


Transformation

High Training
High Planning

Low Customer

Low Variation
High Process

Low Testing
Proactive CI

Low Span of
Complaints
Robustness

Adherence
Cluste r 3

products

Control
Pioneer

Variety
Effort
New
TES

Late
Business

Figure 20: Configurational differences between QC labs

7.4 Quality Culture in QC 7.4.2 Results


Industry and academia acknowledge the important role of Cultur- At the point of analysis the St.Gallen QC Lab OPEX Benchmarking
al Excellence to achieve superior performance (Friedli et al., 2017; database included 54 QC labs. The scatter plot in Figure 21 linking
Patel et al., 2015). In 2014 a Quality Culture study by PDA found a Quality Maturity with Quality Behavior shows a positive correla-
positive correlation between Quality Maturity and Quality Behav- tion. The degree of determination is 55.9 %. This means, that 55.9
ior (Patel et al., 2015). In 2017, the University of St.Gallen was able % of the variation in Quality Behavior can be explained by Quality
to confirm this relation with the St.Gallen OPEX Benchmarking Maturity.
data of more than 300 pharmaceutical manufacturing sites. In this
research the analysis on Quality Maturity and Quality Behavior is 7.4.3 Conclusion
further enhanced by transferring the logic introduced by PDA to
The research team concludes that the previously disclosed positive
the QC labs and the St.Gallen QC Lab OPEX Benchmarking data.
relation Quality Maturity and Quality Behavior driving Quality
Culture also exists in QC labs. The results of PDA in 2013 and the
7.4.1 Research Approach University of St.Gallen in 2017 can be confirmed in a new unit of
This QC Quality Culture analysis is built on the PDA definitions analysis – the QC lab. While the analysis of PDA showed a degree
for Quality Maturity and Quality Behavior. Quality Maturity of determination of 33.7 % in the present analysis the degree of de-
represents “objective characteristics of a quality system that can termination totals at 55.9 %. The highest degree of determination
be observed or verified upon inspection or internal audit” (Patel et (66.4 %) existed in the 2017 analysis by the University of St.Gallen
al., 2015). Quality Behavior represents “observed specific behaviors on Quality Maturity and Quality Behavior on site-level. Figure 22
in their organization or site in such categories as communication compares the results of the 2013 PDA analysis, the 2017 University
& transparency, commitment & engagement, technical excellence, of St.Gallen analysis, and the most recent analysis in this research
and standardization of requirements” (Patel et al., 2015). To report.
understand the relation between the two dimensions the research
team allocated OPEX enablers of the St.Gallen QC Lab OPEX
Benchmarking to either of the two categories. Enablers that
did not match any of the two definition are not considered in
the analysis. In total, 18 enablers can be associated with Quality
Maturity. 19 enablers are linked to Quality Behavior. The
Appendix 2 provides a detailed overview of what enablers are
included in this analysis as Quality Maturity or Quality Behavior.
To understand the relation between Quality Maturity and Quality
Behavior a scatter plot is built and the degree of determination
is used to assess how much variance in Quality Behavior can be
explained by Quality Maturity.

QUALITY METRICS RESEARCH | 53


R2 Linear = 0.559
1.00

.90
QC Quality Behavior

.80
y=0.24+0.68*x

.70

.60

.50
.50 .60 .70 .80 .90 1.00

QC Quality Maturity
Figure 21: Relation between Quality Maturity and Quality Behavior in QC

54 | QUALITY METRICS RESEARCH


Figure 22: Relation between Quality Maturity and Quality Behavior from three
data sources (cf. Patel et al., 2015, Friedli et al., 2017, and this report)

QUALITY METRICS RESEARCH | 55


8 QUALITY CULTURE – RE-VISITED

56 | QUALITY METRICS RESEARCH


Building on the previous research published in the year 1 report 7. For product and process transfers between different units
(Friedli et al., 2017), the aim of this chapter is to deepen the analyses or sites standardized procedures exist, which ensure a fast,
around the enabler items in the St.Gallen Operational Excellence stable and complied knowledge transfer.
database that were assigned to the categories ‘Quality Maturity’
8. Charts showing the current performance status (e.g.
and ‘Quality Behavior’ following the PDA definitions. The two ag-
current scrap-rates, current up-times etc.) are posted on the
gregations seek to replicate the constructs Quality Maturity and
shop-floor and visible for everyone.
Quality Behavior as coined by Patel et al. (Patel et al., 2015): and
whether these quality attributes could be used as surrogates (or 9. We regularly survey our customer's requirements.
proxy variables:
10. We rank our suppliers; therefore, we conduct supplier
Quality Behavior: qualifications and audits.
Behaviors observed at the site or organization that are associ- On a high aggregation level, this research report presents a link
ated with a strong quality culture in areas such as clear com- between Quality Maturity, Quality Behavior and Operational Sta-
munication and transparency, commitment and engagement, bility. The complete Quality Culture findings are summarized in
technical excellence, and standardization of requirements. this chapter.
Quality Maturity:
Objective characteristics of a quality system that can be ob-
served or verified upon inspection or internal audit and that 8.1 Research Approach
have a positive relationship with quality culture behaviors,
including formal programs in preventive maintenance, envi- Similar to the research in year 1, the main tool used for analyses
ronmental health and safety, risk management, human error is multiple linear regression (MLR) with forward selection. In the
prevention, and training or continuous improvement. result section, all analyses are represented by the model summaries
through forward selection with up to ten coefficients. Additionally,
‘Cultural Excellence’ is an aggregation of the two previously men- the coefficient tables with the ten most significant coefficients are
tioned concepts and a KPI-based score on employee engagement displayed. Note that being among the top 10 most significant coef-
metrics. The individual concepts and their contents are shown in ficients does not mean that any of the typical significance levels are
Figure 3 and explained thereafter. In the first year, there were two met or that the remaining coefficients are not significant.
main outcomes of that research:
Prior to the analyses, the sites were clustered based on their overall
» Quality Maturity and Quality Behavior strongly correlate relationship between Quality maturity and Operational Stability as
» Out of the 36 attributes it is compiled of, we published the shown in Figure 23. A two-step clustering algorithm was applied.
ten attributes that most strongly differentiate sites in terms The highlighted clusters represent two characteristic groups of
of their overall Quality Maturity implementation (‘Quality sites. In the authors’ view, cluster three represents sites that are
maturity score’) tuned for high outputs of non-complex products. The majority of
The complete list of these ten elements was found to be as follows: these sites operate generics and/or contract manufacturing. Typ-
ically, the majority of these sites do not invest heavily in the ad-
1. Optimized set-up and cleaning procedures are documented vancement of their systems and processes beyond what is neces-
as best-practice process and rolled-out throughout the sary for compliant production. Experience shows, that there were
whole plant. several quality problems with these types of sites. Therefore, it is
2. A large percentage of equipment on the shop floor is not the intention of the researchers to discourage such sites from
currently under statistical process control (SPC). investing in Operational Excellence and Quality Culture develop-
ment. Cluster four consists of sites that show a high maturity, but
3. For root cause analysis we have standardized tools to relatively low performance. Along with the high implementation
get a deeper understanding of the influencing factors (e.g. levels, the majority of these sites on average also show very high
DMAIC). training efforts, qualification levels of their employees and the larg-
4. Goals and objectives of the manufacturing unit are closely est share of new machinery. Also, a majority of these sites belong
linked and consistent with corporate objectives. The site to smaller companies and show low utilization. The cluster is in-
has a clear focus. terpreted as impacted by launch sites for complex products which
might hamper their Operational Stability. The sites in clusters one,
5. We have joint improvement programs with our suppliers two and five create a dataset of 132 sites for the following analyses.
to increase our performance.
6. All potential bottleneck machines are identified and
supplied with additional spare part s.

QUALITY METRICS RESEARCH | 57


Figure 23: Site clustering on Operational Stability and Quality Maturity for Quality Culture analyses

Table 30: Quality Maturity attributes regression on Quality Behavior

58 | QUALITY METRICS RESEARCH


» We have implemented tools and methods to deploy a
8.2 Results continuous improvement process
The results of the MLR for the analysis of individual Quality Ma- » Goals and objectives of the manufacturing unit are closely
turity attributes and overall Quality Behavior is in Table 30. High- linked and consistent with corporate objectives; The
ly significant coefficients on a confidence level below 5% are dis- production site has a clear focus
played in green, coefficients meeting the 10% confidence level in
light green. Based on the cluster analysis, the majority of sites (68%) show a
positive relationship between Quality Maturity and Operational
For the selected sites, the model with ten coefficients explains Stability. For these sites, this indicates a direct link between the
71.3% of variance in Quality behavior. All ten displayed coefficients achieved levels in Quality Culture and performance metrics:
meet the 10% confidence level of being significant in their impact.
Similarly, the resulting tables for the impact of Quality Maturity Quality Maturity explains a high degree of variation in Operational
attributes (Table 31) and Quality Behavior attributes (Table 32) on Stability with the degree of determination found to be 54%. The
Operational Stability are shown on the next page: theme of the driving factors are real-time process analytics, root
cause analysis as well as visual performance management. Fur-
The models based on Quality Maturity and Quality Behavior are thermore, cross training, 5S and customer feedback are impor-
suitable to explain 54 and 56 percent of variance in Operational tant. The most significant individual attributes from Table 31 are:
Stability, respectively.
» Each of our employees within our work teams is cross-
Both tables show negative coefficients. These can occur when coef- trained so that they can fill-in for others when necessary
ficients are inter-correlated and higher ranked coefficients already
explain large parts of the variance of the dependent variable. In » For root cause analysis we have standardized tools to get a
such cases, literature suggests to focus on significance levels in- deeper understanding of the influencing factors
stead of coefficients (Brosius, 2013). Tests show that individually » Our customers frequently give us feedback on quality and
these attributes are positively correlated with Operational Stability. delivery performance
» We have a housekeeping checklist to continuously monitor
the condition and cleanness of our machines and equipment
8.3 Conclusion » Performance charts at each of our production processes
indicate the annual performance objectives
The shown models generally gain little explanatory value after the
first four to five or the highly significant attributes. This highlights » We operate with a high level of PAT implementation for real
the importance of the first few attributes in the respective models. time process monitoring and controlling
However, from a corporate managerial perspective it is the authors’
» Goals and objectives of the manufacturing unit are closely
belief that initiatives like an Operational Excellence campaign or a
linked and consistent with corporate objectives. The
Lean Production System should very much be holistic and overall
production site has a clear focus
system oriented. As illustrated in the St.Gallen OPEX Model (Fig-
ure 3), all parts of quality, effectiveness and efficiency improvement Quality Behavior explains a high degree of variation (56%) in Op-
come together as interlinked pieces that should be adressed in a erational Stability. The underlying concepts are commitment to
comprehensive way. improvement among employees and management as well as
maintenance focus and standardization. The most significant at-
Specific Quality Maturity attributes that have the highest impact
tributes from Table 32 are:
on variability of Quality Behavior21 (Adjusted R square = 71.3%) fo-
cus overall alignment, engagement, integration as well as 5S fun- » Our employees continuously strive to reduce any kind of
damentals and pro-activeness in maintenance. The most signifi- waste in every process
cant Quality Maturity attributes from Table 30 are:
» Plant management is personally involved in improvement
» All potential bottleneck machines are identified and projects
supplied with additional spare parts
» We use our documented operating procedures to standardize
» In our company, product and process development are our processes
closely linked to each other
» Our maintenance department focuses on assisting machine
» Our plant procedures emphasize putting all tools and operators perform their own preventive maintenance
fixtures in their place
» We emphasize good maintenance as a strategy for increasing
» In our company there are monthly open feedback meetings quality and panning for compliance

21. In the year 1 report, we analyzed and published the individual Quality Maturity attributes that are most responsible for the variance of overall Quality
Maturity. We also showed that Quality Maturity and Quality Behavior strongly correlate. Here, we show which individually Quality Maturity attributes
most strongly directly foster Quality Behavior.

QUALITY METRICS RESEARCH | 59


Table 31: Quality Maturity attributes regression on Operational Stability

Table 32: Quality Behavior attributes regression on Operational Stability

60 | QUALITY METRICS RESEARCH


9 A CLOSER LOOK FROM A
QUALITY RISK PERSPECTIVE

QUALITY METRICS RESEARCH | 61


Using operational metrics to enhance the risk-based site selection benchmarking database typically refers to a site’s operational per-
process for inspection of the FDA was one aspect of the Quality formance related to the full calendar year immediately prior to
Metrics initiative that caused notable discussions with and within when the benchmark is undertaken. Therefore, the criterion “year”
industry. The year 2 report presented sites grouped by their Lot Ac- seeks to compare the St. Gallen OPEX benchmarking dataset with
ceptance Rates with their related FDA inspection outcomes (Fried- the last available FDA inspection conducted before the benchmark
li, Köhler, Buess, Basu, et al., 2018). Since then, this field of research for that given organization. Inspection outcomes for more than
has evolved and the results are displayed in the following chapter. three years prior to the benchmark were not considered. The cut-
Chapter 9.1 depicts the continuation of our previous analyses based off is a compromise between timeliness and data availability. The
on our proprietary database and public FDA inspection outcomes. authors acknowledge that this bears certain limitations, because it
The research in 9.2 makes use of a company specific dataset that includes data pairs with an already considerable time gap. Also, the
incorporates FDA and various national agencies’ inspection out- included inspection could be followed by changes in the site, par-
comes that are part of the European Agency for the Evaluation of ticularly based on CAPAs after adverse inspection outcomes. In our
Medicinal Products (EMA). opinion, CAPAs following an inspection typically specifically target
compliance and not necessarily have an impact on operational KPIs
or enabler implementation. Furthermore, sites might reassign re-
sources from OPEX and continuous improvement to CAPA and re-
9.1 Operational Excellence Benchmarking mediation activities. Therefore, the analysis approach is valid with
the limitations being kept in mind.
database and FDA inspection outcomes The resulting inspection results distribution shows fewer ‘Official
While FDA inspection outcomes have often been researched as Action Indicated OAI’ final decisions than the overall population
a dependent variable and surrogate for quality risk, there is no (see Figure 24), but is not significantly different when evaluated per
published research on a link to site internal operational metrics, Chi-squared (χ2 = 2.73) test.
system or cultural attributes (Gray, Siemsen, and Vasudeva (2015),
Gray, Roth, and Leiblein (2011)). The aim of this chapter is to ex- The Chi-squared test is also used to evaluate possible relationships
plore that gap. between the inspection outcomes and the independent variables.
Therefore, the datasets are clustered in groups based on how they
9.1.1 Research Approach rank in terms of the respective independent variables. To overcome
the challenge of small samples not meeting the required conditions
The analyses are based on the match of sites in the St.Gallen Op- for the Chi-squared test, the dependent variable is simplified to the
erational Excellence benchmarking database and published FDA binary categories:
inspection outcomes. 61 site benchmarking datasets were matched
» ‘No Action Indicated NAI’
with inspection results. The comparison between the two data-
bases was done using company name, site address and year as key » Any action indicated *AI = {‘Official Action Indicated OAI’,
criteria for a match. The data reported into the St.Gallen OPEX ‘Voluntary Action Indicated VAI’}

FDA Drug Quality FDA Drug Quality


Assurance inspection Assurance inspection
outcomes 2009 -2019 outcomes matched with
(n=18,452) St. Gallen (n=61)
7% 3%

36%
42%

51%
61%

NAI VAI OAI

Figure 24: Matched FDA inspection outcomes compared to overall population (2009 – 04/2019)

62 | QUALITY METRICS RESEARCH


Table 33: Test configurations showing significant (at least 5%) relationships with inspection outcomes

Groups defined by Overall Enabler Score Groups defined by TQM Enabler Score
(n=55) (n=55)

Bottom half 6 21 Bottom half 5 22

Top Half 15 13 Top Half 16 12

0% 20% 40% 60% 80% 100% 0% 20% 40% 60% 80% 100%
NAI *AI NAI *AI

Groups defined by JIT Enabler Score Groups defined by Quality Maturity (n=55)
(n=55)

Bottom half 6 19
Bottom half 6 21

Top Half 15 13 Top Half 15 15

0% 20% 40% 60% 80% 100% 0% 20% 40% 60% 80% 100%
NAI *AI NAI *AI

Figure 25: Sites’ inspection outcomes by enabler implementation level

Groups defined by TQM Enabler Score (n=55)

Bottom quartile 3 11

Lower middle quartile 2 11

Upper middle quartile 7 7

Top quartile 9 5

0% 20% 40% 60% 80% 100%


NAI *AI

Figure 26: Inspection outcomes by detailed Total Quality Management enabler implementation level

QUALITY METRICS RESEARCH | 63


While OAI certainly is a highly adverse inspection outcome and
rare, VAI is quite common uncritical. The authors are aware of this 9.1.3 Conclusion
limitation of the analysis approach and it shall be pronounced here. Sites that have a greater level of implementation of their overall
Nevertheless, OAI and VAI both are notably different from the ideal Operational Excellence plans and programs have better inspection
inspection outcome NAI. Also this approach was chosen in align- outcomes. This relationship is stronger and highly significant for
ment to a recent study describing predictive models for risk based enablers specifically associated with the quality systems as meas-
inspection planning (Seiss, 2018). ured in the enabler category Total Quality Management TQM.
Similarly, the groups defined around the independent variables are Advancement in streamlining operations (Just-In-Time JIT enabler
either quartiles (‘Top quartile’, ‘Upper middle quartile’, ‘Lower mid- implementation) appears to have a similar effect. Lastly, the PDA
dle quartile’ and ‘Bottom quartile’) or only two groups divided by Quality Maturity construct remodeled through St.Gallen Oper-
the median (‘Top half’, ‘Bottom half’). The underlying values are al- ational Excellence benchmarking database enablers also can be
ways normalized in a way that the “Top 25%” group always contains linked to inspection outcomes.
the sites with the best characteristics for a particular feature and In parallel to the FDA draft guidance metrics, we analyzed relation-
not necessarily the highest values, e.g. the group shows the best, ships with operational KPIs and FDA inspection outcomes: Sites,
lowest customer complaint rates. which perform better in customer complaint rate and delivery ser-
Since the Chi-squared test only serves to show a relationship be- vice level (On-time-in-full OTIF), also show better inspection re-
tween the two variables, a second, qualitative perspective is neces- sults. This is particularly interesting as customer complaint rate is
sary to understand the nature of such relationships. Therefore, bar part of the FDA draft guidance metrics and OTIF is a surrogate for
charts showing the distributions are provided. overall PQS effectiveness, as established in the final research report
year 1 report (Friedli et al., 2017).
The completed analyses comprise of various performance metrics,
aggregations of such metrics (“scores”) and aggregated enabler im-
plementation levels. Chapter 9.1.2 summarizes the analyses with
significant results. 9.2 Single company case study of operational
9.1.2 Results KPIs and regulatory evaluation
To keep the results compact, Table 33 only depicts dependent vari- Data gathering and management in the context of manufacturing
ables that show relationships to the inspection outcome at a min- site internal information, corporate quality functions and external
imum of 5% confidence level. Other variables, such as the enabler regulators is challenging. Therefore, the research team is thank-
implementation specifically around Statistical Process Control im- ful for the opportunity to look into proprietary data of one of the
plementation or - as shown in the year 2 report - Lot Acceptance world’s largest pharma companies. Thanks to access to longitudi-
Rate, also result in promising trends (Friedli, Köhler, Buess, Basu, nal data around operational KPIs, internal quality audits and agen-
et al., 2018). However, in these cases the current dataset does not cy inspection outcomes, the findings of chapter 9.1 can be deep-
show a relationship that is significant at the 5% confidence level. ened and extended. The performance KPIs shown in this chapter
were selected based on their availability within the project.
As shown above, several significant relationships can be established
between different enabler groups and operational KPIs. The fol-
lowing collection of charts in Figure 25 depict the enabler related
9.2.1 Research Approach
results in individual bar charts. The available data contains information from 37 production sites.
internal Quality Compliance audit information and inspection
While the overall share of inspections without FDA advised actions
results from 2013 to mid-2019 are available. Operational data is
is only about 36% in the overall sample, the sites with higher im-
available for the time frame since 2016. While the company deals
plementation levels always show at least 50% preferable final deci-
with various regulators around the world, our original focus is the
sions. For the enablers describing the quality system, there exists a
evaluation by the US FDA. There are more than 50 drug quality
notable distribution when examining the implementation level in
assurance inspection records for the selected sites in the respective
quartiles (cf. Figure 26).
time frame. To increase the dataset, it was decided to expand the
Figure 27 shows the charts analyzing the distributions of inspec- dataset by also integrating FDA recognized national agencies that
tion outcomes by the sites’ operational KPIs. For customer com- are part of the European network EMA. This results in an overall
plaint rate, the figure also includes an additional chart with the dataset of 149 inspections to be evaluated. Different inspections of
distribution across the quartiles. the same site are treated as independent data records. Allocation
of sites and site types to the individual groups of data records is
As discussed in 9.1.1, a significantly non-random distribution does subsequently controlled for.
not always constitute an interpretable relationship. This is the case
for the distribution around the independent variable ‘release time’, Since the rating mechanisms across the agencies are different, a
where it is difficult to argue the nature of the relationship. Hence, normalization process was necessary. There is no easily interpreta-
it was marked (*) in the overview Table 33. ble inspection outcome as the FDA final decisions as in 9.1 available
for this dataset. Therefore, the focus of the analyses was set on the

64 | QUALITY METRICS RESEARCH


Groups defined by Customer Complaint Groups defined by Customer Complaint
Rate (n=47) Rate (n=47)

Bottom quartile 2 9
Bottom half 4 19
Lower middle quartile 2 10

Upper middle quartile 8 4


Top Half 13 11
Top quartile 5 7

0% 20% 40% 60% 80% 100% 0% 20% 40% 60% 80% 100%
NAI *AI NAI *AI

Groups defined by OTIF - Service level Groups defined by Release Time (n=51)
(n=45)
Bottom quartile 4 7

Bottom half 4 18 Lower middle quartile 6 8

Upper middle quartile 1 12


Top Half 12 11
Top quartile 8 5

0% 20% 40% 60% 80% 100% 0% 20% 40% 60% 80% 100%
NAI *AI NAI *AI

Figure 27: Inspection outcomes by operational KPIs

Inspection score >= 0.5 Inspection score < 0.5

API
Other DP API
13%
19% 19%
Other DP
Drug-Device 36%
combination 8%

4% 47%
54% Drug-Device Sterile DP

Sterile DP combination

Figure 28: Technology classification of site data records with good and bad inspection

Table 34: Operational KPIs and previous compliance results for groups defined by current inspection

QUALITY METRICS RESEARCH | 65


number of findings or citations in the inspections. The inspections score between 0 and 1 with a higher score being more favorable,
of European agencies as well as the internal quality audits distin- representing fewer findings. The audits are processed similarly, but
guish findings into three categories (e.g. critical / major / minor). excluding any differentiations by agencies since there is only one
Since agencies do not average around the same number of cita- internal corporate audit function for quality compliance.
tions, these are normalized compared to the maximum number of
There is no direct link to the number of findings and the actual se-
citations in that category per agency.
verity of the overall inspection outcome. An inspection with more
The normalized number of citations then is aggregated for a pre- citations is not necessarily worse than an inspection with only one.
liminary inspection / audit score in a weighted sum using the fol- This is especially true for FDA inspections where minor or critical
lowing weights: citations cannot be differentiated. The research team is aware of
that limitation.
» Critical findings / citations ≙ 9
The data records are assigned to groups based on their current
» Major �indings / citations ≙ 3
regulatory evaluations. The groups are then compared by their
» Minor �indings / citations ≙ 1 operational KPIs and preceding quality compliance performances.
Differences between the groups are tested for significance using
The preliminary inspection scores then again need normalization.
t-tests. The compilation of sites’ data records in terms of technolo-
Thereby two distinctions were made:
gies and drug products (DP) is shown in Figure 28. “Good” inspec-
» Internally and externally, API sites seem to receive tion are ones with no or only few citations.
significantly fewer citations than drug product sites. They
Overall, the distribution of technologies is similar. However, the
therefore have separate normalization streams
group with comparably more findings / citations in their current
» The European agencies’ and the FDA preliminary inspection inspection shows a higher share of non-sterile drug product manu-
scores are quite different due to the weighted different facturing sites (typically solids & liquids).
criticalities of citations. Hence, there again are separate
normalization streams. 9.2.2 Results
In results, we receive the following normalization equations: There are significant differences between the two groups in several
aspects. The summary of results is shown in Table 34.
Other than the KPIs shown above, the research team also analyzed
Customer Complaint rate as an overlap with the analyses in Chap-
ter 9.1. However, the results achieved on the St.Gallen dataset can-
not be confirmed here.

The preliminary score is then processed as follows: There are no significant differences between the two groups in
term of their previous internal quality compliance audit. This
might be due to several factors. Audits and inspections can have
different focus areas. Also, a site evaluated with many findings in
an internal audit can correct these deficits and be evaluated well in
Since the FDA does not provide citations with classification of se- a following inspection, so that the results are very different. These
verity, the calculation is simplified and illustrated in the following interlinks are not yet fully uncovered.
fictional example:
The given FDA inspection of an API site was concluded with three
9.2.3 Conclusion
citations. The maximum citations of the FDA for any API site in the In addition to the work on the St.Gallen Operational Excellence
sample is seven. benchmarking database, the research team investigates quality
compliance risk in a single case study based on proprietary longi-
tudinal compliance and operational data from one of the top five
global pharma companies:

The overall population of Preliminary FDA inspection scores for » Sites that receive fewer citations in the current inspection,
likely also had fewer citations in previous inspections.
API sites shall be . Using the percentrank-function (per-
Similarly, before an inspection with more findings, typically
centrank.inc in MS excel ranks values by how many values in the
more time has passed since the previous inspection. This
sample are bigger/smaller on scale from 0 to 1) this computes to the
confirms some of the factors used in the FDA’s risk based
following inspection score:
site selection methodology (FDA, 2018).
» Better quality compliance evaluations, meaning inspections
with fewer (weighted) citations, go together with better
performance in operational metrics. This confirms the
At 0.6, the given inspection ranks in the better half of inspections. findings from the first sub-chapter.
In consequence, all inspections are equipped with a comparable

66 | QUALITY METRICS RESEARCH


the top left corner of chart. Generally, the more the curve deviates
9.3 Data usage for predictive modelling from the straight diagonal line to the top left, the better the model
Based on the data analyzed before and the corresponding results, performs. Regarding the performance metric, an AUC value of 0.5
the aim of this research is to practically show the benefit of using equals a simple guess based on a dataset with 50% events.
operational KPIs for the risk evaluation of production sites. This The graphs show that the performance improves with more pre-
could serve as a proof of concept for a possible extension of the dictors used by the models. The strongest gain can be shown when
current risk-based site selection model (FDA - Center for Drug the operational KPIs are included. The highest performance is
Evaluation and Research, 2018). achieved when the dataset is reduced to those inspection records,
where operational KPIs are available.
9.3.1 Research Approach
In order to achieve easily comprehensible results and explainable 9.3.3 Conclusion
models, logistic regression models were chosen to be applied to Even though the available datasets are relatively small for a ma-
the data. Several models were built while varying the input data chine learning application, the tested models show notable perfor-
provided as predictors. Analogously to Chapter 9.1, a binary target mance in predicting inspection outcomes. Operational KPIs are a
variable was chosen that differentiates inspections with or without significantly different for sites inspected with fewer or more cita-
citations, neglecting their number and criticality. tions. Consequently, these KPIs are suitable to strongly improve
As in Chapter 9.2, the dataset contains 149 inspection records. 93 models in predicting inspection outcomes.
of these datasets include the Operational KPIs. The model perfor-
mance was evaluated using a ten-fold cross validation. The high
number of segmentations was chosen to maximize available train-
ing datasets in each iteration. The computational stress of the op-
eration is neglectable.

9.3.2 Results
The performance evaluation of the resulting models is displayed in
Figure 29. For each model the figure displays the data used for in-
puts, the ROC (Receiver-Operator-Characteristic) curve chart and
the associated AUC (Area-under-curve) value.
The ROC calculation requires the model to provide a probability
for the predicted outcome event between 0 and 1. The algorithm
then test several probability cut-offs and evaluates the number of
true and false predictions based on the probabilities and the varied
cut-off values. The area under the curve metric then allows to pro-
vide a simplified measure for its prediction performance between
0.5 and 1. An ideal model would result in a curve that stretches to

▪ Inspection + audit history ▪ Inspection + audit history ▪ Inspection + audit history ▪ Inspection + audit history
▪ Production type ▪ Production type ▪ Production type
▪ Rejected batches, right first ▪ Rejected batches, right first
time, product complaints time, product complaints
▪ Only data records with KPIs
AUC = 0.54 AUC = 0.593 AUC = 0.677 AUC = 0.715

Figure 29: Performance evaluation of Logistic Regression models by input data used

QUALITY METRICS RESEARCH | 67


10 SUMMARY

68 | QUALITY METRICS RESEARCH


» The research has demonstrated that as well as a balanced » Operational KPIs can be early predictive indicators for
measurement of operational KPIs (taking into account looming quality issues and are partly correlating with
effectiveness and efficiency measures) a systematic internal audit as well as external inspection outcomes. That
evaluation of the level of integration and maturity of is why using operational data as an additional input in risk
enabler implementation should also be included for any calculations is recommended, irrespective of whether the
targeted discussions with industry or during the inspection efforts are related to audit / inspection readiness within
process. The fact that some plants can show acceptable companies internally or as part of the risk-based inspection
operational performance levels without having built up scheduling undertaken by regulators.
respective practices (assessed by enabler implementation
» Quality Culture could again be shown as separating better
level) whereas others have sound practices but this has not
from worse plants in terms of Operational Stability.
been transformed in a better outcome yet is proof for the
Additionally, the analyses show a direct strong impact of
necessity of a comprehensive assessment. Nevertheless,
Quality Maturity and Quality Behavior on Operational
the majority of plants and labs evaluated follow the pattern
Stability for a majority of sites. As Quality Maturity drives
that higher capabilities (including culture) is correlated
Quality Behavior and both are correlated with the St.Gallen
to higher overall excellence (effectiveness and efficiency).
Engagement Score as analyzed in year 1, objective measures
The separate analysis of effectiveness and efficiency is still
of this score (e.g. sick leave or turnover rate) could be
meaningful from a risk perspective as it is possible to achieve
included as a proxy to get a first hint about the underlying
good excellence scores by over-emphasizing efficiency and
culture.
neglecting effectiveness.
» Models like the enhanced PPSM introduced in this report
» The analysis of differences in productivity (batches per
can become powerful in driving sustainable improvements
FTEs, etc.) can point to possible instabilities in production.
as they foster and rationalize cross-functional discussions
In some cases, increasing the number of FTEs is a reaction
(e.g. between Quality and Operations).
to quality issues. However, less FTEs being better would
be the wrong conclusion. First, the equipment/processes » The ICH Q10 architecture appears to be meaningful for
must be stable before considering a reallocation of FTEs assessing the PQS. At the same time the implementation
(e.g., increasing resources focused on proactive continual level of ICH Q10 category-related enablers correlates with
improvement, reduction). an overall OPEX enabler implementation level based on the
available limited data. Thus, a focus on ICH Q10 content in
» To be able to derive the right conclusions companies need
audits/inspections above and beyond the GMP compliance
to implement suitable metrics systems which help them
aspects is recommended. On-going data gathering including
to see a holistic picture of the current performance status
newly derived questions (see Chapter 4.1 and Appendix
and maturity of their operations. The metrics system itself,
1) will allow the research team to further investigate the
therefore, becomes a major driver for improvements and
impact of ICH Q10 in the future.
should be part of any efforts related to audit and inspection
readiness. If the metrics system is healthy and integrated » The work completed under this collaborative research grant
into the business processes, it will enable sustainable leads to numerous potentials for further development:
performance improvements. If the metrics system is not
­ The PPSM provides the potential to objectively
measuring balanced system performance comprehensively,
demonstrate PQS effectiveness in the sense of ICH
there is a risk of improving isolated KPIs and facing a
Q12, discussing firm’s PQSs in the context of managing
deterioration of others or overall performance deterioration.
post-approval CMC changes effectively (ICH, 2017).
In other words, if the wrong aspects are being measured, the
However, further analysis on extended data sets are
wrong actions will be taken and the wrong organizational
required.
outcomes will result.
­ The combination of operational data with other
» A deeper understanding of the work of the QC labs has now components of the risk-based site selection model is
shown to be crucial from an overall quality performance purposeful. These new combined models have yet to be
perspective. Lab performance is better understood as a designed and fully tested.
moderator rather than as an outcome component. The QC
­ Establishing an evaluation tool for internal metrics
lab acts as an important safeguard in the overall value chain.
systems could be key to move the industry in the
There is still a need to continue data gathering for a better
direction of sustainable improvements. Both industry
understanding of the relation between the QC labs and PQS
and regulators would benefit from the availability of
effectiveness.
such a comprehensive assessment tool.

QUALITY METRICS RESEARCH | 69


11 REFERENCES

70 | QUALITY METRICS RESEARCH


Ahmad, S., Schroeder, R. G., & Sinha, K. K. (2003). The Role of In- ence, 61(11), 2760–2781. https://doi.org/10.1287/mnsc.2014.2104
frastructure Practices in the Effectiveness of JIT Practices: Implica-
ICH. (2008). ICH Pharmaceutical Quality System Q10.
tions for Plant Competitiveness. Journal of Engineering and Technol-
ogy Management, 20(3), 161–191. ICH. (2017). Technical and regulatory considerations for pharmaceuti-
cal product lifecycle management Q12.
Brosius, F. (2013). Lineare Regression. In SPSS 21 (pp. 541–594). Hei-
delberg: MITP. Kaplan, R. S., & Norton, D. P. (1992). The Balanced Sorecard:
Measures that drive Performance. Harvard Business Review, (Janu-
Cross, K. F., & Lynch, R. L. (1988). The “SMART” Way to Define and
ary-February), 71–79.
Sustain Success. National Productivity Review, 8(1), 23–33.
Keegan, D. P., Eiler, R. G., & Jones, C. R. (1989). Are your Perfor-
Cua, K. O., McKone, K. E., & Schroeder, R. G. (2001). Relationships
mance Measures Obsolete? Management Accounting, 70(12), 45–50.
between Implementation of TQM, JIT, and TPM and Manufactur-
ing Performance. Journal of Operations Management, 19(6), 675–694. Köhler, S. (2019). Measuring Operational Excellence Performance – A
Mixed-methods Conceptualization and Application in Pharmaceutical
Deming, W. E. (1986). Out of the Crisis: Quality, Productivity and
Quality Control Laboratories. University of St.Gallen.
Competitive Position. Cambridge: University Press.
Köhler, S., Friedli, T., & Basu, P. (2019). Operational Excellence in
EFQM. (2012). An Overview of the EFQM Excellence Model (p. 8). p. 8.
Pharmaceutical Quality Control Labs: Driver of an Effective Quali-
FDA. (2009). Guidance for Industry Q10 Pharmaceutical Quality Sys- ty System. Journal of Pharmaceutical Innovation, x(x), x–x.
tem.
MBNQA. (2017). Baldrige Performance Excellence Framework:
FDA - Center for Drug Evaluation and Research. (2018). Program 2017-2018 Criteria Commentary.
description: Understanding CDER’s Risk-Based Site Selection Model.
Meyer, A. D., Tsui, A. S., & Hinings, C. R. (1993). Configurational
USA.
Approaches To Organizational Analysis. Academy of Management
FDA, C. for D. E. and R. (2018). Program description: Understanding Journal, 36(6), 1175–1195.
CDER’s Risk-Based Site Selection Model. USA.
Neely, A., Adams, C., & Crowe, P. (2001). The Performance Prism in
FDASIA. (2012). Public Law 112 – 144 (pp. 1–140). pp. 1–140. Practice. Measuring Business Excellence, 5(2), 6–12.
Ferdows, K., & De Meyer, A. (1990). Lasting Improvements in Man- OSHA. (2019). Voluntary Protection Programs.
ufacturing Performance: In Search of a New Theory. Journal of Op-
Patel, P., Baker, D., Burdick, R., Chen, C., Hill, J., Holland, M., &
erations Management, 9(2), 168–184.
Sawant, A. (2015). Quality Culture Survey Report. PDA Journal of
Flynn, B. B., Sakakibara, S., & Schroeder, R. G. (1995). Relationship Pharmaceutical Science and Technology, 69(5), 631–642. https://doi.
between JIT and TQM: Practices and Performance. The Academy of org/10.5731/pdajpst.2015.01078
Management Journal, 38(5), 1325–1360.
Research, F.-C. for D. E. and. (2016). Submission of Quality Metrics
Friedli, T., Köhler, S., Buess, P., Basu, P., & Calnan, N. (2017). FDA Data Guidance for Industry - Draft Guidance. Silver Spring, MD.
Quality Metrics Research Final Report.
Seiss, M. (2018). FDA CDER ORA Site Selection Model Improvement
Friedli, T., Köhler, S., Buess, P., Basu, P., & Calnan, N. (2018). FDA Pilot Study.
Quality Metrics Research Final Report Year 2.
Shah, R., & Ward, P. T. (2003). Lean Manufacturing: Context, Prac-
Friedli, T., Köhler, S., Buess, P., Calnan, N., & Basu, P. (2018). Out- tice Bundles, and Performance. Journal of Operations Management,
look – The St.Gallen Pharmaceutical Production System Model 21(2), 129–149.
and its Contribution to the FDA Quality Metrics Initiative. In T.
Shah, R., & Ward, P. T. (2007). Defining and Developing Measures
Friedli, P. Basu, N. Calnan, & C. Mänder (Eds.), 21c Quality Manage-
of Lean Production. Journal of Operations Management, 25(4), 785–
ment in the Pharmaceutical Industry (pp. 279–283). Aulendorf: ECV.
805.
Ghosh, M. (2012). Journal of Manufacturing Technology Manage-
The Pew Charitable Trusts, & International Society for Pharmaceu-
ment Lean manufacturing performance in Indian manufacturing
tical Engineering. (2017). Drug Shortages.
plants. Journal of Manufacturing Technology Management Journal of
Manufacturing Technology Management Journal of Manufacturing US Congress. (2012). Food and Drug Adimistration Safety and Inno-
Technology Management Iss Measuring Business Excellence. vation Act.
Gray, J. V., Roth, A. V., & Leiblein, M. J. (2011). Quality risk in off- Voss, C., Blackmon, K., Hanson, P., & Oak, B. (1995). The Competi-
shore manufacturing: Evidence from the pharmaceutical industry. tiveness of European Manufacturing - A Four Country Study. Busi-
Journal of Operations Management, 29(7–8), 737–752. https://doi. ness Strategy Review, 6(1), 1–25.
org/10.1016/j.jom.2011.06.004
Zhang, L.; Narkhede, B. E.; Chaple, A. P. (2017). Evaluating lean
Gray, J. V., Siemsen, E., & Vasudeva, G. (2015). Colocation Still Mat- manufacturing barriers: an interpretive process. Journal of Man-
ters: Conformance Quality and the Interdependence of R&D and ufacturing Technology Management, 28(8), 1086–1114. https://doi.
Manufacturing in the Pharmaceutical Industry. Management Sci- org/10.1108/JMTM-04-2017-0071

QUALITY METRICS RESEARCH | 71


APPENDIX

72 | QUALITY METRICS RESEARCH


Appendix 1 ICH Q10 Enabler Questionnaire
The following table shows the ICH Q10 enabler questionnaire clustered along the guideline’s categories. Questions classified with an UID
starting with N are newly derived, all questions with an UID starting with D are legacy OPEX questionnaire enablers assigned to the ICH Q10 cate-
gories.

Management Responsibilities
1 2 3 4 5

To what degree does your site management Management does not Management rarely Management Management regularly Management always
N01 participate in the design, implementation, participate in these participates in these sometimes participates participates in these participates in these
monitoring and maintenance of an effective activities activities in these activities activities activities
pharmaceutical quality system?

Quality policy represents Quality policy sometimes Quality policy does not
Quality policy in part Quality policy is designed
a major barrier for represents a barrier for represent a barrier for
N02 To what degree does your quality policy facilitates continual to facilitate continual
continual improvement of continual improvement continual improvement
facilitate continual improvement of the PQS? improvement of PQS improvement of PQS
PQS of PQS of PQS

Site management
To what degree does your site management Site management does Site management rarely Site management Site management always
sometimes determines
determine and provide adequate and not determine and determines and provides regularly determines and determines and provides
N03 and provides adequate
appropriate resources (e.g. human, financial) provide adequate and adequate and provides adequate and adequate and
and appropriate
to implement and maintain the PQS and to appropriate resources appropriate resources appropriate resources appropriate resources
resources
continually improve its effectiveness?

Performance indicators Performance indicators Performance indicators


Performance indicators
To what degree are performance indicators, are established, but not are established and are established,
No such performance are established,
N04 which measure progress against quality monitored, monitored, but not monitored,
indicators are established monitored, communicated
objectives, established, monitored, communicated or acted communicated and communicated, but not
and acted upon
communicated regularly and acted upon? upon acted upon acted upon

Management has Management has


Management has
Management has established and established and
To what degree has your management established such a
Management has not established, but has not communicated such a communicated such a
established and trained all employees in a quality policy, has
N05 established such a clearly communicated quality policy to site quality policy to site
quality policy, that describes the overall communicated it to all site
quality policy such a quality policy to employees, but has not employees and has
quality related intentions and direction of your employees and has
site employees yet trained the started to train its
company? trained all employees
employees employees

Management
Management does not Management rarely Management regularly Management always
sometimes conducts
To what degree does your management conduct reviews of conducts reviews of conducts reviews of conducts reviews of
N06 reviews of process
conduct reviews of process performance and process performance and process performance
performance and
process performance and process performance and
product quality on a regular basis? product quality and product quality product quality product quality
product quality

QUALITY METRICS RESEARCH | 73


Knowledge Management
1 2 3 4 5

Knowledge is managed
Knowledge is managed Knowledge is managed
Knowledge is not Knowledge is managed along the entire product
N07 Is product & process knowledge managed across R&D and across R&D, production
managed systematically within departments lifecycle, incl. commercial
comprehensively along the product lifecycle? production and QC
life & discontinuation

Processes have been Processes are defined


To what degree are knowledge management Processes are not Processes have been Processes have been
N08 jointly defined to limited and well-known across
processes defined? defined only partially defined defined
extend the organization

Does your approach for managing None of the tasks At least one of the tasks At least two of the At least three of the tasks
N09 manufacturing process knowledge cover the All four tasks covered
covered covered tasks covered covered
following tasks? Acquire, analyse, store,
disseminate knowledge.

How much of the following knowledge At least two of the At least three of the
sources do you use systematically? None of the sources One of the sources All four sources used
N10 sources used sources used
Development studies, technology transfer used systematically used systematically systematically
systematically systematically
activities, process validation studies,
manufacturing experience.

Employee share their Employee share their


Employee share their
Please describe the eagerness of employees Employees protect their knowledge Employee share their knowledge and ask
N11 knowledge partially if
to share their knowledge & expertise with knowledge comprehensively if knowledge proactively colleagues proactively for
requested
colleagues. requested feedback

There is a process to There is a process to


There is a process to There is a process to
There is no formal train employees in train employees in
train employees in train employees in
Is there a process to train your employees in process to train knowledge knowledge management.
N12 knowledge management knowledge management.
the importance and relevance of your employees in knowledge management but it is It is a non-mandatory part
but it is not applied in It is applied to new
knowledge management process? management rarely applied in of the routine training
practice employees.
practice. schedule.

74 | QUALITY METRICS RESEARCH


Capa System
1 2 3 4 5

There is a process to
Do you have a process to ensure that There is a process to There is a process to There is a process to
There is no process to disseminate this
information regarding nonconforming product, disseminate this disseminate this disseminate this
N13 disseminate this information. It is used in
quality problems and appropriate CAPAs information. It is not information. It is rarely information. It is used in
information practice and continuously
properly disseminated, including used in practice. used in practice. practice.
reviewed.
dissemination for management review?

Process for
documenting CAPA is Process for documenting Process for documenting
Formal process fot
No formal process for applied, however may CAPA is executed CAPA results in being
N14 Please describe your current CAPA documentimg CAPA
documenting CAPA have inconsistencies in consistently from record robust documentation of
documentation process. exists
execution from record to record all CAPA records
to record

There is no SOP and/or


SOP/ training outlining is
training that outlines the Introducing SOP/ SOP/ training outlining SOP/ training outlining is
To what degree do you distinguish in place, mandatory for
N15 differences between training outlining the the differences is in in place & mandatory for
corrections, corrective actions & preventive operators & frequently
corrections, corrective & differences is planned place operators
actions? updated
preventive actions

There are no measures in We have defined


We have minimal We have defined
place to identify the We have defined measures in place,
measures in place & measures in place,
N16 Do you use metrics to measure CAPA overall effectiveness / measures in place & targets are defined, data
targets are defined targets are defined &
effectiveness? health of the CAPA targets are defined is available & proactively
partly data is available
Quality System reviewed by management

All CAPAs are verified or


Critical CAPAs are Critical CAPAs are validated prior to
CAPAs are not verified or All CAPAs are verified or
Do you ensure that corrective and preventive sometimes verified or always verified or implementation. There is
N17 validated prior to validated prior to
actions were verified or validated prior to validated prior to validated prior to a standard how
implementation. implementation.
implementation? implementation. implementation. verification/validation has
to be performed.

There is a formal SOP There is a formal SOP on


There is a formal SOP There is a formal SOP on
on how to implement a how to implement a new
There is no formal SOP on how to implement a how to implement a new
Do you follow a formal SOP on how to new CAPA process. It CAPA process. The SOP
N18 on how to implement a new CAPA process. It is CAPA process. It is
implement a new CAPA process and are your is applied to new is applied to all new
new CAPA process. rarely applied in applied to all new
employees trained on this process? CAPAs with special CAPAs and continuously
practice. CAPAs.
attention. reviewed.

QUALITY METRICS RESEARCH | 75


Quality Risk Management
1 2 3 4 5

Do you operate cross-functional teams


dedicated to analyse risks with qualified At least two of the At least three of the At least four of the All five departments
N19 experts of the following departments? Site No cross-functional teams
departments involved departments involved departments involved involved
Management, Quality, Business
Development, Engineering, Regulatory
Affairs, Operations & Legal

We follow a structured
approach to identify risks,
We follow a structured
We follow a structured document them and
Are you aware of existing risks, review We do not document We document existing approach to identify risks,
N20 approach to identify & communicate them
existing risks on a regular base, prioritize & existing risks risks if we come across document them and
document risks across the organization.
communicate them proactively? communicate them
All documented risks are
re vie we d pe riodically.

The QRM process is not The QRM process is The QRM process is The QRM process is
integrated with the not integrated with the integrated with the overall integrated with the overall
Is your quality risk management process We do not have a formal ove rall Quality ove rall Quality Quality Management QMS, its effectiveness is
N21 integrated with the overall Quality System QRM process. Management System. Management System or System and its measured, the interaction
and do you asses the effectiveness of the We do not measure its its effectiveness is not effectiveness is with QMS is continuously
QRM? effectiveness. measured. measured. re vie we d.

We use a set of tools to We use a set of tools to


Do you use a set of methods & tools access & manage access & manage
dedicated to access & manage risks? (e.g. We do not use tools We use one tool to We use a set of tools to different types of risks. different types of risks. All
N22 Fault Tree Analysis, Hazard Operability dedicated to access & access & manage all access & manage There is a structured tool's outcomes are
Analysis, etc…) If so, is there a systematic & manage risks types of risks different types of risks approach to summarize summarized. We strive to
logical approach to select the methods & and harmonize all tool's develop further tools on
tools? outcomes. our own.

We have an We have a SOP to


We have an appropriate We continuously review
appropriate SOP. conduct Risk
Do you have a SOP to conduct risk SOP. However, our SOP to conduct Risk
We do not have a SOP However, procedures assessment. Procedures
assessment and are your employees trained procedures are not assessment. Procedures
N23 to conduct Risk are not defined for are defined for each area
on this SOP? Are risk management defined for each area are defined for each area
assessment each area or and employees are
procedures for each area of application and employees are not and employees are
employees are not trained dedicated to the
clearly defined? trained dedicatedly. trained dedicatedly.
trained dedicatedly. SOP.

We distinguish tailored We distinguish tailored We continuously update


acceptance criteria. acceptance criteria. Risk- tailored acceptance
We do not distinguish We distinguish tailored
Are the risk acceptance criteria adequate for Risk-based decision(s) based decision(s) are crite ria. Risk-base d
acceptance criteria with acceptance criteria with
N24 the specific situations in question, and the are we ll informe d always data based. decision(s) are always
regard to specific regard to specific
risk-based decision(s) are well-informed and involving dedicated Dedicated experts are data based. Dedicated
situations. situations.
data-based? experts in decision involved in decision experts are involved in
making. making. decision making.

76 | QUALITY METRICS RESEARCH


Environment, Health & Safety
1 2 3 4 5
We have an approved
We have an approved We have an approved
SOP for incident
We do not have an SOP for incident We have an approved SOP for incident
investigation and
Do you have an approved SOP for incident approved SOP for investigation and SOP for incident investigation and
N25 management. It is
investigation and management? Are the incident investigation and management but it is investigation and management. The SOP
reviewed & updated
employees trained on the procedure? management. rarely applied in management. is reviewed & updated on
continuously, employees
practice. a regular base.
are trained on it.

We have defined We have defined We continuously review


We do not have defined We have defined
procedures for waste procedures for waste our procedures for waste
procedures for waste procedures for waste
N26 Do you have defined procedures for waste disposal and hazardous disposal and hazardous disposal and hazardous
disposal and hazardous disposal and
disposal and hazardous handling? handling but it is rarely handling and measure handling and measure
handling. hazardous handling
used in practice. their effectiveness. their effectiveness.

There is a formal process


There is a process to
There is no formal There is a formal process to train employees on
train employees on There is a formal
process to train to train employees on EH&S. It is robust,
N27 Do you have a robust training process for EH&S procedures but it process to train
employees on EH&S EH&S. It is robust and applied to all employees
employees on the EHS procedures? is rarely applied in employees on EH&S.
procedures. applied to all employees. and reviewed/updated
practice.
continuously.

We have an accident
We have an accident We have an accident reporting and recording
We do not have an We have an accident
reporting and recording reporting and recording system, measure the
N28 Do you have an effective accident reporting accident reporting and reporting and recording
system but it is rarely system and measure the system's effectiveness
and recording system? recording system. system.
used in practice. system's effectiveness. and review/update the
system continuously.

Housekeeping is a Housekeeping is a Housekeeping is a core


Housekeeping is not a
Housekeeping is a small reasonable part of our significant part of our site part of our site culture,
D15 To what degree do employees strive to keep core part of our site
part of our site culture site culture, and a part culture and training and we perform regular
the site neat and clean? culture
of our training program program audits

Housekeeping Regularly-updated
To what degree are housekeeping checklists Housekeeping checklists Housekeeping checklists
We do not have checklists exist and are housekeeping checklists
D17 used to continuously monitor the condition exist but are not widely are adhered to across
housekeeping checklists visible, but are adhered are adhered to across
and cleanness of our machines and visible the site
to unevenly site
equipment?

Shop floor employees


To what degree do your employees Waste reduction is a Shop floor employees Shop floor employees
Waste reduction is not a strive to reduce process
D87 continuously strive to reduce waste in focus for specialised strive to reduce process strive to reduce all kinds
core activity waste in some of the
processes (e.g. waste of time, waste of employees only waste in most of the plant of waste in every process
plant
production space, etc.)?

QUALITY METRICS RESEARCH | 77


Process Perf. & Product Quality Monitoring System
1 2 3 4 5

We rarely analyse these We sometimes analyse We regularly analyse We continuously analyse


To what degree do you analyse parameters We do not analyse these parameters and these parameters and these parameters and these parameters and
N29 and attributes identified in the control parameters and attributes
attributes to verify state attributes to verify state attributes to verify state attributes to verify state
strategy to verify continued operation within a to verify state of control of control of control of control of control
state of control?

We sometimes identify
To what degree do you identify sources of We do not identify these We rarely identify these We regularly identify We continuously identify
N30 these sources of
variation affecting process performance and sources of variation sources of variation these sources of variation these sources of variation
variation
product quality?

Our monitoring system


Our monitoring system Our monitoring system Our monitoring system Our monitoring system
does include feedback
To what degree does your process does neither include does include feedback does include feedback always includes feedback
on product quality from
N31 performance and product quality monitoring feedback on product on product quality from on product quality from on product quality from
internal and in rare
system include feedback on product quality quality from internal nor internal but not from internal and sometimes both internal and
instances from external
from both internal and external sources? from external sources external sources from external sources frequently sources
sources

Our monitoring system Our monitoring system Our monitoring system Our monitoring system Our monitoring system
To what degree does your process does not provide rarely provides sometimes provides regularly provides continuously provide
N32 performance and product quality monitoring knowledge to enhance knowledge to enhance knowledge to enhance knowledge to enhance knowledge to enhance
system provide knowledge to enhance process understanding process understanding process understanding process understanding process understanding
process understanding?

Our monitoring system is Our monitoring system is


Our monitoring system is Our monitoring system is Our monitoring system
To what degree does your process suited to ensure a state suited to ensure a state
neither suited to ensure a mostly suited to ensure is suited to ensure a
performance and product quality monitoring of control and to some of control and to degree
N33 state of control nor to a state of control but not state of control but not
system ensure a state of control is degree to identify areas to identify areas for
identify areas for to identify areas for to identify areas for
maintained and areas for continual for continual continual improvement in
continual improvement. continual improvement. continual improvement.
improvement are identified? improvement. real time.

We do not monitor, We rarely monitor, We sometimes monitor, We regularly monitor, We continuously monitor,
To what degree do you monitor, measure measure and control measure and control measure and control measure and control measure and control
N34
and control process quality of outsourcing process quality of process quality of process quality of process quality of process quality of
activities? outsourcing activities outsourcing activities outsourcing activities outsourcing activities outsourcing activities

78 | QUALITY METRICS RESEARCH


Change Management System
1 2 3 4 5
We sometimes utilize
We utilize quality risk
quality risk We regularly utilize quality We always utilize quality
We do not utilize quality management, but not
management with an risk management with an risk management with an
To what degree do you utilize quality risk risk management to with an appropriate level
N35 appropriate level of appropriate level of effort appropriate level of effort
management to evaluate proposed evaluate proposed of effort and formality to
effort and formality to and formality to evaluate and formality to evaluate
changes? changes evaluate proposed
evaluate proposed proposed changes proposed changes
changes
changes

To what degree are proposed changes Proposed changes are Proposed changes are Proposed changes are Proposed changes are Proposed changes are
evaluated by expert teams from relevant not evaluated by expert rarely evaluated by sometimes evaluated regularly evaluated by always evaluated by
N36 areas (e.g., Pharmaceutical Development, teams that ensure the expert teams that by expert teams that expert teams that ensure expert teams that ensure
Manufacturing, Quality, Regulatory Affairs change is technically ensure the change is ensure the change is the change is technically the change is technically
and Medical), to ensure the change is justified technically justified technically justified justified justified
technically justified?

We rarely evaluate We sometimes evaluate


We do not evaluate We regularly evaluate We always evaluate
To what degree does Management evaluate whether the change whether the change
N37 whether the change whether the change whether the change
whether the change objectives were objectives were objectives were
objectives were achieved objectives were achieved objectives were achieved
achieved? achieved achieved

We do evaluate the
We evaluate the impact We evaluate the impact
impact of changes on We evaluate the impact
of changes based on a of changes based on a
To what degree do you evaluate that there We do not evaluate the product quality, but do of changes based on a
formalized change formalized change
N38 was no deleterious impact on quality across impact of changes on not have a formalized formalized, but not risk-
management system that management system that
the entire product lifecycle, after product quality and risk-based change based change
includes product quality includes all product and
implementation of changes? management system in management system
risks process related risks
place

We do have some of We do have some of


We do have all of these We do have all of these
Do you have a SOP for proposing, these SOPs, but only these SOPs, but not all
We do not have any of SOPs, but not all SOPs and all involved
N39 evaluating, approving, reviewing and few of the involved of the involved
these SOPs involved employees are employees are trained on
managing change control and are your employees are trained employees are trained
trained on it it
employees trained in this SOP? on it on it

Quality department is not Quality department is Quality department is Quality department is Quality department is
How often is quality department actively actively involved in the rarely actively involved in sometimes actively usually actively involved always actively involved
N40
involved in the change management change management the change involved in the change in the change in the change
process? process management process management process management process management process

QUALITY METRICS RESEARCH | 79


Management Review
1 2 3 4 5

To what degree does management measure Management does not Management rarely Management Management regularly Management
effectiveness of pharmaceutical quality measure the measures the sometimes measures measures the continuously measures
N41
system (PQS) objectives (including process achievements of PQS achievements of PQS the achievements of achievements of PQS the achievements of PQS
performance and product quality) at your objectives objectives PQS objectives objectives objectives
site?

It includes a structured
It includes a structured It includes a structured
It does not include a It includes a structured approach for timely and
To what degree does Management Review at approach for timely and approach for timely and
structured approach for approach for timely and effective
your site include a timely and effective effective communication effective communication
N42 timely and effective effective communication communication and
communication and escalation process to and escalation, which is and escalation, which is
communication and and escalation, but it is escalation, which is
raise appropriate quality issues to senior visible and adhered to visible and adhered to all
escalating problems n ot widely visible visible, but is adhered
levels of management for review? across most of the site across the site
to unevenly

To what degree does your site management Site management Site management Site management
Site management does Site management rarely
review the results of regulatory inspections sometimes reviews the regularly reviews the continuously reviews the
N43 not review the results of reviews the results of
and findings, audits and other assessments, results of such results of such results of such
such inspections such inspections
and commitments made to regulatory inspections inspections inspections
authorities?

Periodic reviews
To what degree does your site management Periodic reviews do not Periodic reviews rarely Periodic reviews usually Periodic reviews always
sometimes include
N44 perform periodic reviews, that include include measures of include measures of include measures of include measures of
measures of customer
measures of customer satisfaction such as customer satisfaction customer satisfaction customer satisfaction customer satisfaction
satisfaction
product quality complaints and recalls?

Periodic reviews
Periodic reviews do not Periodic reviews rarely Periodic reviews usually Periodic reviews always
To what degree does your site management sometimes include
include conclusions of include conclusions of include conclusions of include conclusions of
N45 perform periodic reviews, that include conclusions of process
process performance and process performance process performance and process performance and
conclusions of process performance and performance and
product quality and product quality product quality product quality
product quality monitoring? product quality

Quality policy at our site Quality policy has been Quality policy has been Quality policy has been Quality policy has been
To what degree is quality policy at your site has not been reviewed in reviewed within the last reviewed within the last reviewed within the last 5 reviewed within the last 5
N46
reviewed periodically for continuing the last 5 years for 5 years and is since 5 years and is since years and is since years and is since
effectiveness? continuing effectiveness rarely reviewed sometimes reviewed regularly reviewed continuously reviewed

80 | QUALITY METRICS RESEARCH


Appendix 2 Operationalization of operational excellence enabler dimensions and link to quality behavior/maturity for the
Quality Control Labs

The following table shows the enabler being assigned to quality behavior, quality maturity or none of the categories. As the analysis intends to
follow-up on the appropriate research in manufacturing (Friedli et al., 2017), only enablers being part of both St.Gallen Operational Excellence
Questionnaires (Manufacturing and QC Lab) are considered. New enablers represented in the lab questionnaire only are highlighted as New
in QC lab benchmarking

Preventive Maintenance Quality Behavior Quality Maturity


To what degree is there a formal program for maintaining your lab equipment? x
To what degree are maintenance jobs (e.g. calibration programs) documented, and
x
maintenance plans and checklists posted close to instruments?
To what degree is potential bottleneck lab equipment identified and supplied with
x
additional spare parts?
To what degree is the maintenance program continuously optimized based on a dedicated
x
failure analysis?
To what degree does the maintenance department focus on assisting analysts perform
x
their own preventive maintenance?
To what degree are analysts actively involved in the decision making process when buying
x
new equipment?
To what degree is your equipment maintained internally vs. externally?
To what degree is your preventive maintenance effort focused on proactive activities
New in QC lab benchmarking
rather than reactive activities?

Technology Assessment & Usage


To what degree is the lab situated at the leading edge of new technology?
To what degree do you screen the market for new production technology and assess new
technology concerning its technical and financial benefit?
To what degree is the lab effectively using new technology?
To what degree does the lab rely on vendors for its equipment?
To what degree is proprietary process technology and equipment used to gain a
competitive advantage?
To what degree do you put an emphasize on smart lab system implementation?

Housekeeping
To what degree do employees strive to keep the lab neat and clean? x
To what degree are tools and consumables put in their place (e.g. usage of a shadow
x
board)?
To what degree are housekeeping checklists used to continuously monitor the condition
x
and cleanness of our equipment?
To what degree do you do a regular review of the “As-Is” situation (e.g. by doing a
walkthrough) in your lab in order to identify potential improvement areas (e.g. by doing a New in QC lab benchmarking
gap analysis)?

Process Management
To what degree are direct and indirect processes documented? x
To what degree is process quality continually measured using process metrics? x
To what degree are dedicated process owners defined and responsible for planning,
x
managing, and improving their processes?
What proportion of the equipment in the lab is currently under statistical process control
x
(SPC)?
To what degree are standardized tools in place for root cause analysis, to get a deeper
x
understanding of the influencing factors (e.g. DMAIC)?

QUALITY METRICS RESEARCH | 81


Standardization and Simplification
To what degree is standardization emphasized as a strategy for continuous improvement
x
of lab processes and equipment?
To what degree are documented operating procedures used to standardize processes (e.g.
x
set-ups)?
To what degree are optimized lab operating procedures (e.g. shortened set-ups)
documented as best-practice processes and rolled-out throughout the whole quality x
organization?
To what degree are standardized functional descriptions used to reduce the period of
vocational training for new employees?
To what degree are standardized lab equipment (e.g. standardized design, standardized
spare parts) used to achieve a high up time of the equipment?
To what degree do you pursue lowering material costs with the help of standardized
equipment (e.g. for spare parts) and standardized consumables?

Set-up Reduction
To what degree do you continuously work to lower set-up and cleaning times in your lab?
To what degree do analysts practice set-ups to reduce the time required?
What proportion of equipment set-ups are scheduled so that the testing process is not
affected (e.g. to shorten lead time)?
To what degree are optimized set-up and cleaning procedures documented as best-
x
practices and rolled-out throughout the whole lab?

Pull Approach
Do you use a pull system (Kanban squares, containers or signals) for your consumables?
To what degree do you test according to forecast?
To what degree do you have tools installed for a regular demand and FTE capacity
New in QC lab benchmarking
analysis?

Layout optimization
To what degree are your processes located close together so that material handling and
consumable storage are minimized?
What proportion of testing substances/products are classified into groups with similar
processing requirements to reduce set-up times?
To what degree does the layout of the lab facilitate low inventories and fast throughput?
To what degree can your lab layout be characterized as separated into “mini-labs”, if
testing substances/products have been classified based on their specific requirements?
To what degree does you testing processes from incoming testing material to release
involve almost no interruptions and can be described as a full continuous flow?
To what degree do you use “Value Stream Mapping” as a methodology to visualize and
optimize processes?

Planning Adherence
To what degree do you meet your daily lab testing plans?
To what degree do you know the root causes of variance in your lab working schedule and
continuously try to eliminate them?
To what degree does your lab have flexible working shift models in order to easily adjust
labor capacity according to current demand changes?
Beyond flexible working shifts, do you assign extra resources within the lab for testing
during peak loads or do you outsource activities? New in QC lab benchmarking
To what degree do you prefer to increase productivity over short lead time or vice versa?

82 | QUALITY METRICS RESEARCH


Visual Management
To what degree do you utilize performance charts to show weekly/monthly/annual
x
performance objective?
To what degree do you utilize charts showing the current performance status (e.g. current
x
RFT rate) in your lab?

Management Commitment and Company Culture


To what degree does the head of quality and management empower employees to
x
continuously improve processes and reduce failure?
To what degree are the head of quality and management personally involved in
x
improvement projects?
To what degree does your site have an open communication culture, encourage the flow
x
of information between the production and lab?
To what degree are problems (e.g. complaints) traced back to their origin to identify root
x
causes?
To what degree do you align the achievement of quality standards between production
x
and QC/QA (e.g. shared responsibility or primarily the task of QA/QC)?
To what degree do your employees continuously strive to reduce waste in processes (e.g.
x
waste of time, consumables)?
To what degree do you prefer improvement programs initiated and promoted by the site
New in QC lab benchmarking
lab and not the global organization and vice versa?

Employee Involvement and Continuous Improvement


To what degree have you implemented tools and methods to deploy a continuous
x
improvement process in your lab?
To what degree are your analysts involved in writing standard operating procedures
x
(SOPs)?
To what degree do lab employee actively drive suggestion programs ((not excl. linked to a
x
suggestion system in place)?
To what degree do your analysts have the authority to correct problems (e.g. with
x
equipment, testing methods) when they occur without consulting a supervisor?
To what degree do supervisors focus on assisting analysts to perform their own problem
x
solving?
To what degree does your site form cross-functional project teams to solve problems in
x
your lab?
To what degree does your lab follow a vision based approach to continuous improvement
integrating constrains into the vision rather than an incremental approach? New in QC lab benchmarking
Does global quality organization have a lab certification program for best performing labs?

Functional Integration and Qualification


To what degree do you put emphasize on analysts cross-training to the required level so
x
that they can fill-in for others when necessary?
To what degree is information and skill-evaluation from official feedback meetings used
x
in further training?
To what degree does your site invest in the training and qualification of your lab
x
employees?
To what degree do your cross-trained analysts rotate on the job performing different
x
tasks?

QUALITY METRICS RESEARCH | 83


84 | QUALITY METRICS RESEARCH
QUALITY METRICS RESEARCH | 85
86 | QUALITY METRICS RESEARCH

You might also like