Kemsley1996 Classification

ELSEVIER Discriminant analysis of high-dimensional dat: CChemometrcs and Intelligent Laboratory Systems 33 (1996) 47-61 Chemometrics and intelligent laboratory systems : a comparison of principal components analysis and partial least squares data reduction methods EK. Kemsley Instiute of Food Research, Norwich Research Park, Colney, Norwich NR4 7UA, UK Received 15 June 1995; accepted 29 October 1995 Abstract Partial least squares (PLS) methods are presented as valuable altematives to principal components analysis (PCA) for compressing high-dimensional data before performing linear discriminant analysis (LDA). It is shown that using PLS, considerable improvement in class separation and thus discriminant ability can be obtained. In general, fewer of the compressed dimensions are required to give the same level of prediction successes, and for some data sets, PLS methods yield higher prediction success rates than those obtainable using PCA scores. Results are presented for two experimental data sets, comprising mid-infrared spectra of edible oils and plant seeds. The potential dangers of PLS methods are also demonstrated, in particular its ability to introduce apparent groupings into data where there is no inherent class structure ‘Keywords: Partial least squares; Principal components analysis; Linear discriminant analysis; Infrared spectroscopy 1, Introduction Infrared, Raman and nuclear magnetic resonance (NMR) spectroscopies are powerful analytical techniques [1], used in research laboratories all over the world to provide qualitative information on the struc ture and composition of a diverse range of samples. Increasing interest is also being shown in addressing quantitative problems by spectroscopic methods, and in particular, by infrared spectroscopy. In recent years, considerable effort has been expended on ex- Ploiting the data compression or reduction methods of principal component analysis (PCA) and partial least squares (PLS). With the rapid expansion in afford- able computing power that has taken place over the last decade, PCA and PLS have now moved from the ‘chemometrician’s development package to the spec- troscopist's instrument-driver software. Increasingly, large data sets and complex applications are being treated by these methods. Since most infrared spec- tral data is high-dimensional, with a single spectrum containing several hundred or even thousand variables, it is pethaps not surprising that compression methods have quickly become established as _valu- able tools for spectroscopic data analysis. PCA is a well-known technique of multivariate analysis. It was first proposed in 1901 by Pearson [2], and developed independently some years later by Hotelling [3], but in common with many multivariate methods, was not widely used until the arrival of modem computing technology. Today, however, it is available in virtually every statistical computer pack- (0169-7439 /96/$15.00 © 1996 Elsevier Science B.V. All rights reserved SSDI 0169-7439(95)00090-9ry EK. Kemsley / Chemometrics and Intelligent Laboratory Systems 33 (1996) 47-61 age. The main goal of PCA is to reduce the dimen- sionality of a data set in which there are a large number of intercorrelated variables, whilst retaining as much as possible of the information present in the original data. This reduction is achieved by a linear transformation to a new set of variables, the principal component (PC) scores, which are uncorrelated, and ordered such that the first few retain most of the variation present in all of the original variables. A subset comprising a few of the transformed variables only may then be used in further procedures of com- paratively reduced complexity. PCA has already been used in conjunction with a range of discriminant analysis techniques to tackle classification problems, [4—7}, for example, a popular approach is to use a subset of scores as variables in a linear discriminant analysis (LDA). An alternative strategy is to carry out principal component regression (PCR) on a dummy variable or variables that indicate class membership; however, this approach has been less widely adopted in the quantitative spectroscopy field, and thus the discussion in this paper is restricted to data compression followed by traditional LDA. PLS is a generic term for a family of related multivariate modelling methods, derived from the con- cepts of iterative fitting developed by Wold around a decade ago [8]. These ideas arose as pragmatic solu- tions to some of the problems that are encountered when conventional maximum likelihood methods are applied to large, intercorrelated data sets. In its basic regression form, PLS models the relationship between two data sets using a series of local least- squares fits. Like PCA, it can be viewed as an axis rotation method, and there are many similarities between the two techniques. PLS regression has been used extensively by chemometricians to tackle cali- bration applications in the physical sciences, with considerable success [9-111] In this paper, PLS methods are presented as alter- natives to PCA for data reduction prior to LDA. Al- though in the past PLS has been used mostly for cal- ibration, it is possible to stop short of the regression step, and use a suitably modified algorithm for data compression only. The transformed variables can then be used in an LDA. This procedure has been applied to data sets comprising mid-infrared spectra of extra virgin and refined olive oils, and of plant seeds in three different categories. It is shown that considerable improvement in class separation and discriminant ability can be obtained using PLS data reduction methods. 2. Methods 2.1. PCA and PLS data reductions An infrared spectrum comprises measurements of absorbance at d different wavelengths, where d is typically several hundred. An experiment usually in- volves collecting n such spectra, and in practice it is almost always the case that n 1 dependent variables, an (n X s) matrix Y is required. Again, X and y or Y are generally mean-centred as a first step. Various algorithms exist that provide related but different defini- tions of PLS. The original and computationally most simple algorithm, termed orthogonalised PLS1 for one y-variable, was devised by Wold et al. in 1983 [13]. An alternative definition, known as non-orthogonalised PLS, was developed by Martens et al. in 1987 [14]. In terms of modelling the dependent variable, the two methods are equivalent, yielding the same regression equation between y and X. How- ever, when viewed as data rotation methods, there are some differences. In both formulations, vectors w, are chosen such that the covariance of each score 2, X,w, with y, is maximised, where X, and y, are the residual variability in X and y at the beginning of the ith pass through the algorithm. Up to i= min(n — 1, d) scores can be calculated. The first vector w, and score z, are equivalent in both methods; thereafter, the algorithms differ. In non-orthogonalised PLS, all vectors infra) afe in fact orthogonal, and the linear transformation matrix W = P describes a rigid rotation. However, the scores obtained are not necessar- ily orthogonal. This definition of PLS is conceptually easy to understand: it is a rigid rotation of the original co-ordinate system, such that scores along the transformed axes have successively maximised covariance with y. In contrast, the orthogonalised PLS algorithm ensures that the scores obtained are uncorrelated by using an alternative method of expressing the residual variability in X at each stage. This re- quires a set of additional vectors b,, termed the esti- mated loadings, onto which the projections of X, have maximum covariance with the scores z;,. The inear transformation matrix is given by P = W(B™ W)!, and in general, its columns are not orthogonal, so that this form of PLS does not describe a rigid rotation. 2.2. PCA and PLS in LDA In this paper, PCA and PLS scores are used as variables in LDA. It is shown that PLS reductions, when performed with a y-vector or Y-matrix filled with suitable dummy variables, give improved discriminant ability in comparison with the use of PCA scores. The origin of this improvement is believed to be that PLS reductions yield scores that maximise the between-groups variance, as will now be shown, Suppose the observations in the matrix X can be assigned to one of g=2 groups, and are arranged such that the first n, rows belong to group 1, and the subsequent n, rows to group 2. A y-vector can be constructed containing m, entries of n,/n, and ny of —n,/m; this vector has column mean zero. The covariance of z, and y can be written: yia, yi Xw cov(yst,) = = This quantity is clearly maximised when xTy wi a @ (y™ XXT y) . This where the denominator is used to set |w,|= vector defines the first PLS loading. Now consider the between-groups variance of the elements of z,, defined as: Q) where z{? denotes the mean of the entries in z, assigned to group j. Since X is mean-centred, Z;=0; furthermore, it follows that: y'z)'_[y™xw,)’ =) = a50 EX. Kemsley/Chemomerrcs and Intelligent Laboratory Systems 33 (1996) 47-61 because gray (3) where z, denotes the ith entry in z,. Since, ;= 0, we can write so that Eq. (3) becomes: m\t [atm ora)'=(Es) -(E fet j=1,2 ‘The between-groups variance can thus be written: (Xm) Lay = ELH. fx ra jet ist f Clearly, when w, corresponds to the first PLS loading defined by Eq. (1), the between-groups variance will also be maximised. This indicates that the first PLS score represents the best single dimension for separating the two classes. Indeed, it is found that the between-groups variances of subsequently calculated scores are equal to zero, implying perhaps that these will be of minimal use for discrimination; however, as will be seen from the experimental data, this is not always the case and sometimes their inclusion can improve the prediction success rate, This can be understood by considering that when LDA is applied to multiple scores, it is the multivariate analogues of the between-groups and within-groups variances, that is, the between-groups and within-groups covariance matrices that influence the success of a model [15]; these are affected even by scores for which the uni- variate between-groups variance defined by [2] is, zero. When there are g > 2 classes, the matrix ¥ can be constructed from g columns of binary variables, which are then mean-centred as a preprocessing step. (Conventionally, g—1 binary variables are used to indicate group membership, but some of the associ- ated algebra is simplified if Y is of the order (n x g), which will be assumed throughout this work). In such cases, a PLS2 algorithm for multiple dependent variables is required. This differs from PLS1, requiring the introduction of an additional vector u to sum- ‘arise the residual variability in ¥. It is the covariance of u with the scores that is successively max- mised. For the first score, the relationship between u and Y can be written {16}: = (4) where an Ht (5) a, Combining Eqs. (4) and (5) leads to: YYTz, ¢ where ¢ is a scalar given by: LYYT2, 22, ‘The covariance of u and z, can thus be written: TY", wi XT YY"Xw, (n= 1) e(n=1) cov(u,z,) (6) ‘Maximisation of this quantity is brought about by an iterative procedure; a full description of the algorithm can be found in the text by Martens and Naes {14]. With the observations in X mean-centred and arranged groupwise as above, the between-groups variance of the first score can be written z,YQY"z, wi XT YQY" Xw, gol gol @) in which Q is a diagonal matrix with entries 1/n,, Try yol/ny, Where nj... are the number of observations in each of the j~ 1...g groups. By exami- nation of Eqs. (6) and (7), it is clear maximising the covariance of z, and u simultaneously maximises the between-groups variance. Moreover, it is found that the first g— 1 scores have substantially non-zero be-EK. Kemsley / Chemometrics and Intelligent Laboratory Systems 33 (1996) 47-61 st tween-groups variance; this is consistent with what one might expect, since to characterise the separa- tions between g groups, g~1 dimensions are re- Quired. In this paper, the PCA and PLS data transforma- tions described above are applied to two sets of experimental data. The scores obtained have been used in LDA. The procedure is summarised as follows: (®) Preprocessing Each observation is assigned to a class. An appro- priate y-vector (or matrix) of dummy variables is constructed. The X and y-data are mean-centred, and the X-data variance-scaled. ii) Data Reduction PCA or PLS data reduction is performed to yield a scores matrix Z. A subset of r of these scores are retained in a reduced matrix Z, of order (n r), and the remainder discarded. (iii) LDA applied to training set ‘The class mean scores are calculated. The Maha- lanobis D? distance [17] of each observation’s scores from each group mean is computed, and the observation re-assigned to the nearest group mean. The per- centage correctly re-classified is examined. (The n observations and g group means are represented by (1X r) row vectors. The Mahalanobis D? between the jth observation 2
You might also like
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Rating: 4 out of 5 stars
4/5 (5823)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1093)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Rating: 4.5 out of 5 stars
4.5/5 (852)
Magazines
Podcasts
Sheet music
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (612)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1718)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (590)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1105)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (898)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (541)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (2105)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (349)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (474)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1025)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (1867)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (823)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (122)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (271)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (441)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Tóibín
3.5/5 (1948)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (403)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (4771)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (809)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (266)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2259)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4208)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (1929)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (98)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (231)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (1903)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (234)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2522)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (3973)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (738)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2409)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (74)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (789)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (880)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
3.5/5 (104)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (137)
Little Women
From Everand
Little Women
Louisa May Alcott
4/5 (105)
Week 3 - High Aspect Ratio Photolithography For MEMS Applications PDF
Document10 pages
Week 3 - High Aspect Ratio Photolithography For MEMS Applications PDF
Preet Kamal Singh
No ratings yet
Week 3 - Grey Scale For MEMS
Document2 pages
Week 3 - Grey Scale For MEMS
Preet Kamal Singh
No ratings yet
Week 2 - Scaling Laws in The Macro, Micro and Nanoworlds - Wautelet
Document11 pages
Week 2 - Scaling Laws in The Macro, Micro and Nanoworlds - Wautelet
Preet Kamal Singh
No ratings yet
Week 2 - Living in A Physical World IV - Vogel
Document12 pages
Week 2 - Living in A Physical World IV - Vogel
Preet Kamal Singh
No ratings yet
Week 1 - Why Are Digital Micromirrors So Reliable
Document7 pages
Week 1 - Why Are Digital Micromirrors So Reliable
Preet Kamal Singh
No ratings yet
Unesco - Eolss Sample Chapters: Mechanical Properties of Polymers
Document16 pages
Unesco - Eolss Sample Chapters: Mechanical Properties of Polymers
Preet Kamal Singh
No ratings yet
Mechanical Properties of Polymers: There Are Three Typical Classes of Polymer Stress-Strain Characteristic
Document6 pages
Mechanical Properties of Polymers: There Are Three Typical Classes of Polymer Stress-Strain Characteristic
Preet Kamal Singh
No ratings yet
Mechanical Properties of Polymers
Document25 pages
Preet Kamal Singh
No ratings yet
Mini-Case Study: Project Management at Global Green Books Publishing
Document3 pages
Mini-Case Study: Project Management at Global Green Books Publishing
Preet Kamal Singh
No ratings yet
Document25 pages
Preet Kamal Singh
No ratings yet

Kemsley1996 Classification

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Kemsley1996 Classification

Uploaded by

Copyright:

Available Formats

You might also like