Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/24063640

Exploratory data analysis with MATLAB

Article  in  Psychometrika · February 2007


DOI: 10.1007/s11336-005-1362-2 · Source: RePEc

CITATIONS READS

6 2,033

3 authors:

Clintin P Davis-Stober Stephen B. Broomell


University of Missouri Carnegie Mellon University
81 PUBLICATIONS   1,112 CITATIONS    22 PUBLICATIONS   924 CITATIONS   

SEE PROFILE SEE PROFILE

Florian Lorenz
Stroz Friedberg
5 PUBLICATIONS   84 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Effects of Subjective Response and Acute Tolerance on Variability in Alcohol-Impaired Driving Decisions View project

All content following this page was uploaded by Clintin P Davis-Stober on 19 May 2014.

The user has requested enhancement of the downloaded file.


PSYCHOMETRIKA—VOL. 72, NO. 1, 107–108
MARCH 2007
DOI: 10.1007/s11336-005-1362-2

BOOK REVIEW

W. Martinez and A. Martinez (2005). Exploratory data analysis with MATLAB. Chapman &
Hall/CRC Press. 405+xv pages. US$79.95. ISBN: 1-58488-366-9.

Wendy and Angel Martinez borrow a quote from John Tukey describing exploratory data
analysis as “detective work.” Researchers use exploratory data techniques to describe patterns
within data rather than to test hypotheses. This work can range from dimensionality reduction of
data to various types of scaling and clustering. These approaches are used in countless fields for
gaining insight into the structure of multidimensional data.
The first chapter of EDA with Matlab begins with an important discussion about the philos-
ophy of EDA and ethical treatment of data sets. After a brief introduction to some basic concepts
of exploratory data analysis, the authors leap into the first half of the volume which covers pattern
recognition. This segment describes techniques of dimensionality reduction starting with princi-
pal component analysis and also contains newer emerging techniques like self-organizing maps.
Over the course of six chapters, a range of topics including multidimensional scaling, various
clustering techniques, and data smoothing is described in detail. Within this wide scope of topics
the book maintains an excellent balance between providing complete yet compact lessons. The
second half of the book is dedicated to data visualization and graphing. In these chapters the
reader is shown graphical tools including the creation of scatter plots matrices, dendrograms,
bivariate histograms, and hexagonal binning.
EDA with Matlab shines as a programming reference to ensure the effective implementation
of EDA code. This volume organizes a wide variety of techniques, distilling each method into a
flexible programming framework ready for application. Each topic begins with a brief history and
explanation of the methods and is supplemented by many worthwhile references. The treatment
of the techniques moves beyond a simple “black box” introduction. The mathematical motivation
behind each technique is given without burdening the reader with the formal proofs and deriva-
tions. Techniques also come complete with an instructive example using real data sets which
are available online. The structure and content compares favorably with similar texts on Matlab
programming and analysis, such as Scientific computing with Matlab (Quarteroni & Saleri, 2003).
The Matlab code provided by the authors is efficient without sacrificing clarity. Readers who
are well versed in Matlab or languages with similar syntax will find the code quite intuitive. A
nice additional detail is that the author’s do an excellent job of providing descriptive comments
within the code itself. This code is available from the authors online at the Carnegie-Mellon
Statistics Department software archive, found at http://lib.stat.cmu.edu.
The only drawback is the concentration on Matlab implementation which keeps the focus
of the book very narrow. If you are looking for implementation beyond the Matlab environment,
you won’t find it here. The description of techniques is clear and concise but is not sufficient to
treat this work as a stand-alone reference, as it does not provide a complete discussion of the
theory behind EDA.
Potential readers should be forewarned that this volume is not to be treated as a “crash
course” in Matlab. Do not remove the shrink-wrap unless you have had prior exposure to the
Matlab programming environment.
As reviewers, we consider ourselves users, but not yet hardened veterans of Matlab;
we found the code to be very easy to follow and ran well on the programs we tested. A


107
c 2006 The Psychometric Society
108 PSYCHOMETRIKA

solid graduate course in applied statistics should be suitable to understand the techniques
themselves.
An instructor teaching a graduate course in applied data analysis using Matlab will find EDA
with Matlab to be an excellent textbook. For a more general statistics course, it could make a nice
supplementary reference.
EDA with Matlab excels at effectively bridging the gulf between theory and application of
EDA in the analysis of data. This volume organizes a wide variety of techniques, distilling each
method into a flexible programming framework ready for application. If you are comfortable with
data analysis and are a regular Matlab user, EDA with Matlab is worth the money.

UNIVERSITY OF ILLINOIS Clintin Davis-Stober


Stephen Broomell
Florian Lorenz

References

Quarteroni, A., & Saleri, F. (2003). Scientific computing with Matlab. Berlin: Springer-Verlag.

Published Online Date: 13 JUN 2007

View publication stats

You might also like