Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

REVIEW ESSAY

QDA Miner 2.0: Mixed-Model


Qualitative Data Analysis Software

Most qualitative data analysis (QDA) software is still largely based on


the same approaches addressed more than a decade ago by Weitzman and
Miles (1995) in their tour of QDA packages. Noteworthy changes, such as
those we’ve seen with successive versions of QDA software leaders like
ATLAS/ti and NVivo (Lewis 2004) and the other packages recently sur-
veyed by Lewins and Silver (2005) tend to enhance functionality without
significantly altering the QDA research model. To a certain extent, this
makes good sense. After all, many QDA researchers work alone or as
members of small teams, and their projects are relatively small, typically
focused only on scores or hundreds of cases.
It is surprising, nevertheless, that little of the increased functionality of
individual researcher-centered QDA software has ventured across the “two
cultures” divide (Snow 1964) and incorporated aspects of the qualitative/
quantitative or mixed-model data analysis and visualization tools that
accompanied the phenomenal growth of data mining and text mining over
the past two decades. Among the many reasons informally tossed about
at conferences and in academic department hallways are that some QDA
researchers oppose the application of quantitative approaches in their stud-
ies; others lack the necessary mathematics training and computing skills
necessary for them to take advantage of such tools; still others note that data
mining research, with its emphasis on machine learning and artificial intel-
ligence tools, tends to ask very different questions than those posed by
other QDA research and may grind through thousands of cases a day (if not
every minute or two!) to seek its answers.
These are good points, but they fail to justify a reluctance to explore new
data analysis and visualization tools (e.g., Keim 2002; Berry 2003) that are
proving to work well, are potentially appropriate for answering many kinds
of QDA questions, and can often be readily scaled to the data requirements
of most QDA projects. Viewed in this light, mixed-model software, which
integrates qualitative and quantitative data management and data analysis
Field Methods, Vol. 19, No. 1, February 2007 87–108
DOI: 10.1177/1525822X06296589
© 2007 Sage Publications
87

Downloaded from fmx.sagepub.com at UQ Library on March 15, 2015


88 FIELD METHODS

approaches in one package, is one of the more interesting and potentially


useful growth areas in qualitative research methods.
QDA Miner, the latest version (v. 2.0) of which was released in June
2006 by Provalis Research of Montreal, Canada, achieves mixed-model
integration well, and the result warrants serious attention. This product is a
welcome complement to two other Provalis Research products—WordStat,
a quantitative text analysis package, and SimStat, a general purpose statis-
tical data analysis package—both of which were previously reviewed in
this journal (Lewis 1999). We’re hard put to identify competing mixed-
model software like QDA Miner that emphasizes qualitative data analysis
features and is designed for the more-or-less-numerate researcher with pro-
jects that are small enough to code by hand.
This review describes QDA Miner and evaluates its functionality as a
qualitative data analysis tool that incorporates many quantitative features to
enhance the researcher’s ability to identify potentially important data pat-
terns. It is not an exhaustive inventory of the program or its capabilities.
The following sections examine setting up the program and the main ele-
ments of its interface, the relative ease of getting data into and out of QDA
Miner and managing it while it is there, data coding, and finally, QDA
Miner’s main qualitative and quantitative analytical tools. To anticipate our
main conclusion, QDA Miner is a feature-rich mixed-model software pack-
age that offers considerable value for a reasonable price.

SETTING UP AND SUPPORT

When you purchase QDA Miner, you receive a CD-ROM that contains
a software installer to guide you through the setup options for this and other
Provalis Research programs you may have bought. If QDA Miner was your
only purchase, you will still find demo copies of other Provalis Research
products on the CD-ROM. Also in the package is a printed user guide for
each program that you purchased. An Adobe Acrobat PDF file of the user
guide is on the CD-ROM.
The minimum system platform for QDA Miner 2.0 is Microsoft
Windows 98 or later, 48 Mb RAM memory, and 9 Mb disk space. It per-
formed well on our two Windows XP/Pentium 4 CPU test machines, one of
which is running at 2.5 GHz with 512 Mb of RAM, the other at 3.0 GHz
and 1 GB of RAM. There are no Mac OS or Linux versions of QDA Miner.
The well-designed 139-page user guide describes how the program
works and how to get the most utility out of it. It also provides ample screen
shots of the program in use. The detailed table of contents makes up for the
lack of an index.

Downloaded from fmx.sagepub.com at UQ Library on March 15, 2015


REVIEW ESSAY 89

The onboard Help options are typical Windows fare, allowing you to
search through a topic-oriented table of contents. Technical support is free
and unlimited. You may contact Provalis Research either by e-mail or by
phone (though you pay for the call to Provalis in Montreal, Canada; see the
appendix for contact information).
Software updates also are free for incremental upgrades within a version.
In other words, if you buy QDA Miner version 2.0, you can get any version
2.* updates as a free download. However, if they later produce version 3 and
you bought 2.*, then upgrading to the new version will cost you money (cur-
rently less than one-third the purchase price of the latest version). Customers
who purchased QDA Miner 1.3 after January 1, 2006 qualify for a free upgrade
to QDA Miner 2.0.

INPUT AND MANAGE DATA

After successfully installing new software, the next task is to get data into
the program so you can get down to work. Fortunately, QDA Miner easily
handles text documents in Microsoft Word, Microsoft Windows Write, Word-
Perfect, Rich Text Format (*.rtf), ASCII text, HTML files, and—the icing on
the cake—Adobe Acrobat text files.1 QDA Miner treats such files as “docu-
ment variables,” a data type that it displays as a codable, editable text docu-
ment. Because documents are handled differently than other QDA Miner
variables, document variables are simply called “documents” or “document
types” in this review. QDA Miner can also import spreadsheet and database
files in Microsoft Access and Excel formats, dBase, Lotus 1-2-3, Paradox, tab
or comma-delimited text file formats, SPSS *.sav files, and Triple-S2 format.
The wide range of accessible file formats is a plus because real-world data
come in many different formats.
QDA Miner gives you four ways to create a “project,” which contains
the study’s data structure of codes, documents, and variables of numeric,
nominal, ordinal, date, Boolean, or short-string type. First, you can create a
new empty project and then manually enter the codes, documents, and vari-
ables. Second, you can import data from a database query or existing files
saved in any of the file formats listed above. Third, a valuable utility
program called the Document Conversion Wizard enables users to import
existing documents (but not database files).3 Document Conversion Wizard
users can also set the delimiter (e.g., end of page, carriage return, end of
paragraph) for each record or case to be imported. Finally, and new to QDA
Miner in Version 2.0, users can create new projects simply by selecting
existing documents (but not database files) to be included in a new project

Downloaded from fmx.sagepub.com at UQ Library on March 15, 2015


90 FIELD METHODS

and by giving the result a project name. The latter method can also be used
to import additional documents into existing projects.
To test QDA Miner’s capabilities, we threw at it fifty HTML files, which
represent one-half of the biographies on the controversial “100 Welsh
Heroes” Web site (http://www.100welshheroes.com/). We chose these test
data because they include both quantitative information (e.g., hero rank,
number of votes received, date of birth) and qualitative information (short
biographies for each hero).
The 100 Welsh Heroes poll was sponsored in 2003 by the Welsh
Assembly Government, which spent £154,000 (roughly US$270,000) of
public money on this project, the outcome of which was intended to cele-
brate Welsh culture and heritage. The poll received more than 40,000 votes
and ended in February 2004, with Aneurin Bevan, a major Welsh political
figure of the early to mid–twentieth century, in the number-one spot.
Questions about the possible rigging of the poll were raised even before the
voting ended, after some of the poll staff members went public with the
claim that thousands of votes for entertainers such as Tom Jones had been
“discounted” to ensure that Bevan would emerge the winner (Shipton 2004a,
2004b). Insofar as we know, the controversy remains unresolved.
Although we made no changes whatsoever to the fifty even-numbered
Welsh heroes biography Web pages (ordered by name, not by Welsh hero
rank), the Document Conversion Wizard imported each document flaw-
lessly. It automatically converted each imported Web page to rich text for-
mat (*.rtf) but preserved the formatting and fonts of each page,4 all of
which can be edited from within QDA Miner. In another test, we imported
MS Word files with complex formatting and embedded graphics into a new
project and found that, although some of the text formatting did not import
well, the jpg-format digital photos and line drawings did make it into the
QDA Miner document. We also found that, using the Document Conversion
Wizard, you can automate the extraction of variables from structured doc-
uments as they are imported into a project (more on this point in the next
section).
In sum, QDA Miner makes it easy for the researcher to work with data
stored in a wide range of file formats. Most important, QDA Miner per-
forms these functions well.

CODING DATA

One big implication of QDA Miner’s mixed-model design is that the


user needs to think of his or her data in terms of variables amenable to

Downloaded from fmx.sagepub.com at UQ Library on March 15, 2015


REVIEW ESSAY 91

FIGURE 1
Main Parts of the QDA Miner Interface

NOTE: The Document Window is the main workspace for each active case. At the top of the
Document Window are the menu bars for the editing and format of the document text. On the
left side of the screen is a stack of three smaller resizable windows that display (from top to
bottom) the case list, the numeric and categorical variables included in the study, and the codes
assigned by the analyst. The Coding Margin on the far right of the screen shows the text seg-
ments to which codes have been assigned.

quantitative analysis; codes, which are the fodder of qualitative analyses;


the documents with which these variables and codes are associated; and
possible interactions between variables and codes. The QDA Miner data
coding process includes the familiar QDA tasks of assigning text passages
to codes or word tags that reflect particular concepts or qualities of analyt-
ical interest, as well as assigning numeric and/or nominal values to cate-
gorical, numeric, Boolean, and other variable types. QDA Miner 2.0 offers
a wide range of tools to facilitate the ease and accuracy with which these
tasks are done.
When QDA Miner opens an existing project, the Document Window
(Figure 1) displays the document5 you were working with at the end of the
previous session. The Case Window in the upper left window of the screen
holds a scrollable list of all cases in the project. The default format of the

Downloaded from fmx.sagepub.com at UQ Library on March 15, 2015


92 FIELD METHODS

Case Window list is “Case #—file name”; this can be changed by right-
clicking on any case or left-clicking on the title bar of the window and
choosing “Case descriptor . . . ,” which opens up a dialog box that gives the
user considerable control over the grouping variables by which cases are
displayed in the Case Window. To display a given Welsh Heroes project
document, one simply clicks on a case name and the document opens in the
Document Window. Stacked beneath the Case Window are the Variable and
Code Windows (Figures 2 and 3). The Variable Window displays the list of
project variables and their associated values for the active case. The Code
Window shows all the project codes associated with the current project, not
just those associated with the current document. The Coding Margin on the
right side of the main screen displays the coding assigned to passages in the
active document (Figure 1).

Variables
Variables can be created at any time in a QDA Miner project as the
researcher decides that a particular concept or quality needs to be mea-
sured, counted, or classified for the cases under examination. An anthro-
pologist, for example, might work with interview transcripts as his or her
primary data but also want to investigate whether some views expressed in
these transcripts vary in a patterned way with the gender, religion, age, or
annual income of the interviewees. Traditional QDA software packages
offer relatively limited tools for the analysis of such data in relation to
coded documents. QDA Miner offers these tools and much more to give the
researcher a wide range of choices of qualitative and quantitative analyses.
Figure 2 shows ten variables we created for the Welsh Heroes data set,
plus two variables (FILENO and TEXT1 for case name and document vari-
able, respectively) that were automatically assigned by QDA Miner when
the cases were imported into the project. The list contains a mix of differ-
ent types, including categorical variables (e.g., religion, political, category)
and numeric variables (e.g., Welsh heroes list rank, year died). A pilot study
of ten Welsh heroes was used to identify the variables and levels of mea-
surement we would use in our analysis of the Welsh Heroes data set.
Variable values were subsequently entered for each Welsh hero as we coded
the Welsh hero biographies.
Had we elected to do so, we could have also created some variables and
automatically imported the associated values available in the Welsh hero
biographies for each case by using the Variable Extraction feature of the
Document Conversion Wizard when we imported the Web pages. Researchers
who work with highly structured documents, such as accident reports, port and

Downloaded from fmx.sagepub.com at UQ Library on March 15, 2015


REVIEW ESSAY 93

FIGURE 2
Variables Window

NOTE: The Variables Window enables the user to create numeric, date, Boolean, short string,
and document variables as well as dropdown lists of nominal or ordinal variable values from
which the user can choose. In this example, the entry for Alexander Cordell is being assigned to
the “Creative” category. The dropdown list shows that Creative is one of five identified categories
to which a Welsh hero can be assigned. QDA Miner also includes the facility for easily assign-
ing additional categorical values to this variable.

harbor docking records, and other forms-based information, may find the
Variable Extraction feature particularly useful. To use it, the researcher defines
a set of extraction rules for QDA Miner to apply as it imports a set of docu-
ments. Each rule identifies a tag, say, “Date of Embarkation,” a rule to apply
when the tag is encountered when importing a document, say, “UP TO end of
line,” and a variable name, say, “EmbDate,” in which to store the data associ-
ated with this rule. Subsequently, as the Document Conversion Wizard imports
each document, it automatically applies all defined variable extraction rules.
In the example described above, the wizard would import a document, search
it for the string “Date of Embarkation,” and, if found, copy all the text on the

Downloaded from fmx.sagepub.com at UQ Library on March 15, 2015


94 FIELD METHODS

Date of Embarkation line and insert it in the EmbDate variable associated with
that case.
Once created, some properties of a variable, such as its name and descrip-
tion, can be edited. You cannot, however, change a variable’s type. If it was
created in the QDA Miner project as an integer-valued variable, then it will
remain one; short string–type variables have to stay as strings, real numbers
can’t be changed to strings, and so on. New values can be added to categori-
cal variables and existing ones can be edited.
Variables that share the same scales, say, of ordinal values between 1 and
5, or “strongly disagree,” “disagree,” “indifferent,” “agree somewhat,”
“strongly agree,” can also be copied by simply selecting an existing vari-
able with the scale of choice when a new variable is created.

Codes
Although one of QDA Miner’s main selling points is its mixed-model
data handling and analysis capabilities, coding and analyzing text passages
is central to its design. The Code Window displays a tree list of codes
included in a given project (Figure 3). Codes can easily be added, deleted,
edited, merged with other codes, or split into two or more new codes at any
time by right-clicking on a code and choosing the appropriate task. Users
can also control the number of code branches in the tree (Figure 3) and
move codes to new branches.
QDA Miner offers several ways to assign one or more codes to selected
text passages in a document: Click-and-drag from selected passage to tar-
get code or from selected code to target passage, select the passage and
double-click the code, or the most cumbersome, select the desired code
from the code dropdown list above the document window, highlight the
passage to code, and then click on the code list button on the right side of
the code dropdown list.
Regardless how coding is done, the result is one or (typically) more code
marks (i.e., brackets and code labels) that identify the location and span of
a coded passage in the Coding Margin window (Figure 1). Most ways of
coding text in QDA Miner also permit the user to set the scope of a given
passage from a single character to the entire document. Once assigned,
codes associated with a given passage can be removed, recoded, or resized
to increase or narrow the assigned code’s scope. Code mark colors can also
be assigned by the user to different codes or even to different coders.
In sum, “coding” in QDA Miner often involves assigning values to vari-
ables as well as assigning code labels to selected text passages. The end
result can set the stage for a particularly rich body of potential analyses of
the patterns in a data set, to which we now turn.

Downloaded from fmx.sagepub.com at UQ Library on March 15, 2015


REVIEW ESSAY 95

FIGURE 3
Codes Window

NOTE: The Codes Window provides a tree list of the codebook defined for a given project.
Each codebook category can have a default maximum of three levels of subcategories (not
counting the root category); the maximum number of levels can be changed to between two
and eight levels by the user. Codes can be edited, added, deleted, split, merged, or sorted by
simply right-clicking on the code name.

SEARCHES AND QUERIES

The analytical advantages offered by QDA Miner’s mixed-model design


immediately become apparent when users begin to ask questions of their
data. The facilities for searches, queries, and other more ambitious analy-
ses of data patterns are found under the Analysis item in the menu bar.
The simplest such tools query project data for passages, paragraphs,
or entire documents that meet user-specified criteria. These queries are
handled by the Text Retrieval, Coding Retrieval, Section Retrieval, and
Keyword Retrieval tools, all of which share the same basic interface. When
activated, each tool appears as a window with two tabs, one for setting the
scope and target of the search, the other for viewing the search results (e.g.,
Figure 4).

Downloaded from fmx.sagepub.com at UQ Library on March 15, 2015


96 FIELD METHODS

FIGURE 4
Text Retrieval Command Window

NOTE: The Text Retrieval command window has two main parts, the Search Expression
and Search Hits tabs. The search command is mostly constructed by choosing items from
dropdown lists.

Text Retrieval is the most general of these commands and serves as a


good example for all of them. To define a search using this tool, users click
on the Search Expression tab (Figure 4) and select the document types,
codes, variables, and other criteria that will yield the desired subset of the
project data. Once constructed, most queries can be saved for future use.
In the Figure 4 example, the “Search in” criterion was the default of all
fifty Welsh hero biographies (i.e., the TEXT1 document variable). Had
other document types been included in this study, for example, interviews
with people who participated in the poll or the Oxford Dictionary of
National Biography entries for each Welsh hero, a user could choose to
limit the search to one or more categories. The “Search unit” selects what
information a given search should return—documents, paragraphs, sen-
tences, or coded segments—that meet the defined search criteria. In Figure
4, the user chose “Coded Segments” and narrowed the search to passages
to which the Military, Career Lift, or Great Contribution codes were
assigned. The “Search expression” further honed the query by identifying
the words, phrases, or combinations to search for. Here, the user specified
the stem “war” and added the wildcard search expression character “*” to
limit the query to target-coded segments that begin with “war” and end with
any combination of adjacent characters (e.g., war, warfare, warrior, warble).
Boolean expressions (e.g., “war AND injury”) or thesaurus-based searches

Downloaded from fmx.sagepub.com at UQ Library on March 15, 2015


REVIEW ESSAY 97

FIGURE 5
The Search Hits Tab of the Text Retrieval Command Window
Displaying the Results of a Query

(e.g., “@WAR,” where WAR words are defined in a QDA Miner thesaurus
entry) can also be used. Finally, the “Add variables” box at the bottom of
the Search Expression window does not constrain the search; it merely
identifies variables (Welsh heroes list rank and birth year in the example
given in Figure 4) that QDA Miner should include for each coded segment
that appears in the Search Hits window.
A click on the Search Hits tab in the Text, Coding, Section, or Keyword
Retrieval windows displays the results of a given query. Figure 5 shows five
of the seven hits that resulted from the search specified in the preceding
paragraph and given graphically in Figure 4. Data columns to the left of
Rank in Figure 5 were generated by default by QDA Miner. Rank and Born,
the remaining columns, were, of course, added as part of the original query.
Although having a scrollable set of search results is sufficient to answer
many questions, QDA Miner packs a lot more functionality into its search
capabilities. The results for Text, Coding, Section, or Keyword Retrieval
searches can be coded in the Search Hits Window, sorted, printed, or saved to
disk as MS Excel, MS Word, ASCII text, comma- or tab-delimited, HTML,
or XML format files.
As noted above, the Coding, Section, and Keyword Retrieval tools employ
the same basic interface as that of Text Retrieval. They differ in the particu-
lar type of query offered to the user. The Coding Retrieval tool, as the name
suggests, searches passages coded by the user. The searches can be simple
ones, such as retrieving all Welsh Hero passages assigned to the Career Lift
code, or be much more complex. One could, for example, specify a Coding

Downloaded from fmx.sagepub.com at UQ Library on March 15, 2015


98 FIELD METHODS

Retrieval query to display all Career Lift coded passages that are near pas-
sages coded to Anglican and not overlapping with passages coded for Great
Contribution simply by clicking on dropdown lists in the Search Expression
window and selecting the appropriate operators and codes.
The Section Retrieval command is designed for queries of structured doc-
uments (e.g., a form such as the personal history section of an employment
application) in which alphanumeric strings (e.g., “NAME,” “SSAN,” “CUR-
RENT ADDRESS”) that appear in each document can be used to set the
query’s scope. Keyword Retrieval, on the other hand, works with any docu-
ment type but requires the Provalis Research software product WordStat 5.*
(Lewis 1999) to generate the keywords applied in this type of search. Without
WordStat 5.*, it’s more a tease than a tool in QDA Miner. We did not test its
capabilities.
In sum, QDA Miner’s search and coding features are well designed and give
the user a lot of flexibility in setting up a wide range of queries. Researchers
who work with structured documents will particularly appreciate the consider-
ation given to making their search and coding tasks easier.

ANALYSES

Where QDA Miner truly shines is with its mixed-model analytical tools.
There is a good mix of these commands, they work as advertised, and they
are well integrated with other aspects of the program.
The simplest of the mixed-model tools is the Coding Frequencies com-
mand, a new addition to QDA Miner with Version 2.0. Although it can be
used just to generate a code list for the active project, the Statistics option
delivers a spreadsheet-like display of counts and percentages for each code,
including the number of times a given code is used in the data set, the
number of cases in which it occurs, and the number of words in all passages
assigned to each code (Figure 6, rear window). User-selected rows in the
resulting table can be displayed as editable 2-D or 3-D bar charts (e.g.,
Figure 6, front window) or pie charts and can be printed, copied to the clip-
board, or saved to the same range of file types as described above for the
various Retrieval command searches.
Perhaps more interesting to many prospective users, and certainly a lot
more complex, are the Coding Co-Occurrence, Coding Sequence, and
Coding by Variables commands. At first glance, the command windows for
these tools may seem a little ho-hum, but once you execute a query and
begin to examine the results by clicking on the various window tabs, it is

Downloaded from fmx.sagepub.com at UQ Library on March 15, 2015


REVIEW ESSAY 99

FIGURE 6
Coding Frequencies Command and Display

NOTE: The Coding Frequencies command generates code counts that can be selected and
exported as spreadsheet files or plotted as bar charts or pie charts with a couple of mouse clicks.

immediately apparent that QDA Miner is stuffed with useful features that
facilitate analysis and interpretation.
As the name suggests, the Coding Co-Occurrences command examines
the extent to which selected codes or code categories tend to co-occur in the
active project data set. The command window is divided into three main
parts (Figure 7). Users select the domain of the query and the codes to
examine from the top two dropdown lists. Immediately below these drop-
down lists are options that can be set for the average-linkage cluster6 algo-
rithm that QDA Miner employs to classify the selected codes or the project
cases by their respective similarity or distance matrices. Finally, the bottom
half of the command window contains the options that can be set for QDA
Miner’s multidimensional scaling7 algorithm, which is used to create “con-
cept maps” or graphical representations of the conceptual proximity of the
selected codes or cases in relation to each other.
The results of a Coding Co-Occurrences query are displayed by clicking
on the other tabs in the command window. The main tabs are, first, a cluster

Downloaded from fmx.sagepub.com at UQ Library on March 15, 2015


100 FIELD METHODS

FIGURE 7
Options Tab of the Coding Co-Occurrences Command Window

analysis dendrogram, or treelike arrangement of cases based on similarity.


Users can set the number of clusters to display and choose to scale the den-
drogram either by the similarity index or the agglomeration order of the clus-
tering.8 Second, the multidimensional scaling results can be displayed as
either a 2-D or 3-D graph, the latter of which can be rotated by the user.
Third, proximity plots can be created that graphically represent a
selected code’s similarity or distance in relation to all other codes included
in an analysis. Figure 8 shows an example proximity plot result for a
Coding Co-Occurrences query of the Welsh Hero data set, in which we
examined the co-occurrences for codes in the Religion, Pivotal Moments,
and Early Life categories. The plot shows that the Career Lift (events that

Downloaded from fmx.sagepub.com at UQ Library on March 15, 2015


REVIEW ESSAY 101

FIGURE 8
Coding Co-Occurrences Window

NOTE: The Coding Co-Occurrences command packs a lot of functionality into six window
tabs, including cluster analysis, multidimensional scaling, proximity plots (shown here), and
similarity matrix data.

improved the hero’s lot in life) and Great Contribution codes tend to
strongly co-occur, and at the other end of the proximity scale, statements
about religion seldom figure in the passages coded against Career Lift in
the Welsh Hero biographies.
The Code Sequence command takes the co-occurrence idea a step fur-
ther and examines the specific order of code co-occurrences, given a set of
user-specified conditions. Unlike most QDA Miner analytical commands,
the Code Sequence command results are tabular rather than graphical. The
results center on showing the number of times (and percentages) that Code
A precedes and/or follows Code B. For example, a Code Sequence analy-
sis of selected codes in the Welsh Hero data set identified five instances in
which some mention of the military experience of a Welsh hero was fol-
lowed in their biographical sketch by mention of their greatest contribu-
tions to Welsh culture and society; there were, however, no instances in
which mention of greatest contributions preceded passages coded against
some mention of their military experience.

Downloaded from fmx.sagepub.com at UQ Library on March 15, 2015


102 FIELD METHODS

FIGURE 9
Correspondence Analysis Plot Generated by the Coding
by Variables Command

NOTE: Results are limited to the first three dimensions, each combination of which can be dis-
played as a 2-D graphic.

The Coding by Variables command enables the researcher to easily


examine coding patterns in relation to numeric or nominal variables. The
initial tabular result of this command is a contingency table of counts or
percentages. Users can test the association between codes and variables by
choosing between eleven different significance tests.9 Associations can be
graphically displayed as bar charts, scatterplots, correspondence analysis
graphs, and heatmap plots, all of which can be edited, printed, saved to
disk, or copied to the clipboard.
Figure 9 shows the correspondence analysis10 result from a Coding by
Variables query in which we cross-tabulated the social class of each Welsh
hero and the Welsh region in which he or she spent the early years of life
against the five Welsh Hero categories set by the poll organizers (the latter
appear in boxes in the graph). To briefly interpret a couple of the obvious
patterns revealed by this correspondence analysis, Welsh heroes catego-
rized as Leaders by the poll are closely associated with the Gentry and to a

Downloaded from fmx.sagepub.com at UQ Library on March 15, 2015


REVIEW ESSAY 103

FIGURE 10
Heatmap Display

NOTE: Heatmap plots can also be constructed with the Coding by Variables command. These
complex graphics show the structure of a two-way contingency table as different levels of
brightness keyed to the table’s cell values.

lesser extent with the Middle Class. There is no strong association between
Welsh regions and Leaders; in point of fact, as a group they can be viewed
as not very Welsh. Thinkers, on the other hand, are most closely associated
with South Wales and the Working Class. Unlike Leaders, Thinkers and
Performers also tend to have a strong association with Wales itself.
The heatmap (Figure 10) is an unusual graphic that displays cross-tabulated
data patterns by replacing the number in each cell with a color, the brightness
of which varies with the cell value. Combine this display with cluster analy-
sis dendrograms, as in Figure 10, and you have a multilayered picture of the
data patterns in the original table. And if this sounds complicated, you’re right.
It is complicated, and most users will need to devote a few minutes to work-
ing through a couple of examples before they feel comfortable enough with
this tool to apply it in their research. The heatmap plot in Figure 10 displays
the results of the same query considered in the correspondence analysis

Downloaded from fmx.sagepub.com at UQ Library on March 15, 2015


104 FIELD METHODS

described above.11 The monochrome color ramp in the lower left corner of
Figure 10 shows how the changes in heatmap brightness are keyed to the per-
centage values of the table’s cells. The heatmap color plot and the cluster
analysis dendrograms (both of the latter of which are based on the similar-
ity matrices) provide essentially the same basis for inference that can
be seen graphically in the correspondence analysis. Here, too, we can see the
strong association between Thinkers, Performers and South Wales, and between
Leaders and the Gentry.
In sum, QDA Miner possesses outstanding mixed-model analytical
tools, all of which work well. We are impressed not only by the range of
functionality represented in this software (not all of which we have the
space to examine here), but also by the good design evident in how the
program’s user interface presents the various components of each tool to
the user.

OTHER FEATURES

A few more of QDA Miner’s many features warrant mention. First, it


includes an Inter-Coder Agreement command, which is very useful in those
research situations where multiple coders work with the same project doc-
uments. The Inter-Coder Agreement command evaluates several levels of
agreement and provides three quantitative measures by which researchers
can assess coder differences.
Second, QDA Miner can also access the functionality of WordStat and
SimStat,12 if these programs are installed on the same computer. WordStat
extends the capabilities of QDA Miner to include content analysis of the
coded documents. SimStat can also be accessed from within QDA Miner to
perform the wide range of statistical analyses available in that full-featured
statistical analysis package. Our tests of these features were limited to ver-
ifying to our satisfaction that, using the Welsh Heroes data set, we could
access both programs from within QDA Miner.

CONCLUSIONS

QDA Miner is a full-featured software package for coding, searching, and


analyzing mixed-model data. It performs as advertised and is good value for
the money. It shares a number of features with WordStat and SimStat, two
other Provalis Research products. In fact, Péladeau and Stovall’s (2005)
recent WordStat/SimStat study of airline safety reports provides an excellent

Downloaded from fmx.sagepub.com at UQ Library on March 15, 2015


REVIEW ESSAY 105

case study of the research application of many tools also available in


QDA Miner.
No software package is perfect. Among the important data management
features that we would like to see added in the next QDA Miner version are
“Save” and “Undo” options in the menu bar. QDA Miner automatically
saves your changes as you make them, and when you finish, you simply
close the program. Although QDA Miner has a backup command by which
users can archive “snapshots” of a project, we believe it is absolutely essen-
tial for users to have the option to control when a given data set is saved to
disk and not simply be at the mercy of the program. Similarly, every user
needs the ability to undo mistakes. As things now stand in QDA Miner 2.0,
if you make a change, it cannot be undone.
When we look across the field of possible competitors, we find surpris-
ingly few software packages that offer QDA Miner’s mixed-model features.
QDA software such as Atlas/ti and NVivo (Lewis 2004) are just that—
qualitative data analysis packages. Both of the latter products can export
code counts and other data that can be analyzed separately in a stand-alone
statistical data analysis package, but they do not offer built-in mixed-model
analytical features that come close to that found in QDA Miner.
QDA Miner’s main screen (Figure 1) also somewhat resembles that of SPSS
Text Analysis for Surveys (see the Web site http://www.spss.com/textanalysis_
surveys/ for details), but closer inspection reveals that they are very different
products. As the name suggests, SPSS Text Analysis for Surveys is aimed at
extracting patterns from a particular class of documents. QDA Miner’s design
is much more flexible, and it offers greater functionality. SPSS Text Analysis
for Surveys users must also turn to other SPSS products to do many of the
kinds of mixed-model analyses that are built into QDA Miner. SAS, another
major statistical software product line, also offers text mining software that
share some QDA Miner features. SAS Text Miner (http://www.sas.com/
technologies/analytics/datamining/textminer/index.html) is optimized for a
wider range of documents than SPSS Text Analysis for Surveys, but both are
just as clearly written for large data sets and business and industry users rather
than academic researchers and relatively small data sets.
Our overall impression of QDA Miner is enthusiastically positive. It’s a
rock-solid product. If your needs extend to mixed-model research in which
you must have good text coding and searching tools, as well as quantitative
tools for analyzing data patterns, then download the demo version of QDA
Miner and try it out on a test data set. We think that, like us, you will find
that it does this job well. Just remember to do frequent backups so you
don’t have to remember with a groan what we said about the lack of an
undo command.

Downloaded from fmx.sagepub.com at UQ Library on March 15, 2015


106 FIELD METHODS

APPENDIX
Characteristic QDA Miner 2.0

Minimum System Requirements MS Windows 98 or later, 48 Mb RAM,


and 9 Mb disk space
Manual 139-page spiral bound booklet and
20-page addendum; Adobe Acrobat
PDF copy of the manual on the distribution
CD-ROM may be more up to date
Help Online help file; Web page, mailing list,
free tech support over the telephone
Importable File Types Coding Documents: ASCII text (*.txt),
MS Word, MS Windows Write,
WordPerfect, Rich text (*.rtf), HTML
(*.html), Adobe Acrobat (*.pdf)
Variables: MS Excel, MS Access, Lotus 1-2-3,
dBase, Paradox, SPSS (*.sav), Triple-S, and
tab or comma-delimited data files
Demo Version Download http://www.provalisresearch.com/Download/
download.html
Web Page http://www.provalisresearch.com
Ease of Use Excellent
Value Excellent
Single License Pricing $695 retail, $375 academic; upgrade from
Version 1.* costs $195 retail, $95 academic
(all prices in U.S. dollars); upgrades to 2.0
are free for customers who purchased
1.3 after January 1, 2006; contact
licenses@provalisresearch.com for
information about other licensing options
Where can you buy it? Provalis Research
2414 Bennett Avenue
Montreal, QC
H1V 3S4
Canada
phone: 514-899-1672
fax: 514-899-1750
email: sales@provalisresearch.com

NOTES

1. QDA Miner will import the text from Adobe Acrobat files containing a mix of text and
images; it will not import Acrobat files containing only images.
2. Triple-S is a public source survey metadata data structure (Wright 2002).
3. The Document Conversion Wizard can be accessed as a stand-alone program or directly
from within QDA Miner.

Downloaded from fmx.sagepub.com at UQ Library on March 15, 2015


REVIEW ESSAY 107

4. We also could have imported these documents as plain text but at the cost of formatting,
fonts, tables, graphics, and so on.
5. A case can comprise multiple documents. In the Welsh Heroes data set, there is only
one document per case: a biographical sketch for each Welsh hero. Projects in which each case
may contain several documents are common. For example, an accident case might consist of
the report filed by the officer at the scene, interviews with each witness, an accident recon-
struction report, follow-up interviews with hospitalized accident victims, and so on. QDA
Miner provides the facility to limit the span of a search to a particular document type (e.g.,
accident reconstruction reports).
6. An average linkage clustering algorithm classifies cases into clusters on the basis of the
average similarity of a case given all existing cases in a cluster.
7. Multidimensional scaling is a statistical technique for displaying the similarities or dis-
tances of multivariate data in a low-dimensional space. The result is often a two-dimensional
graph of similarities.
8. The Statistics tab of the Codes Co-Occurrence window displays the similarity and
co-occurrence matrices.
9. The choice of 1- or 2-tailed probabilities is user-selectable for several of these tests.
10. Correspondence analysis represents the rows and columns of a two-way contingency
table “using a low-dimensional Euclidean space such that the locations of the row and column
points are consistent with their associations in the table” (Péladeau 2006: 116). Interpretation
largely focuses on inspection of the resulting graphic.
11. This heatmap plot is based on a relatively small contingency table, and as such, it does not
make the most convincing case for the interpretive utility of such plots. Heatmap plots are partic-
ularly useful in the interpretation of large tables, say, those greater than ten rows and columns.
12. See Lewis (1999) for a review of the capabilities of these Provalis Research products.

REFERENCES

Berry, M. W. 2003. Survey of text mining: Clustering, classification, and retrieval. New York:
Springer-Verlag.
Keim, D. A. 2002. Information visualization and visual data mining. IEEE Transactions on
Visualization and Computer Graphics 8 (1): 1–8.
Lewins, A., and C. Silver. 2005. Choosing a CAQDAS package. 3rd ed. CAQDAS
Networking Project: http://caqdas.soc.surrey.ac.uk/ChoosingLewins&SilverV3Nov05.pdf
(accessed February 27, 2006).
Lewis, R. B. 1999. SIMSTAT with WORDSTAT: A comprehensive statistical package with a
content analysis module. Field Methods 11 (2): 166–79.
———. 2004. NVivo 2.0 and ATLAS/ti 5.0: A comparative review of two popular qualitative
data analysis programs. Field Methods 16 (4): 439–64.
Péladeau, N. 2006. QDA Miner, qualitative data analysis software: User’s guide. Montreal,
Canada: Provalis Research.
Péladeau, N., and C. Stovall. 2005. Application of Provalis Research Corp.’s statistical con-
tent analysis text mining to airline safety reports. Montreal, Canada: Provalis Research.
Shipton, M. 2004a. “Dirty tricks” heroes claim. Western Mail (July 23), http://icwales.icnetwork
.co.uk/0900entertainment/0050artsnews/tm_method=full%26objectid=14453751%26siteid=
50082-name_page.html (accessed March 1, 2006).

Downloaded from fmx.sagepub.com at UQ Library on March 15, 2015


108 FIELD METHODS

———. 2004b. “Welsh Heroes” row flares up. Western Mail (August 31), http://icwales.icnetwork
.co.uk/0100news/newspolitics/tm_objectid=14586997&method=full&siteid=50082&head
line=-welsh-heroes—row-flares-up-name_page.html (accessed March 1, 2006).
Snow, C. P. 1964. The two cultures and a second look: An expanded version of the two
cultures and the scientific revolution. Cambridge: Cambridge University Press.
Weitzman, E., and M. B. Miles. 1995. Computer programs for qualitative data analysis:
A software sourcebook. Thousand Oaks, CA: Sage.
Wright, G. 2002. The Triple-S standard. Paper presented at the Association for Survey
Computing Conference, “Open Standards: Breaking down the Barriers,” at Imperial College,
London, September 19.

—R. BARRY LEWIS and


STEVEN M. MAAS
University of Illinois at
Urbana-Champaign

Downloaded from fmx.sagepub.com at UQ Library on March 15, 2015

You might also like