Professional Documents
Culture Documents
QDA Miner 2.0: Mixed-Model Qualitative Data Analysis Software
QDA Miner 2.0: Mixed-Model Qualitative Data Analysis Software
When you purchase QDA Miner, you receive a CD-ROM that contains
a software installer to guide you through the setup options for this and other
Provalis Research programs you may have bought. If QDA Miner was your
only purchase, you will still find demo copies of other Provalis Research
products on the CD-ROM. Also in the package is a printed user guide for
each program that you purchased. An Adobe Acrobat PDF file of the user
guide is on the CD-ROM.
The minimum system platform for QDA Miner 2.0 is Microsoft
Windows 98 or later, 48 Mb RAM memory, and 9 Mb disk space. It per-
formed well on our two Windows XP/Pentium 4 CPU test machines, one of
which is running at 2.5 GHz with 512 Mb of RAM, the other at 3.0 GHz
and 1 GB of RAM. There are no Mac OS or Linux versions of QDA Miner.
The well-designed 139-page user guide describes how the program
works and how to get the most utility out of it. It also provides ample screen
shots of the program in use. The detailed table of contents makes up for the
lack of an index.
The onboard Help options are typical Windows fare, allowing you to
search through a topic-oriented table of contents. Technical support is free
and unlimited. You may contact Provalis Research either by e-mail or by
phone (though you pay for the call to Provalis in Montreal, Canada; see the
appendix for contact information).
Software updates also are free for incremental upgrades within a version.
In other words, if you buy QDA Miner version 2.0, you can get any version
2.* updates as a free download. However, if they later produce version 3 and
you bought 2.*, then upgrading to the new version will cost you money (cur-
rently less than one-third the purchase price of the latest version). Customers
who purchased QDA Miner 1.3 after January 1, 2006 qualify for a free upgrade
to QDA Miner 2.0.
After successfully installing new software, the next task is to get data into
the program so you can get down to work. Fortunately, QDA Miner easily
handles text documents in Microsoft Word, Microsoft Windows Write, Word-
Perfect, Rich Text Format (*.rtf), ASCII text, HTML files, and—the icing on
the cake—Adobe Acrobat text files.1 QDA Miner treats such files as “docu-
ment variables,” a data type that it displays as a codable, editable text docu-
ment. Because documents are handled differently than other QDA Miner
variables, document variables are simply called “documents” or “document
types” in this review. QDA Miner can also import spreadsheet and database
files in Microsoft Access and Excel formats, dBase, Lotus 1-2-3, Paradox, tab
or comma-delimited text file formats, SPSS *.sav files, and Triple-S2 format.
The wide range of accessible file formats is a plus because real-world data
come in many different formats.
QDA Miner gives you four ways to create a “project,” which contains
the study’s data structure of codes, documents, and variables of numeric,
nominal, ordinal, date, Boolean, or short-string type. First, you can create a
new empty project and then manually enter the codes, documents, and vari-
ables. Second, you can import data from a database query or existing files
saved in any of the file formats listed above. Third, a valuable utility
program called the Document Conversion Wizard enables users to import
existing documents (but not database files).3 Document Conversion Wizard
users can also set the delimiter (e.g., end of page, carriage return, end of
paragraph) for each record or case to be imported. Finally, and new to QDA
Miner in Version 2.0, users can create new projects simply by selecting
existing documents (but not database files) to be included in a new project
and by giving the result a project name. The latter method can also be used
to import additional documents into existing projects.
To test QDA Miner’s capabilities, we threw at it fifty HTML files, which
represent one-half of the biographies on the controversial “100 Welsh
Heroes” Web site (http://www.100welshheroes.com/). We chose these test
data because they include both quantitative information (e.g., hero rank,
number of votes received, date of birth) and qualitative information (short
biographies for each hero).
The 100 Welsh Heroes poll was sponsored in 2003 by the Welsh
Assembly Government, which spent £154,000 (roughly US$270,000) of
public money on this project, the outcome of which was intended to cele-
brate Welsh culture and heritage. The poll received more than 40,000 votes
and ended in February 2004, with Aneurin Bevan, a major Welsh political
figure of the early to mid–twentieth century, in the number-one spot.
Questions about the possible rigging of the poll were raised even before the
voting ended, after some of the poll staff members went public with the
claim that thousands of votes for entertainers such as Tom Jones had been
“discounted” to ensure that Bevan would emerge the winner (Shipton 2004a,
2004b). Insofar as we know, the controversy remains unresolved.
Although we made no changes whatsoever to the fifty even-numbered
Welsh heroes biography Web pages (ordered by name, not by Welsh hero
rank), the Document Conversion Wizard imported each document flaw-
lessly. It automatically converted each imported Web page to rich text for-
mat (*.rtf) but preserved the formatting and fonts of each page,4 all of
which can be edited from within QDA Miner. In another test, we imported
MS Word files with complex formatting and embedded graphics into a new
project and found that, although some of the text formatting did not import
well, the jpg-format digital photos and line drawings did make it into the
QDA Miner document. We also found that, using the Document Conversion
Wizard, you can automate the extraction of variables from structured doc-
uments as they are imported into a project (more on this point in the next
section).
In sum, QDA Miner makes it easy for the researcher to work with data
stored in a wide range of file formats. Most important, QDA Miner per-
forms these functions well.
CODING DATA
FIGURE 1
Main Parts of the QDA Miner Interface
NOTE: The Document Window is the main workspace for each active case. At the top of the
Document Window are the menu bars for the editing and format of the document text. On the
left side of the screen is a stack of three smaller resizable windows that display (from top to
bottom) the case list, the numeric and categorical variables included in the study, and the codes
assigned by the analyst. The Coding Margin on the far right of the screen shows the text seg-
ments to which codes have been assigned.
Case Window list is “Case #—file name”; this can be changed by right-
clicking on any case or left-clicking on the title bar of the window and
choosing “Case descriptor . . . ,” which opens up a dialog box that gives the
user considerable control over the grouping variables by which cases are
displayed in the Case Window. To display a given Welsh Heroes project
document, one simply clicks on a case name and the document opens in the
Document Window. Stacked beneath the Case Window are the Variable and
Code Windows (Figures 2 and 3). The Variable Window displays the list of
project variables and their associated values for the active case. The Code
Window shows all the project codes associated with the current project, not
just those associated with the current document. The Coding Margin on the
right side of the main screen displays the coding assigned to passages in the
active document (Figure 1).
Variables
Variables can be created at any time in a QDA Miner project as the
researcher decides that a particular concept or quality needs to be mea-
sured, counted, or classified for the cases under examination. An anthro-
pologist, for example, might work with interview transcripts as his or her
primary data but also want to investigate whether some views expressed in
these transcripts vary in a patterned way with the gender, religion, age, or
annual income of the interviewees. Traditional QDA software packages
offer relatively limited tools for the analysis of such data in relation to
coded documents. QDA Miner offers these tools and much more to give the
researcher a wide range of choices of qualitative and quantitative analyses.
Figure 2 shows ten variables we created for the Welsh Heroes data set,
plus two variables (FILENO and TEXT1 for case name and document vari-
able, respectively) that were automatically assigned by QDA Miner when
the cases were imported into the project. The list contains a mix of differ-
ent types, including categorical variables (e.g., religion, political, category)
and numeric variables (e.g., Welsh heroes list rank, year died). A pilot study
of ten Welsh heroes was used to identify the variables and levels of mea-
surement we would use in our analysis of the Welsh Heroes data set.
Variable values were subsequently entered for each Welsh hero as we coded
the Welsh hero biographies.
Had we elected to do so, we could have also created some variables and
automatically imported the associated values available in the Welsh hero
biographies for each case by using the Variable Extraction feature of the
Document Conversion Wizard when we imported the Web pages. Researchers
who work with highly structured documents, such as accident reports, port and
FIGURE 2
Variables Window
NOTE: The Variables Window enables the user to create numeric, date, Boolean, short string,
and document variables as well as dropdown lists of nominal or ordinal variable values from
which the user can choose. In this example, the entry for Alexander Cordell is being assigned to
the “Creative” category. The dropdown list shows that Creative is one of five identified categories
to which a Welsh hero can be assigned. QDA Miner also includes the facility for easily assign-
ing additional categorical values to this variable.
harbor docking records, and other forms-based information, may find the
Variable Extraction feature particularly useful. To use it, the researcher defines
a set of extraction rules for QDA Miner to apply as it imports a set of docu-
ments. Each rule identifies a tag, say, “Date of Embarkation,” a rule to apply
when the tag is encountered when importing a document, say, “UP TO end of
line,” and a variable name, say, “EmbDate,” in which to store the data associ-
ated with this rule. Subsequently, as the Document Conversion Wizard imports
each document, it automatically applies all defined variable extraction rules.
In the example described above, the wizard would import a document, search
it for the string “Date of Embarkation,” and, if found, copy all the text on the
Date of Embarkation line and insert it in the EmbDate variable associated with
that case.
Once created, some properties of a variable, such as its name and descrip-
tion, can be edited. You cannot, however, change a variable’s type. If it was
created in the QDA Miner project as an integer-valued variable, then it will
remain one; short string–type variables have to stay as strings, real numbers
can’t be changed to strings, and so on. New values can be added to categori-
cal variables and existing ones can be edited.
Variables that share the same scales, say, of ordinal values between 1 and
5, or “strongly disagree,” “disagree,” “indifferent,” “agree somewhat,”
“strongly agree,” can also be copied by simply selecting an existing vari-
able with the scale of choice when a new variable is created.
Codes
Although one of QDA Miner’s main selling points is its mixed-model
data handling and analysis capabilities, coding and analyzing text passages
is central to its design. The Code Window displays a tree list of codes
included in a given project (Figure 3). Codes can easily be added, deleted,
edited, merged with other codes, or split into two or more new codes at any
time by right-clicking on a code and choosing the appropriate task. Users
can also control the number of code branches in the tree (Figure 3) and
move codes to new branches.
QDA Miner offers several ways to assign one or more codes to selected
text passages in a document: Click-and-drag from selected passage to tar-
get code or from selected code to target passage, select the passage and
double-click the code, or the most cumbersome, select the desired code
from the code dropdown list above the document window, highlight the
passage to code, and then click on the code list button on the right side of
the code dropdown list.
Regardless how coding is done, the result is one or (typically) more code
marks (i.e., brackets and code labels) that identify the location and span of
a coded passage in the Coding Margin window (Figure 1). Most ways of
coding text in QDA Miner also permit the user to set the scope of a given
passage from a single character to the entire document. Once assigned,
codes associated with a given passage can be removed, recoded, or resized
to increase or narrow the assigned code’s scope. Code mark colors can also
be assigned by the user to different codes or even to different coders.
In sum, “coding” in QDA Miner often involves assigning values to vari-
ables as well as assigning code labels to selected text passages. The end
result can set the stage for a particularly rich body of potential analyses of
the patterns in a data set, to which we now turn.
FIGURE 3
Codes Window
NOTE: The Codes Window provides a tree list of the codebook defined for a given project.
Each codebook category can have a default maximum of three levels of subcategories (not
counting the root category); the maximum number of levels can be changed to between two
and eight levels by the user. Codes can be edited, added, deleted, split, merged, or sorted by
simply right-clicking on the code name.
FIGURE 4
Text Retrieval Command Window
NOTE: The Text Retrieval command window has two main parts, the Search Expression
and Search Hits tabs. The search command is mostly constructed by choosing items from
dropdown lists.
FIGURE 5
The Search Hits Tab of the Text Retrieval Command Window
Displaying the Results of a Query
(e.g., “@WAR,” where WAR words are defined in a QDA Miner thesaurus
entry) can also be used. Finally, the “Add variables” box at the bottom of
the Search Expression window does not constrain the search; it merely
identifies variables (Welsh heroes list rank and birth year in the example
given in Figure 4) that QDA Miner should include for each coded segment
that appears in the Search Hits window.
A click on the Search Hits tab in the Text, Coding, Section, or Keyword
Retrieval windows displays the results of a given query. Figure 5 shows five
of the seven hits that resulted from the search specified in the preceding
paragraph and given graphically in Figure 4. Data columns to the left of
Rank in Figure 5 were generated by default by QDA Miner. Rank and Born,
the remaining columns, were, of course, added as part of the original query.
Although having a scrollable set of search results is sufficient to answer
many questions, QDA Miner packs a lot more functionality into its search
capabilities. The results for Text, Coding, Section, or Keyword Retrieval
searches can be coded in the Search Hits Window, sorted, printed, or saved to
disk as MS Excel, MS Word, ASCII text, comma- or tab-delimited, HTML,
or XML format files.
As noted above, the Coding, Section, and Keyword Retrieval tools employ
the same basic interface as that of Text Retrieval. They differ in the particu-
lar type of query offered to the user. The Coding Retrieval tool, as the name
suggests, searches passages coded by the user. The searches can be simple
ones, such as retrieving all Welsh Hero passages assigned to the Career Lift
code, or be much more complex. One could, for example, specify a Coding
Retrieval query to display all Career Lift coded passages that are near pas-
sages coded to Anglican and not overlapping with passages coded for Great
Contribution simply by clicking on dropdown lists in the Search Expression
window and selecting the appropriate operators and codes.
The Section Retrieval command is designed for queries of structured doc-
uments (e.g., a form such as the personal history section of an employment
application) in which alphanumeric strings (e.g., “NAME,” “SSAN,” “CUR-
RENT ADDRESS”) that appear in each document can be used to set the
query’s scope. Keyword Retrieval, on the other hand, works with any docu-
ment type but requires the Provalis Research software product WordStat 5.*
(Lewis 1999) to generate the keywords applied in this type of search. Without
WordStat 5.*, it’s more a tease than a tool in QDA Miner. We did not test its
capabilities.
In sum, QDA Miner’s search and coding features are well designed and give
the user a lot of flexibility in setting up a wide range of queries. Researchers
who work with structured documents will particularly appreciate the consider-
ation given to making their search and coding tasks easier.
ANALYSES
Where QDA Miner truly shines is with its mixed-model analytical tools.
There is a good mix of these commands, they work as advertised, and they
are well integrated with other aspects of the program.
The simplest of the mixed-model tools is the Coding Frequencies com-
mand, a new addition to QDA Miner with Version 2.0. Although it can be
used just to generate a code list for the active project, the Statistics option
delivers a spreadsheet-like display of counts and percentages for each code,
including the number of times a given code is used in the data set, the
number of cases in which it occurs, and the number of words in all passages
assigned to each code (Figure 6, rear window). User-selected rows in the
resulting table can be displayed as editable 2-D or 3-D bar charts (e.g.,
Figure 6, front window) or pie charts and can be printed, copied to the clip-
board, or saved to the same range of file types as described above for the
various Retrieval command searches.
Perhaps more interesting to many prospective users, and certainly a lot
more complex, are the Coding Co-Occurrence, Coding Sequence, and
Coding by Variables commands. At first glance, the command windows for
these tools may seem a little ho-hum, but once you execute a query and
begin to examine the results by clicking on the various window tabs, it is
FIGURE 6
Coding Frequencies Command and Display
NOTE: The Coding Frequencies command generates code counts that can be selected and
exported as spreadsheet files or plotted as bar charts or pie charts with a couple of mouse clicks.
immediately apparent that QDA Miner is stuffed with useful features that
facilitate analysis and interpretation.
As the name suggests, the Coding Co-Occurrences command examines
the extent to which selected codes or code categories tend to co-occur in the
active project data set. The command window is divided into three main
parts (Figure 7). Users select the domain of the query and the codes to
examine from the top two dropdown lists. Immediately below these drop-
down lists are options that can be set for the average-linkage cluster6 algo-
rithm that QDA Miner employs to classify the selected codes or the project
cases by their respective similarity or distance matrices. Finally, the bottom
half of the command window contains the options that can be set for QDA
Miner’s multidimensional scaling7 algorithm, which is used to create “con-
cept maps” or graphical representations of the conceptual proximity of the
selected codes or cases in relation to each other.
The results of a Coding Co-Occurrences query are displayed by clicking
on the other tabs in the command window. The main tabs are, first, a cluster
FIGURE 7
Options Tab of the Coding Co-Occurrences Command Window
FIGURE 8
Coding Co-Occurrences Window
NOTE: The Coding Co-Occurrences command packs a lot of functionality into six window
tabs, including cluster analysis, multidimensional scaling, proximity plots (shown here), and
similarity matrix data.
improved the hero’s lot in life) and Great Contribution codes tend to
strongly co-occur, and at the other end of the proximity scale, statements
about religion seldom figure in the passages coded against Career Lift in
the Welsh Hero biographies.
The Code Sequence command takes the co-occurrence idea a step fur-
ther and examines the specific order of code co-occurrences, given a set of
user-specified conditions. Unlike most QDA Miner analytical commands,
the Code Sequence command results are tabular rather than graphical. The
results center on showing the number of times (and percentages) that Code
A precedes and/or follows Code B. For example, a Code Sequence analy-
sis of selected codes in the Welsh Hero data set identified five instances in
which some mention of the military experience of a Welsh hero was fol-
lowed in their biographical sketch by mention of their greatest contribu-
tions to Welsh culture and society; there were, however, no instances in
which mention of greatest contributions preceded passages coded against
some mention of their military experience.
FIGURE 9
Correspondence Analysis Plot Generated by the Coding
by Variables Command
NOTE: Results are limited to the first three dimensions, each combination of which can be dis-
played as a 2-D graphic.
FIGURE 10
Heatmap Display
NOTE: Heatmap plots can also be constructed with the Coding by Variables command. These
complex graphics show the structure of a two-way contingency table as different levels of
brightness keyed to the table’s cell values.
lesser extent with the Middle Class. There is no strong association between
Welsh regions and Leaders; in point of fact, as a group they can be viewed
as not very Welsh. Thinkers, on the other hand, are most closely associated
with South Wales and the Working Class. Unlike Leaders, Thinkers and
Performers also tend to have a strong association with Wales itself.
The heatmap (Figure 10) is an unusual graphic that displays cross-tabulated
data patterns by replacing the number in each cell with a color, the brightness
of which varies with the cell value. Combine this display with cluster analy-
sis dendrograms, as in Figure 10, and you have a multilayered picture of the
data patterns in the original table. And if this sounds complicated, you’re right.
It is complicated, and most users will need to devote a few minutes to work-
ing through a couple of examples before they feel comfortable enough with
this tool to apply it in their research. The heatmap plot in Figure 10 displays
the results of the same query considered in the correspondence analysis
described above.11 The monochrome color ramp in the lower left corner of
Figure 10 shows how the changes in heatmap brightness are keyed to the per-
centage values of the table’s cells. The heatmap color plot and the cluster
analysis dendrograms (both of the latter of which are based on the similar-
ity matrices) provide essentially the same basis for inference that can
be seen graphically in the correspondence analysis. Here, too, we can see the
strong association between Thinkers, Performers and South Wales, and between
Leaders and the Gentry.
In sum, QDA Miner possesses outstanding mixed-model analytical
tools, all of which work well. We are impressed not only by the range of
functionality represented in this software (not all of which we have the
space to examine here), but also by the good design evident in how the
program’s user interface presents the various components of each tool to
the user.
OTHER FEATURES
CONCLUSIONS
APPENDIX
Characteristic QDA Miner 2.0
NOTES
1. QDA Miner will import the text from Adobe Acrobat files containing a mix of text and
images; it will not import Acrobat files containing only images.
2. Triple-S is a public source survey metadata data structure (Wright 2002).
3. The Document Conversion Wizard can be accessed as a stand-alone program or directly
from within QDA Miner.
4. We also could have imported these documents as plain text but at the cost of formatting,
fonts, tables, graphics, and so on.
5. A case can comprise multiple documents. In the Welsh Heroes data set, there is only
one document per case: a biographical sketch for each Welsh hero. Projects in which each case
may contain several documents are common. For example, an accident case might consist of
the report filed by the officer at the scene, interviews with each witness, an accident recon-
struction report, follow-up interviews with hospitalized accident victims, and so on. QDA
Miner provides the facility to limit the span of a search to a particular document type (e.g.,
accident reconstruction reports).
6. An average linkage clustering algorithm classifies cases into clusters on the basis of the
average similarity of a case given all existing cases in a cluster.
7. Multidimensional scaling is a statistical technique for displaying the similarities or dis-
tances of multivariate data in a low-dimensional space. The result is often a two-dimensional
graph of similarities.
8. The Statistics tab of the Codes Co-Occurrence window displays the similarity and
co-occurrence matrices.
9. The choice of 1- or 2-tailed probabilities is user-selectable for several of these tests.
10. Correspondence analysis represents the rows and columns of a two-way contingency
table “using a low-dimensional Euclidean space such that the locations of the row and column
points are consistent with their associations in the table” (Péladeau 2006: 116). Interpretation
largely focuses on inspection of the resulting graphic.
11. This heatmap plot is based on a relatively small contingency table, and as such, it does not
make the most convincing case for the interpretive utility of such plots. Heatmap plots are partic-
ularly useful in the interpretation of large tables, say, those greater than ten rows and columns.
12. See Lewis (1999) for a review of the capabilities of these Provalis Research products.
REFERENCES
Berry, M. W. 2003. Survey of text mining: Clustering, classification, and retrieval. New York:
Springer-Verlag.
Keim, D. A. 2002. Information visualization and visual data mining. IEEE Transactions on
Visualization and Computer Graphics 8 (1): 1–8.
Lewins, A., and C. Silver. 2005. Choosing a CAQDAS package. 3rd ed. CAQDAS
Networking Project: http://caqdas.soc.surrey.ac.uk/ChoosingLewins&SilverV3Nov05.pdf
(accessed February 27, 2006).
Lewis, R. B. 1999. SIMSTAT with WORDSTAT: A comprehensive statistical package with a
content analysis module. Field Methods 11 (2): 166–79.
———. 2004. NVivo 2.0 and ATLAS/ti 5.0: A comparative review of two popular qualitative
data analysis programs. Field Methods 16 (4): 439–64.
Péladeau, N. 2006. QDA Miner, qualitative data analysis software: User’s guide. Montreal,
Canada: Provalis Research.
Péladeau, N., and C. Stovall. 2005. Application of Provalis Research Corp.’s statistical con-
tent analysis text mining to airline safety reports. Montreal, Canada: Provalis Research.
Shipton, M. 2004a. “Dirty tricks” heroes claim. Western Mail (July 23), http://icwales.icnetwork
.co.uk/0900entertainment/0050artsnews/tm_method=full%26objectid=14453751%26siteid=
50082-name_page.html (accessed March 1, 2006).
———. 2004b. “Welsh Heroes” row flares up. Western Mail (August 31), http://icwales.icnetwork
.co.uk/0100news/newspolitics/tm_objectid=14586997&method=full&siteid=50082&head
line=-welsh-heroes—row-flares-up-name_page.html (accessed March 1, 2006).
Snow, C. P. 1964. The two cultures and a second look: An expanded version of the two
cultures and the scientific revolution. Cambridge: Cambridge University Press.
Weitzman, E., and M. B. Miles. 1995. Computer programs for qualitative data analysis:
A software sourcebook. Thousand Oaks, CA: Sage.
Wright, G. 2002. The Triple-S standard. Paper presented at the Association for Survey
Computing Conference, “Open Standards: Breaking down the Barriers,” at Imperial College,
London, September 19.