Power Tools: Code, Analyze, and Illustrate Data With The Provalis Research Suite

software review

Power Tools
Code, analyze, and illustrate data with the Provalis Research suite.
By Deborah Bobier

ontreal’s Provalis Research offers a suite of data and through page after page of logged results in a single file, one of

M text analysis tools for those interested in both quanti-

tative and qualitative data. It is particularly useful for
those who wish to “put numbers” to qualitative data to help
the nicest features of the notebook is that users can add pages to
it to organize output and make navigation faster and easier.
Labeled tabs on pages make searching even simpler. Blank pages
establish frequencies and relationships. The current offering can be added to the notebook, allowing users to include the
consists of: Simstat, a Microsoft Windows-based, full-purpose analysis plan, make comments, and outline next steps if desired.
statistical package; WordStat, a content analysis and text min- Simstat also enables users to save analysis specifications
ing module; and the newer (released in 2004) QDA Miner, a with a script feature. The script automatically keeps track of
qualitative data analysis package. Note that WordStat is not a what was done in a session, and users can save this for later.
stand-alone application; users must access it through Simstat or This is extremely helpful when someone frequently uses an
QDA Miner. analysis, or when the analysis is complex and would be time-
Simstat. This is a statistical analysis package not unlike consuming to recreate with the Windows point-and-click
SPSS for Windows, and those familiar with the latter will find method. Descriptions of the commands/syntax conventions
that navigation is easy and straightforward. It contains a tuto- and examples of scripts help less-familiar users.
WordStat. This is an add-on module for studying textual
For those who have ever needed to find information such as responses to open-ended questions and
interviews, articles, speeches, and other communications. Its
themes or relationships in verbatim responses, power is that it allows for automatic categorization of text
with a dictionary approach (after some user setup). For those
focus group transcripts, or other text sources, who have ever needed to find themes or relationships in verba-
WordStat is very attractive indeed. tim responses, focus group transcripts, or other text sources,
WordStat is very attractive indeed. Although getting started
rial to help users become familiar with the possibilities of the might take a while, both in formatting the text to be analyzed
software (and there are many), and the comprehensive manual and setting up the dictionary, the results make it worthwhile.
provides clear instructions on how to perform actions and To get full value from WordStat’s capabilities, users must
analyses—not just a description of what is available. This is invest time in customizing and maintaining the appropriate
often sorely lacking in software manuals, forcing users to find dictionaries—for categorization and exclusion. They must
other sources of information to work with new software, and populate the inclusion dictionary with the categories under
is added value here. Users can enter data directly into Simstat consideration, all of the words and technical terms to be
or import them from a variety of sources, including SPSS, included in the category, and so on. The advantage is that this
Excel, and comma- or tab-delimited ASCII data. If necessary, is a subject-specific word list, and users can save it, reuse it,
they can easily merge data files—a handy feature if multiple
people are entering data. Simstat provides the full range of sta-
tistical offerings, from more basic cross-tabs and descriptive Exhibit 1 Heat plot
statistics to factor-, bootstrap-, and time-series analysis. User-
friendly descriptions of each test and the available options are
presented (users must bring their own knowledge of interpre-
tation, and the software cannot guard against inappropriate
uses). Users can create and edit a full range of charts (e.g., bar,
pie, Pareto, histogram, box-and-whisker, scatter) following
analysis, and transfer them to the clipboard or export them to
other applications for use elsewhere.
When running analyses, the notebook displays the statistical
output for all analyses performed during a session. For those
who have lost their way searching for a particular data run,

patterns in the data, and it is easy to switch between the cross-
Exhibit 2 Code mark example tabs and the keywords in context page to get additional insight.
Anyone interested in some of the more powerful ways to depict
relationships between categories and categorical variables will
enjoy the heat plots, which show relative correlation between
words or categories using a color spectrum. WordStat offers
multiple spectra to suit various tastes.
Users can easily create dendograms (tree graphs) to display
hierarchical clusterings of categories. Correspondence maps
can be used to illustrate the likelihood that items—categories
and variables—will appear together. This can be shown in
two-dimensional and three-dimensional maps, and gives a
snapshot of similarities between items. If this isn’t enough for
the average user, then multivariate statistics also are available.
QDA Miner. This is designed for use with already coded data,
or for coding text data. QDA Miner and WordStat possess many
similarities, and the two can work together for enhanced capa-
bility. For instance, the word- and phrase-finder functions of
and modify it. They can also append words and categories WordStat can identify items that might be included in the code
into other dictionaries, which makes establishing subsequent book, and QDA Miner also offers heat maps, dendograms, and
dictionaries more efficient. correspondence maps. The manual provides easy instructions for
WordStat offers tools to help users compile dictionaries. running these analyses, and detailed explanations of the output.
For example, they can look at simple word frequencies of all QDA Miner presents a straightforward way to manually
the words contained in the data, automatically generated by code text. Users select the text to be coded and click the
WordStat. They can also look at the words not contained in desired code, and a code mark appears in the right margin.
the inclusion dictionary, to make sure important words Code marks appear as the “name” of the code and a colored
haven’t been overlooked. The phrase finder looks at idioms, bracket, to show the physical limits of the coded segment.
phrases, and expressions throughout the text, which otherwise Users can code sections within segments multiple times, with
might be missed. WordStat also will suggest synonyms for the additional code marks appearing in the right margin, and eas-
existing words in users’ categories, which can enhance the dic- ily add or remove code marks as needed. A nice feature is that
tionary. The keyword-in-context page allows users to see—in they can add comments to coded text by clicking the code
one table—all occurrences of a word or category in the origi- mark; these appear as a small yellow square in the middle of
nal text. Users then can sort these instances to look for simi- the code bracket.
larities or differences in word usage, as well as inconsistencies Some of the package’s other highly useful functions are the
in word meanings. Any discrepancies in usage can then be multiuser possibilities, and the ability to merge files or projects
addressed by refining the dictionary. when coding is performed on different computers—or in dif-
Users also can create rules to specify under which conditions a ferent files by different team members. Creating project back-
word or category should be coded. If specified properly, this can ups is simple using an archiving procedure. Users should do
reduce ambiguity when words have multiple meanings, because this regularly, so they can recover lost variables or go back to
the different meanings can be clearly defined. An exclusion dic- earlier versions.
tionary specifies which words should not be included in the anal- Separately or together, Provalis’ statistical and text analysis
ysis. It comes already populated, and can be edited by users. software packages provide basic and higher-level tools for coding,
Once the dictionaries have been set up, users must prepare analyzing, and illustrating data. Users looking for only a basic
their data files and documents for import into the program. statistics package might find Simstat somewhat overwhelming;
(All text must be in raw text, or ASCII, format.) WordStat however, it is easy to use, and help is available with the tutorial
searches for specific spellings; therefore users should check and user’s manual. WordStat and QDA Miner will allow
text before beginning analysis, to ensure that words aren’t researchers to fully investigate textual information and although
missed (Provalis supports English, French, Spanish, and sev- startup may be time consuming, the end result is well worth it.
eral other languages). Other formatting issues will also need to Those interested can download demonstrations from the
be addressed. WordStat generally treats hyphenated words as company’s Web site (www.provalisresearch.com). They can
separate words, and uses brackets and braces as specific mark- also order products there. Current prices are $355 for Simstat,
ers, so users need to remove these symbols or replace them $595 for QDA Miner, and $1,150 for WordStat. G
with other symbols. Lemmatization is also used: treating word
stems, singular and plural, and tense forms as one word. Users
can adjust these rules for the dictionary. Deborah Bobier is an account executive at Millward Brown in
Now the fun begins. Users can cross-tab categories and Toronto. She may be reached at deborah.bobier@ca.millward-
words with other categorical variables of interest to uncover brown.com.

