Statistica Release Notes - 125 PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Whats new in Dell Statistica v12

Overview
Better. Bigger. Faster.
An explosive combination of Big Data growth, digital storage capabilities, and technological advances has forever
altered the modern business analytics landscape. The application of analytic tools and decision making is no longer
limited to the realm of data scientists, computer programmers, engineers, and the like. Rather, analytics are now being
integrated into day-to-day tasks across all departments, utilized by project managers, business analysts, predictive
modelers, customer agents, and executive leaders who need access to sensible, actionable information. People who
need visual user interfaces to create, consume, and share KPIs, graphs, reports, slide presentations, and more.
To meet these changes head on, we made Statistica even faster, more flexible, and more functional than ever:
We boosted the Big Data performance of the entire product line.
We added a visual user interface to write SQL queries with the new Advanced Query Builder in all products.
We reinvented the visual analytic workspace in Statistica Enterprise and Data Miner for a more intuitive user
experience, with greater visual workflow and storage capabilities to help users understand and communicate
their findings.
We strengthened the predictive/prescriptive capabilities of Decisioning Platform.
We introduced the highly flexible Reporting Tables product that enables users to visually build tables of
summary statistics and use them in presentations and other reports.
We developed new nodes, such as the practical Data Health Check that facilitates cleanup of a large number of
variables.
With the rollout of Statistica 12 in April 2013, weve built on nearly 30-year legacy of exceeding customer expectations,
furnishing this ever-growing business landscape with a host of relevant features and performance improvements that
will make our analytic solutions even faster, more accessible, and more effective for business leaders and power users
alike.

We fit into your IT world better than any alternative. Whether handling medium data or Big Data, Statistica 12 takes
greater advantage of existing data warehouses and IT tools than ever before, helping move businesses even closer and
faster to meaningful ROI.

All components
Advanced Query Builder

Advanced Query Builder (AQB) makes it possible for even non-technical staff to write complex queries to retrieve data.
It has a new visual user interface to build queries (dragging, dropping, nesting, selecting). The application's parsing
engine determines the current context.

Dell Software

August 2014

Offering features usually found only in specialized applications, AQB can build left, right, and full outer joins
graphically; can build queries with aggregate functions; is capable of building complex queries involving unions and
minus operations; can graphically represent complex SQL queries and ER diagrams; and can provide the means for
SQL dialect to be changed when the universal default is not practical.

Spreadsheet Improvements
NEW FILE FORMAT FOR BETTER SUPPORT OF BIG DATA
Statistica now features a new data file format that is optimized for Big Data by supporting variable storage length for
text variables. When text variables include sparsely populated columns, the space occupied by those values is
automatically optimized, reducing spreadsheet sizes sufficiently to produce significant performance improvements.
SPREADSHEET VIRTUAL VARIABLES
Spreadsheets now use virtual variables that can be specified by formula and evaluated at run time, requiring no real
storage. These virtual variables are added or deleted behind the scenes without needing to rewrite entire spreadsheet
data sections, so users will notice only enhanced performance. New data hides in a separate vector on disk and is
reunited with the original spreadsheet when data is saved. This especially adds significant performance improvements
to large spreadsheets when you need to add transformed variables.
INCREASE IN TEXT LABELS
Text Label support in spreadsheets has now been increased to millions of distinct labels with significant performance
improvements for name/value lookup. This makes Text Labels a good choice for text fields with large numbers of
distinct values, inheriting all the performance benefits from a fixed storage size of the numeric value and avoiding
duplication of repeated values.

Dell Software

August 2014

AGGREGATE FUNCTION IN OLE DB PROVIDER FOR Statistica SPREADSHEETS


The OLE DB provider now allows for the utilization of aggregate functions such as average, count, max, min, or sum.
IMPORTING TEXT FILES USING AUTO-FIXED IMPORTING VARIABLE OPERATIONS
This enhancement to Statistica provides the ability to take blocks of data that contain fixed-length pieces of
information, and specify the fixed length to import variable- specific information.

Statistica now has the option for a Fixed import setting.

Data Visualization
Several new options have been added to provide additional features and tools for visualizing data.
"Orthogonal regression" fit type is now supported in 2D scatterplots
Points on graphs can now be annotated
New options in compound graphs improve visual appearance by controlling the scaling display
A new data file can be created by brushing the points to be included
Date and time support was added for meaningful time intervals in graph scales
Now you can modify the margins of all plots in an original graph (e.g., multi-graph layout)
Create Pareto charts more easily
We added a new graph type, the parallel coordinate plot, which shows multiple variables, side-by-side, on
comparable scales, thus making it easier to compare values across variables (see below)

Dell Software

August 2014

Each Y-axis corresponds to a variable in a Statistica spreadsheet and can be defined according to standalone values or
two-sided values (e.g., range boundaries, upper and lower limits, etc.)

Statistics
FALSE DISCOVERY RATE
False Discovery Rate (FDR) and Qvalues were added. FDR performs the Benjamini and Hochberg method, and Qvalues
performs the method described in the 2002 Storey paper .
NEW DISTRIBUTIONS
New distributions were added to the Probability Distribution Calculator, STATISTICA Visual Basic functions, and
spreadsheet functions. These are for hypergeometric distributions (inverse, cumulative, prob) and the inverse P oisson
and inverse binomial distributions.
STEPWISE MODEL BUILDER (Statistica ADVANCED)
Stepwise Model Builder provides control over model building and gives the modeler a what-if environment. This is
useful when regulation or a companys standard practices limit which variables can be used to build models. For
example, a bank cannot discriminate based on age or gender.
NEGATIVE BINOMIAL DISTRIBUTION (Statistica ADVANCED)
This new option is available within GLZ. It enables you to specify the Negative Binomial as the distribution for the
response variable. This specific form is referred to as the Poisson-Gamma mixture form and is the discrete analog to
the continuous gamma distribution.
QUALITY CONTROL CHARTS (Statistica QUALITY CONTROL)
Quality Control now includes options that can set the background color for in control, out of control, and out of
warning lines on quality control graphs.

Other
MICROSOFT OFFICE 2010 STYLE TOOLBARS
Statistica now uses the Office 2010 style toolbars. The Help menu has been moved to the File tab.
SEARCH FACILITY
Now you can search for modules by name, select a module, and start it. This feature indexes all available ribbon bar
options and displays them alphabetically. Typing in the search box will start restricting the list to those entries that
match any of the words from the ribbon bar option. Pressing ENTER will open the selected modules dialog box.
HIGH RESOLUTION DPI 120 SUPPORTED
Starting with the release of Microsoft Vista and the greater availability of very high resolution monitors, Microsoft made
it much easier to change DPI. And for Windows 7, themes come with a default of DPI 120 for high resolution. This
resolution is now supported with Statistica.

Data Miner
Data Miner Workspace Enhancements
The Workspace has been upgraded to include a large number of new features to improve usability and performance,
especially with respect to handing very large data sets.

Dell Software

August 2014

A new system of nodes has been introduced with enhancements of the user interface to closely resemble the user
interface in the respective modules. The previous nodes are still offered and supported for backwards compatibility.
ENHANCED ABILITY TO IMPORT EXCEL FILES
Statistica now has the ability to import Excel files using the nomenclature of Excel spreadsheets: letters for columns
and numbers for cases.

Dell Software

August 2014

This functionality is not only available interactively, but is also translated to the Workspace utilizing the newImport
Excel node.

You can use this node to import Excel data directly from a spreadsheet into a Workspace.

Analytic Enhancements
DATA HEALTH CHECK
The Data Health Check node is new in Statistica 12 and is available to all Statistica Data Miner users. This node detects
common data issues for each variable, completes basic data cleaning, and generates a report that can be used in
deciding how to further clean the data. The Data Health Check node is especially useful for exploring a large number
of variables automatically.
CONSTRUCTION OF TREES, SENSITIVITY ANALYSIS
This new sensitivity option enables you to learn more detail about a specific node. You can then use this knowledge
to redefine the splits of the proposed tree in an expert way.
ORDERED TWOING CRITERION
This is an option to treat categorical dependent variables in order. It is useful when categories represent levels (low,
medium, high).
PREDICTOR SCREENING
This is a new method for analyzing predictors that was added to Feature Selection. This functionality can be used as a
quick, first look at a predictor to provide a basic set of statistics.

Data Access Enhancements


TERADATA CODE DEPLOYMENT (Statistica DATA MINER WITH CODE GENERATOR)
User-defined functions can now be defined for the Teradata database, which allows for in-database scoring.
Dell Software

August 2014

Enterprise
Enterprise Workspace Enhancements
The Workspace has been upgraded to include a large number of new features to improve usability and performance,
especially with respect to handing very large data sets.

A new system of nodes has been introduced with enhancements of the user interface to closely resemble the user
interface in the respective modules. The previous nodes are still offered and supported for backwards compatibility.
ENHANCED ABILITY TO IMPORT EXCEL FILES
Statistica now has the ability to import Excel files using the nomenclature of Excel spreadsheets: letters for columns
and numbers for cases.

Dell Software

August 2014

This functionality is not only available interactively, but is also translated to the Workspace utilizing the new Import
Excel node.

You can use this node to import Excel data directly from a spreadsheet into a Workspace.
Dell Software

August 2014

Analytic Enhancements
DATA HEALTH CHECK
The Data Health Check node is new in Statistica 12 and is available to all Statistica Enterprise users. This node detects
common data issues for each variable, completes basic data cleaning, and generates a report that can be used in
deciding how to further clean the data. The Data Health Check node is especially useful for exploring a large number
of variables automatically.
REPORTING
A new enhancement is the selection of spreadsheet cells into dynamic tags, which allows inserting the value of a
particular cell into the text of a report and can be used for both text (including paragraph text strings) and numeric
values.
Individual workbook items can be specified as dynamic tags, making it possible for these items to be included in
reports. Additionally, Statistica now supports an expanded list of keyword tags, including workflow name, SDMS
version numbers, and more.
QUALITY CONTROL CHARTS
Statistica Enterprise now supports full color and pattern control for the elements of QC charts, in the same manner
that these options are supported in the interactive usage of Statistica. These controls are accessible from inside
the Enterprise Manager application.

Data Access Enhancements


SVB DATA CONFIGURATIONS
With SVBData Configurations, you can access non-traditional databases that dont have an ODBC or OLE DB provider.
As an example, a large text file can be thought of as a database if someone desired to obtain its data. As a text file,
however, it does not have an ODBC or OLE DB provider. But with an SVB Data Configuration, it is possible to access
this text file as a database and make its data available to Statistica. If you want to execute different queries based on
predetermined conditions, those conditions can also be coded into the SVB Data Configuration.
GENERAL DOCUMENT STORE
Files can now be saved/opened within the Enterprise System View , so Statistica documents and other document
types can be stored within Enterprise Manager and shared among users outside a file share. The Enterprise System
View is the default destination for saving reports. Additionally, standard Statistica Enterprise permissions
and SDMS versioning are supported.
SVB and SVX code can be stored within Enterprise using the General Document store. Now all the places in
Enterprise that use SVB can reference the stored code; changing the code in one place can simultaneously implement
that change in SVB Analysis Configurations, SVB Data Configurations, Workspace node code, and Secondary SVB
Programs within Enterprise.
BROWSER SUPPORT (Statistica ENTERPRISE SERVER)
Support is provided for all main stream browsers: Internet Explorer, Chrome, Firefox, Safari, and Opera. This makes it
possible for you to use Statistica Enterprise Server from your iPad or laptop.
WORKBOOK SUPPORTED (Statistica ENTERPRISE SERVER)
Workbooks can now be shared easily with others through the Statistica Enterprise Server Portal. After a file is
published, a Download from Server link (URL) will be provided.
Versioning Support (Statistica Enterprise Compliance Edition)
Statistica Enterprise Compliance Edition is an integration of Statistica Enterprise with a highly scalable document
management system that enables you to securely manage documents of any kind, and it is designed to ensure
compliance with FDA 21 CFR Part 11 regulations, Sarbanes-Oxley legislation, as well as ISO 9000, 9001, and 14001

Dell Software

August 2014

documentation requirements. New functionality provides for easy version comparison and opening of previous
versions of documents.
VERSION COMPARISON
Now when SDMS integration is enabled, you can compare different versions of SDMS objects in Enterprise Manager.
Each versionable Enterprise object will have a text representation:
Data Configuration list of query, data types, and OLE DB column properties
IQC Analysis Configuration summary of QC settings/parameters
SVB Analysis Configuration SVB text and properties
Rules object text representation of rules
PMML object PMML representation of model
Workflow text detailing all contained nodes and parameters
OPEN PREVIOUS VERSION
For those versionable objects that can be opened directly in Enterprise, including Workspaces, PMML, and Rules
objects, Statistica will allow a specified previous version of the object to be opened as a read-only object.
Labels (Statistica Web Data Entry)
Labels are used with the Data Entry product. Labels can now be stored in one or more system folders. Customers will
find it easier to manage Labels with this new option.

Scorecard
CALIBRATION TESTS
Calibration Tests is a tool that makes it possible to compare the forecast probability of default (PD) with the realized
PD that eventually occurs.
A typical use case in financial institutions is to divide customers into segments of like customers, realizing that each
separate segment will have a certain number of customers who meet credit obligations and a certain number who will
not. Based upon the model the financial institution has agreed upon, each segment has a forecast PD. After the model
has been used for a period of time, the accuracy of the model must be tested. Performing such tests is very easy
in Statistica, which even includes a built-in "traffic light approach" described in a popular reference on guidelines in
credit risk management (Oesterreichishe Nationalbank, 2004).
RULES
Statistica Scorecard is now integrated with Statistica Decisioning Platform. This tool can now generate rules for batch
scoring or live scoring.

Decisioning platform
Versioning Support
StatisticaCompliance Edition is an integration of Statistica with a highly scalable document management system that
enables you to securely manage documents of any kind, and it is designed to ensure compliance with FDA 21 CFR Part
11 regulations, Sarbanes-Oxley legislation, as well as ISO 9000, 9001, and 14001 documentation requirements. New
functionality provides for easy version comparison and opening of previous versions of documents.
VERSION COMPARISON
Now when SDMS integration is enabled, you can compare different versions of SDMS objects. Each versionable object
will have a text representation:
Data Configuration list of query, data types, and OLE DB column properties
IQC Analysis Configuration summary of QC settings/parameters
SVB Analysis Configuration SVB text and properties

Dell Software

August 2014

10

Rules object text representation of rules


PMML object PMML representation of model
Workflow text detailing all contained nodes and parameters

Weight of Evidence
This new product is important to anyone engaged in binary prediction (yes/no). This tool automates a timeconsuming task to bin predictors.
Two methods are used:
Optimal
Interpreted (e.g., observed risk of prediction probability)

Rules Builder
Every organization has rules that govern its behavior. Consistently applying these rules to analytic projects or reports is
a common challenge. Rules Builder solves this problem.
Business users, developers, or modelers find it easy to create, maintain, share, and re-use sets of rules. A rule set for
data transformation could be created and then used by one or thousands of analytic projects. Role-based security
controls access to these rules.

Rules Builder has the ability to conditionally execute models with pre-scoring segment rules and then apply postscoring policy rules. Rules can retrieve reason codes for individual predictions, which can be critical for many
industries, such as banking or insurance. For example, banks are required to state why a loan application was denied.
The execution of rules can be visually traced with sample data to aid in troubleshooting complex scenarios.

New components
StatisticaReporting Tables (optional)
Businesses are challenged to:
Summarize large amounts of data into formats that are easily understood
Easily emphasize particular data segments (e.g. , only report on Oklahoma and France)

Dell Software

August 2014

11

Statistica Reporting Tables (an optional product to be purchased separately for Version 12) automatically sorts and
summarizes data based on specifications made while developing the table. The tables are generated interactively by
visually dragging and dropping variables into the appropriate four sections of the Reporting Tables dialog box (Layers,
Column Label, Row Label , and Sigma). As the tables are customized, they can be previewed, and final results can be
generated with the click of a button.

Options are available for processing Multiple Response Categories, Crosstable Groups , and Conditional Formatting.
Dell Software

August 2014

12

You might also like