Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

white paper

A Revolutionary Approach for Advanced


Analytics and Big Data Management

Aster Database: The First MPP Database with Applications Inside

A Revolutionary Approach for Advanced Analytics and Big Data Management

Contents
Executive Summary ....................................................................................................................................................... 3
Introducing Aster Database: An MPP Database with Analytic
Applications Running Inside .................................................................................................................................... 4
Breakthrough Performance and Scalability ..................................................................................................... 5
Advanced In-Database Analytics with MapReduce ..................................................................................... 9
Accelerating Development of Advanced Analytics ................................................................................... 14
Efficient Management for Both Data and Applications .......................................................................... 16
Competing and Winning with Analytics ......................................................................................................... 18
About Teradata Aster .................................................................................................................................................. 19

Teradata Aster

A Revolutionary Approach for Advanced Analytics and Big Data Management

Executive Summary
Organizations in every major market are turning to a new generation of advanced
analytic applications that leverage huge volumes of data in new ways to provide deeper,
smarter insights. Applications like fraud detection, customer behavior analysis, trending
and forecasting, scenario modeling, service personalization and targeting, and deep clickstream analysis are increasingly being used to drive real-time decisions, increase revenue,
and reduce costs.
Operating a business today without serious insight into business data is simply not an
option. Competitive advantage depends on the ability to manage and analyze all the
critical data entering a business environment. With over 100% data growth per year in
many enterprise applications and with 70-80% of enterprises data residing outside a
traditional data warehouse, addressing the Big Data challenge has become a top priority
for companies.
Legacy architectures and approaches to data management and analytics are inherently
unfit for todays realities of Big Data and advanced analytics. The traditional database
architectures and multi-tier data pipelines are under severe strain because they were
simply not designed to store and process terabytes to petabytes of data nor to perform the
advanced analysis that has become common today. Traditional databases struggle with
the complexity and poor performance that result from trying to express rich analytics in
SQL and provide only very limited capabilities for going beyond SQLs limitations.
Additionally, moving large volumes of data through the traditional data pipeline from
the data warehouse to an analytic application for processing takes significant amounts of
time and delays analysis of fresh data. The larger the volume of data, the larger the time
and effort needed to move it from one tier to another.
These challenges are so severe that application developers and analysts are forced to
compromise the richness and depth of their analyses. They first reduce Big Data to small
data via aggregations, windowing, or sampling and then perform computations on a
subset of the data rather than the entire data set. They also are forced to spend significant
amounts of time writing, testing, and modifying complex analytic logic in order to fit it
into the limitations of traditional databases. Lower-quality analytics are the result, for
which the organization pays the price in poor decisions and lost revenue opportunities.
This paper explains how Aster Database solves the challenges of advanced analytics that
can scale to Big Data with a monumental shift in the way data and analytics are processed
and managed. Aster Database delivers a revolutionary architecture for analytics and data
management that allows rich analytic applications to be pushed down into Teradata
Aster's Massively Parallel Processing (MPP) database so that they can run where the data
natively resides. With Aster Database, it is no longer necessary to push Big Data through a
network to an overworked application server, build aggregates or random samples, or

Teradata Aster

A Revolutionary Approach for Advanced Analytics and Big Data Management


limit the richness of analytics to fit the limitations of the database. Instead, business users
get the business-critical insights they need in the fastest possible time, taking advantage
of Aster Databases parallel processing of both data and analytics, with unprecedented
simplicity and affordability.

Introducing Aster Database: An MPP Database with Analytic Applications


Running Inside
A new generation of advanced analytics has become critical to daily business operations.
Their impact continues to grow as new applications constantly emerge to help
organizations run their business, manage their products and services, and interact with
their customers at speeds and scales that were once inconceivable.
While these analytic applications are used in many different industries and business
processes, they have a few key characteristics. One is that they process Big Dataterabytes
to even petabytesfrom far more sources and events than in the past. Enterprises are
capturing more and more touch points and interactions leading to every customer
decision, creating an explosion of data that needs to be examined and understood. New
types of data such as clickstream, GPS, biometric, and RFID data are flowing into the
organization at a tremendous rate. With millions or billions of events occurring on short
notice, most of this data (typically 70-80%) lives outside the enterprise data warehouse.
While the intelligence in this data is critical, most companies have not been able to
leverage it until now.
These applications are also characterized by increased depth and richness of analysis.
They typically require access to full data sets rather than samples or aggregations in
order to provide valuable insightfor example, detecting fraud requires looking at all
transaction data in order to identify outliers or unusual patterns. Further, techniques such
as statistical analysis, graph analysis, predictive modeling, and collaborative clustering
are increasingly used to drive critical applications. They use techniques that are far more
complex than what traditional Business Intelligence (BI) applications were designed for,
such as simulations that weigh thousands or millions of possible variables and models
that morph and change over time.
Finally, these new applications are dynamicrather than simply reporting on past events,
the new generation of advanced analytics is a critical tool in helping organizations make
the best decisions as events unfold by predicting likely outcomes. As a result, rather than
only a few statisticians and analysts accessing the system occasionally to run reports,
many business analysts could be accessing the system constantly. These analysts require
rapid results for both the interactive, ad hoc analysis that is crucial for data exploration as
well as their recurring analytical workloads.

Teradata Aster

A Revolutionary Approach for Advanced Analytics and Big Data Management


Aster Database is designed from the ground up to meet these challenges. Aster Database
is a massively parallel (MPP) row and column database with an integrated analytics
engine. This combination, called a massively parallel analytics platform, provides the
industrys most powerful platform for high-performance advanced analytics on terabytes
to petabytes of data, greatly improving the ability of enterprise users to make informed
decisions. Teradata Asters approach addresses major limitations of traditional data
warehousing and business intelligence systems by making it easy to create advanced
analytics and then embed 100% of analytic processing inside the database so that
analytics are co-located with the data to drive rapid analysis, eliminate massive data
movement, and avoid forced sampling. Combined with a visual development
environment, easy administration, and the exceptional reliability required by businesscritical applications, Aster Database makes it possible for organizations to perform rich
and deep analysis of large data sets at ultra-fast speeds so that they can leverage their data
in ways that were previously impractical or impossible.
Aster Database is available in a flexible set of offerings for deployment on premise or in
the cloud:

Aster Database packaged software provides Teradatas analytic platform for


installation on any certified commodity hardware.

The Teradata Aster MapReduce Data Warehouse Appliance combines server


hardware with Aster Database and valuable third-party software to simplify
purchase, deployment, and configuration.

The Aster Database Cloud Edition provides cloud integration between Aster
Database and Amazon Web Services (AWS), AppNexus, Dells Data Cloud, and
Terremark platforms. It is ideally suited to fully use the elasticity, scalability, and
persistence of cloud computing.

Breakthrough Performance and Scalability


Pervasive Parallelism
Aster Database is a Massively-Parallel Processing (MPP) database that provides end-to-end
parallelism of data and analytic processing, allowing organizations to examine very large
data sets with unprecedented granularity and depth of analysis. The result: 10100 times
faster performance than other architectures and scalability to terabytes and even
petabytes of data.
Aster Database is the first MPP database to deliver pervasive parallelism, independently
parallelizing all functions rather than simply parallelizing a few functions and leaving
others to run sequentially over a shared resource. Aster Database executes loads, queries,

Teradata Aster

A Revolutionary Approach for Advanced Analytics and Big Data Management


exports, backups, recoveries, installs and upgrades in parallel to take full advantage of all
resources, optimizing performance for all data warehouse and analytic operations. Each
function can be scaled independently simply by adding additional commodity servers at
any time. Independent parallelization and scaling of each function prevents bottlenecks
from occurring anywhere in the data management and data processing lifecycle.
Pervasive parallelism is delivered by Aster Databases internal architecture. Aster
Database consists of four separate classes of nodes that reside on commodity servers:
Queens, Workers, and Loaders, as well as an independent Backup Cluster as illustrated in
Figure 1 below.

Figure 1: Aster Databases pervasive parallelism provides end-to-end parallelism and


optimized performance for all data warehouse and analytics operations
Queen nodes provide the external interface to the data warehouse. End users and database
administrators can connect to a Queen through ODBC/JDBC, while systems
administrators monitor Aster Database through the Aster Management Console (AMC).
The Queen nodes are also responsible for coordinating the cluster servers in query
processing, result aggregation, and failure handling.
Worker nodes are responsible for parallel execution of queries and in-database analytics,
directed by one or more Queen nodes. Worker nodes also store partitions of data and
replicas of data that reside on other Worker nodes. Finally, Worker nodes participate in
maintenance tasks (e.g., indexing, load balancing) initiated by Queen nodes.
Loader nodes are responsible for rapid loading and partitioning of new data into the
Worker nodes. Loaders can perform both trickle-feed loading for granular data loading as
well as bulk loading proven to be able to load up to eight terabytes of fresh data per hour.
Additionally, Loader nodes can export data for use in other systems.
Teradata Aster

A Revolutionary Approach for Advanced Analytics and Big Data Management


Backup nodes are responsible for backing up compressed Aster Database user data and
metadata. Backup servers provide local data protection (over a LAN) and remote disaster
protection (over a WAN) and offer multiple backup options including full backup,
incremental backup, and logical table-level backup. All backups and recoveries are
performed in parallel, running non-disruptively in the background across all
backup servers.
Another advantage of the Aster Database architecture is the ability to allocate
heterogeneous hardware for different tiers. For example, the Loader tier may leverage
CPU- or memory-heavy servers with only one or two small-capacity disk drives since data
is not persistently stored on Loader nodes. In contrast, Backup nodes may leverage servers
with 48 large-capacity drives since the Backup tier is focused on optimizing backup cost
per gigabyte of data. By offering the flexibility to allocate the appropriate server hardware
for each tier on demand, enterprises not only ensure optimal SLAs for that particular
function, but also ensure lower costs.

Unlimited Linear Scalability


Aster Database's unique Online Precision Scaling enables Aster Database to achieve
breakthrough performance with linear scale-out to terabytes and petabytes of user data.
Architected to take advantage of ongoing advances in large-scale distributed computing,
Aster Database provides linear scalability for loads, queries, and backups, independently
or in unison to meet requirements.
Aster Databases multi-tier architecture allows each tier query processing, backup, and
loading to be independently scaled, providing the flexibility to scale each function costeffectively as needed based on workload characteristics rather than be forced to scale the
entire system to address a single bottleneck. Capacity can be added to functions
independently, offering significant flexibility and Total Cost of Ownership (TCO)
advantages compared to scaling the entire system. For example, if data volumes or
processing requirements grow, the number of Worker servers can be increased; if faster
loads or higher volume loads or exports are desired, more Loader servers can be
provisioned; if backup retention policies lengthen or backup volumes grow, the number
of backup servers can be increased. This architecture also allows you to choose the most
cost-effective hardware for each function. For example, functions such as backup that
occur less frequently and are more storage-intensive than CPU-intensive can run on
lower-cost hardware configured with extra storage.
When new resources are added, Aster Database is designed to take full advantage of those
resources with maximum parallelism. Aster Databases capabilities for granular splitting
and load balancing of virtual partitions ensure that new resources are efficiently
leveraged to deliver increased performance. Aster Databases patent-pending, dual-stage
query optimizer ensures use of maximum resources as the system scales and processing

Teradata Aster

A Revolutionary Approach for Advanced Analytics and Big Data Management


demands grow. After optimizing each query globally across all MPP nodes, local
optimization on each partition fine-tunes processing at a local level. This ability to
adapt on-the-fly to the latest resource use ensures the fastest possible response even at
massive scale.
Aster Databases unique architecture also provides administrators with the critical ability
to scale to hundreds of diverse workloads and users while ensuring predictable and
consistent performance. Aster Databases Dynamic Mixed Workload Management
capabilities allocate scheduling and resources to ensure consistent performance even as
the number of users and workloads grows. In addition, Aster Databases multi-tier
architecture ensures that heavy processing in one area does not impact other areas. For
example, load execution is kept independent of query execution, which is kept
independent of export or backup execution.
Aster Database delivers this scalability with exceptional ease of use and flexibility. Adding
more capacity is a simple matter of plugging a new commodity server into the local
network and performing one-click incorporation through the Aster Management Console.
The system automatically recognizes the new resources and rebalances the workload.
During peak load windows a system administrator has the flexibility to dynamically reprovision Worker nodes into Loader node identities to provide additional loading
resources for load balancing. In peak query windows, the reverse can occur. No additional
outside servers are required, enabling cost-effective task-based scaling without disruption.

Dynamic Workload Management for High Concurrency and Predictable


Service Levels
As companies have come to rely on data-driven applications and ultra-fast analytics across
all sectors of the organization, data warehouses are now expected to serve a broad range
of applications and users in their daily business operations. Many of these users require
rapid results and low latencies in order to support urgent analysis as well as ad hoc and
exploratory analysis. At the same time, other tasks such as loading and backups that are
running in the background need access to resources but cannot be allowed to disrupt
other important workloads.
Delivering consistent, scalable performance with these many concurrent users and
workloads without disruptions is critical to enabling large-scale advanced analytics. When
hundreds or thousands of mixed workloads are executing simultaneously, it becomes
increasingly difficult to prioritize and intelligently allocate the right amount of system
resources to the right workloads at the right time. Traditional manual database tuning
cannot possibly keep pace with rapidly changing workload demands. Manual tuning is
not only slow and complex, it also very expensiveindustry surveys have consistently
shown that up to 80% of TCO is attributed to ongoing maintenance, dwarfing initial

Teradata Aster

A Revolutionary Approach for Advanced Analytics and Big Data Management


hardware acquisition costs. Systems need to optimize performance and cost, in particular
avoiding duplication of data and systems just to separate different workloads.
Aster Database addresses these challenges with automated tools that keep the system
running efficiently even under peak demands, saving a great deal of administrative time
that would be required to tune a traditional system. Aster Database includes a Dynamic
Mixed Workload Manager, the industrys most advanced workload management
capability for large-scale distributed computing on commodity hardware. Intuitive, finegrained policy controls allow administrators to define and manage diverse workloads to
meet the organizations business priorities. Using the Workload Manager (Figure 2),
administrators can define rules that reallocate resources on the fly across hundreds of
distributed nodes to adapt to new workloads and changing priorities in real time. The
result is highly predictable performance and guaranteed service levels for the complex
mixed workloads of an enterprise data warehouse and analytic-intensive applications.
Workload management rules are easily created and managed using the Teradata Aster
Management Console. Rules are written as easy-to-read SQL predicates, eliminating
complex tuning.

Figure 2: Rules-based resource reallocation for different constituents and workloads

Advanced In-Database Analytics with MapReduce


As enterprises struggle to manage and leverage exploding data volumes, they also face a
lack of capabilities within the traditional data warehouse for processing and scaling
advanced analytics. This includes event-based analytic applications like fraud detection

Teradata Aster

A Revolutionary Approach for Advanced Analytics and Big Data Management


that execute within a business process and responsive, exploratory analytic applications
on fine grained data.
Advanced analysis has traditionally required complex coordination and interaction
among specialized analysts and data programmers to extract samples of data from large
data sets, appropriately transform the data for advanced analysis, build analytical models
using data mining toolsets and then develop scoring programs in proprietary database
languages to execute data mining model results on the complete data set. Some analysts
note that up to 70% of the time on data mining projects can be spent on preparing data
for analysis, just to get the data into a usable state.
Adding to these challenges, traditional data warehouses were designed for SQL, a useful
declarative language for simple queries and many administrative tasks but one that has
significant limitations for expressing rich analytic operations. Common types of analysis
such as time series analysis, clickstream analysis, graph analysis, and the like are
generally highly complex to express in SQL as well as being difficult for SQL to process
and scale with acceptable performance. Although some databases have basic in-database
capabilities such as stored procedures or User-Defined Functions (UDFs), these
approaches have significant limitations in flexibility, richness, and performance that limit
their ability to perform advanced analysis.
As a result of these obstacles, enterprise IT organizations are forced to export data from
their database to an external analytic application for processing. In addition to the cost of
storing and processing data in multiple systems, this architecture presents significant
challenges that only worsen over time:

High data latency: moving data between the data warehouse and the analytics
server can take many hours and require significant network bandwidth.

High processing latency: most analytic applications run on symmetric


multiprocessing (SMP) systems that process data serially with limited CPU and
storage resources, taking hours to days to finish.

Limited insights: sampling or aggregation is typically required to avoid


overwhelming the analytics server and network, but yields much weaker insights
than analyzing the full data set because it can fail to include outliers or
uncommon events. Analytics done inside the database in order to analyze full
data sets struggle to express advanced logic as a result of the limitations of SQL
and of traditional in-database processing capabilities.

In-Database Processing of 100% of Analytic Computations


With Aster Database, Teradata Aster has taken a unique approach to solving the cost and
performance challenges of running advanced analytics against Big Data. Teradata Aster is

Teradata Aster

10

A Revolutionary Approach for Advanced Analytics and Big Data Management


the first commercial vendor to bring to market a unified solution that seamlessly embeds
analytic applications with the data where it natively resides for ultra-fast and ultra-deep
analytics and data processing. By avoiding the need to move data from the database to the
analytic application for processing, Teradata Aster overcomes critical challenges and
limitations of the traditional analytics architecture. This new approach to analytic
application processing delivers orders of magnitude better performance and scalability
for advanced analytics including exploratory and ad hoc analysis, providing analysts and
businesses the freedom to examine the largest data sets and gain unique analytical
insights in the shortest time possible.
Aster Databases unique Applications-Within architecture delivers this breakthrough by
processing full analytic logic inside the database, leveraging Aster Databases massivelyparallel architecture and patent-pending SQL-MapReduce (SQL-MR) framework to fully
parallelize processing for ultra-fast analysis of massive data sets. With Aster Databases
Applications-Within architecture, all data stored in the database is available to all indatabase analytics, eliminating the need to move data between the database and analytic
applications. Rather than spending hours to weeks processing Big Data, Aster Database
distributes application processing and data processing across commodity hardware for
results in minutes to seconds. This frees up analysts from the complexity and delays of
data preparation and data movement tasks so that they can spend more time on data
exploration and richer analytics, resulting in deeper and more accurate analysis and
allowing a dramatic reduction in cycle time.
The Teradata Aster architecture makes it possible for any analytic application, custom or
packaged, to be pushed into the database without requiring rewriting (see Figure 3).
Embedded applications have access as a first-class citizen to all of the services available in
the database including memory management, workload management and fault tolerance
Teradata Asters approach of running applications within the database goes beyond what
is provided by traditional stored procedures and User-Defined Functions (UDFs). Aster
Database provides a complete application execution environment, separate from data
management, so that applications have the resources required to execute with maximum
performance and stability. Database services including failover, performance, Information
Lifecycle Management (ILM), and dynamic workload management are optimized
specifically to meet the requirements for both analytics execution and high-performance,
scalable data management.

Teradata Aster

11

A Revolutionary Approach for Advanced Analytics and Big Data Management

Figure 3: Aster Database embeds applications with data in a unified MPP platform for
scalable data management and ultra-fast analysis of large volumes of data

Automatically Parallel Analytics via SQL-MapReduce


Until recently, massively parallel processing of data required extremely specialized
programming skills to design for parallelization. The MapReduce framework popularized
by Google has rapidly emerged as a standard way to simplify parallelization, but does
require specialized developers who are experienced with its programming paradigm.
Teradata Aster has dramatically increased the accessibility and ease of use of MapReduce
for analytics by coupling SQL with Map Reduce in the patent-pending SQL-MapReduce
(SQL-MR) framework. The SQL MapReduce framework combines the analytic power of
MapReduce with the familiarity of SQL, so that any business analyst can leverage the
power of MapReduce from any SQL statement without needing to learn MapReduce
programming or parallel programming concepts. Using SQL-MR, any analytics code
running in database or any MapReduce function can be incorporated into an analytic
application through a SQL statement. Aster Database automatically parallelizes the
application processing using MapReduce so that any in-database application runs in a
massively parallel processing environment supported by commodity hardware.
The SQL-MapReduce approach to parallelizing analytics is a significant technological
innovation. As queries running in a traditional database become more complex, including
multi-pass statements or exceeding the limitations of SQL, parallelizing analytic logic
becomes much more complicated. No longer is it sufficient to replicate a query across
multiple nodes and aggregate the output. Parallelization of complex queries requires
designing queries for parallelism, a challenge addressed by the SQL-MapReduce
framework and the pre-defined SQL-MapReduce functions that Teradata Aster provides in

Teradata Aster

12

A Revolutionary Approach for Advanced Analytics and Big Data Management


Aster MapReduce Portfolio. The SQL-MR framework makes it possible for Aster Database
to seamlessly assemble, parallelize and execute standard ANSI SQL, custom functions,
and packaged analytic applicationsall in an extremely cost-effective software platform
thats easy to use and manage.
Using SQL-MR, developers can write powerful and highly expressive functions in a
variety of languages including Java, C, C++, C#, Python, and R and push them into the
database for advanced in-database analytics (see Figure 4). Additionally, pre-packaged
analytic toolsets for business intelligence or data mining that use standard SQL can
natively access a MapReduce enabled analytic application without any code changes,
making the power of MapReduce easily and transparently accessible to business analysts.

Figure 4: Seamless execution of SQL and SQL-MR functions (using Java or other
language of choice) inside the Aster Database massively parallel analytic platform
The ability to apply MapReduce to data from diverse sources with varying degrees of
structure provides groundbreaking potential for unique, provocative insights to enable
enterprises to gain competitive advantage. Analytics that were unimaginable in the past
are now easy to achieve and execute in seconds.
A few real-world examples of applications that benefit from this architecture include:

Predictive and granular forecasting

Trend analysis and modeling

Sequential pattern analysis (e.g. fraud detection, attribution, or behavior analysis)

Time series analysis (e.g. financial trading and risk modeling)

Teradata Aster

13

A Revolutionary Approach for Advanced Analytics and Big Data Management

Graph analytics (e.g. network optimization, human intelligence, influencer


marketing)

Text analytics (voice of the customer for improved customer satisfaction


and retention)

Statistical machine learning algorithms (linear regression, K-means clusters,


SVMs, etc.)

Transformation pre-processing (e.g. aggregations and other cleansing or


normalization routines)

And many others

Accelerating Development of Advanced Analytics


Development of advanced analytic applications has traditionally been hindered by
complexity and inefficiency, delaying deployment and forcing analysts to spend
significant effort on the mechanics of development rather than on the application logic
that delivers insights for the business questions they are trying to answer. Teradata Aster
provides an intuitive, fully-integrated development environment and a suite of powerful
MapReduce analytics functions to dramatically simplify and accelerate the development,
testing, and deployment of rich analytic applications. These tools make it possible to build
rich analytics applications not in weeks or months but in days due to the simplicity of
SQL-MapReduce and Teradata Asters extensive suite of pre-built rich analytics functions.

Visual Development Environment


Aster Developer Express accelerates development, validation, and deployment of
advanced analytic applications by providing the first integrated visual environment for
developing analytic applications with SQL and MapReduce.
Intuitive development: Developer Express integrates with the popular Eclipse Integrated
Development Environment (IDE) to enable developers to write applications that leverage
SQL and MapReduce in a rich graphical development environment. Using Developer
Express, developers can easily write, compile, and validate analytic applications; transfer
existing code to SQL-MR applications; and leverage the pre-built analytics in Aster
MapReduce Portfolio. Developer Express also includes SQL-MR wizards that
automatically package custom analytic logic for push down into the Aster Database
database, enabling them to focus on valuable analytic logic rather than on the mechanics
of integrating that logic with the database.
Rapid testing: Developer Express enables rapid, frequent testing by allowing developers to
validate their application code on their desktop without requiring access to a running
database. Developers can launch tests of their applications from within the graphical

Teradata Aster

14

A Revolutionary Approach for Advanced Analytics and Big Data Management


development environment, using the desktop test environment to simulate application
processing, and then view the results of their test within the Eclipse IDE.
One-click deployment: Developers can embed their completed applications in the Aster
Database with a single click directly from the IDE. Rather than spending time working
through the mechanics of embedding their applications in the database, developers and
administrators can focus on higher-value development.

Figure 5: Aster Developer Express makes advanced analytics on big data easy with
the first integrated environment for developing, testing, and deploying advanced
in-database analytics

Optimized MapReduce Analytic Functions


Aster MapReduce Portfolio accelerates the development of rich analytics with the first
suite of analytic functions built for in-database MapReduce. These powerful ready-to-use
functions are optimized to take advantage of Teradata Asters SQL-MR framework for
advanced in-database processing. Using the functions in Aster MapReduce Portfolio
makes it simple to rapidly create advanced analytics that leverage the performance and
scalability enabled by Aster Databases in-database MapReduce and SQL-MR.
The Aster MapReduce Portfolio components provide powerful, pre-tested analytic
functions that can be plugged into a wide variety of applications. For example, nPath is a
Teradata Aster

15

A Revolutionary Approach for Advanced Analytics and Big Data Management


sequential and trending pattern analysis framework built on SQL-MapReduce that
discovers relationships between rows of data that usually cannot be expressed through
SQL. The ability to invoke a simple nPath extension to the SQL language leverages the
compute power of the Aster Database database to greatly increase query performance.
Applications include customer shopping sequences, telephone calling patterns, stock
market trading sequences, and more. Other examples of types of functions in Aster
MapReduce Portfolio include:

Path analysis functions to discover patterns in rows of sequential data for use in
scenarios including time-series analysis, predictive analytics, and web analytics
such as click-stream analysis.

Statistical analysis functions to perform high-performance processing of common


statistical calculations for use in a variety of applications including analysis of
portfolios, market prices, consumer behavior, and security.

Relational analysis functions for discovering important relationships among


transaction, graph, or text data for use cases that include retail optimization,
network analysis, and log file analysis.

Efficient Management for Both Data and Applications


Aster Database automates and simplifies important aspects of monitoring and managing
availability and performance not only for data queries but also in-database analytic
applications. As the system scales from three to 30 to 300 servers or more, Aster
Databases management capabilities ensure that administrative effort and costs
remain minimal.

Always-On Fault Tolerance and Online Administration


Downtime, whether due to system failures or to maintenance operations, is disruptive to
both administrators and users, particularly when analytics are integral to the business and
people need to access the system around the clock. Traditional systems often require
downtime or disruption for routine operations such as loading data, backing up data, or
scaling up the system. Avoiding planned downtime requires a highly available
architecture that can process loading, backups and similar tasks without disrupting
performance. Fault tolerance is also critical so that operations are not affected in the
event of hardware failures, software failures, or user errors. As the size of the system
scales out to accommodate data growth, innovative approaches are needed to maintain
application availability in the face of increasing risk of failures in a component of
the system.
Aster Databases Always-On architecture is designed to minimize and avoid both
planned and unplanned downtime. Live Administration enables non-disruptive

Teradata Aster

16

A Revolutionary Approach for Advanced Analytics and Big Data Management


operations including online scaling, simultaneous load and export during queries, online
backup and recovery, and online restoration, eliminating downtime or disruption
traditionally required for these routine tasks. Aster Database is also designed from the
ground up to avoid unplanned outages due to hardware and software failures, user or
administrator error, and local or regional disasters. Aster Database leads the industry in
massive-scale fault tolerance with replication, automatic failover, NIC bonding, failure
heuristics, and clustered backup to prevent unplanned downtime due to hardware or
software failures. In the event of hardware or software failure, Aster Databases patentpending Recovery-Oriented Computing (ROC) capabilities and innovations in online data
redistribution ensure real-time recovery.
Aster Databases Always-On architecture enables a massively scalable platform with
continuous availability using standard, off-the-shelf commodity servers. Aster Database
can process queries with consistently high performance even when it is:

Experiencing a hardware failure

Recovering from a hardware failure

Adding capacity

Performing backups

Loading data

Exporting data

Aster Database is uniquely designed to deliver this level of mission-critical resilience


against both unplanned and planned downtime on MPP commodity hardware.

Powerful Console for Visibility and Control


The Aster Management Console (AMC) provides rich visibility into and control of the
Aster Database platform and the applications running inside, making it easy to configure,
manage and monitor data, applications, users, and infrastructure.
The AMCs intuitive web-based graphical interface enables easy monitoring with
summary dashboards, graphical views of query and process execution, and easy access to
common administrative operations. Dashboards provide at-a-glance visibility and easy
drill-down into system status, application metrics, and query performance. Single-click
scaling and point-and-click access to workload management policies and backup
processes further streamline and simplify management.

Teradata Aster

17

A Revolutionary Approach for Advanced Analytics and Big Data Management

Figure 6: The Aster Management Console (AMC) provides deep visibility, monitoring,
and control of data and analytic application processing in an intuitive graphical console

Competing and Winning with Analytics


Aster Database revolutionizes the ability to capture critical intelligence from the huge
data volumes flowing into organizations. For enterprises that depend on advanced
analysis of large data volumes to drive daily operations and profitability, theres simply
nothing else like it.
With Aster Database, Teradata Aster opens up a new world of high-performance, scalable
analytics that was previously out of reach for most companies. In this new world, analytic
functions run natively where the data resides, in parallel across hundreds or thousands of
processing instancesas many as you requirefor dramatic performance gains. The ability
to quickly build rich analytic functions and push them into the MPP database with a
single click allows companies to leverage the insights hidden in their data in powerful
new and actionable ways.
Aster Database is ideal for any analytics that require sophisticated analysis that can scale
to massive data sets: from data mining to credit scoring, risk modeling, ad targeting,
fraud detection, cross-sell/up-sell bundling, and many more. Hundreds of concurrent
workloads and users, whether running simple queries or the most complex analytics,
execute with phenomenal speed, availability, and scalability.

Teradata Aster

18

A Revolutionary Approach for Advanced Analytics and Big Data Management


Welcome to the new world of advanced analytics and Big Data, a world in which deep
analytic insights gathered from many data sources and large volumes of data enable datadriven business decisions that deliver competitive advantage with less work and
lower costs.

About Teradata Aster


The Teradata Aster MapReduce Platform is the market-leading big data analytics
solution. This analytic platform embeds MapReduce analytic processing for deeper
insights on new data sources and multistructured data types to deliver analytic
capabilities with breakthrough performance and scalability. Teradata Asters solution
utilizes Asters patented SQL-MapReduce to parallelize the processing of data and
applications and deliver rich analytic insights at scale. For more information, visit
www.asterdata.com or for more about Teradata, visit teradata.com.

Teradata Aster

19

You might also like