Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 35

Teradata

Leaders in Enterprise Data Warehousing

John Tulley
Vice President, Teradata Canada Email: John.tulley@ncr.com Office: 905-478-8997

NCR Corporate Overview


Fortune 500 company Global operations in more than 100 countries & territories 28,500 employees 2004 Revenue $5.984B 1999-2004 >51% revenue growth
2004 Revenue by Business Unit
Teradata Financial Retail Systemedia Customer Service Payment & Imaging Other

Retail Solutions

Teradata Data Warehouse

Financial Solutions

Systemedia

Worldwide Customer Services

Top Industry Leaders Rely on Teradata


Teradata Top 10 80% of Top 10 Global Telco Firms 60% of Top 10 Most Admired Global Companies 60% of Top 10 Global Airlines 50% of Top 10 Global Retailers 50% of the Top 10 Transportation Logistic Firms
FORTUNE Global Rankings, July 2005

Leading industries > Banking > Government > Insurance & Healthcare > Manufacturing > Retail > Telecommunications > Transportation Logistics > Travel World class customer list > More than 800 customers > Over 1200 installations Global presence > Over 100 countries 4,000 world-wide professionals dedicated to data warehousing

The Teradata Difference


What We Do.
Enterprise data warehouse Windows 2003/Unix/Linux scales from Intel laptop to MPP Analytic capabilities transform data into information. Extreme high availability Industry leader in analytical applications Integration with SAP, Siebel, Hyperion Partnerships include Accenture, Bearingpoint, CAPGemini, Deloitte, EDS, Lockheed Martin Strong customer references All we do is Data Warehousing!

Teradata - the recognized leader in data warehousing and high-performance decision analytics. .Gartner ASEM
IBM S/390 OS/390 DB2 EEE Sun Enterprise Solaris Oracle HP HP9000 HP-UX Oracle IBM SP RS/6000 AIX DB2 EEE Compaq Alpha Tru64 Oracle Generic Unisys Intel IA-32 ES7000 Win2000 Win2000 SQL Server SQL Server

Teradata

Data Mgmt.
Data Admin.

Scalability and Suitability Concurrent Query Mgmt. DW Track Record Query Perform.

Source: Gartner ASEM Ratings 2004

Worst

Best

Industry Leadership Recognition


Gartner - Dominant Lead 5th Consecutive Year

> DBMS is surely the place where NCR Teradata sets the gold standard. As in previous years, the Teradata score was 98%, leaving little scope (and need) for improvement.

Gartner's [Application Server Evaluation Model] ASEM Data Warehouse Server Update, A. Butler, K. Strange, J. Enck, M. Chuba, November 2004

> Teradata[database management system] DBMS capabilities remain unchallenged by its competitors in the market.
Gartners Magic Quadrant for Data Warehouse DBMSs, 2004, Kevin H. Strange, June 2004

> Teradata continues to drive a strong vision.


Gartner Research, MarketScope: Customer Relationship Marketing, 1Q04, G. Herschel, J. Radcliffe, Feb 2004

> Gartner Dataquest recognized Teradata as the growth leader in the RDBMS market, with above market growth of 17.4%. 2005

> Teradata is rated Positive in Gartners MarketScope for Campaign Management,


the highest rating awarded 2005

META Group
> Teradata has displayed unmatched (but often copied) strength of vision and focus in the [enterprise data warehouse] EDW market.
METAspectrum Market Summary, Enterprise Data Warehouse METAspectrumSM Evaluation, 2004

Industry Awards and Recognition - 2005


BI Excellence Award Sponsor: Gartner Group Continental Airlines - winner Cardinal Health - finalist Technology Leadership Award Sponsor: Frost & Sullivan Teradata selected for Leadership Award CRM Analytics TDWI Best Practices Award sunrise TDC Switzerland AG winner - Customer Relationship Management 1to1 Impact Award Sponsor: Peppers & Rogers Continental Airlines recognized as Technology Optimization winner Editors Choice Awards Sponsor: Intelligent Enterprise Teradata selected for the Dozen Most Influential BI Companies Winner, Customer Analytics category NEXUS Awards NEXUS Sponsor: New Zealand Awards Direct Marketing Association Bank of New Zealand, silver award - data mining & analytics; bronze award - data management

Government Agencies with Teradata Presence


US Air Force US Navy US Transportation Command Defense Commissary Agency Army, Air Force Exchange Intelligence Community US Postal Service Italian Post Office Dept. of Justice Dept. of Housing and Urban Development Dept. of Agriculture Arizona, Iowa, Florida, Texas, Illinois, New York, Utah, Michigan RAMQ Quebec Australian Tax Office South African Tax Office

Teradata Solutions Methodology


Project Management
Strategy
Opportunity Assessment

Research
Business Value

Analyze
Application Requirement Logical Model Data Mapping Infrastructure & Education

Design
System Architecture

Equip
Hardware Platform

Build
Physical Database

Integrate
Components for Testing System Test

Manage
Help Desk

Enterprise Assessment

EDW Roadmap
Information Sourcing

Package Adaptation Custom Component


Test Plan Education Plan

Software Platform Support Management


Operational Mentoring Technical Education

ECTL Application Information Exploitation


Operational Applications Backup & Recovery User Curriculum

Capacity Planning System Performance


Business Continuity Data Migration HW/SW Upgrade Availability SLA System DBA Solution Architect
9

Production Install
Initial Data Acceptance Testing User Training Value Assessment

Technology Neutral Services

Teradatas success is the combination of hardware, software and methodology

Data Warehouse Needs Will Evolve


Query complexity grows Workload mixture grows Data volume grows Schema complexity grows Depth of history grows Number of users grows Expectations grow
PREDICTING WHAT WILL happen? ANALYZING WHY did it happen? REPORTING WHAT happened? Batch Analytical Modeling Grows Increase in Ad Hoc Analysis Primarily Batch & Some Ad Hoc Reports Ad Hoc Analytics Continuous Update/Short Queries Event-Based Triggering ACTIVATING MAKE it happen!

OPERATIONALIZING WHAT IS happening?

Workload Complexity

Event-Based Triggering Takes Hold

Data Sophistication

10

Enterprise Analytical Topologies


Data Mart Centric
Sources Marts Users Users Sources Middleware Virtual, Distributed, Federated Sources
ODS

Hub-andSpoke Data Warehouse Sources DW Users

Enterprise Data Warehouse

DW Marts Users

Independent Data Marts P Easy to Build Organizationally r o Easy to Build Technically s C Business Enterprise view unavailable o n Redundant data costs s High ETL costs
High App costs High DBA and operational costs

Leave Data Where it Lies


No need for ETL No need for separate platform

Dependent Data Marts


Allows easier customization of user interfaces & reports

Centralized Integrated Data With Direct Access


Enterprise view Design consistency & data quality Data reusability Requires vision Requires Data Owners to willingly participate

No ETL Meta data issues Network bandwidth and join complexity issues Only viable for low volume

Business Enterprise view challenging Redundant data costs High DBA and operational costs Data latency ODS duplication

11

Typical Data Warehouse Architecture


Whats wrong with this picture?
1. There are too many copies of the data. Will they all be the same? 3. The solution is too complex. Every line on the chart represents an ETL process that requires $$ for Life Cycle Maintenance

Transaction Systems

Operational Data Stores

Central store, Hub, Clearing house 2. There is too much latency - too long to get the data to the people who need it. Everyone sees different inconsistent points in time

Data Marts

4. The solution is too expensive. There are numerous components that lead to increased costs. Costs often hidden in distributed organization.

12

Teradatas Enterprise Data Warehouse


Transactional Users Transactional Data
Optional ETL Hub

An Integrated, Centralized Data Warehouse Solution


Physical Data Base Design

Enterprise, System, & Database Management

Single version of data


ORDER ORDER NUM BER ORDER DATE STATUS ORDER ITEM BACKOR DERED QUANTITY CUSTOM ER CUSTOM ER NU MBER CUSTOM ER NAM E CUSTOM ER CITY CUSTOM ER POST CUSTOM ER ST CUSTOM ER ADD R CUSTOM ER PHONE CUSTOM ER FAX

Optional ELT

Enterprise Data Warehouse Data Replication Data Marts


PERIOD PERIOD KEY DATE DAY MONTH YEAR QUARTER TRIMESTER

ORDER ITEM SHIPPED QUANTITY SHIP DATE ITEM ITEM NUM BER QUANTITY DESCRIPTION

PRODUCT PRODUCT KEY PRODUCT NAME DISTRIBUTOR PRODUCT DESCRIPTION PRODUCT HEIGHT PRODUCT WIDTH PRODUCT DEPTH PRODUCT WEIGHT

SALES PERIOD KEY PRODUCT KEY CUSTOMER KEY MARKET KEY DOLLARS UNITS

Logical (Views)

Application Co-Located

CUSTOMER CUSTOMER KEY CUSTOMER NAME CUSTOMER CITY CUSTOMER POST CUSTOMER ST CUSTOMER ADDR CUSTOMER PHONE CUSTOMER FAX

MARKET MARKET KEY CITY STATE ZIP ZIP4 DISTRICT REGION COUNTRY

Optional

Virtual Views

Decision Users
Strategic Users Tactical Users Reporting OLAP Users Data Miners Event-driven/ Closed Loop
13

Metadata

Dimensional

Dependent DM

Logical Data Model

Operational Data Store (ODS)

Optional

Middleware/Enterprise Message Bus

Data Transformation

Business & Technology Consultation Support & Education Services

TERADATA is an Open System


Virtually any application or middleware framework can be integrated with TERADATA !!!
Messages
WEB

JMS JAVA JDBC

JMS EJB JDBC

JSP TAP Appl JDBC

IIOP CORBA ODBC

ASP .NET OLE-DB

Publish & Subscribe

TERADATA Utilities

Adapter(s)

TERADATA
TERADATA Utilities Adapter(s)

Queues
14

Message Bus

Teradata Active Data Warehouse in action


Front Line
Base Supply

Secure Wireless

DOD Supplier

Warfighter Support

5.Warfighter receives alert via Secure Blackberry, adjusts Battle Plans to align with rush replenishment Enterprise Application Integration Web Services WebTibco .NET Sphere (EAI) Business Services OLAP Rules Event Intel Queries Agents Engine Engine

Strategic & Tactical Queries

1.Continuous Transaction feeds on supplies usage

4. and or DOD Vendor notified and reorders

Secure DOD Network

Secure DOD Network

2. Conditioning & Ascential Loading of trans Informatica data

Information Exchange
MQ Adapter T-Pump, MQ Adapter

Fast Export

Legacy Systems

Direct Data Access Data Acquisition


T-Pump, MQ Adapter Fast Load, Multi Load

3.Stored Procedures trigger based event detection TERADATA sends alert Stored Procedures to Q Tables Warfighter, UDF, Triggers Warfighter Support, & DOD Supplier via MSTR Narrowcaster Decision Making Environment
16

Transactional Environment

So what is Teradata ?

What is Teradata?
RDBMS designed to run the worlds largest databases Latest Intel technology nodes UNIX-MP-RAS, Windows 2003 Linux in Fall 2005 Scales linearly from Laptop to MPP Has a parallel aware optimizer that allows multiple complex queries to run concurrently Standard access language (SQL) Uses a Shared-Nothing architecture Unlimited, unconditional parallelism Linear Scalability allows for increased workload without decreased throughput.
18

Teradata Hardware Architecture


SMP Nodes
BYNET Interconnect
SMP Node1 PE PE AMP SMP Node2 PE PE AMP SMP Node3 PE PE AMP SMP Node4 PE PE AMP

BYNET Interconnect

> Latest Intel SMP CPUs > Configured in 2 to 8 node cliques > Windows, Unix or Linux > Fully scalable bandwidth > 1 to 1024 nodes > Fully scalable > Channel - ESCON > LAN, WAN > Independent I/O > Scales per node

AMP AMP AMP

AMP AMP AMP

AMP AMP AMP

AMP AMP AMP

Connectivity

Storage

Server Management
> One console to view the entire system
Server Management

19

Teradata Shared Nothing Architecture

P
FSB Memory

P
FSB

I/O

I/O

Memory

P
FSB Memory

P
FSB

I/O

I/O

Memory

Similar to Large SMP, except Interconnect runs at I/O Rates and not Memory Rates Longer Lifetime: I/O Interfaces have a 3-5 Year Lifetime Scaling Is By Increasing Link Data Rates and Parallel Links
20

SMP vs. MPP: The Teradata Advantage


2-Way SMP
> > > > > > > > > > 1.8 Relative CPUs 4 GB Memory 3.2 GB/Sec BUS 3.2 GB/Sec Memory 1.5 GB/Sec I/O 3.1 Relative CPUs 4 GB Memory 3.2 GB/SEC BUS 3.2 GB/Sec Memory 1.5 GB/Sec I/O

2 2-Way Teradata Nodes


> 3.6 Relative CPUs > 8 GB Memory > 6.4 GB/Sec BUS > 6.4 GB/Sec Memory > 3 GB/Sec I/O

4-Way SMP

32 2-Way Teradata Nodes


> 57.6 Relative CPUs > 128 GB Memory > 102.0 GB/Sec BUS > 102.0 GB/Sec Memory > 48 GB/Sec I/O
21

Teradata Data Distribution


Dividing the Work
Rows are distributed evenly by hash partitioning
> > > > > Done in real-time as data are loaded, appended, or changed. No reorgs, repartitioning, space management Each VAMP owns an equal slice of the data. Each VAMP works exclusively & independently on its rows Nothing centralized: No single point of control for any operation (I/O, Buffers, Locking, Logging, Dictionary)

Shared nothing software:


Table A Table B Table C

Prime Index Teradata Parallel Hash Function


RowHash (Hash Bucket) Data Fields

VAMP1
P

VAMP2
P

VAMP3
P

VAMP4 VAMPn
P P P P P P

22

File System
File system architecture is fundamentally different
> > > > Broke all the rules No Pages, BufferPools, TableSpaces, Extents,... Data location and management are entirely automatic Space allocation is entirely dynamic

Absolutely minimal labor required


> No reorgs
Dont even have a reorg utility

> > > > >

No index rebuilds No re-partitioning No detailed space management Easy database and table definition Minimum ongoing maintenance
All performed automatically
23

Self Managing Architecture

Teradatas self-managing philosophy provides the lowest total cost of ownership of any RDBMS
> > > > > > Automatic, random and even data distribution Parallel-aware optimizer eliminates query tuning Parallel utilities with low setup and checkpoint restart Single operational view of entire MPP complex (AWS) Single point of control for the DBA (Teradata Manager) SQL-ready database management information (log files)

Teradata DBAs Dont Worry About!


1. 2. 3. 4. 5. 6. 7. 8. 9. Install the Database Understand, monitor and tune extensive operating system parameters Understand, monitor and tune extensive database parameters Determine the size and physical location and/or space allocations of tables and index partitions Perform periodic table and index re-orgs Manually restart multi-step load process when failure occurs Ability to run queries and data maintenance 24x7 Sort data before loading Calculate and configure fail-over plans in a clustered multiprocessing environment

10. Spend a lot of time planning and expanding the system 11. Query tuning for decision support

25

Teradata High Availability


Teradata software provides high availability beyond other databases
> Compensates for hardware failures:
Automatic failover for dynamic workload rebalancing (migrating VPROCS) Online, continuous backup (Fallback)
SMP Node1 PE PE AMP

BYNET Interconnect
SMP Node2 PE PE AMP SMP Node3 PE PE AMP SMP Node4 PE PE AMP

AMP AMP AMP

AMP AMP AMP

AMP AMP AMP

AMP AMP AMP

> Recycles before the operating system completes its reboot (multi-node system)

26

Teradatas Multidimensional Scalability


(Its more than just big data)

Amount of Detailed Data

Concurrent Users

Multiple Subject Areas


ORDER ORDER NUMBER ORDER DATE STATUS ORDER ITEM BACKORDERED QUANTITY CUSTOMER CUSTOMER NUMBER CUSTOMER NAME CUSTOMER CITY CUSTOMER POST CUSTOMER ST CUSTOMER ADDR CUSTOMER PHONE CUSTOMER FAX

Sophisticated Queries
Simple Direct at the start Moderate Multi-table Join Regression analysis Query tool support

ORDER ITEM SHIPPED QUANTITY SHIP DATE ITEM ITEM NUMBER QUANTITY DESCRIPTION

28

EDW Requires Multi-dimensional Scalability


Data Volume
(Raw, User Data)

Mixed Workload

Query Concurrency

Data Freshness

Query Complexity

Query Freedom

Query Data Volume

Schema Sophistication
29

The Teradata Difference Multi-dimensional Scalability


Data Volume
(Raw, User Data)

Mixed Workload
Teradata can Scale Simultaneously Across Multiple Dimensions Driven by Business!

Query Concurrency
Competition Scales One Dimension at the Expense of Others Limited by Technology!

Data Freshness

Query Complexity

Query Freedom

Query Data Volume

Schema Sophistication
30

The Teradata Difference Multi-dimensional Scalability


Data Volume
(Raw, User Data)

Mixed Workload
Teradata can Scale Simultaneously Across Multiple Dimensions Driven by Business!

Query Concurrency
Competition Scales One Dimension at the Expense of Others Limited by Technology!

Data Freshness

Difference!
Query Freedom Query Data Volume Schema Sophistication
31

Teradata

The

Query Complexity

The Teradata Difference Multi-dimensional Scalability


Data Storage (raw, user data)
20 TB

Teradata Others 100s TBs +

Multiple, Integrated Stars and Normalized

15 TB 1,000s

Schema Sophistication

Normalized Multiple, Integrated Stars

10 TB

# of Concurrent Queries

5 TB Simple Star Batch Reporting, Repetitive Queries Iterative, Ad Hoc Queries Data Analysis/Mining Near Real Time Data Feeds

3-5 Way Joins 5-10 Way Joins MBs

15+ way Joins + OLAP operations + Aggregation + Complex Where constraints + Views Parallelism

GBs

Active Data Warehousing

Query Complexity

TBs

Query Data Volumes

Workload Mix
32

State of Michigan, Department of Community Health (DCH)


Customer Profile
Teradata Customer Since 1991
As the largest department in the State of Michigan, DCH is responsible for managing delivery of health care services to more than 1.2 million clients and overseeing an annual budget of $9.5 billion. DCH administers many of the states most critical programs, including Medicaid, WIC, and child immunizations.

Business Solutions
Data warehouse integrates claims/encounters; beneficiary eligibility data; provider data; birth records; death records; long-term care assessments; WIC data; immunizations; lead screening; newborn screening; & notifiable diseases.

Implementation Summary
Integrated data from nine separate health-related agencies Managed and used by agency subject matter/programmatic
experts, not by the IT department

Over 200 users in Medicaid and 8,000 state-wide

Fraud & abuse Contract management with health


plans

Realizations and ROI


Estimated annual savings of $75 million$100 million due to advanced health care analysis Medicaid administrative costs have been reduced by 25 percent Recoveries for Medicaid Fraud has doubled Maximized Medicaid program savings while sustaining quality care Warehouse helped Michigan go from last to first in child immunization rates Track and substantiate savings in Medicaid pharmacy costs 2004 TDWI Best Practice Award Winner Government and Non-Profit Category

Healthcare cost & quality


assessment

Overpayment & COB analysis Program effectiveness Predict States healthcare needs Prioritize health initiatives
for future

33

The New York State Department of Health (DoH)


Customer Profile
Business Solutions
New York is making more rapid, informed decisions about programs, policies, and people across its vast Medicaid system.

Teradata Customer Since 1999

New Yorks Medicaid program provides critical health care services to more than 3.7 million participants 2.4 million in New York City alone. To serve this constituency, the state processes and analyzes more than 300 million claims totaling more than $38 billion annually. It is the largest Medicaid program in the US.

Implementation Summary
More than five years of History 1.3 Billion Claims 650 users from 17 counties that is expected to grow to
thousands

Fraud & abuse Tracking bio-terrorism indicators daily by pharmaceutical purchases with acute illness data from hospital emergency rooms Determining disease patterns and trends and the best possible treatment Tracking drug pattern usage to prevent abuse Program effectiveness Service delivery effectiveness Enhanced audit control Forecasting the cost and utilization of expensive prescription drugs Identification of overpayments Responding quickly to legislative inquiries

Realizations and ROI


First year in operation paid for entire implementation of the DW! Better analysis of integrated data resulted in recoveries in the millions! $16m - Coordination of Benefits, $5m - duplicate payments, $1 million - overpayments $187 million saved due to better policy decisions based on medical and pharmaceutical analysis Millions saved due to efficiency of analysis such as Audit process reduced to 2 hours from 8 weeks 2004 NASCIO Award Best Information Architecture Category

34

Iowa Department of Revenue


Tax Compliance Have more accurate leads because of better information Experienced substantial savings; staff can -> Analyze greater volumes of data > Manage a greater number of cases > Exercise a higher level of control over taxpaying behavior > Before the EDW, this additional work would have caused for a 20-25% increase of the audit staff Generated $69.7M in incremental collections and refund reductions in 2003 > $30.6M through office examinations > $17.4M in refund reductions > $ 9.1M from tax gap revenues > $ 7.5M in out-of-state audits of multi-state businesses > $ 5.1M from in-state field audits Business Benefits
35

The Teradata Mission


Teradata Active Data Warehousing
strategic tactical event-driven decision making in a single centralized mission-critical up-to-date version of the enterprise data

Sources
tactical

strategic

Active Data Warehouse

Users

Any Question, By Any User, At Any Time All Decision Makingfrom One Copy of the Data.
36

The Industry Leader in Data Warehousing

john.tulley@ncr.com
37

You might also like