Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 21

Module 2

Planning Data Warehouse


Infrastructure
Module Overview

Considerations for Data Warehouse Infrastructure


• Planning Data Warehouse Hardware
Lesson 1: Considerations for Data Warehouse
Infrastructure

System Sizing Considerations


Data Warehouse Workloads
Typical Server Topologies for a BI Solution
Scaling Out a BI Solution
• Planning for High Availability
System Sizing Considerations

Data Volume Analysis/Report Complexity

Number of Users Availability Requirements


Data Warehouse Workloads

ETL
• Control flow tasks
Data Models • Data query and insert
• Processing • Network data transfer
• Aggregation storage • In-memory data pipeline
• Multidimensional on disk • SSIS Catalog or msdb I/O
• Tabular in memory
• Query execution

Operations and
DW Maintenance
• OS activity
Reporting •

Logging
SQL Server Agent Jobs
• Client requests • SSIS packages
• Data source queries
• Indexes
• Report rendering
• Caching • Backups
• Snapshot execution
• Subscription processing
• Report Server Catalog I/O
Typical Server Topologies for a BI Solution

Single-server Distributed

Few Servers Many

Hardware costs
Software license costs
Configuration complexity
Scalability and performance
Flexibility
Scaling Out a BI Solution

Data Warehouse Analysis Services

Integration Services Reporting Services


Planning for High Availability

Data Warehouse Analysis Services


• AlwaysOn Failover Cluster • AlwaysOn Failover Cluster
• RAID Storage

Integration Services Reporting Services


• AlwaysOn Availability Group • NLB Report Servers
• AlwaysOn Availability Group
Or
• AlwaysOn Failover Cluster
Lesson 2: Planning Data Warehouse Hardware

SQL Server Fast Track Data Warehouse Reference


Architectures
Core-Balanced System Architecture
Demonstration: Calculating Maximum
Consumption Rate (MCR)
Determining Processor and Memory Requirements
Determining Storage Requirements
Considerations for Storage Hardware
SQL Server Data Warehouse Appliances
• SQL Server Parallel Data Warehouse
SQL Server Fast Track Data Warehouse
Reference Architectures

• Pre-tested and approved


hardware specifications and
guidance
• Available from multiple
hardware vendors in
partnership with Microsoft
• Support for a range of data
warehouse sizes
• Tools provided to calculate
required specification
Core-Balanced System Architecture

Per-Core MCR = 200 MBps 2 x FC Port per processor


Total MCR = 1,600 MBps Max I/O Rate = 2,000 MBps

Server Storage Enclosure


Storage
Processors

4-Spindle RAID 10 Disk Groups


SQL Server

Fibre Switch
Storage Enclosure
Storage
Windows Server Processors

4-Spindle RAID 10 Disk Groups


Quad
Dual Port
Core
FC HBA
CPU

Quad
Storage Enclosure
Dual Port
Core FC HBA
CPU Storage
Processors
Dual Port
FC HBA 4-Spindle RAID 10 Disk Groups

Max I/O Rate = 2,000 MBps Max I/O Rate = 1,800 MBps
Demonstration: Calculating Maximum
Consumption Rate (MCR)

In this demonstration, you will see how to:


• Create tables for benchmark queries
• Execute a query to retrieve I/O statistics
• Calculate MCR from the I/O statistics
Determining Processor and Memory
Requirements

Estimating CPU Requirements:


• Determine core MCR
• Apply formula to estimate required
number of cores:
((Average query size in MB ÷ MCR) x Concurrent users) ÷ Target response
time
• Spread cores across CPUs based on the
number of storage arrays

Estimating RAM Requirements:


• Use a minimum of 4 GB per core
(or 64–128 GB per socket)
• Target 20% of data volume
Determining Storage Requirements

Data Warehouse
Estimating Data Volumes for the Data Warehouse
1. Estimate Initial Fact Data
• Number of fact table rows × row size
• Use 100 bytes per row as an estimate if unknown

2. Allow for Indexes and Dimensions


• Add 30–40% for dimensions and indexes
3. Project Fact Data Growth
• Number of new fact rows per month
4. Factor in compression
• Typically 3:1

Other storage requirements


• Configuration databases • Staging tables
• Log files • Backups
• tempdb • Analysis Services models
Considerations for Storage Hardware

• Use more smaller disks instead of


fewer larger disks
• Use the fastest disks you can afford
• Consider solid state disks―especially for
random I/O
• Use RAID 10, or minimally RAID 5
• Consider a dedicated storage area
network for manageability and
extensibility
• Balance I/O across enclosures, storage
processors, and disk groups
SQL Server Data Warehouse Appliances

• Pre-built hardware and software solutions, based


on tested configurations
• Part of a range of appliances that are based on
SQL Server
• Available from multiple hardware vendors
SQL Server Parallel Data Warehouse

• A special SQL Server edition only available in


hardware appliances
• Shared-nothing architecture
• Massively parallel processing
• Dedicated control nodes, compute nodes, and
storage nodes
Lab: Planning Data Warehouse Infrastructure

• Exercise 1: Planning Data Warehouse Hardware

Logon Information
Virtual machine: 20767C-MIA-SQL
User name: ADVENTUREWORKS\Student
Password: Pa55w.rd

Estimated Time: 30 minutes


Lab Scenario

You are planning a data warehouse solution for


Adventure Works Cycles, and have been asked to
specify the hardware that is required. You must
design a solution that is based on SQL Server that
provides the right balance of functionality,
performance, and cost.
Lab Review

• Review DWHardwareSpec.xlsx in the


D:\Labfiles\Lab02\Solution folder. How does the
hardware specification in this workbook compare
to the one that you created in the lab?
Module Review and Takeaways

• Review Question(s)

You might also like