CH2 Notes: Chapter 2: Choosing The SQL Server 2019

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

CH2 Notes

Chapter 2: Choosing the SQL Server 2019


Analytic Model for Your BI Needs
Strengths of the
Strengths of tabular models
multidimensional model
Everything is a table - This gives the designer flexibility to
The multidimensional model create different types of models to support business needs -
is mature - Stability - availability managing changes in the model. - the fields in a table are
- well documented also treated the same - In multidimensional design, we work
with measure groups and dimensions
Scaling for large datasets - built
- built into memory - Tabular models have a much higher
on the filesystem - data
compression rate at 10x
compression: 3x compression
Using actions to enhance the
user experience - Actions allow DAX is typically easier for users - it is still an expression
the developer to create a more language that is not designed for complex scenarios or simple
interactive or elegant solution querying. - DAX is getting more support for set-based
directly in the cube with data in operations, which is where it is significantly weaker than SQL
the cube. - create links to or MDX.
websites

Building complex relationships


and rollups: - parent-child
relationship or self-join structures
- This join type is not supported
in tabular models but is easily Extending tables with calculated columns
supported in multidimensional
models - Custom rollups are
particularly helpful when working
with financial models
Solving 'what if' scenarios with Using data source capabilities with Direct Query - Direct
write back - This feature is Query allows you to work with sets of data beyond the size
typically used to handle 'what if' that can be supported in tabular models due to memory
scenarios, such as budgeting and restrictions - Direct Query also allows you to take advantage
forecasting. - The write back of underlying data source servers to return results - serve as

CH2 Notes 1
capability in multidimensional a semantic layer and send queries back to the supported data
models retains and updates data sources
modified in the process.

Forces good data modeling


techniques - You must have a
solid star schema built on
dimensional modeling techniques
to create the cube - While some Distinct count is a simple expression
may consider this a weakness as
it requires more time and
technical development, the end
result is highly trusted
Multidimensional model
Challenges with tabular models
challenges

Model size is limited to memory capacity - you need to


understand the potential impact to the server memory you will
experience when your model is fully deployed and in use. A
Difficult implementation of
typical rule of thumb is that you need enough memory to
distinct count - That issue is
support the size of your model twice over (size x2) for normal
resolved by creating a separate
operations - use multiple models to deliver solutions. - use
measure group,
DirectQuery for large models. However, you will need to
convert your model to DirectQuery. - Composite models in
Power BI

MDX can be difficult - the Subpar design experience in Visual Studio -the tabular
traditional understanding of how model design in Visual Studio can be an exercise in
tables and relationships work frustration. This is magnified by the simplicity of creating
doesn't apply. similar solutions with Power BI and Power Pivot in Excel.
Small changes require full
reload - The problem is that the
actual storage of data, design of Code is contained in a single file - Tabular models are
aggregations, and index designs effectively limited to one developer at a time due to merge
are all impacted with a change as issues
simple as adding an attribute to a
dimension

Power Query is intended to be a lightweight data mashup tool


for retrieving and shaping data prior to loading it into a model.
- This capability lets them mash the data together, which can eliminate the need for
a relational data warehouse

CH2 Notes 2
- This should not be considered as a replacement for a standard extraction,
transformation, and load (ETL) process using tools such as SQL Server
Integration Services (SSIS). Power Query, while simple and flexible, struggles to
perform in large enterprise solutions

- There are still many benefits to having the transformed or shaped data stored in a
warehouse, including access to reporting tools or analysis by open source analytic
tools, such as Azure Machine Learning.

CH2 Notes 3
Partitioning

Partitioning is the process of separating a table into sections to improve performance or


maintainability

partitioning larger tables can improve the processing time. Partitioning allows you to
process the partitions independently so that you can reload smaller amounts of data,
which reduces load time accordingly. This means that if you need to reduce the time it
takes to bring the latest data into a model of either type

it only improves query performance in multidimensional models.

The lesson here is: do not use partitions to improve query performance in tabular
models.

in multidimensional models, partitions are only applied to measure groups or fact


tables

CH2 Notes 4
You need to consider why and how you plan to implement partitions in your
model.

Role-playing dimensions
Role-playing dimensions refer to those dimension tables that may have multiple
relationships with a fact table. The most common example is the date dimension.

This functionality is built into the multidimensional model natively

You can add multiple relationships to the table in the data source view, and then
create multiple dimensions on the table. This allows the fact table to be sliced or
filtered by either date

OLAP versus relational concepts

The crux of this difference is that multidimensional modes were designed for Online


Analytical Processing (OLAP) databases.
Dimensions and facts are the normal design pattern.

When traditional BI architects move to tabular models, they see that this is no longer
true.

Tabular models are built on relational concepts. There are strictly two dimensions –
columns and rows. This simplifies the model and is much easier for people familiar with
Excel or relational databases to comprehend the model.
Multidimensional models and OLAP models are built with the concepts of cells with
intersecting dimensionality. One of the ways we often see this being handled is by
implementing star schemas and dimensional models to be used by tabular models.
While that structure is not required, there are definitely advantages to using that design.
It simplifies the overall model and follows a known design pattern that can be supported
by many BI practitioners.

Hardware requirements

CH2 Notes 5
Multidimensional models require more overall performance considerations as the
data is retrieved from disk.

Tabular models only use the hard disk to store metadata and data when the server
is shut down.

When working with SSAS in multidimensional mode, we recommend that you use
SSD

Both systems require substantial memory for querying and processing.

Multidimensional models use the memory capacity to optimize queries through


various caching mechanisms.

with tabular models, memory is the most important consideration. The entire tabular
database must be loaded into memory and around three to four times the size of the
database will be required to properly support a tabular model.

both modes need to have high-speed CPUs with onboard caching to support query
and processing operations.

both model types use parallel processing techniques, more cores are preferred.

Once you have reached the peak of performance with the hardware capability, we
recommend scaling out Analysis Services. Scaling up or increasing hardware
capacity only solves some of the issues that you may experience with your
deployments. We recommend that you scale out as opposed to scale up

you should never put instances of both models on the same machine (bare iron or
virtual)

Choosing the model type for business-specific reasons

Multidimensional Mode Tablular Mode


Tabular models are by far the better
require a solid star schema, which
option to support frequent changes -
Rapid involves more complex ETL solutions as
The primary factors for this
development well. This design requirement also slows
recommendation include the ability
and change down the ability to handle business
to retrieve and shape data using
changes quickly
Power Query

CH2 Notes 6
Multidimensional Mode Tablular Mode

- Azure Analysis Services is built


on the same technology as Power
Multidimensional models are not natively BI and tabular models - This means
Cloud
supported structures in Azure and require that tabular models are much easier
readiness
virtual machines. to migrate to Azure when you are
ready for a full cloud solution. -
Cloud support for tabular features

multidimensional models have a significant


amount of maturity. This includes support
for complex analytical scenarios. One There is currently no simple, built-in
Complex
such scenario concerns financial analysis functionality to support this type of
analysis
including balance sheets, charts of reporting in tabular models.
account, and other financial statements
that change signs as the data is rolled up

In many cases, the tools are sending MDX


Power BI, for example, supports
to both models. Analysis Services has the
both model types well with Live
ability to use this for both model types, but
Client tools Connection capabilities. Excel is
you should understand the tooling so that
best equipped for MDX, which it has
you can make an informed decision
been doing for years.
>>>>>>>>>>>>>>>>>>>>>>>>>>

💡 While tabular models are supported in Azure Analysis Services and Power BI
Premium, not all features in SQL Server 2019 Analysis Services are available
in the online servers. We highly recommend that you do not commit to a
purely cloud solution until you have tested the features you are using. In most
cases, you will not have feature issues. The other area to consider is the size
of your model. At the time of writing, Azure Analysis Services models are
limited to 400 GB.

You can also use virtual machines to support larger tabular models or use
features not currently available in true cloud deployments.

CH2 Notes 7

You might also like