Download as pdf or txt
Download as pdf or txt
You are on page 1of 36

SPICA Join Path problems

DATA SYSTEMS

A join path is a series of joins that a query can use to


access data in the tables linked by the joins.
Join path problems can arise from the limited way that
dimension and fact tables are related in a relational
database.
The three major join path problems that you encounter
when designing a schema –
• loops
• chasm traps
• fan traps
SPICA Loops Explained
DATA SYSTEMS

• In a relational database schema, a common type of join


path that returns too few rows is called a loop.

• Loops create additional conditions in the generated SQL


statement. Additional statements are added to the where
clause, producing in most cases an incorrect final result
set of a Business Objects query (too few rows).

• A loop is a set of joins that defines a closed path through


a set of tables in a schema. Loops occur when joins form
multiple paths between lookup tables.
SPICA Loops
DATA SYSTEMS
SPICA Loop returns incorrect results
DATA SYSTEMS
SPICA Run following Query
DATA SYSTEMS

For each resort country, give me the number of


guests from each country that
stay at each resort.
SPICA Expected Result
DATA SYSTEMS
SPICA Actual Result
DATA SYSTEMS
SPICA Reason
DATA SYSTEMS

The Where clause created by the loop is shown below:


WHERE
( Country.country_id=Resort.country_id )
AND ( Resort.resort_id=Service_Line.resort_id )
AND ( Service_Line.sl_id=Service.sl_id )
AND ( Service.service_id=Invoice_Line.service_id )
AND ( Sales.inv_id=Invoice_Line.inv_id )
AND ( Customer.cust_id=Sales.cust_id )
AND ( City.city_id=Customer.city_id )
AND ( Region.region_id=City.region_id )
AND ( Country.country_id=Region.country_id )
AND ( Service_Line.service_line = 'Accommodation' )
SPICA Joins causing the problem
DATA SYSTEMS

The following two joins are both applying a


restriction to the Country table:
• Country.country_id=Resort.country_id
• Country.country_id=Region.country_id
SPICA Solutions
DATA SYSTEMS

• Create alias of affecting dimensions.

• Create Contexts for join paths.


SPICA Aliases Defined
DATA SYSTEMS

• An alias is an alternate name for a database


table.

• It is in some way similar to database view.


SPICA Create Alias for the Country
DATA SYSTEMS
SPICA Where clause after alias
DATA SYSTEMS

WHERE
( City.city_id=Customer.city_id )
AND ( City.region_id=Region.region_id )
AND ( Country_Region.country_id=Region.country_id )
AND (Country.country_id=Resort.country_id )
AND ( Customer.cust_id=Sales.cust_id )
AND ( Invoice_Line.inv_id=Sales.inv_id )
AND ( Invoice_Line.service_id=Service.service_id )
AND ( Resort.resort_id=Service_Line.resort_id )
AND ( Service.sl_id=Service_Line.sl_id )
AND ( Service_Line.service_line = 'Accommodation' )
SPICA Disadvantage of Aliases
DATA SYSTEMS

• Multiple copies of Similar Objects


SPICA Context Defined
DATA SYSTEMS

A context resolves a loop by defining a set of


joins that specify one specific path through
tables in a loop. It ensures that joins are not
included from different paths within the same
SQL query.

You often use contexts in schemas that contain


multiple fact tables (“multiple stars”) that share
lookup tables and query requires data from only
one fact table at a time.
Consider Following schema
SPICA with Loop
DATA SYSTEMS
SPICA Alias doesn’t solve the problem
DATA SYSTEMS

If you created an alias for


the Customer so that you
had a Customer to
Reservation join and a
Customer_Sales to Sales
join, you break the loop,
but if you want to add a
City table to the schema,
you end up with a loop
again as shown below:
SPICA No end to Aliasing
DATA SYSTEMS

You must continue creating aliases for


each new table you add to the schema.

This is difficult to maintain, and also ends


up proliferating the number of similar
objects using each table in the universe.
SPICA Create Context
DATA SYSTEMS

The only way to resolve this loop is to


leave the loop in place, and create a
context that specifies one or the other path
around the schema. This ensures that
queries answer questions for one
transaction or the other, such as:
Is the customer information needed from
the perspective of sales or reservations?
SPICA Designer Detects two contexts
DATA SYSTEMS

• In the example, you can follow two different


paths from the Customer table to the
Service table:
SPICA Reservation Line Context
DATA SYSTEMS
SPICA Sales Line Context
DATA SYSTEMS
Automatically Identifying and
SPICA Resolving Loops
DATA SYSTEMS

You can use Designer to automatically


detect loops and propose candidate
aliases and contexts that you can insert in
your schema to resolve the loops.

Cardinalities must be set before


detecting loops
SPICA Chasm Trap
DATA SYSTEMS

A chasm trap is a type of join path between three tables


when two "many-to-one“ joins converge on a single
table, and there is no context in place that separates the
converging join paths (The three tables have been
separated from the rest of the schema to illustrate the
chasm trap). Chasm Trap inflates no. of rows in the
query.

Contd…
Incorrect results only when all the
following conditions exist
SPICA
DATA SYSTEMS
Results of two
SPICA separate Queries
DATA SYSTEMS
SPICA Result of combined Query
DATA SYSTEMS
SPICA Reason
DATA SYSTEMS

The number of guests that have used, and


future guests who have reserved to use
the Sports service has increased
considerably.

A Cartesian product has been returned


and the results are incorrect.
Resolving a Chasm Trap
SPICA
DATA SYSTEMS

The second option works when fact table has only numerical columns. If
the fact table has descriptive columns, context is the only way to resolve
Chasm Trap. Junk dimension should be designed for descriptive
elements of fact table, in this way contexts can be avoided.
SPICA Fan Traps
DATA SYSTEMS

A fan trap is a type of join path between three tables


when a “one-to many” join links a table which is in turn
linked by another “one-to-many” join. The fanning out
effect of “one-to-many” joins can cause incorrect results
to be returned when a query includes objects based on
both tables.
SPICA Correct Result- w/o Model Id
DATA SYSTEMS
Incorrect result after
SPICA adding Model Id
DATA SYSTEMS
SPICA Reason
DATA SYSTEMS

Sales is not available at Model Id Level


SPICA Solution
DATA SYSTEMS
SPICA Correct Results
DATA SYSTEMS
SPICA Exercise # 2
DATA SYSTEMS

1. Insert product_promotion_facts &


promotion_lookup tables into the existing
universe.
2. Insert joins & cardinalities.
3. Create required Classes & Objects for these
tables.
4. Connect shop_facts & product_promotion_facts
using shortcut join.
5. Check integrity & solve Loops if any.
6. Detect Loops and Contexts
7. Manually create Aliases, Contexts

You might also like