Professional Documents
Culture Documents
Index Aware
Index Aware
Index Aware
The information in this document is subject to change without notice. If you find any problems with this article, please
report them using the feedback link on this site or at techweb@businessobjects.com. Business Objects does not warrant
that this document is error free.
Trademarks:
The Business Objects logo, BusinessMiner, BusinessQuery, and WebIntelligence are registered trademarks of Business
Objects S.A.
The Business Objects tagline, Broadcast Agent, BusinessObjects, Personal Trainer, Rapid Deployment Templates, and
Set Analyzer are trademarks of Business Objects S.A.
Microsoft, Windows, Windows NT, Access, Microsoft VBA, the Visual Basic Logo and other names of Microsoft products
referenced herein are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or
other countries.
Oracle is a registered trademark of Oracle Corporation. All other names of Oracle products referenced herein are
trademarks or registered trademarks of Oracle Corporation.
All other product and company names mentioned herein are the trademarks of their respective owners.
This software and documentation is commercial computer software under Federal Acquisition regulations, and is
provided only under the Restricted Rights of the Federal Acquisition Regulations applicable to commercial computer
software provided at private expense. The use, duplication, or disclosure by the U.S. Government is subject to
restrictions set forth in subdivision (c)(1)(ii) of the Rights in Technical Data and Computer Software clause at 252.227-
7013.
............................................... ..................................53
Overview ...................................................................55
Pre-requisites
The reader should be very familiar with Business Objects universe and report building concepts. A
familiarization with RDBMS and database performance tuning would also be advantageous.
Disclaimer
The functionality and features detailed herein are based on the author’s understanding of the software version
stated earlier. Every attempt has been made to create an accurate document by testing workflows and
confirming understanding with corporate development, but the reader assumes all risk in the using the
information contained herein.
Resources needed
The following documents or files will be needed to perform the task:
Database Index
A database construct which allows faster access to rows in a table. Indexes are defined by the database
designer and maintained automatically by the RDBMS. Since indexes are maintained in a sorted sequence, the
database can quickly locate a row or rows in the table.
Primary Key
A database designer defined column or columns which uniquely identify a row in a table. If the RDBMS is
told about the Primary Key during database design, it will automatically maintain an index to help it guarantee
uniqueness and provide referential integrity.
Foreign Key
A database designer defined column or columns which match the Primary Key of another table. This facilitates
referential integrity and provides likely columns for joins to a table that contains the Primary Key or other
tables with Foreign Key instances.
Surrogate keys
A common Data Warehouse type of column where the actual Primary Key and Foreign Key values are
referenced via a smaller, faster column type i.e. a number instead of a longer character type
This allows joins and indexes to be faster because indexes are smaller in size (same rows, less width) and
therefore less memory and/or disk space is consumed during their navigation.
Partitions
A way of dividing the physical layout of a table across separate disk locations to avoid contention when
accessing the table via the disk system. Usually the partitioning is based on a commonly accessed dimension
such as time e.g. break a fact table into separate partitions based on the year\month of the transactions.
Referential Integrity
This is not really a performance tuning term but it is related because this RDBMS feature requires Foreign Key
and Primary Keys to exist.
There is no guarantee that the database optimizer will use any specific database performance improving
construct just because the user has provided the opportunity for it to do so e.g. just because a index could be
used does not mean it will be used. The database optimizer may decide that other rules, limits and conditions
apply and it will choose a different route to the data.
Although Index Awareness will help the optimizer make a better decision, it is no guarantee that a Primary Key
or Foreign Key entry (see later in document) will cause the index to be used.
Tuning philosophy
Although no one will argue that using the Primary Key of a table is a good thing there seems to be 2 opinions
on whether tuning should avoid dimension tables or use them explicitly
Once the restrictions have been processed by the fact table, the database will join out to the smaller dimension
tables.
If this is your intention that you should setup your Index Awareness with Foreign Key entries that will reduce
the number of tables when the weightiest tables are involved.
Leaving conditions on the dimension tables causes the database to restrict the smaller dimension tables first
and then join into the fact table with less rows.
If this is your intention that you should setup your Index Awareness with less Foreign Key entries that directly
point to the larger tables in the schema
The performance improvements discussed in this document are theoretical. Because the ‘club.mdb’ is so small
and Microsoft Access is not considered a production capable database, it is unlikely you will notice any
performance improvement in running the small queries that ‘club.mdb’ is capable of.
Any LOV that are shown in this document that involve editing the LOV (DESIGNER\OBJECT
PROPERTIES\PROPERTIES\EDIT) have sorts applied to the objects to allow the hierarchical view of the
LOV dialog to display values without duplicates. This step is discussed in another document by the same
author.
Note : MS Access may have a ‘explain plan’ feature (mentioned earlier) but this was not investigated by the
author
Now add a condition that restricts it to return only customers from Dallas, Houston, Los Angeles, San Diego
and San Francisco.
But what if we knew that the ‘city’ column is not indexed and therefore is not a good candidate for WHERE
clause restrictions? (see schema description earlier)
We also know that column ‘city_id’ is the primary key of the city table but more so, it is indexed. From our
discussions on good performance earlier we determined that using the primary key of a table is one of the
fastest ways of retrieving the rows of that table.
So if we could get the SQL to automatically use the Primary Key of the city table, we could achieve a faster
retrieval of the records.
Note : Of course we could change the object’s SELECT definition to point to that Primary Key column
(city_id) but that means we expect users to know what those Primary Key values mean. Since Primary Key and
Foreign Key values are typically based on surrogate keys, and surrogate keys are automatically allocated by
the ETL or RDBMS itself, an end user will not normally know what Primary Key or Foreign Key value relates
to a real live value that they understand
SELECT
Customer.last_name,
sum(Invoice_Line.days * Invoice_Line.nb_guests * Service.price)
FROM
Customer,
Invoice_Line,
Service,
City,
Sales
WHERE
( City.city_id=Customer.city_id )
AND ( Customer.cust_id=Sales.cust_id )
AND ( Sales.inv_id=Invoice_Line.inv_id )
AND ( Invoice_Line.service_id=Service.service_id )
AND (
City.city_id IN (11, 10, 13, 14, 12)
)
GROUP BY
Customer.last_name
Notice that the ‘city.city’ column reference has been replaced by the ‘city.city_id’ column and that the city
names have been replaced with the primary key values that represent their name.
This query will likely run much faster because the main condition is now on an indexed column.
Note : if the SQL does not show any change this could be a sporadic bug noticed by the author where the SQL
would only change when the LOV dialog was visited and OK’d whether any changes were made to the LOV or
not.
It would have used ‘City.city’ but with Index Awareness it used ‘City.city_id’ instead.
But how did Business Objects convert the City names you selected in the LOV into the primary key values
necessary for the SQL? Remember that the City table has the following values (that are relevant to our
example) :
City table
City_id (Primary Key) City Region
11 Dallas 20
10 Houston 20
13 Los Angeles 21
14 San Diego 21
12 San Francisco 21
It did this by adding the Primary Key SELECT to the query it normally generates for the LOV and when you
selected the City names, Business Objects matched the City names to the Primary Key values then.
The SQL generated for a non-Index Aware LOV involving City would be:
SELECT DISTINCT City.city FROM City
But the SQL generated for the LOV on the Index Aware City object was:
SELECT DISTINCT City.city, City.city_id FROM City
So when you selected the City names from the LOV dialog, you were indirectly selecting the Primary Key
values as well. Think of the Primary Key values as a hidden column in the LOV dialog.
Summary
Index Awareness allowed us to automatically redirect a WHERE clause condition to another column (on the
same table for this example) that we know would provide better performance at query time.
We determined which column to choose as an alternative based on our knowledge of the database schema and
the RDBMS optimizer.
The LOV values we select actually tell Business Objects what Primary Key values to substitute in final query
SQL. The KEYS tab tells Business Objects which SQL syntax to substitute in the final query SQL.
Rule – when an Index Aware object is used in the CONDITIONS pane, the
Primary Key entry will replace the object’s SELECT
Rule – the operand ‘Show list of values’ dialog returns the Primary Key values
that match the visible values in the LOV dialog
The SQL from the last example improved performance by restricting on the Primary Key (indexed) of the city
table.
SELECT
Customer.last_name,
sum(Invoice_Line.days * Invoice_Line.nb_guests * Service.price)
FROM
Invoice_Line,
Sales,
City,
Customer,
Service
WHERE
( City.city_id=Customer.city_id )
AND ( Customer.cust_id=Sales.cust_id )
AND ( Sales.inv_id=Invoice_Line.inv_id )
AND ( Invoice_Line.service_id=Service.service_id )
AND (
City.city_id IN (11, 10, 13, 14, 12)
)
GROUP BY
Customer.last_name
This query has the Customer and City tables in it which both can provide the City_id column.
The City table is only needed to satisfy the WHERE clause and is not needed in the SELECT or GROUP BY
clauses.
Is it possible to remove the City table from the query completely? i.e. tell Business Objects to use the
Customer table to get City_id from if it can
City_id can come from more than one table
SELECT
Customer.last_name,
sum(Invoice_Line.days * Invoice_Line.nb_guests * Service.price)
FROM
Invoice_Line,
Sales,
Customer,
Service
WHERE
( Customer.cust_id=Sales.cust_id )
AND ( Sales.inv_id=Invoice_Line.inv_id )
AND ( Invoice_Line.service_id=Service.service_id )
AND (
Customer.city_id IN (11, 10, 13, 14, 12)
)
GROUP BY
Customer.last_name
Note that the City table is no longer referenced in the query and that the City_id is being restricted using the
Customer table instead.
When we check the SQL of the LOV for the City object we see it remains as it was in example 1 i.e. it has the
Primary Key SELECT added to it.
So we can conclude that the LOV SQL is only affected by the Primary Key entry and that the LOV selection
will always return the Primary Key entries ‘behind the scenes’. Therefore all columns you provide in Foreign
Key entries must have the same type of values and column type as the column referenced by the Primary Key
entry.
If the Primary Key table is still needed in the query’s RESULTS, SORTS or CONDITIONS which don’t
involve Index Aware on the same table, then the table will remain in the query and the Foreign Key entry will
not be applied
City object used in the RESULTS causing the Foreign Key entry not to be used i.e. City table remains in the FROM
clause and is used in the WHERE clause (restricting on its Primary Key)
But if the Primary Key table (the original object’s table) is not needed anywhere but the WHERE clause, then
the Foreign Key entry will be applied:
Example of Foreign Key being applied because Primary Key table not being needed anywhere else in the query
Rule – a Foreign Key entry will be ignored if it does not result in less tables
being used. Business Objects can only use less tables if the Primary Key table
is referenced only inside the WHERE clause.
Database designers may denormalize Primary Key values beyond whats necessary to satisfy constraints to
assist in performance and simplification of SQL generation.
It is expected to see an entity repeated twice. Initially in the entity own table as its Primary Key and then in a
table that refers to this Primary Key thru its Foreign Key.
Denormalization implies that the database designer has gone beyond this duplication and has more than 2
instances of the value in a schema.
If we search the ‘club.mdb’ (Islands Resorts Marketing) database to find such entities, we will find only one
i.e. sales person
Tables list
Sponsor.sales_id
Sales_Person.sales_id
Customer.sales_id
The Sponsor table is used to restrict on Sales Person (sales_id) because it is one of the Foreign Key entries and
the Sales_Person table is not used anywhere else in the query so can be dropped by Business Objects.
But why does Business Objects use the Sponsor table and not the Customer table? Both tables exist in the
FROM clause? Perhaps it’s the sequence of Foreign Key entries in the KEYS dialog? If we rearrange them so
that they are listed as follows:
The SQL has changed again, this time using the Customer table to restrict Sales Person via sales_id.
Rule – the sequence of Foreign Key entries in the KEYS dialog determines
which entry is given preference i.e. the last enabled entry in the list that
results in the least number of tables.
Rule - If the ‘best’ Foreign Key entry does not result in a table count reduction
compared to the original query, the Primary Key entry will be the only one
that applies
And the build an example query (using Customer and Revenue as RESULTS and Service IN LIST <list of
values):
Why does the LOV has duplicates? Normally LOV have a DISTINCT on them and
no duplicates are allowed as shown in the screenshot on the right:
When we look at the service table, we see the duplicates there as well e.g. Hotel Room is repeated for SL_ID =
21, 31 and 41. SL_ID refers to Service Line.
So how can we help a user navigate to the correct value in a LOV when there are duplicates? Easy, we use
normal customized LOV in Designer.
EDIT the LOV, add the Service Line object, apply a primary ascending sort to the Service Line and a secondary ascending sort
to the Service object note : the sorts are only necessary if the LOV is to be viewed in Hierarchical View
If we use this LOV as it is with the Primary Key entry, we’ll see the following SQL generated (based on
original query RESULTS=Customer, Revenue CONDITIONS=Service IN LIST <values>):
The Primary Key values from the Service table has been selected (behind the scenes) and put into the SQL. These values
represent all the ‘Accommodation’ Service Line values. This is as expected.
Although the duplication is annoying, selecting all the Accomodation values while in Tabular View (in this
example) does result in the correct SQL and therefor results.
This time only 3 Primary Key values have been selected e.g. 212, 211 and 213 which represent Bungalow, Hotel Room
and Hotel Suite where SL_ID=21!! This means using the Hierarchical View compared to the Tabular View would give
different results.
Clearly it would be unacceptable to users that depending on whether they used Tabular View or Hierarchical
View they got the correct results or not.
Note : Even before Index Aware, the Hierarchical View would always apply a further DISTINCT on the display
values. The problem only arises now because using Index Aware we are not actually limting the query by the
values we select in the dialog but by the Primary Key values indirectly being selected. See a separate article
by the author on the LOV dialog
We look at the data and see that the Service name is not really unique in its own Service Line table
table. That table is actually a Service within Service Line table. So we look at
Service Line table and discover the Service Line name is not unique without being
qualified by which Resort the Service Line is available at.
We could look beyond Resort because it is shown with its country but the Resort
name is acutally unique across all countries. Although we determined this by
looking at the data, we really should confirm it by talking to the schema designer
as well. Since this may be test data which not reflect all the true relationships in
the lifetime of the database
Tabular View – selecting the Bungalow, Hotel Room and Hote Suite values for Bahamas Beach and French Riveria
generates the correct SQL much like before.
1
Rule - The LOV dialog matches the values on the first match it can find and
does not take into consideration that previous Primary Key values were
remembered.
Clearly the problem is that the LOV dialog matches on the first occurrence of the values selected and does not
take the Primary Key value into account.
Rule – use of Primary Key entries in LOV can only be used with accuracy when
the value is unique e.g. customer social security number, product code.
Let’s build a query with Customer and Revenue in the RESULTS pane.
First of all we’ll see what the SQL will be without Index Aware being active:
Notice that the SELECT of the query has changed, a MAX function has been added to the Customer.last_name
(the Customer object did not have a MAX function in its SELECT definition).
Also the GROUP BY clause has been changed i.e. Customer.last_name replaced with Customer.cust_id
Now in a real database (I apologize to the designer of the ‘club.mdb’ demo database!) would have last names
that are not unique.
The row counts for the 2 refreshes are indentical. The first refresh had Index Aware disabled and the second
has it enabled.
20 rows are returned when Index Awareness is not applied but 21 are returned when it is applied. The
difference is the result set is that Okumura last name has been aggregated at the database level when Index
Aware is not enabled but when Index Awareness was enabled, both Okumura records were returned from the
database and optionally aggregated locally within the report.
So this means that Index Awareness can affect result sets and can cause less aggregation to be performed on
the database, which is exactly where we’d want aggregation to be peformed to reduce the load on Reporter
(BUSOBJ.EXE).
Rule – when Index Aware objects are used in the RESULTS pane their Primary
Key entries can cause additional rows to be returned from the database.
Report results should remain the same if no local reporter calculations
depend on the row count and level of uniqueness of the result set
So far all our examples have ignored the WHERE clause part of the Primary Key and Foreign Key entries
within the KEYS dialog.
First, let’s see what happens when you add a WHERE clause to the Primary Key entry.
The WHERE clause SQL gets added to the LOV query for that object:
So far I cannot think of a use for adding WHERE clauses to the Primary Key entries. I would assume if the
original object has a WHERE clause you would want the Primary Key WHERE clause to replace that much as
the Primary Key SELECT clause replaces the original SELECT clause but this means the universe developer
would need to look up the Primary Key values in advance
Note : TODO CHECK THIS STATEMENT If these Primary Key values are based on surrogate keys there is no
guarantee that a database reload will maintain the same surrogate key values.
Based on:
We get:
Again, right now I cannot think of a use for the addition of the WHERE clause entry.
Rules:
• WHERE clause entries for Primary Key and Foreign Key entries are
ANDed to the query once the SELECT entry has been implemented.
• WHERE clauses must reference the same tables as the WHERE clause
for Primary Key entries
• WHERE clauses should try to reference the same tables as the WHERE
clause for Foreign Key entries to avoid introducing more complexity
into the query
Where a table does not have one column which uniquely identifies the rows but instead relies on more than
one column, this is a ‘multi-column primary key’
Note : this is not be to confused with ‘concatenated index’ or ‘composite index’ which is a database index that
is made up of more than one column in a table.
There are no examples of this in the ‘club.mdb’ so we have to create our own schema or make believe using
‘club.mdb’. To save time let’s make believe…
Since the KEYS dialog only allows one Primary Key entry we have to make that one entry relate to our multi-
column primary key. So we define a Primary Key entry as “Country_id & Resort.Resort_id” (the & is MS
The value of 61 comes from the concantenation of ‘6’ and ‘1’. ‘6’ represents the country_id for Australia and
‘1’ represents the resort_id for ‘Australian Reef’
So although the Index Awareness feature allows SELECT clauses that span multiple columns (or even tables) it
is unlikely that the database indexes will do the same. You should avoid multi-column primary key entries
and/or discuss indexes with your DBA. Surrogate keys are a good way around multi-column primary keys in
database design and your database designer should consider them
Rule – Index Awareness is best used with single column Primary Keys. Multi-
column primary keys should be avoided and/or your DBA should consider
‘function based indexes’ or surrogate keys
Row Based Restrictions (RBR) are settings created in Supervisor that allow additional WHERE clauses to be
ANDed to the final query SQL should that table be involved in the query. See Designer documentation for
more information.
City object with Primary Key and Foreign Key entries and no RBR applied to the City table. As expected the
restriction of ‘City EQUAL TO Albertville’ has been converted to a Primary Key value (25) pointing to the
Customer.City_id Foreign Key
The Customer.City_id is not used, instead the original Primary Key entry is used and the RBR is applied to the
City table (as expected)
Rule - If an Index Aware object tables list references a table that has RBR, we
cannot use any Foreign Key entries. Why? because the RBR in Supervisor will
be invalid for the Foreign Key table.
The KEYS tab of the Object Properties dialog contains the following controls:
• ‘type of values in PK and FK’
• DETECT
• PARSE
• Key Type
• Select
• Where
• Enable
• OK
• Cancel
• Apply
• Help
• Insert
• Delete
Note : Buttons INSERT, DELETE, Cancel and Help are not detailed below because they are standard
Designer (and Windows) buttons and their purpose should be self explanatory
This drop down lists all the standard data types that Business Objects handles i.e. Character, Date, Long Text
and Number. You have to inform Business Objects which data type is the relevant for the series of Primary
Key and Foreign Keys because the data type may be different from the object you are defining Index
Awareness for e.g. column ‘City.city’ is a character column and the object ‘City’ is also a Character type but its
Primary Key column ‘City.city_id’ is a numeric type.
All Primary Key and Foreign Key entries must be of the same value type because there is only one drop down
for all entries. This makes sense because depending on the rules applied, any of the Primary Key and Foreign
Key entries could be used in the final query.
If you select the wrong type and then try to PARSE then you will receive the error ‘The expression type is not
compatible with the object type’ as shown below:
Detects Primary Key and Foreign Keys if the middleware and database support their detection. Some
databases do not support their detection e.g. MS Access.
Should the database be unable to supply key information, the error ‘Key is not supported’ is raised by Business
Objects.
Should the Index Aware object reference no tables or more than one table, the error ‘No detected key for
Note : this error description is slightly misleading because it is raised even when the object makes reference to
no tables i.e. the Index Aware object SELECT and WHERE clauses result in no tables being referenced in the
TABLES list
Any Primary Key detected will over-write without warning any existing Primary Key entry
Once the Primary Key has been detected then the Foreign Keys are sought. All Foreign Key entries will be
added to the list if they didn’t previously exist i.e. no duplicates will be allowed
Apart from the Business Objects caught error ‘The expression type is not compatible with the object type’ all
other errors will be database specific, usually relating to errors in the building of the SELECT and WHERE
SQL statements.
Key Type
There are only 2 key types i.e. Primary Key and Foreign Key. Only one Primary Key
entry can exist for a single Index Aware object. There can be zero or more Foreign Key
entries but before even 1 Foreign Key entry can be used a Primary Key entry must
exist.
Note : Should you try to create more than one Primary Key entry, an existing Primary Key entry will be
converted automatically without warning to a Foreign Key. Its SQL will remain unchanged only its type.
This could lead to incorrect SQL in the final query if the changed Primary Key entry is not reviewed by
the universe developer
Select
Enter or build the SELECT SQL which will be used to:
a. Replace the Index Aware object’s original SELECT SQL should the object be used in the
CONDITIONS pane of a query (subject to other rules stated elsewhere)
b. Added to the SQL of the Index Aware object’s LOV query
c. Replace the Index Aware object’s SELECT SQL in a GROUP BY clause should the Index Aware
object be used in the RESULTS pane of a query
Tip – add SQL comments to the Primary Key and Foreign Key entries so that they easily identifiable in the
final query SQL e.g. [ Customer.City_id /* Primary Key */ ] for Oracle, where /* comment */ is Oracle
comment syntax
ANDing additional conditions can cause unexpected results. You cannot chose whether the condition ORed
instead. It is always ANDed. Consider the effect of ANDing the conditions carefully.
Tip – add SQL comments to the Primary Key and Foreign Key entries so that they are easily identifiable in the
final query SQL e.g. [ Customer.City_id IN (10, 11, 12) /* Foreign Key */ ]
Enable
Only Primary Key or Foreign Key entries that are enabled will be taken into consideration during SQL
generation.
But enabled or not, the entries will be checked when the PARSE button is used.
Apply or OK
Applies the changes and optionally closes the dialog.
These errors should be self explanatory based on our understanding of the rules determined so far.
This dialog provides normal SQL constructs as well as more complex items such as:
If your target database allows them, you can add SQL comments to the Primary Key and Foreign Key entries
so that they easily identifiable in the final query SQL e.g. [ Customer.city_id /* Index
Awareness Foreign Key */ ] for Oracle
TODO – check what affect @Aggregate_Aware which also changes tables involved in a query has on
Index Awareness
Typing or entering a value will not trigger it even if the values are
identical to those in the LOV
Why? Because Index Awareness uses the selection from the LOV
dialog to grab the Primary Key value for each entry. If you entered
your selection using ‘Type a new constant’ (for example) Index
Awareness would then have to go out to the database again and locate
the matching Primary Key value. (or at least interrogate the cached
LOV which it currently does not do)
Also, creating a condition by using the ‘Simple Condition’ toolbar button within the query panel does not
permit Index Awareness either
The following are repetition of rules discovered and explained within the examples earlier in the document.
They are listed here for reference:
• A Foreign Key entry will be ignored if it does not result in less tables being used. Business Objects
can only use less tables if the Primary Key table is referenced only inside the WHERE clause.
• The sequence of Foreign Key entries in the KEYS dialog determines which entry is given preference
i.e. the last enabled entry in the list that results in the least number of tables. But if the ‘best’ Foreign
Key entry does not result in a table count reduction compared to the original query, the Primary Key
entry will be the only one that applies
• Use of Primary Key entries in LOV can only be used with accuracy when the value is unique e.g.
customer social security number, product code.
• When Index Aware objects are used in the RESULTS pane their Primary Key entries can cause
additional rows to be returned from the database. Report results should remain the same if no local
reporter calculations depend on the row count and level of uniqueness of the result set
• WHERE clause entries for Primary Key and Foreign Key entries are ANDed to the query once the
SELECT entry has been implemented.
• For Primary Key entries the WHERE clauses must reference the same tables as the Index Aware
object’s original tables list
• For Foreign Key entries the WHERE clauses should try to reference the same tables as the Index
Aware original object’s tables list avoid introducing more complexity into the query
• Index Awareness is best used with single column Primary Keys. Multi-column primary keys should
be avoided and/or your DBA should consider ‘function based indexes’ and multi-column indexes.
• If an Index Aware object tables list references a table that has RBR, we cannot use any Foreign Key
entries. Why? because the RBR in Supervisor will be invalid for the Foreign Key table.
The more feedback the developers receive, will help guide their coding efforts.
Designer SDK
There seems to be no way using the SDK to access Index Aware information i.e. the properties and methods
are not exposed.
A universe exists, called ‘combined.unv’ which was written by the author of this document, will include a first
attempt at reporting on the KEY information. The additions to the universe are not perfect, its limitations are:
1. outer joins are not handled – so trying to list objects whether they have KEY records or not will not
work i.e. only objects with KEY records will be returned
2. returning the SELECT and WHERE clauses in a single Business Objects query will generate 2
SELECT statements because these are on separate contexts. This may not be correct but it works for
now
It is suggested you experiment with the 3 tables highlighted and the combined.unv universe.
1. The use Primary Key to solve the old problem of hierarchical prompts only returning the value
selected and not the path
3. What happens when you ignore ‘the expression type us not compatible with the object type’
Columns that have low cardinality are good candidates (if the cardinality of a column is <= 0.1 % th
column is ideal candidate, consider also 0.2% – 1%)
• Tables that have no or little insert/update are good candidates (static data in warehouse)
• Stream of bits: each bit relates to a column value in a single row of table
Begin
For i in 1..1000000
Loop
Insert into test_normal
values(i, dbms_random.string('U',30), dbms_random.value(1000,7000));
If mod(i, 10000) = 0 then
Commit;
End if;
End loop;
End;
/
Total Rows
----------
1000000
Distinct Values
---------------
1000000
Elapsed: 00:00:06.09
SQL> select count(*) "Total Rows" from test_random;
Total Rows
----------
1000000
Elapsed: 00:00:03.05
SQL> select count(distinct empno) "Distinct Values" from test_random;
Distinct Values
---------------
1000000
Elapsed: 00:00:12.07
Note that the TEST_NORMAL table is organized and that the TEST_RANDOM table is
randomly created and hence has disorganized data. In the above table, column EMPNO has
100-percent distinct values and is a good candidate to become a primary key. If you define
this column as a primary key, you will create a B-tree index and not a bitmap index because
Oracle does not support bitmap primary key indexes.
To analyze the behavior of these indexes, we will perform the following steps:
1. On TEST_NORMAL:
A. Create a bitmap index on the EMPNO column and execute some queries with
equality predicates.
B. Create a B-tree index on the EMPNO column, execute some queries with
equality predicates, and compare the logical and physical I/Os done by the
queries to fetch the results for different sets of values.
2. On TEST_RANDOM:
A. Same as Step 1A.
B. Same as Step 1B.
3. On TEST_NORMAL:
A. Same as Step 1A, except that the queries are executed within a range of
predicates.
B. Same as Step 1B, except that the queries are executed within a range of
predicates. Now compare the statistics.
4. On TEST_RANDOM:
A. Same as Step 3A.
B. Same as Step 3B.
5. On TEST_NORMAL:
Index created.
Elapsed: 00:00:29.06
SQL> analyze table test_normal compute statistics for table for all
indexes for all indexed columns;
Table analyzed.
Elapsed: 00:00:19.01
SQL> select substr(segment_name,1,30) segment_name, bytes/1024/1024 "Size
in MB"
2 from user_segments
3* where segment_name in ('TEST_NORMAL','NORMAL_EMPNO_BMX');
SEGMENT_NAME Size in MB
------------------------------------ ---------------
TEST_NORMAL 50
NORMAL_EMPNO_BMX 28
Elapsed: 00:00:02.00
SQL> select index_name, clustering_factor from user_indexes;
INDEX_NAME CLUSTERING_FACTOR
------------------------------ ---------------------------------
NORMAL_EMPNO_BMX 1000000
Elapsed: 00:00:00.00
You can see in the preceding table that the size of the index is 28MB and that the clustering
factor is equal to the number of rows in the table. Now let's execute the queries with equality
predicates for different sets of values:
SQL> set autotrace only
SQL> select * from test_normal where empno=&empno;
Enter value for empno: 1000
old 1: select * from test_normal where empno=&empno
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=34)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST_NORMAL' (Cost=4 Car
d=1 Bytes=34)
2 1 BITMAP CONVERSION (TO ROWIDS)
3 2 BITMAP INDEX (SINGLE VALUE) OF 'NORMAL_EMPNO_BMX'
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
5 consistent gets
0 physical reads
0 redo size
515 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
Step 1B (on TEST_NORMAL)
Now we will drop this bitmap index and create a B-tree index on the EMPNO column. As
before, we will check for the size of the index and its clustering factor and execute the same
queries for the same set of values, to compare the I/Os.
SQL> drop index NORMAL_EMPNO_BMX;
Index dropped.
Index created.
SQL> analyze table test_normal compute statistics for table for all
indexes for all indexed columns;
Table analyzed.
SEGMENT_NAME Size in MB
---------------------------------- ---------------
TEST_NORMAL 50
NORMAL_EMPNO_IDX 18
INDEX_NAME CLUSTERING_FACTOR
---------------------------------- ----------------------------------
NORMAL_EMPNO_IDX 6210
It is clear in this table that the B-tree index is smaller than the bitmap index on the EMPNO
column. The clustering factor of the B-tree index is much nearer to the number of blocks in
a table; for that reason, the B-tree index is efficient for range predicate queries.
Now we'll run the same queries for the same set of values, using our B-tree index.
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=34)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST_NORMAL' (Cost=4 Car
d=1 Bytes=34)
2 1 INDEX (RANGE SCAN) OF 'NORMAL_EMPNO_IDX' (NON-UNIQUE) (C
ost=3 Card=1)
Statistics
----------------------------------------------------------
29 recursive calls
0 db block gets
5 consistent gets
0 physical reads
0 redo size
515 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
As you can see, when the queries are executed for different set of values, the number of
consistent gets and physical reads are identical for bitmap and B-tree indexes on a 100-
percent unique column.
BITMAP B-TREE
Consistent Physical EMPNO Consistent Physical
Reads Reads Reads Reads
5 0 1000 5 0
5 2 2398 5 2
5 2 8545 5 2
5 2 98008 5 2
5 2 85342 5 2
5 2 128444 5 2
5 2 858 5 2
Step 2A (on TEST_RANDOM)
Now we'll perform the same experiment on TEST_RANDOM:
SQL> create bitmap index random_empno_bmx on test_random(empno);
Index created.
SQL> analyze table test_random compute statistics for table for all
indexes for all indexed columns;
Table analyzed.
SEGMENT_NAME Size in MB
------------------------------------ ---------------
TEST_RANDOM 50
RANDOM_EMPNO_BMX 28
INDEX_NAME CLUSTERING_FACTOR
------------------------------ ---------------------------------
RANDOM_EMPNO_BMX 1000000
Again, the statistics (size and clustering factor) are identical to those of the index on the
TEST_NORMAL table:
SQL> select * from test_random where empno=&empno;
Enter value for empno: 1000
old 1: select * from test_random where empno=&empno
new 1: select * from test_random where empno=1000
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=34)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST_RANDOM' (Cost=4 Card=1
Bytes=34)
2 1 BITMAP CONVERSION (TO ROWIDS)
3 2 BITMAP INDEX (SINGLE VALUE) OF 'RANDOM_EMPNO_BMX'
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
5 consistent gets
0 physical reads
0 redo size
515 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
Step 2B (on TEST_RANDOM)
Now, as in Step 1B, we will drop the bitmap index and create a B-tree index on the EMPNO
column.
SQL> drop index RANDOM_EMPNO_BMX;
Index dropped.
Index created.
SQL> analyze table test_random compute statistics for table for all
indexes for all indexed columns;
Table analyzed.
INDEX_NAME CLUSTERING_FACTOR
---------------------------------- ----------------------------------
RANDOM_EMPNO_IDX 999830
This table shows that the size of the index is equal to the size of this index on
TEST_NORMAL table but the clustering factor is much nearer to the number of rows,
which makes this index inefficient for range predicate queries (which we'll see in Step 4).
This clustering factor will not affect the equality predicate queries because the rows have
100-percent distinct values and the number of rows per key is 1.
Now let's run the queries with equality predicates and the same set of values.
SQL> select * from test_random where empno=&empno;
Enter value for empno: 1000
old 1: select * from test_random where empno=&empno
new 1: select * from test_random where empno=1000
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=34)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST_RANDOM' (Cost=4 Card=1
Bytes=34)
2 1 INDEX (RANGE SCAN) OF 'RANDOM_EMPNO_IDX' (NON-UNIQUE)
(Cost=3 Card=1)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
5 consistent gets
0 physical reads
0 redo size
515 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
Again, the results are almost identical to those in Steps 1A and 1B. The data distribution did
not affect the amount of consistent gets and physical reads for a unique column.
Step 3A (on TEST_NORMAL)
In this step, we will create the bitmap index (similar to Step 1A). We know the size and the
clustering factor of the index, which equals the number of rows in the table. Now let's run
some queries with range predicates.
SQL> select * from test_normal where empno between &range1 and &range2;
Enter value for range1: 1
Enter value for range2: 2300
old 1: select * from test_normal where empno between &range1 and &range2
new 1: select * from test_normal where empno between 1 and 2300
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=451 Card=2299
Bytes=78166)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST_NORMAL' (Cost=451
Card=2299 Bytes=78166)
2 1 BITMAP CONVERSION (TO ROWIDS)
3 2 BITMAP INDEX (RANGE SCAN) OF 'NORMAL_EMPNO_BMX'
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
331 consistent gets
0 physical reads
0 redo size
111416 bytes sent via SQL*Net to client
2182 bytes received via SQL*Net from client
155 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2300 rows processed
Step 3B (on TEST_NORMAL)
In this step, we'll execute the queries against the TEST_NORMAL table with a B-tree index
on it.
SQL> select * from test_normal where empno between &range1 and &range2;
Enter value for range1: 1
Enter value for range2: 2300
old 1: select * from test_normal where empno between &range1 and &range2
new 1: select * from test_normal where empno between 1 and 2300
Elapsed: 00:00:00.02
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=23 Card=2299
Bytes=78166)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST_NORMAL' (Cost=23
Card=2299 Bytes=78166)
2 1 INDEX (RANGE SCAN) OF 'NORMAL_EMPNO_IDX' (NON-UNIQUE)
(Cost=8 Card=2299)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
329 consistent gets
15 physical reads
0 redo size
111416 bytes sent via SQL*Net to client
2182 bytes received via SQL*Net from client
155 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2300 rows processed
When these queries are executed for different sets of ranges, the results below show:
Elapsed: 00:00:08.01
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=453 Card=2299
Bytes=78166)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST_RANDOM' (Cost=453
Card=2299 Bytes=78166)
2 1 BITMAP CONVERSION (TO ROWIDS)
3 2 BITMAP INDEX (RANGE SCAN) OF 'RANDOM_EMPNO_BMX'
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2463 consistent gets
1200 physical reads
0 redo size
111416 bytes sent via SQL*Net to client
2182 bytes received via SQL*Net from client
155 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2300 rows processed
Elapsed: 00:00:03.04
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=613 Card=2299
Bytes=78166)
1 0 TABLE ACCESS (FULL) OF 'TEST_RANDOM' (Cost=613 Card=2299
Bytes=78166)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
6415 consistent gets
4910 physical reads
0 redo size
111416 bytes sent via SQL*Net to client
2182 bytes received via SQL*Net from client
155 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2300 rows processed
The optimizer opted for a full table scan rather than using the index because of the
clustering factor:
BITMAP B-TREE
EMPNO
Consistent Physical (Range) Consistent Physical
Reads Reads Reads Reads
2463 1200 1-2300 6415 4910
2114 31 8-1980 6389 4910
2572 1135 1850-4250 6418 4909
3173 1620 28888-31850 6456 4909
2762 1358 82900-85478 6431 4909
7254 3329 984888-1000000 7254 4909
For the last range (984888-1000000) only, the optimizer opted for a full table scan for the
bitmap index, whereas for all ranges, it opted for a full table scan for the B-tree index. This
disparity is due to the clustering factor: The optimizer does not consider the value of the
clustering factor when generating execution plans using a bitmap index, whereas for a B-
tree index, it does. In this scenario, the bitmap index performs more efficiently than the B-
tree index.
The following steps reveal more interesting facts about these indexes.
Step 5A (on TEST_NORMAL)
Index created.
SQL> analyze table test_normal compute statistics for table for all
indexes for all indexed columns;
Table analyzed.
Now let's get the size of the index and the clustering factor.
SQL>select substr(segment_name,1,30) segment_name, bytes/1024/1024 "Size
in MB"
2* from user_segments
3* where segment_name in ('TEST_NORMAL','NORMAL_SAL_BMX');
SEGMENT_NAME Size in MB
------------------------------ --------------
TEST_NORMAL 50
NORMAL_SAL_BMX 4
INDEX_NAME CLUSTERING_FACTOR
------------------------------ ----------------------------------
NORMAL_SAL_BMX 6001
Now for the queries. First run them with equality predicates:
SQL> set autot trace
SQL> select * from test_normal where sal=&sal;
Enter value for sal: 1869
old 1: select * from test_normal where sal=&sal
new 1: select * from test_normal where sal=1869
Elapsed: 00:00:00.08
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=39 Card=168 Bytes=4032)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST_NORMAL' (Cost=39
Card=168 Bytes=4032)
2 1 BITMAP CONVERSION (TO ROWIDS)
3 2 BITMAP INDEX (SINGLE VALUE) OF 'NORMAL_SAL_BMX'
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
165 consistent gets
0 physical reads
0 redo size
8461 bytes sent via SQL*Net to client
609 bytes received via SQL*Net from client
12 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
164 rows processed
and then with range predicates:
SQL> select * from test_normal where sal between &sal1 and &sal2;
Enter value for sal1: 1500
Enter value for sal2: 2000
Elapsed: 00:00:05.00
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=601 Card=83376 Bytes
=2001024)
1 0 TABLE ACCESS (FULL) OF 'TEST_NORMAL' (Cost=601 Card=83376
Bytes=2001024)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
11778 consistent gets
5850 physical reads
0 redo size
4123553 bytes sent via SQL*Net to client
61901 bytes received via SQL*Net from client
5584 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
83743 rows processed
Now drop the bitmap index and create a B-tree index on TEST_NORMAL.
SQL> create index normal_sal_idx on test_normal(sal);
Index created.
SQL> analyze table test_normal compute statistics for table for all
indexes for all indexed columns;
Table analyzed.
Take a look at the size of the index and the clustering factor.
SQL> select substr(segment_name,1,30) segment_name, bytes/1024/1024 "Size
in MB"
2 from user_segments
3 where segment_name in ('TEST_NORMAL','NORMAL_SAL_IDX');
SEGMENT_NAME Size in MB
------------------------------ ---------------
TEST_NORMAL 50
NORMAL_SAL_IDX 17
INDEX_NAME CLUSTERING_FACTOR
------------------------------ ----------------------------------
NORMAL_SAL_IDX 986778
In the above table, you can see that this index is larger than the bitmap index on the same
column. The clustering factor is also near the number of rows in this table.
Now for the tests; equality predicates first:
SQL> set autot trace
SQL> select * from test_normal where sal=&sal;
Enter value for sal: 1869
old 1: select * from test_normal where sal=&sal
new 1: select * from test_normal where sal=1869
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=169 Card=168 Bytes=4032)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST_NORMAL' (Cost=169
Card=168 Bytes=4032)
2 1 INDEX (RANGE SCAN) OF 'NORMAL_SAL_IDX' (NON-UNIQUE) (Cost=3
Card=168)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
177 consistent gets
0 physical reads
0 redo size
8461 bytes sent via SQL*Net to client
609 bytes received via SQL*Net from client
12 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
164 rows processed
...and then, range predicates:
SQL> select * from test_normal where sal between &sal1 and &sal2;
Enter value for sal1: 1500
Enter value for sal2: 2000
old 1: select * from test_normal where sal between &sal1 and &sal2
new 1: select * from test_normal where sal between 1500 and 2000
Elapsed: 00:00:04.03
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=601 Card=83376 Bytes
=2001024)
1 0 TABLE ACCESS (FULL) OF 'TEST_NORMAL' (Cost=601 Card=83376
Bytes=2001024)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
11778 consistent gets
3891 physical reads
0 redo size
4123553 bytes sent via SQL*Net to client
61901 bytes received via SQL*Net from client
5584 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
83743 rows processed
When the queries were executed for different set of values, the resulting output, as shown in
the tables below, reveals that the numbers of consistent gets and physical reads are identical.
BITMAP B-TREE
SAL Rows
Consistent Physical (Equality) Consistent Reads Physical Fetched
Reads Reads Reads
165 0 1869 177 164
Table altered.
S COUNT(*)
- ----------
F 333769
M 499921
166310
3 rows selected.
The size of the bitmap index on this column is around 570KB, as indicated in the table
below:
SQL> create bitmap index normal_GENDER_bmx on test_normal(GENDER);
Index created.
Elapsed: 00:00:02.08
SEGMENT_NAME Size in MB
------------------------------ ---------------
TEST_NORMAL 50
2 rows selected.
In contrast, the B-tree index on this column is 13MB in size, which is much bigger than the
bitmap index on this column.
SQL> create index normal_GENDER_idx on test_normal(GENDER);
Index created.
SEGMENT_NAME Size in MB
------------------------------ ---------------
TEST_NORMAL 50
NORMAL_GENDER_IDX 13
2 rows selected.
Now, if we execute a query with equality predicates, the optimizer will not make use of this
index, be it a bitmap or a B-tree. Rather, it will prefer a full table scan.
SQL> select * from test_normal where GENDER is null;
Elapsed: 00:00:06.08
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=601 Card=166310
Bytes=4157750)
1 0 TABLE ACCESS (FULL) OF 'TEST_NORMAL' (Cost=601 Card=166310
Bytes=4157750)
Elapsed: 00:00:16.07
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=601 Card=499921
Bytes=12498025)
1 0 TABLE ACCESS (FULL) OF 'TEST_NORMAL' (Cost=601
Card=499921Bytes=12498025)
Elapsed: 00:00:12.02
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=601 Card=333769 Byte
s=8344225)
1 0 TABLE ACCESS (FULL) OF 'TEST_NORMAL' (Cost=601 Card=333769
Bytes=8344225)
Conclusions
Elapsed: 00:00:02.03
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=198 Card=754
Bytes=18850)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST_NORMAL' (Cost=198
Card=754 Bytes=18850)
2 1 BITMAP CONVERSION (TO ROWIDS)
3 2 BITMAP AND
4 3 BITMAP OR
5 4 BITMAP INDEX (SINGLE VALUE) OF 'NORMAL_SAL_BMX'
6 4 BITMAP INDEX (SINGLE VALUE) OF 'NORMAL_SAL_BMX'
7 4 BITMAP INDEX (SINGLE VALUE) OF 'NORMAL_SAL_BMX'
8 4 BITMAP INDEX (SINGLE VALUE) OF 'NORMAL_SAL_BMX'
9 4 BITMAP INDEX (SINGLE VALUE) OF 'NORMAL_SAL_BMX'
10 4 BITMAP INDEX (SINGLE VALUE) OF 'NORMAL_SAL_BMX'
11 4 BITMAP INDEX (SINGLE VALUE) OF 'NORMAL_SAL_BMX'
12 4 BITMAP INDEX (SINGLE VALUE) OF 'NORMAL_SAL_BMX'
13 4 BITMAP INDEX (SINGLE VALUE) OF 'NORMAL_SAL_BMX'
14 3 BITMAP INDEX (SINGLE VALUE) OF 'NORMAL_GENDER_BMX'
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
1353 consistent gets
920 physical reads
0 redo size
75604 bytes sent via SQL*Net to client
1555 bytes received via SQL*Net from client
98 SQL*Net roundtrips to/from client
Elapsed: 00:00:03.01
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=601 Card=754
Bytes=18850)
1 0 TABLE ACCESS (FULL) OF 'TEST_NORMAL' (Cost=601 Card=754
Bytes=18850)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
6333 consistent gets
4412 physical reads
0 redo size
75604 bytes sent via SQL*Net to client
1555 bytes received via SQL*Net from client
98 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1453 rows processed
As you can see here, with the B-tree index, the optimizer opted for a full table scan, whereas
in the case of the bitmap index, it used the index to answer the query. You can deduce
performance by the number of I/Os required to fetch the result.
In summary, bitmap indexes are best suited for DSS regardless of cardinality for these
reasons:
• With bitmap indexes, the optimizer can efficiently answer queries that include AND,
OR, or XOR. (Oracle supports dynamic B-tree-to-bitmap conversion, but it can be
inefficient.)
• With bitmaps, the optimizer can answer queries when searching or counting for
nulls. Null values are also indexed in bitmap indexes (unlike B-tree indexes).
• Most important, bitmap indexes in DSS systems support ad hoc queries, whereas B-
tree indexes do not. More specifically, if you have a table with 50 columns and users
frequently query on 10 of them—either the combination of all 10 columns or
sometimes a single column—creating a B-tree index will be very difficult. If you
create 10 bitmap indexes on all these columns, all the queries can be answered by
these indexes, whether they are queries on all 10 columns, on 4 or 6 columns out of
the 10, or on a single column. The AND_EQUAL hint provides this functionality for
B-tree indexes, but no more than five indexes can be used by a query. This limit is
not imposed with bitmap indexes.
In contrast, B-tree indexes are well suited for OLTP applications in which users' queries are
relatively routine (and well tuned before deployment in production), as opposed to ad hoc
queries, which are much less frequent and executed during nonpeak business hours. Because
data is frequently updated in and deleted from OLTP applications, bitmap indexes can cause
a serious locking problem in these situations.