DT-EDU-DEN80EDU18S04 Cache Configuration

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

DEN80EDU18S04

Cache Configuration
Denodo Platform 8.0

Denodo
Operations
Management

FOR TRAINING PURPOSES ONLY


Licensed to OnDemand Training Courses - 2020
Agenda
1. Configuring the Cache

2. Bulk Data Load

3. Cache Maintenance
4. Cache Best Practices

FOR TRAINING PURPOSES ONLY


Licensed to OnDemand Training Courses - 2020
Cache Configuration
Configuring the Cache

FOR TRAINING PURPOSES ONLY


Licensed to OnDemand Training Courses - 2020
Configuring the cache
Introduction

The Cache module can store copies of the source data in a relational database
accessible through JDBC.
■ This module automatically maintains a table in the cache database for each
base view or view for which the cache has been configured.
■ From an administration point of view, the cache server can be configured
globally at server level or per virtual database.

FOR TRAINING PURPOSES ONLY


Licensed to OnDemand Training Courses - 2020 4
Configuring the cache
Global configuration

The Cache system can be enabled globally at server level.


■ From the “Administration > Server” configuration dialog.
■ In order to activate it, the cache status has to be set to On.
■ Once it is enabled, the cache parameters can be configured.

FOR TRAINING PURPOSES ONLY


Licensed to OnDemand Training Courses - 2020 5
Configuring the cache
Database configuration

A Denodo Virtual DataPort server can contain several virtual databases.


■ Each virtual database is independent from the other virtual databases of the
server and, as described previously, it is possible to configure the cache settings
for a specific virtual database.
■ From the “Administration > Database management > Cache ” dialog.

FOR TRAINING PURPOSES ONLY


Licensed to OnDemand Training Courses - 2020 6
Live Demo DEN80EDU18S04DEM01

Configure Cache at Server Level

This demonstration explains about how to configure a relational database


as a Global Cache server in Denodo.

FOR TRAINING PURPOSES ONLY


Licensed to OnDemand Training Courses - 2020 7
Configuring the cache
Bulk Data Load

FOR TRAINING PURPOSES ONLY


Licensed to OnDemand Training Courses - 2020
Bulk Data Load
About Bulk Data Load...

For huge data loads in the amount of millions of rows a Bulk Data Load API is provided
by different DBMS. The difference between Bulk Data Load and a “normal” load is:
■ It does not run single/multiple INSERT statements to load data.
■ It is optimized for huge data loads, thus only when dealing with large amounts
of data significant improvements are reached.

FOR TRAINING PURPOSES ONLY


Licensed to OnDemand Training Courses - 2020 9
Bulk Data Load
… About Bulk Data Load

If the DBMS used as cache system allows administrators to use the Bulk Data Load
feature:
■ Virtual DataPort Server writes data to an external file with a specific format.
■ Not all DBMS create these external files.
■ These files are transferred to the DBMS.

■ Implementation of Bulk Data Load is vendor specific.


■ Most of the databases provide a proprietary auxiliary application to load big amounts of
data.

FOR TRAINING PURPOSES ONLY


Licensed to OnDemand Training Courses - 2020 10
Bulk Data Load
Configuring Bulk Data Load

Depending on the DBMS selected as cache system, some parameters can be configured
when enabling this feature. Typical configuration includes:
■ Work Path:
■ This field allows administrators to select a different directory to create the temporary files,
instead of using the path where the Denodo Platform is installed.
■ This can be used to use a fast storage location such as a SSD or a in-memory drive.

■ Tool executable location:


■ Location where the auxiliary application provided by the database vendor is installed.
■ Some DBMS do not need it as this functionality is included within their JDBC driver.

FOR TRAINING PURPOSES ONLY


Licensed to OnDemand Training Courses - 2020 11
Cache Configuration
Cache Maintenance

FOR TRAINING PURPOSES ONLY


Licensed to OnDemand Training Courses - 2020
Cache Maintenance
Removing expired rows from Cache…

The Cache system has a maintenance task that deletes the expired entries.
■ It is activated by default when a Denodo administrator enables the Cache.
■ It executes the CLEAN_CACHE_DATABASE stored procedure in the background.

FOR TRAINING PURPOSES ONLY


Licensed to OnDemand Training Courses - 2020 13
Cache Maintenance
… Removing expired rows from Cache

It is strongly recommended to disable it in production environments to avoid the


execution of the task periodically.
■ The maintenance task is a resource-expensive action.
■ Administrators should use the Denodo Scheduler to execute the
CLEAN_CACHE_DATABASE procedure at a period where Virtual DataPort server and
the Cache database are not expected to be under heavy load.
■ Only Global Administrators can execute the CLEAN_CACHE_DATABASE procedure.

FOR TRAINING PURPOSES ONLY


Licensed to OnDemand Training Courses - 2020 14
LAB
Cache and maintenance task

* Check the Labs Guide document for a complete description of the Lab DEN80EDU18S04LAB01

FOR TRAINING PURPOSES ONLY


Licensed to OnDemand Training Courses - 2020 15
Cache Configuration
Cache Best Practices

FOR TRAINING PURPOSES ONLY


Licensed to OnDemand Training Courses - 2020
Cache Best Practices
Shared cache server

Cache architecture makes it perfect to be used as a shared mechanism among a cluster


of Denodo servers.
■ Simply loading the VQL file pointing to that server will set up the use of a shared
cache.
■ All Virtual DataPort nodes should disable the maintenance task and a Scheduler
job must be created to run the CLEAN_CACHE_DATABASE stored procedure.

FOR TRAINING PURPOSES ONLY


Licensed to OnDemand Training Courses - 2020 17
Cache Best Practices
Bulk Data Load

The Bulk Data Load feature should be enabled on the cache engine:
■ When the amount of rows to be inserted is tens of thousands or higher.
■ With a lower number of rows, there is no performance increase and sometimes, there even
may be a performance decrease.
■ HDFS based databases require Bulk Data Load to be enabled to be eligible for caching.

■ When caching huge datasets (million of rows), as it will improve the insertion
phase significatively.

FOR TRAINING PURPOSES ONLY


Licensed to OnDemand Training Courses - 2020 18
Cache Best Practices
Connection pool

The cache system is a DBMS, and its access is done via JDBC connection pool that must
be configured properly.
■ The database used as cache should not be used as data source.
■ An overuse of the cache may end up making the system slower:
■ Views with cache activated need to query the cache even if the data is not actually cached.
■ The system will be slower if the waiting times in the cache connection pool are bigger than
going to the original source.
■ Monitor the connection pool to ensure it is properly sized!

FOR TRAINING PURPOSES ONLY


Licensed to OnDemand Training Courses - 2020 19
Cache Best Practices
Database Maintenance

Since the cache database will have an intensive insert/delete usage:


■ It’s tables should be compacted in accordance to avoid fragmentation of its
data files.
■ Indexes can be added to cache tables for improving cache queries.
■ Administrators must add only the indexes really needed because indexes can make other
cache operations slower.

■ It is a good practice to rebuild indexes periodically.

FOR TRAINING PURPOSES ONLY


Licensed to OnDemand Training Courses - 2020 20
References
Denodo Platform 8.0 at Denodo Community

Denodo Platform 8.0 Reference Manuals:


■ Configuring the Cache
■ Bulk Data Load
■ Best practices
■ Best Practices to Maximize Performance III: Caching

FOR TRAINING PURPOSES ONLY


Licensed to OnDemand Training Courses - 2020 21
www.denodo.com info@denodo.com

© Copyright Denodo Technologies. All rights reserved


Unless otherwise specified, no part of this PDF file may be reproduced or utilized in any for or by any means, electronic or mechanical, including photocopying
and microfilm, without prior the written authorization from Denodo Technologies.

FOR TRAINING PURPOSES ONLY


Licensed to OnDemand Training Courses - 2020

You might also like