Professional Documents
Culture Documents
End-to-End Hadoop Development Using OBIEE, ODI and Oracle Big Data Discovery
End-to-End Hadoop Development Using OBIEE, ODI and Oracle Big Data Discovery
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
Twitter Stream
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
Step 1
Ingesting Basic Dataset into Hadoop
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
Loading
Stage
Real-Time
Logs / Events
File /
Unstructured
Imports
Publish
Process
Exadata
Dim
Attributes
Flume
Flume
MongoDB
Spark
Spark
Spark
Hive
Hive
Hive
HDFS
HDFS
HDFS
SQL for
BDA Exec
E : info@rittmanmead.com
W : www.rittmanmead.com
OBIEE
TimesTen
Big Data
SQL
Filtered &
Projected
Rows /
Columns
Exalytics
12c In-Mem
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
What is MongoDB?
Open-source document-store NoSQL database
Flexible data model, each document (record)
can have its own JSON schema
Highly-scalable across multiple nodes (shards)
MongoDB databases made up of
collections of documents
Add new attributes to a document just by using it
Single table (collection) design, no joins etc
Very useful for holding JSON output from web apps
- for example, twitter data from Datasift
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
Step 2
Processing, Enhancing & Transforming Data
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
from
from
from
from
from
from
from
from
from
from
from
from
from
from
from
from
deserializer
deserializer
deserializer
deserializer
deserializer
deserializer
deserializer
deserializer
deserializer
deserializer
deserializer
deserializer
deserializer
deserializer
deserializer
deserializer
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
Step 3
Publishing & Analyzing Data
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
1
3
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
Combined output
in report form
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
access_per_post_categories.ip_integer
BETWEEN geoip_country.start_ip_int
AND geoip_country.end_ip_int
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
Step 4
Enabling for Data Discovery
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
BDD Data
Processing
BDD Data
Processing
BDD Data
Processing
Spark
Spark
Spark
Hive
Hive
Hive
HDFS
HDFS
HDFS
Ingest semi-
process logs
(1m rows)
BDD
Studio
Web UI
Ingest processed
Twitter activity
BDD
DGraph
Gateway
HDFS Client
Write-back
Transformations
to full
datasets
Hive Client
Data Discovery
using Studio
web-based app
Upload
Site page and
comment
contents
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
Cost
Accuracy
Amount'of'data'queried
E : info@rittmanmead.com
W : www.rittmanmead.com
"@class" : "com.oracle.endeca.pdi.client.config.workflow.
ProvisionDataSetFromHiveConfig",
"hiveTableName" : "rm_linked_tweets",
"hiveDatabaseName" : "default",
"newCollectionName" : edp_cli_edp_a5dbdb38-b065,
"runEnrichment" : true,
"maxRecordsForNewDataSet" : 1000000,
"languageOverride" : "unknown"
3
Apache Spark
pageviews
>1m rows
Profiling
pageviews
>1m rows
Enrichment
pageviews
>1m rows
}
BDD
E : info@rittmanmead.com
W : www.rittmanmead.com
pageviews
>1m rows
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
1
2
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
Results filtered on
selected refinement
4
3
Further refinement on
OBIEE in post keywords
E : info@rittmanmead.com
W : www.rittmanmead.com
Summary
Oracle Big Data, together with OBIEE, ODI and Oracle Big Data Discovery
Complete end-to-end solution with engineered hardware, and Hadoop-native tooling
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com