Graph Databases (GDB) : Adrian Silvescu Doina Caragea Anna Atramentov

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 24

Graph Databases (GDB)

Adrian Silvescu
Doina Caragea
Anna Atramentov
Problems And Motivations

• The necessity to represent, store and manipulate


complex data make RDBMS somewhat obsolete
• [P1] Problem 1: Violations of the 1NF
– Multi-valued attributes
– Complex attributes
– Complex combination of the previous two
• [P2] Problem 2 : Accommodate Changes
– Appears when acquiring data from autonomous
dynamic sources or Web (eg: genexp & restaurants ).
– RDBMS may require schema renormalization
Problems and Motivations (contd)

• [P3] Problem 3: Unified representation for:


– Data
– Knowledge (Schemas are a subset of this)
– Queries (More generally: Goals) [results+def]
– Models (Concepts are a particular example)
• In order to facilitate the application of
learning and reasoning methods on these
structures
The Big Picture USER

Discovery & Mode lli ng Que ry = Goal Specification

Information
Exploita tion Informa tion

AVAILABLE
INFORMA TION
IN THE WORLD
INFORMA TION INTEGRATION
Informa tion
Integra tion

... ... ...

KNOWLEDGE LINK DATA


SOURCES SOURCES SOURCES
Existing Approaches
• RDBMS – may need schema renormalisation
• Approaches that try to fix the above mentioned
problems:
– OO Databases [P1], [P2] - graphs [but procedural]
– XML Databases [P1] (somewhat [P3]) – trees
– OORDBMS [P1] – graphs with foreign keys
• Others
– Datalog – More Efficient Crippled Prolog
– Network Models - graphs
– Hierarchical Models – trees
• Therefore the motivation for Graph Databases
Outline
• Graph Databases
– Examples
– DDL
– DML – Queries
– DML – UPDATES
– Informal Semantics
– DB => GDB
• GDB vs. OO, XML, OR, …
• Conclusions and Further Work
Graph Databases
• We propose a new kind of Database: Graph
Databases (GDB) as a solution to Problems [P1],
[P2] and [P3].
• In order to define the GDB we will specify:
– The Data Definition Language (DDL)
– The Query Language (more generally DML)
– Informal Semantics of the above languages
• We will also show how to convert existing DBs (RDBMS)
into the GDB DDL to facilitate the transition to GDBs
Goals and Design Choice

• Goals
– Declarativity
– Change
• Design Choice : Have unique instance
identifiers vs. having foreign keys
– Close in Spirit to OO
– Will allow us to cope easier with Change
– Declarativity is an issue in OO, but not for GDB as we
will show
Database Representation
• Sailors(sid:integer, sname:char(10), rating: integer, age:real)
• Boats(bid:integer, bname:char(10), color:char(10))
• Reserve(sid:integer, bid:integer, day:date)

Sailors Reserves Boats

sid sname rating age sid bid day bid bname color

22 dustin 7 45.0 101 Interlake red


22 101 10/10/96
31 lubber 8 55.5 102 Clipper green
58 103 11/12/96
58 rusty 10 35.0 103 Marine red
22
Graph Representation sid

name dustin
IOF ID1
rating 7

IO Sailor
age 45.0
TB F s
L
sid 31
IOF
IOF
name lubbe
IOF ID2 r
Boat
s rating 8
Reserves
IOF IOF age 55.5

IOF ID8
ID6 ID3

ID4 ID5 : ID7 :
: :
Foreign Keys 22
sid

name dustin
ID1
rating 7

age 45.0
Sailor

sid 22

day 10/10/96
ID4

bid 101
Boat
bid 101
ID6
bname Interlake

color red
Data Representation in the GDB DDL
Name1 Val
1
Name2 Val
ID 2
……
Name Val
N N

• ID:(Name1=Val1,…,NameN=ValN)
Examples:
ID1:(sid=22, name=“Dustin”, rating=7, age=45.0)
ID4:(sailor=ID1, day=“10/10/96”, boat=ID6)
ID6:(bid=101, bname=“Interlake”, color=“red”)
Defining New Concepts in GDB DDL– Grandson

Person Person Person

IOF IOF IOF

_ID1 GrSon _ID2 :- _ID1 Son _ID3 Son _ID2

GrSon
_ID1:(GrSon=_ID2) :- _ID1:
(IOF=“Person”,Son=_ID3),
_ID3:(IOF=“Person”, Son=_ID2),
_ID2:(IOF=“Person”).
[DML-QL] Writing simple queries:
•The names of all sailors who have reserved a red boat

_X Sailors _X Boats

Name IOF Name IOF

_ID :- _ID Boat _ID1 Color Red

_ID:(Name = _X) :-
_ID:(IOF = Sailors, Boat = _ID1, Name = _X),
_ID1:(IOF = Boats, Color = Red).
Informal Semantics

• Three kinds of Definitions


– Facts:
• G1. [Extensional definition]
– Definitions:
• G1 :- G2. [Intensional definition]
• G1 :- PROC f(x1,…,xn). [Procedural definition]
• Queries = Graphs to be Matched = QG
– The same as a definition: Query :- QG.
Informal Semantics - Picture Query

Query match

Facts
Extended Graph
RDBMS => GDB
• Sailors(sid:integer, sname:char(10), rating: integer, age:real)
• Boats(bid:integer, bname:char(10), color:char(10))
• Reserve(sid:integer, bid:integer, day:date)

Sailors Reserves Boats

sid sname rating age sid bid day bid bname color

22 dustin 7 45.0 101 Interlake red


22 101 10/10/96
31 lubber 8 55.5 102 Clipper green
58 103 11/12/96
58 rusty 10 35.0 103 Marine red
DML - Updates
• Inline Query
– _X : [ _ID: (IOF = Sailor, sname = lubber, rating = _X)]
• Updates: MODIFY (QryGraph, UpdList)
– Add to all GrandSons the money of the Grandparent as a
potential inheritance.
– MODIFY ( _ID : (GrSon = _ID1),
(=> NEWID:(IOF = POT_INHER, BENFICIARY =
_ID1, AMOUNT = _AMNT: [ _ID:(Money =
_AMNT )]) ))
– .
Change

Book

IOF
Title Databases
_ID
Author Ramakrishan

Author Gehrke
Gene_ex
Change

IOF

_ID value 0.7


IOF _ID1 value x1

x / Exp IOF _ID2 value x2


:
IOF _IDn value xN

Aggregate
Operation
High Order Queries

Find all the fields from Tables that contain the name John.

_ID:(Name=_X) :-
_ID1:(IOF=Tables),
_ID2:(IOF=_ID1, _X=“John”).
GDB vs. OO, XML, OR, CG, …
• GDB are close in spirit to OO but not the same
(GDB : no encapsulation + more IDs).
• Close To Datalog but with IDs(links) vs foreign
keys
• The same for ORDBMs and somewhat XML
• Close to Conceptual Graphs But CG do not have
IDs
• We can also use foreign keys:
_ID:[ _ID(IOF = Sailors, sname = lubber)].
OO vs. GDB

OO GDB

ID1 Person

Class: Person IOF


age: 42 age 42 Car
name: john
car ID1
name John IOF
ID2

Class: Car car ID2


color: red
color

red
Conclusions and Further work
• Advantages & Disadvantages of GDB
• Conclusions
– Proposed a new database model DBG that copes with
problems existing in previous approaches
– Showed how to link existing DB with GDB
– Showed advantages of GDB over existing database
approaches
• Further Work
– Use GDB for learning

You might also like