Download as pdf or txt
Download as pdf or txt
You are on page 1of 48

SQL Server 2017 introduced the extension for graph databases.

As there are
millions of SQL Server users worldwide, this feature broadens enormously
the audience of potential users. But, what to expect exactly from a graph
database? How to query it? Is SQL Server fully featured compared to other
products? In this session we answer these questions. We start illustrating the
concepts behind the model; how relationships are handled and what are the
common patterns and issues for a graph. What are the data connections a
graph can easily represent. Then we compare the semantic model with SQL
Server to discover how to apply it to real world. We analyze some case study:
pattern matching, path finding, aggregation, ranking … For each of them we
show how to use standard T-SQL and how to rewrite the query using graph
objects. What is the benefit of reformulate our queries in terms of clearness
and performances, what is already available in order to consider SQL Server a
valuable player.
BIG Thanks to SQL Sat Denmark sponsors
GOLD

SILVER

BRONZE
Raffle and goodbye Beer
Remember to visit the sponsors, stay for the raffle and
goodbye beers ☺

Join our sponsors for a lunch break session in :


cust 0.01 and cust 1.06

We hope you’ll all have a great Saturday.

Regis, Kenneth
Speaker Info
• First name: Andrea. Last name: Martorana Tusa.
• Microsoft MVP Data Platform
• Title: BI Specialist (database, datawarehouse, cubes, reporting, , big
data, ….)
• Company: Widex. A Danish manufacturer of hearing aids.
• Speaker for many community driven events: SQL Saturday (all Europe),
Power BI Summit (Ireland), Power BI World Tour (Denmark), SQL
Konferenz (Germany), Data Minds Connects (Belgium), SQL Nexus
(Denmark), Intelligent Cloud (Denmark), SQL Days (Poland), …
• Author for sqlservercentral.com, sqlshack.com, UGISS (User Group
Italiano SQL Server).
andrea.martoranatusa@gmail.com twitter: @bruco441
Agenda
• Introducing graphs databases
• Graphs and SQL Server 2017
• Query sintax and T-SQL graph exentions in SQL Server 2017
• Rethink your code: how to query a graph database
Introducing graphs
What a graph is?
A graph is a collection of Nodes and Edges
• Nodes = Entities (customers, products, territories,
…)
• Edges = Relationships between entities
• Properties = Node or Edge attributes
What a graph is?
Node
Represents an entity: Product, Customer, Supplier, …
Nodes can contain some properties
Nodes can be labeled with one or more labels
Stored as physical table in the database
What a graph is?
Edges
Relationships between entities
Relationships are named and directed and always
have a start and end node
Relationship can also contain properties
Stored as physical table in the database
What a graph is?
A graph database adds to graph features (Nodes and
Edges) the standard CRUD methods available for
every DBMS
(Create, Read, Update, Delete)
What a graph is?
The strength of graph databases lies in the concepts of
relationships and connection between elements
In a graph database connected data is stored as connected
data.
In a relational database data is stored in different tables
linked through joins.
Furthermore the flexibility of a graph model allows to add
new nodes and new edges without compromising the
existing structure.
What a graph is?

Edges

Nodes
What a graph is?
What a graph is?

Twitter relationships and users


messages

Robinson, Webber, Eifrem


”Graph Databases” – O’Reilly
Why using a graph?
Typical use cases for a graph database:
• Real-Time Recommendation Engines – correlate product, customer,
inventory, supplier, logistics and even social sentiment data. Instantly capture
any new interests shown in the customer’s’ current visit
• Master Data Management – unify your master data, including customer,
product, supplier and logistics information to power the next generation of
eCommerce, fraud detection, supply chain and logistics applications
• Social networks – people and interactions, how they are related over a social
media. Who follows who on Twitter, who is friend of who on Facebook.
• Network & IoT – interconnections of devices, machines, sending signals to
a receiver
• Identity & Access Management – Managing multiple changing roles,
groups, products and authorizations seamlessly track all identity and access
authorizations and inheritances
Graphs and SQL
Server 2017
SQL graph database architecture

Source: https://docs.microsoft.com/en-us/sql/relational-databases/graphs/sql-graph-architecture
Writing queries in a graph database
• There’s nothing you can do in a graph database
that cannot be done in a relational database
• A graph database can handle some relationships
easier than a relationship database. Some queries
can be written in a more linear way
• A graph database is optimized for higly connected
data
SQL graph database architecture
• Users can create one graph per database
• Two new tables types
• Node or edge tables can be created under any schema in the
database
• Since nodes and edges are stored in tables, most of the
operations supported on regular tables are supported on node or
edge tables
• Users can model many-to-many relationships using edge tables.
A single edge type can connect multiple type of nodes with each
other, in contrast to foreign keys in relational tables.
Node tables
• Every time a node table is created, along with the
user-defined columns, an implicit $node_id column
is created, which uniquely identifies a given node in
the database
• The values in $node_id are automatically generated
and are a combination of object_id of that node
table and an internally generated bigint value
• When the $node_id column is selected, a computed
value in the form of a JSON string is displayed
Edge tables
• Edges are always directed and connect two nodes. An edge table
enables users to model many-to-many relationships in the graph
• Every time an edge table is created, along with the user-defined
attributes, three implicit columns are created in the edge table:
1. $edge_id: Uniquely identifies a given edge in the database
2. $from_id: Stores the $node_id of the node, from where the
edge originates
3. $to_id: Stores the $node_id of the node, at which the edge
terminates
Node and edge tables in SQL Server
Query sintax and T-SQL graph
exentions in SQL Server 2017
T-SQL Extensions

• CREATE TABLE … AS NODE / AS EDGE

• CREATE EDGE CONSTRAINTS: enforce specific semantics and


maintain data integrity

• MATCH: built-in function to support pattern matching and


traversal through the graph
T-SQL Extensions
MATCH
Specifies a search condition for a graph. MATCH can be used only with graph
node and edge tables, in the SELECT statement as part of WHERE clause.

Syntax: node-(edge)->node or node<-(edge)–node


From one node to another via an edge
Edge names inside brackets
Easier than a relational JOIN
T-SQL Extensions
START a=node:user(name='Michael')
Cypher MATCH (a)-[:KNOWS]->(b)-[:KNOWS]->(c), (a)-[:KNOWS]->(c)
RETURN b, c

SELECT Person2.name AS FriendName


FROM Person person1, friend, Person person2
T-SQL WHERE MATCH(Person1-(friend)->Person2)
AND Person1.name = 'Michael'

START: starting point in the graph SELECT


MATCH: relationship pattern FROM
RETURN: what data return to query engine WHERE MATCH
T-SQL Extensions
New system functions to extract information from the generated
columns.
• OBJECT_ID_FROM_NODE_ID
• GRAPH_ID_FROM_NODE_ID
• NODE_ID_FROM_PARTS

• OBJECT_ID_FROM_EDGE_ID
• GRAPH_ID_FROM_EDGE_ID
• EDGE_ID_FROM_PARTS
Rethink your code: how to
query a graph database
Demo: social relationship
Relational DB
LivesIn Person
1:∞ 1: ∞ 1:∞
PersonID PersonID
CityID PersonName

Likes
City
FriendOf PersonID
CityID
PersonID1 RestaurantID
CityName
PersonID2 Rating

LocatedIn Restaurant
1:∞ 1:∞ 1:∞
CityID RestaurantID
RestaurantID RestaurantName
Demo: social relationship
Graph DB

Nodes Edges
• Person • LivesIn
• City • Likes
• Restaurant • LocatedIn
• Friends

https://docs.microsoft.com/en-us/sql/relational-databases/graphs/sql-graph-sample?view=sql-server-2017
Demo: social relationship
Search for all the restaurants that John’s friend like
Relational
Likes
FriendOf Person2
Person1 1:∞ ∞:1 1:∞ Mary ∞:1 Restaurant
John Mary
John Pizza Hut Pizza Hut
…. Alice
9

Graph

Likes
FriendOf

Person Restaurant
Demo: social relationship
Search for all the restaurants that John’s friend like
Relational
FriendsOf Person2 Likes Restaurant
Person1
1:∞ ∞:1 1:∞ ∞:1

Graph Restaurant
Person2
Person1

Likes
FriendOF
Hierarchies
Implement hierarchies in a database
• All columns in one table
• Different related tables
• Self-Join table
• Graph DB
❖ Nodes are related via Edges
❖ No redundancy
❖ Handle many-to-many relationships
❖ Handle multiple hierarchies
❖ Easy to write and read
Hierarchies Is Manager

Alice

Ken

Peter

Is Report

Is Manager Is Report

Employee
Hierarchies - Limitations
• Currently complex traverses not supported via MATCH. Cannot
browse up to the first level of the hierarchy
• Must wrap the query into a loop, a function, (not a recursive CTE)
to perform transitive closure on your hierarchy
Graph Reporting in Power BI
Use Force Directed Graph custom visual in Power BI to visualize
connections between data
Merge relational and graph queries
Graph database features are integrated with the database
engine. You can work with graph and relational data side-by-
side, for instance merging graph tables into an existing relational
structure.
Nodes Edges
• Stores • Purchases
• SalesReps • Sells
• Products • Supplies
• Vendors

https://www.red-gate.com/simple-talk/sql/sql-development/sql-server-graph-databases-part-1-introduction/
Route calcolation and shortest path
Global Post parcel network

Robinson, Webber, Eifrem


”Graph Databases” – O’Reilly
Route calcolation and shortest path
Graph database
model associated to
the network

Two types of
connections:
CONNECTED_TO
DELIVERY_ROUTE

Robinson, Webber, Eifrem


”Graph Databases” – O’Reilly
Route calcolation and shortest path
Route calculations
involve finding the
cheapest route between
two locations.
Requirements is that
calculated route must
go via at least one
parcel center in the part
of the graph.

Robinson, Webber, Eifrem


”Graph Databases” – O’Reilly
Route calcolation and shortest path
START s=node:location(name={startLocation}),
e=node:location(name={endLocation})
MATCH upLeg = (s)<-[:DELIVERY_ROUTE*1..2]-(db1)
WHERE all(r in relationships(upLeg)
WHERE r.start_date <= {intervalStart} Cypher query to calculate parcel route
AND r.end_date >= {intervalEnd})
WITH e, upLeg, db1
MATCH downLeg = (db2)-[:DELIVERY_ROUTE*1..2]->(e)
WHERE all(r in relationships(downLeg)
WHERE r.start_date <= {intervalStart}
AND r.end_date >= {intervalEnd})
WITH db1, db2, upLeg, downLeg
MATCH topRoute = (db1)<-[:CONNECTED_TO]-()-[:CONNECTED_TO*1..3]-(db2)
WHERE all(r in relationships(topRoute)
WHERE r.start_date <= {intervalStart}
AND r.end_date >= {intervalEnd})
WITH upLeg, downLeg, topRoute,
reduce(weight=0, r in relationships(topRoute) : weight+r.cost) AS score
ORDER BY score ASC
LIMIT 1
RETURN (nodes(upLeg) + tail(nodes(topRoute)) + tail(nodes(downLeg))) AS n Robinson, Webber, Eifrem
”Graph Databases” – O’Reilly
Route calcolation and shortest path
A traversal-based implementation of the route calculation engine must solve
two problems:
finding shortest paths, and filtering paths based on time period.

The real solution for calculating this routes has been implemented using a
traversal framewok engine available into a native graph database (Neo4J).

This kind of calculation is not replicable in the current version of SQL Server.
The database engine does not allow traversal functions and some clauses are
missing.

Robinson, Webber, Eifrem


”Graph Databases” – O’Reilly
Summary
What is ready What is missing
• Get rid of redundancy • Using nodes as derived table in
• Write simple queries with complex queries
MATCH pattern • No Update for edges
• Handle higly connected data • Edges constraints
• Handle many-to-many rel • OR and NOT operators in MATCH
• Handle hierarchical data with pattern
multiple parents • Traversal functions
• Routing
• Shortest path
Summary
What is coming (SQL Server 2019)
• Edge constraints
❖ Check on node types
❖ Check on existing nodes
❖ Force referential integrity
Raffle and goodbye Beer
Remember to visit the sponsors, stay for the raffle and
goodbye beers ☺

Join our sponsors for a lunch break session in :


cust 0.01 and cust 1.06

We hope you’ll all have a great Saturday.

Regis, Kenneth
BIG Thanks to SQL Sat Denmark sponsors
GOLD

SILVER

BRONZE
References
https://cloudblogs.microsoft.com/sqlserver/2017/04/20/graph-data-processing-with-sql-server-
2017/
https://docs.microsoft.com/en-us/sql/relational-databases/graphs/sql-graph-
overview?view=sql-server-2017
https://docs.microsoft.com/en-us/sql/relational-databases/graphs/sql-graph-architecture
https://www.red-gate.com/simple-talk/sql/sql-development/sql-server-graph-databases-part-1-
introduction/

You might also like