Apache Jack Rabbit OAK On MongoDB PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 51

Apache Jackrabbit Oak on MongoDB

Marcel Reutegger | Senior Software Engineer

© 2014 Adobe Systems Incorporated. All Rights Reserved.


Apache Jackrabbit Oak

About me
eer
Software Engin
a y / A d o b e s in ce 2002
At D
C R A P I S p e c ifi cation
J
er
Apache memb
a ch e J a c k r a b b it / Oak
Ap

© 2014 Adobe Systems Incorporated. All Rights Reserved. 2


Contents

Adobe Experience Manager


Java Content Repository
Apache Jackrabbit Oak
Multiversion Concurrency Control
Transactions
Content Addressable Storage
Q&A

© 2014 Adobe Systems Incorporated. All Rights Reserved. 3


Adobe Experience Manager

© 2014 Adobe Systems Incorporated. All Rights Reserved. 4


Adobe Experience Manager – Technology Stack

OSGi Container

Web Framework

Java Content Repository

© 2014 Adobe Systems Incorporated. All Rights Reserved. 5


Adobe Experience Manager – Technology Stack

OSGi Container

Web Framework

Java Content Repository

© 2014 Adobe Systems Incorporated. All Rights Reserved. 6


Adobe Experience Manager – Technology Stack

Java Content Repository / Apache Jackrabbit Oak

.tar

Tar MongoDB RDBMS

© 2014 Adobe Systems Incorporated. All Rights Reserved. 7


Contents

Adobe Experience Manager


Java Content Repository
Apache Jackrabbit Oak
Multiversion Concurrency Control
Transactions
Content Addressable Storage
Q&A

© 2014 Adobe Systems Incorporated. All Rights Reserved. 8


Java Content Repository – Why?

“Build me a web content management system!”

Easy: LAMP stack Apache PHP


Linux MySQL

Done in 2 weeks

Developer happiness
© 2014 Adobe Systems Incorporated. All Rights Reserved. 9
Java Content Repository – Why?

Build me a web content management system!

“Nice, but I want to organize my pages in a hierarchy.”

Apply a well known hierarchical database model


and update the application.

Done in 4 weeks

Developer happiness
© 2014 Adobe Systems Incorporated. All Rights Reserved. 10
Java Content Repository – Why?

Nice, but I want to organize my pages in a hierarchy.

“Can you please add structured and fulltext searches?”

Integrate with Apache Solr or Elasticsearch

Done in 4 weeks

Developer happiness
© 2014 Adobe Systems Incorporated. All Rights Reserved. 11
Java Content Repository – Why?

Can you please add structured and fulltext searches?

“I accidentally deleted the product page.


We need to version our content.”

Introduce new tables and rewrite the application.

Done in 8 weeks

Developer happiness
© 2014 Adobe Systems Incorporated. All Rights Reserved. 12
Java Content Repository – Why?

“I accidentally deleted the product page.


We need to version our content.”
“We cannot publish financial results, unless the system
has fine grained access control.”

Introduce more tables and


integrate with a directory server.

I’ll get back to you


next year!

Developer happiness
© 2014 Adobe Systems Incorporated. All Rights Reserved. 13
Java Content Repository – Features

JSR-283 – JCR 2.0 released 2009

Hierarchical - Structured and binary data

Query – SQL, XPath and Java language binding

Access Control on Node and Property level

Versioning – Modeled after WebDAV DeltaV (RFC 3253)

Locking – Shallow or deep

Asynchronous Observation

© 2014 Adobe Systems Incorporated. All Rights Reserved. 14


Contents

Adobe Experience Manager


Java Content Repository
Apache Jackrabbit Oak
Multiversion Concurrency Control
Transactions
Content Addressable Storage
Q&A

© 2014 Adobe Systems Incorporated. All Rights Reserved. 15


Apache Jackrabbit Oak – Design

Cacheable

Customizable

Support NoSQL Storage


Scalable Support Sharding

© 2014 Adobe Systems Incorporated. All Rights Reserved. 16


Apache Jackrabbit Oak – Design

Cacheable
Pluggable Storage
Customizable Custom Index Definitions

Scalable

© 2014 Adobe Systems Incorporated. All Rights Reserved. 17


Apache Jackrabbit Oak – Design

Copy-On-Write
Cacheable Multiversion Concurrency
Content Addressable Storage

Customizable

Scalable

© 2014 Adobe Systems Incorporated. All Rights Reserved. 18


Contents

Adobe Experience Manager


Java Content Repository
Apache Jackrabbit Oak
Multiversion Concurrency Control
Transactions
Content Addressable Storage
Q&A

© 2014 Adobe Systems Incorporated. All Rights Reserved. 19


Apache Jackrabbit Oak – MVCC & Copy-On-Write

Rev 1

/a /b

/a/1 /a/2 /b/1

© 2014 Adobe Systems Incorporated. All Rights Reserved. 20


Apache Jackrabbit Oak – MVCC & Copy-On-Write

Rev 1

/a /b new

/a/1 /a/2 /b/1 /b/2

© 2014 Adobe Systems Incorporated. All Rights Reserved. 21


Apache Jackrabbit Oak – MVCC & Copy-On-Write

Rev 1 Rev 2
copy parents
/ /'

/a /b /b'

/a/1 /a/2 /b/1 /b/2

© 2014 Adobe Systems Incorporated. All Rights Reserved. 22


Apache Jackrabbit Oak – MVCC & Copy-On-Write

Rev 1 Rev 2

/ /'

/a /b /b'

/a/1 /a/2 /b/1 /b/2


concurrent access
to Rev 1 and Rev 2
© 2014 Adobe Systems Incorporated. All Rights Reserved. 23
Apache Jackrabbit Oak – MVCC & Copy-On-Write

collect garbage Rev 2

/'

/a /b'

/a/1 /a/2 /b/1 /b/2

© 2014 Adobe Systems Incorporated. All Rights Reserved. 24


Apache Jackrabbit Oak – MVCC & Copy-On-Write

Rev 2 compact
/'

/a /b'

/a/1 /a/2 /b/1 /b/2

© 2014 Adobe Systems Incorporated. All Rights Reserved. 25


Apache Jackrabbit Oak – MVCC & Copy-On-Write

Stable snapshot view of data


⊕ Writes do not block reads

⊖ Higher storage cost


Garbage collection

© 2014 Adobe Systems Incorporated. All Rights Reserved. 26


Apache Jackrabbit Oak – The Data Model

{!
_id : “/home/john”,!
name : “john”,!
email : “john@example.com”!
}!

{!
_id : “2:/home/john”,!
name : { “r14979e4b424-0-1” : “john” },!
email : { “r14979e4b424-0-1” : “john@example.com” },!
_deleted : { “r14979e4b424-0-1” : “false” },!
_revisions : { “r14979e4b424-0-1” : “c” }!
}!

© 2014 Adobe Systems Incorporated. All Rights Reserved. 27


Apache Jackrabbit Oak – The Data Model

{!
_id : “2:/home/john”,!
name : { “r14979e4b424-0-1” : “john” },!
email : { “r14979e4b424-0-1” : “john@example.com” },!
_deleted : { “r14979e4b424-0-1” : “false” },!
_revisions : { “r14979e4b424-0-1” : “c” }!
}!

© 2014 Adobe Systems Incorporated. All Rights Reserved. 28


Apache Jackrabbit Oak – The Data Model

{!
_id : “2:/home/john”,!
name : { “r14979e4b424-0-1” : “john” },!
email : { “r14979e4b424-0-1” : “john@example.com” },!
_deleted : { “r14979e4b424-0-1” : “false” },!
_revisions : { “r14979e4b424-0-1” : “c” }!
}!

Timestamp Counter Cluster ID

© 2014 Adobe Systems Incorporated. All Rights Reserved. 29


Apache Jackrabbit Oak – The Data Model

{!
_id : “2:/home/john”,!
name : { “r14979e4b424-0-1” : “john” },!
email : {!
“r14979e4b424-0-1” : “john@example.com”,!
“r14979e6a941-0-1” : “john.doe@example.com”!
},!
_deleted : { “r14979e4b424-0-1” : “false” },!
_revisions : {!
“r14979e4b424-0-1” : “c”,!
“r14979e6a941-0-1” : “c”!
}!
}!

Change email to “john.doe@example.com”

© 2014 Adobe Systems Incorporated. All Rights Reserved. 30


Contents

Adobe Experience Manager


Java Content Repository
Apache Jackrabbit Oak
Multiversion Concurrency Control
Transactions
Content Addressable Storage
Q&A

© 2014 Adobe Systems Incorporated. All Rights Reserved. 31


Apache Jackrabbit Oak – Transactions

{!
_id : “2:/home/john”,!
name : { “r14979e4b424-0-1” : “john” },!
_deleted : { “r14979e4b424-0-1” : “false” },!
_commitRoot : { “r14979e4b424-0-1” : “1” }!
}!

{!
_id : “3:/home/john/profile”,!
avatar : { “r14979e4b424-0-1” : <bin> },!
_deleted : { “r14979e4b424-0-1” : “false” },!
_commitRoot : { “r14979e4b424-0-1” : “1” }!
}!

© 2014 Adobe Systems Incorporated. All Rights Reserved. 32


Apache Jackrabbit Oak – Transactions

{!
_id : “1:/home”,!
,! _deleted : { “r14979e1b312-0-1” : “false” },!
se” },! _revisions : {!
1” }! “r14979e1b312-0-1” : “c”,!
“r14979e4b424-0-1” : “c”!
}!
}!

},! Conditional update for commit:


se” },!
1” }! {!
_id : “1:/home”, !
“_collisions.r14979e4b424-0-1” : { $exists : false }!
}!

© 2014 Adobe Systems Incorporated. All Rights Reserved. 33


Apache Jackrabbit Oak – Transactions

{!
_id : “3:/home/john/profile”,!
avatar : {!
“r14979e4b424-0-1” : <bin>!
},!
_deleted : {!
“r14979e4b424-0-1” : “false”!
},!
_commitRoot : {!
“r14979e4b424-0-1” : “1” !
}!
}!

© 2014 Adobe Systems Incorporated. All Rights Reserved. 34


Apache Jackrabbit Oak – Transactions

{! T-1
_id : “3:/home/john/profile”,!
avatar : {!
“r14979e4b424-0-1” : <bin>,!
“r14979e6c7a2-0-1” : <bin>!
},!
_deleted : {!
“r14979e4b424-0-1” : “false”!
},!
_commitRoot : {!
“r14979e4b424-0-1” : “1”,!
“r14979e6c7a2-0-1” : “1” !
}!
}!

© 2014 Adobe Systems Incorporated. All Rights Reserved. 35


Apache Jackrabbit Oak – Transactions

{! T-1 T-2
_id : “3:/home/john/profile”,!
avatar : {!
“r14979e4b424-0-1” : <bin>,!
“r14979e6c7a2-0-1” : <bin>,!
“r14979e6c7a3-0-1” : <bin>!
},!
_deleted : {!
“r14979e4b424-0-1” : “false”!
},!
_commitRoot : {!
“r14979e4b424-0-1” : “1”,!
“r14979e6c7a2-0-1” : “1”,!
“r14979e6c7a3-0-1” : “1”!
}!
}!

© 2014 Adobe Systems Incorporated. All Rights Reserved. 36


Apache Jackrabbit Oak – Transactions

{! T-2
_id : “1:/home”,!
_deleted : { !
“r14979e1b312-0-1” : “false”!
},!
_revisions : {!
“r14979e1b312-0-1” : “c”!
},!
_collisions : {!
“r14979e6c7a2-0-1” : “true”!
}!
}!

Conditional update for collision marker:


{!
_id : “1:/home”, !
“_revisions.r14979e6c7a2-0-1” : { $exists : false }!
}!

© 2014 Adobe Systems Incorporated. All Rights Reserved. 37


Apache Jackrabbit Oak – Transactions

{! T-1 T-2
_id : “1:/home”,!
_deleted : { !
“r14979e1b312-0-1” : “false”!
},!
_revisions : {!

},!
“r14979e1b312-0-1” : “c”! ✗
_collisions : {!
“r14979e6c7a2-0-1” : “true”!
}!
}!

Conditional update for commit:


{!
_id : “1:/home”, !
“_collisions.r14979e6c7a2-0-1” : { $exists : false }!
}!

© 2014 Adobe Systems Incorporated. All Rights Reserved. 38


Apache Jackrabbit Oak – Transactions

{! T-1 T-2
_id : “1:/home”,!
_deleted : { !
“r14979e1b312-0-1” : “false”!
},!
_revisions : {!
“r14979e1b312-0-1”
“r14979e6c7a3-0-1”
: “c”,!
: “c”!

},!
_collisions : {!
“r14979e6c7a2-0-1” : “true”!
}!
}!

{!
_id : “1:/home”, !
“_collisions.r14979e6c7a3-0-1” : { $exists : false }!
}!

© 2014 Adobe Systems Incorporated. All Rights Reserved. 39


Apache Jackrabbit Oak – Transactions

{! T-1
_id : “3:/home/john/profile”,!
avatar : {!
“r14979e4b424-0-1” : <bin>,!
“r14979e6c7a2-0-1” : <bin>,!
“r14979e6c7a3-0-1” : <bin>!
},!
_deleted : {!
“r14979e4b424-0-1” : “false”!
},!
_commitRoot : {!
“r14979e4b424-0-1” : “1”,!
“r14979e6c7a2-0-1” : “1”,!
“r14979e6c7a3-0-1” : “1”!
}!
}!

© 2014 Adobe Systems Incorporated. All Rights Reserved. 40


Contents

Adobe Experience Manager


Java Content Repository
Apache Jackrabbit Oak
Multiversion Concurrency Control
Transactions
Content Addressable Storage
Q&A

© 2014 Adobe Systems Incorporated. All Rights Reserved. 41


Apache Jackrabbit Oak – Content Addressable Storage

GridFS

file chunk 0

chunk 1

chunk N

© 2014 Adobe Systems Incorporated. All Rights Reserved. 42


Apache Jackrabbit Oak – Content Addressable Storage

Binary

Oak

Chunk Chunk
0x38a7 0x8f91

Chunk
0xc92a

Node Storage Binary Storage

© 2014 Adobe Systems Incorporated. All Rights Reserved. 43


Apache Jackrabbit Oak – Content Addressable Storage

Binary Hash: 0x38a7

0x38a7
Oak

Chunk Chunk
0x38a7 0x8f91

Chunk
0xc92a

Node Storage Binary Storage

© 2014 Adobe Systems Incorporated. All Rights Reserved. 44


Apache Jackrabbit Oak – Content Addressable Storage

Binary Hash: 0x8f91

0x38a7 0x8f91
Oak

Chunk Chunk
0x38a7 0x8f91

Chunk
0xc92a

Node Storage Binary Storage

© 2014 Adobe Systems Incorporated. All Rights Reserved. 45


Apache Jackrabbit Oak – Content Addressable Storage

Binary Hash: 0x52f1

0x38a7 0x8f91 0x52f1


Oak

Chunk Chunk Chunk


0x38a7 0x8f91 0x52f1

Chunk
0xc92a

Node Storage Binary Storage

© 2014 Adobe Systems Incorporated. All Rights Reserved. 46


Apache Jackrabbit Oak – Content Addressable Storage

0x38a7 0x8f91 0x52f1


Oak

Chunk Chunk Chunk


0x38a7 0x8f91 0x52f1

Chunk
0xc92a

Node Storage Binary Storage

© 2014 Adobe Systems Incorporated. All Rights Reserved. 47


Apache Jackrabbit Oak – Content Addressable Storage

De-duplication on chunk level


⊕ Chunks are immutable
Shared Storage

⊖ Garbage collection

© 2014 Adobe Systems Incorporated. All Rights Reserved. 48


Summary

Adobe Experience Manager


Java Content Repository
Apache Jackrabbit Oak
Multiversion Concurrency Control
Transactions
Content Addressable Storage
Q&A

© 2014 Adobe Systems Incorporated. All Rights Reserved. 49


Q&A
© 2014 Adobe Systems Incorporated. All Rights Reserved. 50
© 2014 Adobe Systems Incorporated. All Rights Reserved.

You might also like