Professional Documents
Culture Documents
Inner Architecture of A Social Networking System: Petr Kunc, Jaroslav Škrabálek, Tomáš Pitner
Inner Architecture of A Social Networking System: Petr Kunc, Jaroslav Škrabálek, Tomáš Pitner
Who am I?
Master student of FI MU
Member of LaSArIS
Webtops
Modern web applications
Cloud (and distributive) solutions
Table of contents
What and why?
Takeplace
Which way?
Hadoop
HBase
Memcached
How?
Architecture and design
Was it worth it?
Testing
Takeplace
Functional requirements
Entities can create asymmetric
relations
Posts
Walls and news feed
Comments and like
Technology requirements
Linux and Cloud
Data-oriented application
High throughput
Heavy loads
Concurrent requests
Caching tool
Relational databases
Fixed schema, ACID, indexes, joins
Problems
scaling up dataset size
Read/write concurrency
Hbase
Example
{
aa : {
cf : {
c1 : data
c2 : data
}
cf2 : {
anyByteArray : true
}
},
ab : { }
}
Hadoop
SW framework backbone of distributed
environment
MapReduce
HDFS
HBase
No real indexes
Automatic partitioning
Scale linearly and automatically
Parallel
Cheap
Not for everyone
Write once, read many
Built on top of Hadoop
Memcached
Distributed cache
Typical usage
public Data getData (String query) {
Data data = memcached.get(query);
if (data == null) {
data = database.get(query);
memcached.set(query, data);
}
return data;
}
Architecture
Architecture (2)
Architecture (3)
User ID
transformation
Data!
Three tables
Entities
Followers, Following, Blocked, Count,
News
Walls
Info, text, likes
Storing data
News feed
One by one (slow)
OR
Store news at each profile (great redundancy)
MEMCACHED!
Post put in DB => search followers =>
store minimized in Memcached => links to
news feed => 1 normal q & 1 batch q to
Memcached
TTL (LRU)
Conclusion
Pros
High volume data distribution
Scalability
High throughput
Heavy data load (write once, read many)
Cons
Losing relations, indexes, triggers,
Responsibility for consistent data
still not sure how it will behave when deployed on
production