Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Visualizing Graphs with Elasticsearch and KeyLines

 
What is Elasticsearch? ......................................................................................................................... 2  

Graph: The Elasticsearch Graph Engine ......................................................................................... 2  

Kibana: The Elasticsearch visualization tool .................................................................................. 2  

Logstash – a data management tool .............................................................................................. 4  

Why visualize Elasticsearch with KeyLines? ..................................................................................... 4  

A KeyLines / Elasticsearch Architecture ........................................................................................... 5  

Getting started with KeyLines and Elasticsearch ............................................................................ 6  

Step 1: Download your files ............................................................................................................. 6  

Step 2: Set up your file structure .................................................................................................... 6  

Step 3: Load data into Elasticsearch ............................................................................................... 6  

Step 4: Embed KeyLines in your webpage .................................................................................... 7  

Step 6: Parse our result in the KeyLines format ........................................................................... 9  

Step 7: Visualize the data in KeyLines .......................................................................................... 10  

Step 8: Performing more sophisticated searches ...................................................................... 11  

Next steps: Extending the UI ......................................................................................................... 11  

Try it yourself!.................................................................................................................................... 11  

Who should read this white paper?

This white paper is aimed at:

• Project managers and non-technical staff looking for a detailed introduction to visualizing
data from Elasticsearch with KeyLines.

• Developers and technical staff seeking a non- technical introduction to visualizing data
from Elasticsearch with KeyLines.

If you require more information we recommend contacting us to discuss your project.

1
What is Elasticsearch?
Elasticsearch is a fast and scalable open source search engine.

Its power and out-of-the-box simplicity has made it a popular option for organizations needing a
way to search very large volumes of data. It can support near real time searching of data on a
petabyte scale, using a system of sharding and routing to scale outwards from the beginning.

The Elasticsearch engine itself is built on the Apache Lucene software library. Lucene is a high-
performance technology for searching and indexing data, but it is also very complex.
Elasticsearch makes the power of Lucene more readily useable by pre-selecting some sensible
defaults and providing a more intuitive REST API.

Elasticsearch powers the search functionality of some very data-rich organizations, including
Facebook, Wikimedia and Stack Exchange. It is also increasingly popular with KeyLines
developers, looking for a powerful and scalable back-end technology for their graph applications.

In this Getting Started guide we are going to explain how you can use the KeyLines toolkit to
build a UI for your Elasticsearch server.

Through the document, we will refer to a number of different technologies in the Elastic Stack:

• Elasticsearch – the core search technology

• Graph – a new API for performing searches across connected data

• Kibana – an open source visualization web application

• Logstash – a tool for streaming, munging and loading data into Elasticsearch

Elastic Graph: The Elasticsearch graph engine


Released in Elasticsearch v2.3, Elastic Graph provides a way to discover and understand
connections in your Elasticsearch index. It is able to infer two data attributes: vertices (nodes) and
connections (links). The Elastic Graph API then allows you to query and explore these vertices and
connections as a graph.

Out of the box, Elastic Graph uses relevance scoring to help identify the most meaningful
connections. This simple analysis can be enhanced with KeyLines visual graph analysis
functionality, making it easier for users to understand complex network trends and uncover
outliers.

Kibana: The Elasticsearch visualization tool


Kibana is Elasticsearch’s open source data visualization platform. It provides a dashboard of
charts and maps to help users interpret their data and search results:

2
A screengrab of a Kibana dashboard, via http://elastic.co

Kibana includes a Graph plugin, allowing users to visually explore data connections:

A screengrab of the Kibana graph plugin, via http://elastic.co

As both Kibana and KeyLines are web technologies, they complement each other perfectly.

3
Logstash – a data management tool
The easiest way to load data into Elasticsearch is using LogStash, a command line tool. This
approach means you can input data as a CSV file, leaving LogStash to parse the dataset into your
Elasticsearch instance.

Why visualize Elasticsearch with KeyLines?


Graph visualization is a great way to make large and complex connected data easy to understand.
A well-designed visualization means users can:

• Find and interpret patterns and outliers

• Explore connections in an intuitive way

• Answer questions more quickly using data insight.

Extending Kibana’s graph visualization functionality with KeyLines provides access to an


extensive library of powerful functionality for even greater graph insight, including:

• Social network analysis

• Automated graph layouts

• The KeyLines Time Bar and dynamic network visualization

• KeyLines Geospatial to view network data on geographic maps

• WebGL for faster and more powerful visualization

In this Getting Started guide we are going to follow the steps required to build a simple KeyLines
component to visualize and explore your Elasticsearch graph data.

Let’s get started…

4
A KeyLines / Elasticsearch Architecture
Elasticsearch provides a REST API and works with the JSON data structure, so the KeyLines
integration architecture is very simple:

In this scenario users interact with KeyLines, which runs in the web browser, to raise events (e.g.
click, hover, right-click, etc). These user interactions with the graph interface raise requests to the
Elasticsearch REST API. Elasticsearch returns the data as a JSON object, which is then styled and
re-presented in KeyLines.

5
Getting started with KeyLines and Elasticsearch
In this tutorial, we will create a simple KeyLines application to perform a search of our
Elasticsearch data. This is just the starting point. Once you have a functioning integration, you can
incorporate additional KeyLines visualization and analysis functionality.

If you have any problems following these instructions, get in touch.

Step 1: Download your files


To build a KeyLines-Elasticsearch integration you will need the following files:

• Keylines.js – request trial account

• Elasticsearch – we used v2.3.3 – installation guide

• Elastic Graph API plugin – installation guide

• Logstash – installation guide

Step 2: Set up your file structure


For our KeyLines/Elasticsearch JavaScript app, we will use the following structure:

• App.js contains the main functions to initialize


KeyLines and controllers for Elasticsearch and app-
graph-search.js.

• Elasticsearch.js will contain the functions required


to send queries to, and generally interact with, the
server.

• App-graph-search.js will contain the controller for


our search function with the Graph API.

• Index.htm will contain the KeyLines chart and some


customization code to describe the general UI.

Step 3: Load data into Elasticsearch


This step can be omitted if your instance is pre-populated.

We used a random data generator to produce a fake dataset of users. Then we imported the
generated users into Elasticsearch with Logstash: with a “user” type inside a “users” index.

Users have the following structure:


user:  {
 id:  "number",
 firstname:  "string",
 lastname:  "string",

6
 gender:  "string",
 company:  "string",
 eyes_color:  "string"
}  

Step 4: Embed KeyLines in your webpage


We won’t go into detail on this, but you can find sample applications on the KeyLines SDK
website, or create your own using the Getting Started guide in the SDK documentation.

To give you some idea of how this works, here is some of the HTML we would need on our page
to load the KeyLines component:

We have to include KeyLines:


 
<link  rel='stylesheet'  type='text/css'  href='css/keylines.css'/>  
<link  rel="stylesheet"  type="text/css"  href="css/style.css">  
 
And we also need a container to start KeyLines within it:

<!-­‐-­‐  This  is  the  HTML  element  that  will  be  used  to  render  the  KeyLines  component  -­‐-­‐>  
<div  id="kl"></div>  
 
After that, the rest will be UI to interact with Elasticsearch.

Our KeyLines chart with some UI

Step 5: Fetch some data from Elastic Graph API


The Graph API is a rest service, so we use the action “_graph/explore” to request data. Our
endpoint therefore is http://localhost:9200/users/_graph/explore

7
By importing the data with Logstash, we have an extra field in each user: message. It is the raw
line used to do the import, it looks like this:

100|Noelle|Frye|Sodales  Purus  In  Company|gray

We will use this field to search with the Graph API.

For a graph search for the term ‘brown’, our data query would look like this:
 
{  
   "query":  {  
           "query_string":  {  
                   "default_field":  "_all",  
                   "query":  "brown"  
           }  
   },  
   "controls":  {  
           "use_significance":  true,  
           "sample_size":  2000,  
           "timeout":  5000  
   },  
   "connections":  {  
           "vertices":  [  
                   {  
                           "field":  "message",  
                           "size":  20,  
                           "min_doc_count":  3  
                   }  
           ]  
   },  
   "vertices":  [  
           {  
                   "field":  "message",  
                   "size":  20,  
                   "min_doc_count":  3  
           }  
   ]  
}  

In response to this we would receive a JSON object, which we can parse into KeyLines’ own JSON
format.

8
Step 6: Parse our result in the KeyLines format
The Elasticsearch response contains all the information we need to create a KeyLines input, so
parsing your JSON is a relatively simple process.

The search results are received in the following structure:

{  
   connections:[],  
   failures:[],  
   timed_out:false,  
   took:0,  
   vertices:[]  
}  

Inside the connections attribute, we will find the links, for example:

{
   doc_count:  14,
   source:  10,
   target:  2,
   weight:  0.005304290380952548
}
 
source and target attributes are the index of vertices in the vertices attributes.

Inside the vertices attribute, we will find the object itself, for example:

{
   depth:  0,
   field:  "message",
   term:  "blue",
   weight:  0.8421388547845717
}  

More details are in the documentation.

For this we just use the makeNode() and makeLink() functions to get our KeyLines input, e.g.:

var  makeNode  =  function  (index,  item)  {  


   var  e  =  getNodeWidth(item);  
 
   return  {  

9
           id:  item.term,  
           type:  "node",  
           t:  item.term,  
           e:  e,  
           c:  "green",  
           d:  Object.assign({},  item)  
   };  
};  
 
var  makeLink  =  function  (index,  item,  nodes)  {  
   var  w  =  getLinkWidth(item);  
   var  node1  =  nodes[item.source];  
   var  node2  =  nodes[item.target];  
 
   return  {  
           type:  "link",  
           id:  "link_"  +  node1.term  +  "_"  +  node2.term,  
           id1:  node1.term,  
           id2:  node2.term,  
           w:  w,  
           d:  Object.assign({},  item)  
   };  
};  

Step 7: Visualize the data in KeyLines


Now we have our JSON object, we can put it into KeyLines using a callback like this:

function  loadChart(items)  {  
   chart.load({  
           type:  'LinkChart',  
           items:  items  
   },  function  ()  {  
           chart.layout("standard");  
   });  
}  

Success!

10
In this example, we have added another request to count users returned in our search result. This
allows us to scale nodes and weight the links.

Step 8: Performing more sophisticated searches


Our example above is just the starting point. Now your infrastructure is working, you can begin to
perform more sophisticated searches.

For example, you may want to pull in nodes with their full relationships. This would be managed
by performing another server request asking for all elements in the relationships found. You will
also ask it to omit any related nodes – otherwise you will keep returning the original node over
and over.

Next steps: Extending the UI


In our example, we included some controls to run KeyLines’ automatic layouts and a selection
detail tool to show information about the selected elements.

You should now be ready to extend these with other functionality to help users explore and
understand their data. The KeyLines SDK site contains has a fully-documented API of functionality
for you to incorporate.

Try it yourself!
To find out more about KeyLines, or to start a free trial, just get in touch: http://cambridge-
intelligence.com/contact.

11

You might also like