Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Table of Contents

Chapter 1 Introduction……………………………….…………....1
1.1 Problem Definition……………………………………….2
1.2 NoSQL Database…………………………………………2
Chapter 2 Graph Database………………….…………………….3
Chapter 3 Neo4J…………………………………………………...4
3.1 Neo4J AuraDB…………………………………….….....6
Chapter 4 Tive platform……………………………………….…..7
Chapter 5 Implementation………………………………………...8
Chapter 6 Observation and Results………………………….…..12
6.1 Creating Instance………………………………………..12
6.2 Creating Nodes……………………………………….....12
6.3 Creating Relationships……………………….………......13
6.4 Graph traversals and Shortest path……………………….14
Chapter 7 Conclusion…………………………………………..…..15
Neo4J AuraDB

Chapter 1

Introduction
In recent years, NoSQL databases have emerged as a popular alternative to traditional SQL databases
for handling large and complex data sets. While SQL databases are based on a relational data model, NoSQL
databases use a variety of data models, including document-oriented, key-value, graph, and column-family
models. This makes them more flexible and better suited to handling unstructured and semi-structured data,
such as customer reviews, social media posts, and sensor data.

NoSQL databases are designed to scale horizontally, meaning that they can add additional servers to
handle increased data loads. This makes them ideal for handling big data, which can quickly overwhelm
traditional SQL databases. Additionally, NoSQL databases provide high availability and fault tolerance,
ensuring that the system remains accessible even in the event of hardware failures or network outages.

NoSQL databases are designed to handle large volumes of unstructured and semi-structured data.
Unlike traditional relational databases, NoSQL databases do not rely on a fixed schema and can store data
in a variety of formats, including key-value pairs, document-oriented data, and graph-based models. This
flexibility makes NoSQL databases well-suited for handling healthcare data, which can be highly variable
and complex.

In addition to the flexibility provided by NoSQL databases, they also offer other advantages over
traditional relational databases. NoSQL databases can handle data that is distributed across multiple servers,
providing greater scalability and availability. They can also provide faster read and write performance,
making them ideal for real-time applications such as patient monitoring and alert systems.

Dept. of CSE 2022-2023 1|Page


Neo4J AuraDB

1.1 Problem Definition


We are using Neo4J as our choice for NoSQL database. Neo4J is a graph database. For our problem
statement, The retail industry is highly competitive and fast-paced, with a constant demand for new products
and fast delivery. However, the complexity of the retail supply chain can make it challenging for retailers to
meet these demands. Some of the challenges retailers face in their supply chain include:

Inventory Management: Retailers need to manage large amounts of inventory across multiple locations,
including warehouses, stores, and distribution centers.

Supply Chain Visibility: Retailers need visibility into their supply chain to track shipments and monitor the
movement and condition of products.

Customer Experience: Retailers need to deliver products to customers quickly and efficiently while ensuring
that they are delivered in good condition.

Hence our problem statement is to convert a standard RDBMS style database to a Graph database
using Neo4J.

1.2 NoSQL Databases


NoSQL (Not Only SQL) databases are a type of database management system that differs from traditional
relational databases in the way data is stored and queried. NoSQL databases are designed to handle large
amounts of unstructured or semi-structured data, and are particularly well-suited for use cases that require high
scalability and performance.

One of the key differences between NoSQL databases and relational databases is the way data is organized. In
a relational database, data is organized into tables with predefined columns and rows, which must conform to
a schema. This schema enforces strict data consistency and relationships between tables. In contrast, NoSQL
databases use various data models, such as key-value, document, column-family, and graph, which allow for
more flexible data storage.

One of the most popular types of NoSQL databases is the document database, which stores data as documents
in a hierarchical format, often using JSON or BSON formats. Each document can have a different structure,
allowing for greater flexibility in data storage. Document databases are well-suited for use cases such as
content management systems, e-commerce, and social networking.

Dept. of CSE 2022-2023 2|Page


Neo4J AuraDB

Chapter 2

Graph Databases

A graph database is a type of database management system that stores data in the form of nodes and edges,
which are used to represent relationships between entities. Graph databases are designed to handle highly
interconnected data and are particularly useful for applications that require complex queries and analysis.

Nodes in a graph database represent entities, such as people, places, or things, and edges represent
relationships between them. For example, in a social network, nodes might represent users, while edges
represent connections between users, such as friend relationships.

One of the key advantages of graph databases is their ability to handle complex queries that involve
multiple relationships. For example, a query might ask for all users who are connected to a particular user
through a friend relationship, who have also visited a certain location, and who have posted about a particular
topic on social media. In a traditional relational database, such a query would be difficult and time-consuming
to execute, whereas in a graph database it can be performed much more efficiently. Another advantage of graph
databases is their ability to easily represent and handle data that changes over time. For example, in a supply
chain management system, nodes might represent products, while edges represent the flow of those products
through the supply chain. As products move through the supply chain, the edges can be updated to reflect
changes in location, status, or other attributes.

Graph databases are also highly scalable, as they can easily handle large amounts of data and complex
relationships. They are often used in applications such as social networks, recommendation engines, fraud
detection, and knowledge management systems.

There are several different types of graph databases, including property graphs, which store properties
or attributes on nodes and edges, and RDF (Resource Description Framework) databases, which store data as
subject-predicate-object triples. Popular graph database management systems include Neo4j, Amazon
Neptune, and Microsoft Azure Cosmos DB.

In summary, a graph database is a powerful tool for managing and analyzing highly interconnected
data. By representing data as nodes and edges, graph databases enable complex queries and analysis, handle
changing data over time, and are highly scalable. They are commonly used in applications such as social
networks, recommendation engines, and knowledge management systems.

Dept. of CSE 2022-2023 3|Page


Neo4J AuraDB

Chapter 3

Neo4J

Neo4j is a popular NoSQL graph database that is designed to store and manage complex, highly connected
data. Its graph-based approach allows for the representation and management of relationships between data
points, making it particularly well-suited for use in applications such as social networking, recommendation
systems, and electronic health records.

At its core, Neo4j is a database management system that uses a graph data model to represent data. A
graph consists of nodes (also known as vertices) and edges (also known as relationships). Nodes represent
entities, while edges represent the relationships between those entities. For example, in a social network, a
node might represent a user, while an edge might represent the fact that one user is friends with another user.

One of the key benefits of using a graph data model is that it allows for the representation of highly
interconnected data. This is particularly important in applications such as social networking, where users are
often connected to many other users through various relationships. It is also important in recommendation
systems, where data about users' preferences and behaviors can be used to make recommendations based on
similarities between users.

In addition to its graph-based approach, Neo4j offers a number of other features that make it a powerful
tool for managing complex data. Neo4j is designed to handle large volumes of data and complex queries,
making it a good choice for applications that require real-time data processing and analysis. Additionally,
Neo4j is highly optimized for graph-based data models, allowing it to provide fast query results even when
working with highly interconnected data.

Dept. of CSE 2022-2023 4|Page


Neo4J AuraDB

Another important feature of Neo4j is its scalability. Neo4j can be scaled horizontally across multiple
machines, allowing it to handle even larger volumes of data as needed. This makes it a good choice for
applications that need to handle large and growing data sets, as well as for applications that need to handle
spikes in data volume.

Neo4j is also highly flexible, supporting a variety of data types, including unstructured and semi-
structured data. This makes it well-suited for use in applications with diverse data sets, where data may be in
different formats or stored in different systems. Additionally, Neo4j provides a flexible data model that can be
easily extended as needed, allowing developers to add new data types and relationships to the graph.

Security is also a key feature of Neo4j. Neo4j offers a number of security features, including
authentication, access control, and encryption, making it a good choice for applications that handle sensitive
data. Additionally, Neo4j provides fine-grained access control, allowing administrators to control access to
specific parts of the graph based on user roles and permissions.

Neo4j uses a declarative query language called Cypher, which is designed to be intuitive and easy to
use. Cypher allows for complex queries and filtering, making it easier for users to extract meaningful insights
from data. Additionally, Cypher provides a number of built-in functions and operators, allowing users to easily
perform operations such as sorting, grouping, and aggregation.

It is designed to be easily integrated with a variety of programming languages and tools, making it a
versatile tool for developers. Neo4j provides a number of APIs and drivers for popular programming languages
such as Java, Python, and Node.js, as well as tools such as Apache Spark and Elasticsearch. Additionally,
Neo4j provides a number of tools for data modeling, visualization, and analysis, making it easier for developers
to build and maintain graph-based applications.

In conclusion, Neo4j is a powerful tool for managing complex, highly interconnected data. Its graph-
based approach allows for the representation and management of relationships between data points, making it
particularly well-suited for use in applications such as social networking, recommendation systems, and
electronic health records. Additionally, its high performance, scalability, flexibility, security, and query
language make it a versatile tool for managing a variety of data sets. Finally, its easy integration with a variety
of programming languages and tools makes it a great choice for developers who need to build and maintain
graph-based applications.

Dept. of CSE 2022-2023 5|Page


Neo4J AuraDB

3.1 Neo4J AuraDB

Neo4j Aura DB is a fully managed cloud database service provided by Neo4j, the creators of the Neo4j graph
database. It allows users to easily deploy and operate a Neo4j database in the cloud without the need for
managing infrastructure or configuring complex database settings.

With Neo4j Aura DB, users can scale their databases up or down, depending on their needs, without the hassle
of managing their own servers. It also provides automatic backups and updates, ensuring that users' databases
are always up-to-date and secure.

Neo4j Aura DB offers several benefits, including high availability, disaster recovery, and a pay-as-you-go
pricing model. It also provides enterprise-level security features such as SSL encryption, LDAP authentication,
and role-based access control.

Users can access Neo4j Aura DB through the Neo4j Browser, REST API, and various drivers for programming
languages such as Java, Python, and JavaScript. Additionally, users can take advantage of Neo4j's built-in
graph algorithms and visualization tools to analyze and visualize their data.

Overall, Neo4j Aura DB is a convenient and efficient way for businesses and organizations to manage their
graph databases in the cloud, without the need for in-house database administrators or complex infrastructure
management.

Dept. of CSE 2022-2023 6|Page


Neo4J AuraDB

Chapter 4

Tive platform
In the retail industry, supplier management is crucial for ensuring the timely and efficient delivery of
products to customers. The Tive platform and Neo4j AuraDB can be used together to provide a comprehensive
solution for managing suppliers in retail. In this article, we will discuss how the Tive platform and Neo4j
AuraDB can be used in supplier management in retail.

The Tive platform is a cloud-based software platform that provides real-time visibility into the
movement and condition of products during transport. The platform is designed to provide retailers with a
comprehensive view of their supply chain, enabling them to identify issues before they become problems. The
Tive platform uses IoT-enabled sensors that are placed on products during transport to track their location,
temperature, humidity, shock, and light exposure. The data collected by these sensors is sent to the Tive
platform, where it is analyzed and presented in real-time.

Neo4j AuraDB is a cloud-based graph database that provides a scalable and reliable solution for storing
and analyzing large amounts of data. The graph database model used by Neo4j AuraDB is ideal for modeling
complex relationships between data points. The database uses nodes to represent data points and relationships
to represent the connections between them. This model is particularly well-suited for analyzing supply chain
data, as it allows retailers to easily visualize the connections between suppliers, products, and carriers.

One of the main benefits of using the Tive platform and Neo4j AuraDB in supplier management in
retail is that they provide real-time visibility into the supply chain. This allows retailers to identify issues
before they become problems, enabling them to take proactive measures to mitigate them. For example, if a
product is exposed to high temperatures during transport, the Tive platform can send an alert to the retailer,
who can then take steps to ensure that the product is not damaged. Neo4j AuraDB can be used to store and
analyze the data collected by the Tive platform, providing a complete view of the supply chain.

Another benefit of using the Tive platform and Neo4j AuraDB in supplier management is that they
enable retailers to monitor the performance of their suppliers. The Tive platform can be used to track the
delivery of products from suppliers, providing real-time updates on their progress. Neo4j AuraDB can be used
to store and analyze this data, providing a comprehensive view of supplier performance. This allows retailers
to identify suppliers who are consistently delivering products on time and in good condition, as well as those
who may be experiencing issues.

Dept. of CSE 2022-2023 7|Page


Neo4J AuraDB

In addition to monitoring supplier performance, the Tive platform and Neo4j AuraDB can be used to
improve collaboration between retailers and suppliers. The Tive platform provides collaboration tools that
enable retailers and suppliers to share data and insights, improving communication and reducing friction in
the supply chain. Neo4j AuraDB can be used to store and analyze this data, providing a single source of truth
for supply chain data. This allows retailers and suppliers to work together more effectively, improving the
overall efficiency of the supply chain.

Risk management is another area where the Tive platform and Neo4j AuraDB can be beneficial in
supplier management in retail. The Tive platform can be used to identify potential risks in the supply chain,
such as delays or damage to products. Neo4j AuraDB can be used to model the relationships between suppliers,
carriers, and products, allowing retailers to identify potential areas of risk and take proactive measures to
mitigate them. This can help retailers to minimize the impact of supply chain disruptions and ensure the timely
delivery of products to customers.

Finally, the Tive platform and Neo4j AuraDB can be used to improve product quality in supplier
management in retail. The Tive platform can be used to monitor the condition of products during transport,
ensuring that they are not damaged or spoiled. Neo4j AuraDB

Dept. of CSE 2022-2023 8|Page


Neo4J AuraDB

Chapter 5

Implementation
Neo4j AuraDB is a fully managed cloud graph database service.

Built to leverage relationships in data, AuraDB enables lightning-fast queries for real-time analytics and
insights. AuraDB is reliable, secure, and fully automated, enabling you to focus on building graph applications
without worrying about database administration.

We are able to create an instance in the Neo4J Aura through a Free account. Some of the features provided in
the free subscription are:

• Neo4j version: 5
• Nodes: 1104 / 200000 (1%) - This represents the total number of nodes that are used out of the total
number of nodes that we as users are allowed to create. For the free account, we can create upto 20000
nodes.
• Relationships: 4909 / 400000 (1%) - This represents the number of relationships among the nodes in
the database. There can be 400000 relationships (or edges) in the database for a free account in Neo4J
AuraDB.
• Region: Singapore (asia-southeast1), GCP – This represents the location where the database is stored.

5.1 NEO4J CYPHER SHELL


You can connect to an AuraDB instance using the Neo4j Cypher Shell command-line interface (CLI) and run
Cypher commands against your instance from the command-line.

To connect to an instance using Neo4j Cypher Shell:

1. Navigate to the Neo4j Aura Console in your browser.


2. Copy the Connection URI of the instance you want to connect to. The URI is below the instance status
indicator.
3. Open a terminal and navigate to the folder where you have installed Cypher Shell.
4. Run the following cypher-shell command replacing:
• <connection_uri> with the URI you copied in step 2.
• <username> with the username for your instance.
• <password> with the password for your instance.

Dept. of CSE 2022-2023 9|Page


Neo4J AuraDB

./cypher-shell -a <connection_uri> -u <username> -p <password>

Once connected, you can run :help for a list of available commands.

Available Commands:

-begin - Open a transaction

-commit - Commit the currently open transaction

-exit - Exit the logger


-help - Show this help message
-history - Print a list of the last commands executed
-param - Set the value of a query parameter
-params - Print all currently set query parameters and their values
-rollback - Rollback the currently open transaction
-source - Interactively executes cypher statements from a file
-use - Set the active instance

For help on a specific command typehelp command

5.2 IMPORTING DATA

There are two ways you can import data from a .csv file into an AuraDB instance:

• Load CSV - A Cypher statement that you run from Neo4j Browser or Neo4j Cypher Shell.
• Neo4j Data Importer - A visual application that you launch from the Console.

Load CSV
The LOAD CSV Cypher statement can be used from within Neo4j Browser and Cypher Shell.
There are some limitations to consider when using this method to load a .csv file into an AuraDB instance:

• For security reasons, you must host your .csv file on a publicly accessible HTTP or HTTPS server.
Examples of such servers include GitHub, Google Drive, and Dropbox.
• The LOAD CSV command is built to handle small to medium-sized data sets, such as anything up to
10 million nodes and relationships. You should avoid using this command for any data sets exceeding
this limit.

Dept. of CSE 2022-2023 10 | P a g e


Neo4J AuraDB

Neo4j Data Importer:


Neo4j Data Importer is a no-code tool that lets you-
1. Load data from flat files (.csv and .tsv).
2. Define a graph model and map data to it.
3. Import the data into an AuraDB instance.

To load data with Neo4j Data Importer:

1. Navigate to the Neo4j Aura Console in your browser.


2. Select the Import button on the instance you want to open.

IMPORT EXISTING DATABASE

To import a .dump file under 4GB:

• Navigate to the Neo4j Aura Console in your browser.


• Select the instance you want to import the data.
• Select the Import Database tab.
• Drag and drop your .dump file into the provided window or select Select a .dump file and select
your file.
• Select Upload.

When the upload is complete, the instance goes into a Loading state as the dump is applied. Once this has
finished, the instance returns to its Running state; and the data is ready.

Dept. of CSE 2022-2023 11 | P a g e


Neo4J AuraDB

Chapter 6

Observations and Results


6.1 Creation of an instance:

The instance takes a few seconds to be created. Every created instance comes with it’s limitation,
depending on the cost invested for the resources of the database.

6.2 Creation of Nodes:

We created a graph by uploading an already existing database based on a company database. The database
was uploaded in the form of CSV files. Then we define the relationships among the data in the files. Hence
we are able to develop a graph from the schema that we are receiving from the database.

The given figure is a graph consisting of the following nodes: Supplier, Product, Category, Order,
Customer, Shipper, Region, Territory, Employee

Dept. of CSE 2022-2023 12 | P a g e


Neo4J AuraDB

Supplier

Supplies

Part_of
Product Category

Orders

Customer Purchases Ships Shipper


Order

Sold

In_Region In_Territory
Region Territory Employee Reports_To

6.3 Creation of Relationships

Dept. of CSE 2022-2023 13 | P a g e


Neo4J AuraDB

6.4 Graph Traversal and Shortest Paths

Dept. of CSE 2022-2023 14 | P a g e


Neo4J AuraDB

Chapter 7

Conclusion
Graph databases are better than relational databases because they are more flexible and can handle more
complex data relationships. Relational databases are based on the table structure of data, which is difficult to
change once the data is in the database. Graph databases, on the other hand, are based on the graph structure
of data, which is easy to change. Considering that data is becoming increasingly complex, graph databases are
far much better for most use cases where complex data manipulation is a priority. Graph databases are well
suited for storing data with complex relationships, such as social networks or financial data. However, graph
databases are not well suited for storing data that can be easily represented in a tabular format, such as product
catalogs or customer orders.

Understanding graph database vs relational database is the first step to building effective data models that will
provide valuable insights into connected data. It is also important to note that the two are not alternatives, but
each serves a different purpose. The most important point that you’ll always need to bear in mind in graph vs
relational databases is that graph databases are better suited for applications that require multiple relationships
between data points, while relational databases are better for applications with less complex data structures.

Graph databases are well suited for applications that require the storage and retrieval of data that can be
represented as a graph, such as social networks, maps, and networks. They are also well suited for applications
that require the analysis of data that is connected in complex ways, such as fraud detection and
recommendation engines.

For example, in a social network, a user's friends are also friends with each other. A graph database can quickly
find all the friends of a user's friends. In contrast, a relational database would need to perform multiple joins
to find the same information. By prioritizing relationships, graph databases can provide greater insight into
data. In general, any application that would benefit from being able to represent data as a network of
interconnected nodes would be a good candidate for a graph database.

Dept. of CSE 2022-2023 15 | P a g e

You might also like