Neo4j 1.8中文手册

目录
前言
I. 简介
1. Neo4j 的亮点
2. 图数据库概要
3. Neo4j 图数据库
II. 教程
4. 在 Java 应用中使用 Neo4j
5. Neo4j 远程客户端库
6. 遍历查询框架
7. 数据模型范例
8. 多语言支持
9. 在 Python 应用中使用 Neo4j
10. 扩展 Neo4j 服务器
III. 参考
11. 性能
12. 事务管理
13. 数据导入
14. 索引
15. Cypher 查询语言
16. 图形算法
17. Neo4j 服务器
18. REST API
19. 在 Python 中使用 Neo4j 嵌入模式
IV. 操作
20. 安装和部署
21. 配置和调优
22. 高可用性模式
23. 备份
24. 安全
25. 监视服务器
V. 工具集
26. 基于 Web 的 Neo4j 图数据库管理工具
27. Neo4j 命令行
VI. 社区
28. 社区支持
29. 促进 Neo4j 发展
A. 联机帮助页
A.1. neo4j
A.2. neo4j-shell
A.3. neo4j-backup
A.4. neo4j-coordinator
A.5. neo4j-coordinator-shell
B. 常见问题
1
第 1 章 Neo4j 的亮点
作为一款强健的，可伸缩的高性能数据库，Neo4j 最适合完整的企业部署或者用于一个轻
量级项目中完整服务器的一个子集存在。
它包括如下几个显著特点:
完整的 ACID 支持
高可用性
轻易扩展到上亿级别的节点和关系
通过遍历工具高速检索数据
适当的 ACID 操作是保证数据一致性的基础。Neo4j 确保了在一个事务里面的多个操作同
时发生，保证数据一致性。不管是采用嵌入模式还是多服务器集群部署，都支持这一特性。更
多详细的介绍，请参考章节:transactions。
可靠的图型存储可以非常轻松的集成到任何一个应用中。随着我们开发的应用在运营中不
断发展，性能问题肯定会逐步凸显出来，而 Neo4j 不管应用如何变化，他只会受到计算机硬件
性能的影响，不受业务本身的约束。部署一个 neo4j 服务器便可以承载上亿级的节点和关系。
当然，当单节点无法承载我们的数据需求时，我们可以进行分布式集群部署，详细的细节，请
参考章节：ha。将图数据库用于存储关系复杂的数据是他最大的优势。通过 Neo4j 提供的遍历
工具，可以非常高效的进行数据检索，每秒可以达到上亿级的检索量。一个检索操作类似于
RDBMS 里面的连接（_join_）操作。
第 2 章图数据库概要
这个章节包括一个对图数据模型的介绍以及将它与其他我们常见的数据持久化模型
的比较。
2.1. 什么是图数据库?
图数据库用图来存储数据，是最接近高性能的一种用于存储数据的数据结构方式之
一。让我们跟随下面的图表，用他们来解释图数据库相关概念。我们将跟随图表中箭头方
向来读懂图想表达的真正含义。
2.1.1. 一个图由无数的节点和关系组成
“一张图 – 数据记录在 → 节点 → 包括的 → 属性里面”
最简单的图是单节点的，一个记录，记录了一些属性。一个节点可以从单属性开始，
成长为成千上亿，虽然会有一点点麻烦。从某种意义上讲，将数据用关系连接起来分布到
不同节点上才是有意义的。
2
2.1.2. 关系将图各个部分组织起来
“节点 — 被组织 → 关系 — 可以有 → 属性”
关系可以将节点组织成任意的结构，允许一张图被组织成一个列表，一棵树，一张地
图，或者一个复杂的实体 – 这个实体本身也是由复杂的，关系高度关联的结构组成。
2.1.3. 用 Traversal 进行数据库查询

“一个 Traversal — 导航 → 一张图; 他 — 标示 → 路径 — 包含 → 节点”
一次 Traversal, 你可以理解为是你通过一种算法，从一些开始节点开始查询与其关联
的节点，比如你想找到 “我朋友喜欢但我不喜欢的那首音乐是什么呢？”，又或者 “如果断
电了，拿下服务器的服务会首影响？”等等问题。
3
2.1.4. 为节点和关系建立索引
“一个索引 — 映射到 → 属性 — 属于 → 节点或者关系”
经常，你想通过某一给定的属性值找到节点或者关系。比起通过遍历我们的图来书，
用索引将会更加高效。比如“找到用户名是 tony 的用户”。
4
2.1.5. Neo4j 是一个图数据库
“一个图数据库 — 管理 → 一张图和与图相关的 → 索引”
Neo4j 是一个有商业支持的开源图数据库。他被设计来用于拿下数据不断高速成长的
数据存储，用高效的图数据结构代替传统的表设计。用 Neo4j 工作，您的应用将得到图的
所有优越表现，以及您期望的高可靠性。
2.2. 比较各种数据库模型
图数据库通过在一张图上存储节点和关系来持久化我们的数据结构。比起其他持久化
数据模型如何呢？因为图是一种常规数据结构，让我们与其他的进行一下比较试试看。
2.2.1. 从图数据库转换成 RDBMS

将所有的数据用竖立的堆栈表示，并且保持他们直接的关系，你可以看到下面一张图。
一个 RDBMS 被优化用于聚合数据，而 Neo4j 擅长于高度关联的数据。
5
图 2.1. RDBMS
图 2.2. 用图实现 RDBMS 模型
2.2.2. 从图数据库转换成 Key-Value 数据库

Key-Value 模型适合用于简单的数据或者列表。当数据之间不断交互关联时，你更需
要一张图模型。Neo4j 让你能惊醒制作简单的数据结构到复杂，互相连接的数据。
图 2.3. Key-Value 存储模型
K* 代表一个键，V* 代表一个值。请注意，某些键指向其他键以及普通值。
6
图 2.4. 用图实现 Key-Value 模型
2.2.3. 从图数据库转换成列数据库
列式（大表）数据库是 Key-Value 模型的升级，用 “”来允许行数据增加。如果存储
一张图，这个表将是分层的，关系也是非常明确的。
2.2.4. 从图数据库转换成文档型数据库
文档型数据库用文档进行层次划分，而自由的数据规划也很容易被表示成一颗树。成
长为一张图的话，文档之间的关联你需要更有代表性的数据结构来存储，而在 Neo4j 中，
这些关系是非常容易处理的。
图 2.5. 文档型数据库
D=文档, S=子文档, V=值, D2/S2 = 关联到（其他）文档的索引。
图 2.6. 从图数据库转换成文档型数据库
7
第 3 章 Neo4j 图数据库
这个章节将讲述 Neo4j 模型和行为的更多细节。
3.1. 节点
构成一张图的基本元素是节点和关系。在 Neo4j 中，节点和关系都可以包含属性。
节点经常被用于表示一些_实体_，但依赖关系也一样可以表示实体。
下面让我们认识一个最简单的节点，他只有一个属性，属性名是 name,属性值是 Marko：
3.2. 关系
节点之间的关系是图数据库很重要的一部分。通过关系可以找到很多关联的数据，比如节
点集合，关系集合以及他们的属性集合。
8
一个关系连接两个节点，必须有一个开始节点和结束节点。
因为关系总是直接相连的，所以对于一个节点来说，与他关联的关系看起来有输入/
输出两个方向，这个特性对于我们遍历图非常有帮助：
关系在任一方向都会被遍历访问。这意味着我们并不需要在不同方向都新增关系。
而关系总是会有一个方向，所以当这个方向对你的应用没有意义时你可以忽略方向。
特别注意一个节点可以有一个关系是指向自己的：
为了将来增强遍历图中所有的关系，我们需要为关系设置类型。注意关键字 type 在

这可能会被误解，你其实可以把他简单的理解为一个标签而已。
下面的例子是一个有两种关系的最简单的社会化网络图。
9
表 3.1. 使用到的关系和关系类型
功能实现
get who a person follows outgoing follows relationships,

depth one
get the followers of a incoming follows relationships,

person depth one
get who a person blocks outgoing blocks relationships, depth

one
get who a person is incoming blocks relationships, depth

blocked by one
下面的放里是一个简单的文件系统，包括一些符号软链接：
10
根据你看到的，你在遍历的时候会用到关系的方向和关系的类型。
What How
get the full path of a file incoming file relationships
get all paths for a file incoming file and symbolic link relationships
get all files in a directory outgoing file and symbolic link relationships,
depth one
get all files in a directory, excluding outgoing file relationships, depth one
symbolic links
get all files in a directory, recursively outgoing file and symbolic link relationships
3.3. 属性
节点和关系都可以设置自己的属性。
属性是由 Key-Value 键值对组成，键名是字符串。属性值是要么是原始值，要么是原

始值类型的一个数组。比如+String+，+int+和 i+int[]+都是合法的。
注意
null 不是一个合法的属性值。 Nulls 能代替模仿一个不存在的

Key。
11
表 3.2. 属性值类型
T Description Value range
ype
boo true/false
lean
byt 8-bit integer -128 to 127, inclusive

e
sho 16-bit integer -32768 to 32767, inclusive

rt
int 32-bit integer -2147483648 to 2147483647, inclusive
lon 64-bit integer -9223372036854775808 to

g 9223372036854775807, inclusive
flo 32-bit IEEE 754 floating-point

at number
12
T Description Value range
ype
dou 64-bit IEEE 754 floating-point

ble number
cha 16-bit unsigned integers representing u0000 to uffff (0 to 65535)

r Unicode characters
Str sequence of Unicode characters

ing
如果要了解 float/double 类型的更多细节，请参考：Java Language Specification。
3.4. 路径
路径由至少一个节点，通过各种关系连接组成，经常是作为一个查询或者遍历的结果。
最短的路径是 0 长度的像下面这样：
长度为 1 的路径如下:
13
3.5. 遍历（Traversal）
遍历一张图就是按照一定的规则，跟随他们的关系，访问关联的的节点集合。最多的
情况是只有一部分子图被访问到，因为你知道你对那一部分节点或者关系感兴趣。
Neo4j 提供了遍历的 API，可以让你指定遍历规则。最简单的设置就是设置遍历是宽

度优先还是深度优先。
想对遍历框架有一个深入的了解，请参考章节：tutorial-traversal。
想了解更多的 Java 代码范例，请参考章节：tutorials-java-embedded-traversal。
其他查询图的方式还有 cypher-query-lang, Cypher 和 gremlin-plugin, Gremlin。
部分 II. 教程
教程这部分将指导我们如何建立你的环境并使用 Neo4j 进行开发。它将从最简单的
Hello World 到图数据库的高级应用逐一介绍。
第 4 章在 Java 应用中使用 Neo4j

在 Java 中采用嵌入方式使用 Neo4j 是非常方便的。在这个章节中，你将找到所有你想
了解的 — 从基本环境的搭建到用你的数据做一些实际有用的事情。
在 Java 应用中使用 Neo4j 是非常容易的。正这个章节中你将找到你需要的一切 — 从开

发环境的建立到用你的数据做一些有用的事情。
4.1. 将 Neo4j 引入到你的项目工程中

在选择了适合你的平台的 editions,edition 后，只需要引入 Neo4j 的 jars 文件到你的
工程的构造路径中，你就可以在你的工程中使用 Neo4j 数据库了。下面的章节将展示如何
完成引入，要么通过直接改变构造路径，要么使用包依赖管理。
4.1.1. 增加 Neo4j 的库文件到构造路径中

可以通过下面任意一种方式得到需要的 jar 文件:
14
• 解压 Neo4j 下载的压缩包，我们需要使用的 jars 文件都包括在 lib 目录中。
• 直接使用 Maven 中心仓库的 jars 文件。
将 jars 引入到你的项目工程中:
JDK tools
增加到 -classpath 中
Eclipse
• 右键点击工程然后选择 Build Path → Configure Build Path 。在对话框中
选择 Add External JARs ，浏览到 Neo4j 的'lib/'目录并选择所有的 jar 文件。
• 另外一种方式是使用 User Libraries。
IntelliJ IDEA
看 Libraries, Global Libraries, and the Configure Library dialog 了解详情。
NetBeans
• 在工程的 Libraries 点击鼠标右键，选择 Add JAR/Folder ，浏览到 Neo4j
的'lib/'目录选择里面的所有 jar 文件。
• 你也可以从工程节点来管理库文件。详细情况请查看管理一个工程的
classpath。
4.1.2. 将 Neo4j 作为一个依赖添加

想总览一下主要的 Neo4j 构件，请查看 editions。列在里面的构件都是包含实际 Neo4j
实现的顶级构件。
你既可以使用顶级构件也可以直接引入单个的组件。在这的范例使用的是顶级构件的
方式。
Maven
Maven dependency.
1 <project>
2 ...
3 <dependencies>
4 <dependency>
5 <groupId>org.neo4j</groupI
d>
6
7 <artifactId>neo4j</artifac
tId>
8
9 <version>1.8</version>
1 </dependency>
0 ...
1 </dependencies>
1 ...
15
1 </project>
2
参数 artifactId 可以在 editions 找到。
Eclipse and Maven

在 Eclipse 中开发，推荐安装插件 m2e plugin 让 Maven 管理 classpath 来代替上面的方案。
这样的话，你既可以通过 Maven 命令行来编译你的工程，也可以通过 Maven 命令自

动生成一个 Eclipse 工作环境以便进行开发。
Ivy
确保能解决来自 Maven Central 的依赖问题，比如我们在你的'ivysettings.xml'文件中使
用下面的配置选项：
1 <ivysettings>
2 <settings defaultResolver="main"/>
3 <resolvers>
4 <chain name="main">
5 <filesystem name="local">
6 <artifact
pattern="${ivy.settings.dir}/repository/[artifact]-[revision].[ext]"
7 />
8 </filesystem>
9 <ibiblio name="maven_central" root="http://repo1.maven.org/maven2/"
m2compatible="true"/>
1
0 </chain>
1 </resolvers>
1 </ivysettings>
有了这个，你就可以通过增加下面这些内容到你的'ivy.xml'中来引入 Neo4j：
..
1
<dependencies>
2
..
3
<dependency org="org.neo4j" name="neo4j"
4
rev="1.8"/>
5
..
6
</dependencies>
7
..
参数 name 可以在 editions 找到。
Gradle
下面的范例演示了用 Gradle 生成一个脚本来引入 Neo4j 库文件。
1 def neo4jVersion = "1.8"

2 apply plugin: 'java'
3 repositories {
4 mavenCentral()
16
5 }
6 dependencies {
7 compile
"org.neo4j:neo4j:${neo4jVersion}"
8
}
参数 coordinates (在范例中的 org.neo4j:neo4j ) 可以在 editions 找到。
4.1.3. 启动和停止
为了创建一个新的数据库或者打开一个已经存在的，你需要实例化一个
+EmbeddedGraphDatabase+对象。
graphDb = new
1
GraphDatabaseFactory().newEmbeddedDatabase( DB_PATH );
2
registerShutdownHook( graphDb );
注意
EmbeddedGraphDatabase 实例可以在多个线程中共享。然而你不能创建多个实例来
指向同一个数据库。
为了停止数据库，你需要调用方法 shutdown() ：
graphDb.shutdow
1
n();
为了确保 Neo4j 被正确关闭，你可以为它增加一个关闭钩子方法：
1
2 private static void registerShutdownHook( final
GraphDatabaseService
3 graphDb )
4 {
5 // Registers a shutdown hook for the Neo4j instance so that
it
6
7 // shuts down nicely when the VM exits (even if you "Ctrl-C"
the
8
9 // running example before it's completed)
1 Runtime.getRuntime().addShutdownHook( new Thread()
0 {
1 @Override
1 public void run()
1 {
2 graphDb.shutdown();
1 }
3 } );
1 }
4
如果你只想通过只读方式浏览数据库，请使用 EmbeddedReadOnlyGraphDatabase 。
17
想通过配置设置来启动 Neo4j，一个 Neo4j 属性文件可以像下面这样加载：
GraphDatabaseService graphDb = new GraphDatabaseFactory().

1
newEmbeddedDatabaseBuilder( "target/database/location" ).
2
loadPropertiesFromFile( pathToConfig +
3
"neo4j.properties" ).
4
newGraphDatabase();
或者你可以编程创建你自己的 Map<String, String> 来代替。
想了解更多配置设置的细节，请参考：embedded-configuration。
4.2. 你好，世界
正这里可以学习如何创建和访问节点和关系。关于建立工程环境的信息，请参考：第 4.1 节
“将 Neo4j 引入到你的项目工程中”。
从第 2.1 节 “什么是图数据库?”中，我们还记得，一个 Neo4j 图数据库由以下几部分组成：
• 相互关联的节点
• 有一定的关系存在
• 在节点和关系上面有一些属性。
所有的关系都有一个类型。比如，如果一个图数据库实例表示一个社网络，那么一个关系
类型可能叫 KNOWS 。
如果一个类型叫 KNOWS 的关系连接了两个节点，那么这可能表示这两个人呼吸认识。一

个图数据库中大量的语义都被编码成关系的类型来使用。虽然关系是直接相连的，但他们也可
以不用考虑他们遍历的方向而互相遍历对方。
提示
范例源代码下载地址：
EmbeddedNeo4j.java
4.2.1. 准备图数据库
关系类型可以通过 enum 创建。正这个范例中我们之需要一个单独的关系类型。下面是我
们的定义：
1 private static enum RelTypes implements

RelationshipType
2
3 {
4 KNOWS
18
}
我们页准备一些需要用到的参数：
1
graphDb;
2
Node firstNode;
3
Node secondNode;
4
Relationship relationship;
下一步将启动数据库服务器了。逐一如果给定的保持数据库的目录如果不存在，那么
它会自动创建。
graphDb = new
1
2
registerShutdownHook( graphDb );
注意：启动一个图数据库是一个非常重（耗费资源）的操作，所以不要每次你需要与
数据库进行交互操作时都去启动一个新的实例。这个实例可以被多个线程共享。事务是线
程安全的。
就像你上面所看到的一样，我们注册了一个关闭数据库的钩子用来确保在 JVM 退出
时数据库已经被关闭。现在是时候与数据库进行交互了。
4.2.2. 在一个事务中完成多次写数据库操作
所有的写操作（创建，删除以及更新）都是在一个事务中完成的。这是一个有意的设
计，因为我们相信事务是使用一个企业级数据库中非常重要的一部分。现在，在 Neo4j 中
的事务处理是非常容易的：
1 Transaction tx =
graphDb.beginTx();
2
3 try
4 {
5 // Updating operations go here
6 tx.success();
7 }
8 finally
9 {
1 tx.finish();
0 }
要了解更多关于事务的细节，请参考：transactions 和 Java API 中的事务接口。
4.2.3. 创建一个小型图数据库
现在，让我们来创建一些节点。 API 是非常直观的。你也随意查看在
http://components.neo4j.org/neo4j/1.8/apidocs/ 的 JavaDocs 文档。它们也被包括正发行版中。这
19
儿展示了如何创建一个小型图数据库，数据库中包括两个节点并用一个关系相连，节点和关系
还包括一些属性：
firstNode = graphDb.createNode();
1
firstNode.setProperty( "message", "Hello, " );
2
secondNode = graphDb.createNode();
3
secondNode.setProperty( "message", "World!" );
4
5
relationship = firstNode.createRelationshipTo( secondNode,
6
RelTypes.KNOWS );
7
relationship.setProperty( "message", "brave Neo4j " );
现在我们有一个图数据库看起来像下面这样：
图 4.1. Hello World 图数据库
4.2.4. 打印结果
在我们创建我们的图数据库后，让我们从中读取数据并打印结果。
System.out.print( firstNode.getProperty( "message" ) );

1
System.out.print( relationship.getProperty( "message"
2
) );
3
System.out.print( secondNode.getProperty( "message" ) );
输出结果：
Hello, brave Neo4j World!
4.2.5. 移除数据
在这种情况下我们将在提交之前移除数据：
// let's remove the data

1
firstNode.getSingleRelationship( RelTypes.KNOWS,
2
Direction.OUTGOING ).delete();
3
firstNode.delete();
4
secondNode.delete();
注意删除一个仍然有关系的节点，当事务提交是会失败。这是为了确保关系始终有一
个开始节点和结束节点。
20
4.2.6. 关闭图数据库
最后，当应用完成后关闭数据库：
graphDb.shutdow
1
n();
4.3. 带索引的用户数据库
你有一个用户数据库，希望通过名称查找到用户。首先，下面这是我们想创建的数据
库结构：
图 4.2. 用户节点空间预览
其中，参考节点连接了一个用户参考节点，而真实的所有用户都连接在用户参考节点上面。
提示
范例中的源代码下载地址：
EmbeddedNeo4jWithIndexing.java
首先，我们定义要用到的关系类型：
private static enum RelTypes implements

1
RelationshipType
2
{
3
USERS_REFERENCE,
4
USER
5
}
然后，我们创建了两个辅助方法来处理用户名称以及往数据库新增用户：
1 private static String idToUserName( final int id )

2 {
21
3 return "user" + id + "@neo4j.org";
4 }
5
6 private static Node createAndIndexUser( final String
username
7 )
8 {
9 Node node = graphDb.createNode();
1 node.setProperty( USERNAME_KEY, username );
0 nodeIndex.add( node, USERNAME_KEY, username );
1 return node;
1 }
1
2
下一步我们将启动数据库:
graphDb = new
1
2
nodeIndex = graphDb.index().forNodes( "nodes" );
3
registerShutdownHook();
是时候新增用户了：
1
2
3
Transaction tx = graphDb.beginTx();
4
try
5
{
6
// Create users sub reference node
7
Node usersReferenceNode = graphDb.createNode();
8
graphDb.getReferenceNode().createRelationshipTo(
9
usersReferenceNode, RelTypes.USERS_REFERENCE );
1
// Create some users and index their names with the IndexService
0
for ( int id = 0; id < 100; id++ )
1
{
1
Node userNode = createAndIndexUser( idToUserName( id ) );
1
usersReferenceNode.createRelationshipTo( userNode,
2
RelTypes.USER );
1
}
3
1
4
通过 Id 查找用户:
1 int idToFind = 45;

2 Node foundUser = nodeIndex.get( USERNAME_KEY,
3 idToUserName( idToFind ) ).getSingle();
4 System.out.println( "The username of user " + idToFind + " is
22
"5
+ foundUser.getProperty( USERNAME_KEY ) );
4.4. 基本的单元测试
Neo4j 的单元测试的基本模式通过下面的范例来阐释。
要访问 Neo4j 测试功能，你应该把 neo4j-kernel 'tests.jar'新增到你的类路径中。你可以

从 Maven Central: org.neo4j:neo4j-kernel 下载到需要的 jars。
使用 Maven 作为一个依赖管理，你通常会正 pom.xml 中增加依赖配置：
Maven 依赖.
1
2
<project>
3
...
4
<dependencies>
5
<dependency>
6
<groupId>org.neo4j</groupId>
7
<artifactId>neo4j-kernel</artifac
8
tId>
9
<version>${neo4j-version}</versio
1
n>
0
<type>test-jar</type>
1
<scope>test</scope>
1
</dependency>
1
...
2
</dependencies>
1
...
3
</project>
1
4
_ ${neo4j-version} 是 Neo4j 的版本号。_
到此，我们已经准备好进行单元测试编码了。
提示
Neo4jBasicTest.java
每一次开始单元测试之前，请创建一个干净的数据库：
1 @Before
2 public void prepareTestDatabase()
3{
23
4 graphDb = new
TestGraphDatabaseFactory().newImpermanentDatabaseBuilder().newG
5
raphDatabase();
}
在测试完成之后，请关闭数据库：
@After
1
public void
2
destroyTestDatabase()
3
{
4
graphDb.shutdown();
5
}
在测试期间，创建节点并检查它们是否存在，并在一个事务中结束写操作。
1
2 Transaction tx = graphDb.beginTx();
3
4 Node n = null;
5 try
6 {
7 n = graphDb.createNode();
8 n.setProperty( "name", "Nancy" );
9 tx.success();
1 }
0 catch ( Exception e )
1 {
1 tx.failure();
1 }
2 finally
1 {
3 tx.finish();
1 }
4
1 // The node should have an id greater than 0, which is the id
5 of the
1 // reference node.
6 assertThat( n.getId(), is( greaterThan( 0l ) ) );
1
7 // Retrieve a node by using the id of the created node. The id's
and
1
8 // property should match.
1 Node foundNode = graphDb.getNodeById( n.getId() );
9 assertThat( foundNode.getId(), is( n.getId() ) );
2 assertThat( (String) foundNode.getProperty( "name" ),
0 is( "Nancy" ) );
2
24
1
2
2
2
3
2
4
2
5
2
6
2
7
如果你想查看创建数据库的参数配置，你可以这样：
Map<String, String> config = new HashMap<String, String>();

1
config.put( "neostore.nodestore.db.mapped_memory", "10M" );
2
config.put( "string_block_size", "60" );
3
config.put( "array_block_size", "300" );
4
GraphDatabaseService db = new
5
ImpermanentGraphDatabase( config );
4.5. 遍历查询
了解更多关于遍历查询的信息，请参考：tutorial-traversal。
了解更多关于遍历查询范例的信息，请参考：第 7 章数据模型范例。
4.5.1. 黑客帝国
对于上面的黑客帝国范例的遍历查询，这次使用新的遍历 API：
提示
NewMatrix.java
朋友以及朋友的朋友.
1 private static Traverser getFriends(

2 final Node person )
3 {
4 TraversalDescription td = Traversal.description()
5 .breadthFirst()
6 .relationships( RelTypes.KNOWS,
Direction.OUTGOING
7 )
8 .evaluator( Evaluators.excludeStartPosition() );
9 return td.traverse( person );
25
}
让我们只想一次真实的遍历查询并打印结果：
1
2 int numberOfFriends = 0;
3 String output = neoNode.getProperty( "name" ) + "'s friends:\n";
4 Traverser friendsTraverser = getFriends( neoNode );
5 for ( Path friendPath : friendsTraverser )
6 {
7 output += "At depth " + friendPath.length() + " => "
8 + friendPath.endNode()
9 .getProperty( "name" ) + "\n";
1 numberOfFriends++;
0 }
1 output += "Number of friends found: " + numberOfFriends + "\n";
1
输出结果：
Thomas Anderson's
friends:
1
2 At depth 1 => Trinity
3 At depth 1 => Morpheus
4 At depth 2 => Cypher
5 At depth 3 => Agent Smith
6
找到朋友的数量: 4
谁编写了黑客帝国？.
1 private static Traverser findHackers( final Node startNode )

2 {
4 .breadthFirst()
5 .relationships( RelTypes.CODED_BY, Direction.OUTGOING )
6 .relationships( RelTypes.KNOWS, Direction.OUTGOING )
7 .evaluator(
8 Evaluators.includeWhereLastRelationshipTypeIs( RelTypes.CODED
_BY
9 ) );
1 return td.traverse( startNode );
0 }
打印输出结果：
1 String output = "Hackers:\n";

2 int numberOfHackers = 0;
3 Traverser traverser = findHackers( getNeoNode() );
4 for ( Path hackerPath : traverser )
5 {
26
6 output += "At depth " + hackerPath.length() + " => "
7 + hackerPath.endNode()
8 .getProperty( "name" ) + "\n";
9 numberOfHackers++;
1 }
0 output += "Number of hackers found: " + numberOfHackers + "\n";
1
1
现在我们知道是谁编写了黑客帝国:
Hackers:
1 At depth 4 => The
Architect
2
3
找到 hackers 的数量: 1
游走一个有序路径
这个范例展示了如何通过一个路径上下文控制一条路径的表现。
提示
OrderedPath.java
创建一个图数据库.
Node A = db.createNode();
Node B = db.createNode();
1 Node C = db.createNode();
2 Node D = db.createNode();
3 A.createRelationshipTo( B,
REL1
4 );
5 B.createRelationshipTo( C,
REL2
6 );
7 C.createRelationshipTo( D,
REL3
8 );
A.createRelationshipTo( C,
REL2 );
27
现在，关系 ( REL1 → REL2 → REL3 ) 的顺序保存在一个 ArrayList 对象中。当遍历
的时候，Evaluator 能针对它进行检查，确保只有拥有预定义关系顺序的路径才会被包括并
返回：
定义如何游走这个路径.
1 final ArrayList<RelationshipType> orderedPathContext = new

ArrayList<RelationshipType>();
2
3 orderedPathContext.add( REL1 );
4 orderedPathContext.add( withName( "REL2" ) );
5 orderedPathContext.add( withName( "REL3" ) );
7 .evaluator( new Evaluator()
8 {
9 @Override
1 public Evaluation evaluate( final Path path )
0 {
1 if ( path.length() == 0 )
1 {
1 return Evaluation.EXCLUDE_AND_CONTINUE;
2 }
1 RelationshipType expectedType =
3 orderedPathContext.get( path.length() - 1 );
1 boolean isExpectedType =
4 path.lastRelationship()
1 .isType( expectedType );
28
5 boolean included = path.length() ==
orderedPathContext.size()
1
6 && isExpectedType;
1 boolean continued = path.length() <
7 orderedPathContext.size()
1 && isExpectedType;
8 return Evaluation.of( included, continued );
1 }
9 } );
2
0
2
1
2
2
2
3
2
4
执行一次遍历查询并返回结果.
Traverser traverser = td.traverse( A );

1
PathPrinter pathPrinter = new PathPrinter( "name" );
2
for ( Path path : traverser )
3
{
4
output += Traversal.pathToString( path,
5
pathPrinter );
6
}
输出结果:
(A)--[REL1]-->(B)--[REL2]-->(C)--[REL3]--
1
>(D)
在这种情况下我们使用一个自定义类来格式化路径输出。下面是它的具体实现：
1 static class PathPrinter implements

Traversal.PathDescriptor<Path>
2
3 {
4 private final String nodePropertyKey;
5
6 public PathPrinter( String nodePropertyKey )
7 {
8 this.nodePropertyKey = nodePropertyKey;
9 }
1
0 @Override
1 public String nodeRepresentation( Path path, Node node )
29
1 {
1 return "(" + node.getProperty( nodePropertyKey, "" ) +
2 ")";
1 }
3
1 @Override
4 public String relationshipRepresentation( Path path, Node
from,
1
5 Relationship relationship )
1 {
6 String prefix = "--", suffix = "--";
1 if ( from.equals( relationship.getEndNode() ) )
7 {
1 prefix = "<--";
8 }
1 else
9 {
2 suffix = "-->";
0 }
2 return prefix + "[" + relationship.getType().name() +
1 "]" + suffix;
2 }
2 }
2
3
2
4
2
5
2
6
2
7
2
8
2
9
3
0
3
1
为了了解更多关于 Path 的有选择的输出的细节，请参考：Traversal 类。
注意
下面的范例使用了一个已经废弃的遍历 API。它与新的遍历查询 API 共享底层实
30
现，所以它们的性能是一样的。比较起来它提供的功能非常有限。
4.5.2. 老的遍历查询 API

这是我们想遍历查询的第一个图数据库：
图 4.3. 黑客帝国节点空间预览
提示
范例源代码下载地址： Matrix.java
朋友以及朋友的朋友.
private static Traverser getFriends( final Node person )

1
{
2
return person.traverse( Order.BREADTH_FIRST,
3
StopEvaluator.END_OF_GRAPH,
4
ReturnableEvaluator.ALL_BUT_START_NODE,
5
RelTypes.KNOWS,
6
Direction.OUTGOING );
7
}
让我们执行一次真实的遍历查询并打印结果：
1 int numberOfFriends = 0;
2 String output = neoNode.getProperty( "name" ) + "'s friends:\n";
3 Traverser friendsTraverser = getFriends( neoNode );
4 for ( Node friendNode : friendsTraverser )
31
5 {
6 output += "At depth " +
7 friendsTraverser.currentPosition().depth() +
8 " => " +
9 friendNode.getProperty( "name" ) + "\n";
1 numberOfFriends++;
0 }
1 output += "Number of friends found: " + numberOfFriends + "\n";
1
1
2
下面是输出结果：
Thomas Anderson 的朋友
们:1
2
At depth 1 => Trinity
3
At depth 1 => Morpheus
4
At depth 2 => Cypher
5
At depth 3 => Agent
6
Smith
一共找到朋友数量: 4
是谁编写了黑客帝国呢?
1
2
private static Traverser findHackers( final Node startNode )
3
{
4
return startNode.traverse( Order.BREADTH_FIRST,
5
StopEvaluator.END_OF_GRAPH, new
6
ReturnableEvaluator()
7
{
8
@Override
9
public boolean isReturnableNode(
1
final TraversalPosition currentPos )
0
{
1
return !currentPos.isStartNode()
1
&& currentPos.lastRelationshipTraversed()
1
.isType( RelTypes.CODED_BY );
2
}
1
}, RelTypes.CODED_BY, Direction.OUTGOING, RelTypes.KNOWS,
3
1
}
4
1
32
5
1
6
输出结果：
1
2
String output = "Hackers:\n";
3
int numberOfHackers = 0;
4
Traverser traverser = findHackers( getNeoNode() );
5
for ( Node hackerNode : traverser )
6
{
7
output += "At depth " +
8
traverser.currentPosition().depth() +
9
" => " +
1
hackerNode.getProperty( "name" ) + "\n";
0
numberOfHackers++;
1
}
1
output += "Number of hackers found: " + numberOfHackers + "\n";
1
2
现在我们知道是谁编写了黑客帝国:
Hackers:
1 At depth 4 => The
Architect
2
3
找到 hackers 的数量: 1
4.5.3. 在遍历查询中的唯一路径
这个范例演示了节点唯一性的使用。下面是一个想象的有多个负责人的领域图，这些
负责人有它们增加的宠物，而这些宠物又生产了它的后代。
图 4.4. 后代范例图
33
为了返回 Pet0 的所有后代，要求与 Pet0 必须有 owns 和 Principal1 关系（实际
上只有 Pet1 和 Pet3 ），遍历查询的 Uniqueness 应该设置成 NODE_PATH 来代替默认的
NODE_GLOBAL 以至于节点可以被遍历不止一次，而且那些有不同节点但能有一些相同的路径
（比如开始节点和结束节点）也能被返回。
1
2
3
final Node target =
4
data.get().get( "Principal1" );
5
TraversalDescription td =
6
Traversal.description()
7
.uniqueness( Uniqueness.NODE_PATH )
8
.evaluator( new Evaluator()
9
{
1
@Override
0
public Evaluation evaluate( Path path )
1
{
1
if ( path.endNode().equals( target ) )
1
{
2
return
1
Evaluation.INCLUDE_AND_PRUNE;
3
}
1
return Evaluation.EXCLUDE_AND_CONTINUE;
4
}
1
} );
5
1
Traverser results = td.traverse( start );
6
1
7
这将返回下面的路径:
(3)--[descendant,0]-->(1)<--[owns,3]-
-(5)
1
2 (3)--[descendant,2]-->(4)<--[owns,5]-
-(5)
在 path.toString() 的默认实现中，(1)--[knows,2]-->(4) 表示一个 ID=1 的节点
通过一个 ID=2，关系类型为 knows 的关系连接到了一个 ID=4 的节点上。
让我们从一个旧的中创建一个新的 TraversalDescription ，并且设置 uniqueness 为

NODE_GLOBAL 来查看它们之间的区别。
提示
TraversalDescription 对象是不变的，因此我们必须使用一个新的实例来返回新
34
的 uniqueness 设置。
TraversalDescription nodeGlobalTd =
1
td.uniqueness( Uniqueness.NODE_GLOBAL );
2
results = nodeGlobalTd.traverse( start );
现在只有一条路径返回:
(3)--[descendant,0]-->(1)<--[owns,3]-
1
-(5)
4.5.4. 社交网络
注意: 下面的范例使用了处于实验阶段的遍历查询 API。
社交网络（在互联网上也被称为社交图）是天然的用图来表示的模型。下面的范例演
示了一个非常简单的社交模型，它连接了朋友并关注了好友动态。
提示
socnet
简单的社交模型
图 4.5. 社交网络数据模型
35
一个社交网络的数据模型是简漂亮的：有名称的 Persons 和有时间戳文本的
StatusUpdates 。这些实体然后通过特殊的关系连接在一起。
• Person
o friend: 连接两个不同 Person 实例的关系 (不能连接自己)
o status: 连接到最近的 StatusUpdate

• StatusUpdate
o next: 指向在主线上的下一个 StatusUpdate ，是在当前这个状态更新之前发

生的
状态图实例
一个 Person 的 StatusUpdate 列表是一个链表。表头（最近动态）可以通过下一个
status 找到。每一个随后的 StatusUpdate 都通过关系 next 相连。
这是一个 Andreas Kollegger 微博记录图早上走路上班的范例：
为了读取状态更新情况，我们可以创建一个遍历查询，比如：
TraversalDescription traversal =
1
Traversal.description().
2
depthFirst().
3
relationships( NEXT );
36
这给了我们一个遍历查询，它将从一个 StatusUpdate 开始，并一直跟随状态的主线直
到它们运行结束。遍历查询是懒加载模式所以当我们处理成千上万状态的时候性能一样很好
— 除非我们真实使用它们，否在它们不会被加载。
活动流
一旦我们有了朋友，而且它们有了状态消息，我们可能想读取我们的朋友的消息动态，按
时间倒序排列 — 最新的动态在前面。为了实现这个，我们可以通过下面几个步骤：
1. 抓取所有的好友动态放入一个列表 — 最新的排前面。
2. 对列表进行排序。
3. 返回列表中的第一个记录。
4. 如果第一个迭代器为空，则把它从列表移除。否则，在这个迭代器中获取下一个
记录。
5. 跳转到步骤 2 直到在列表中没有任何记录。
这个队列看起来像这样。
代码实现像这样：
1
2
3 PositionedIterator<StatusUpdate> first =
statuses.get(0);
4
5 StatusUpdate returnVal = first.current();
6
7 if ( !first.hasNext() )
8 {
9 statuses.remove( 0 );
1 }
0 else
1 {
1 first.next();
1 sort();
2 }
1
3 return returnVal;
1
4
4.6. 领域实体
这个地方演示了当使用 Neo4j 时控制领域实体的一个方法。使用的原则是将实体封装
到节点上（这个方法也可以用在关系上）。
提示
37
Person.java
马上，保存节点并且让它在包里可以被访问：
1
private final Node underlyingNode;
2
3
Person( Node personNode )
4
{
5
this.underlyingNode =
6
personNode;
7
}
8
9
protected Node getUnderlyingNode()
1
{
0
return underlyingNode;
1
}
1
分配属性给节点：
public String getName()

1
{
2
return
3
(String)underlyingNode.getProperty( NAME );
4
}
确保重载这些方法：
1 @Override
2 public int hashCode()
3 {
4 return underlyingNode.hashCode();
5 }
6
7 @Override
8 public boolean equals( Object o )
9 {
1 return o instanceof Person &&
0 underlyingNode.equals( ( (Person)o ).getUnderlyingNode(
)1 );
1 }
1
2 @Override
1 public String toString()
3 {
1 return "Person[" + getName() + "]";
4 }
38
1
5
1
6
1
7
1
8
4.7. 图算法范例
提示
PathFindingExamplesTest.java
计算正连个节点之间的最短路径（最少数目的关系）：
1
2 Node startNode = graphDb.createNode();
3 Node middleNode1 = graphDb.createNode();
6 Node endNode = graphDb.createNode();
7 createRelationshipsBetween( startNode, middleNode1,
endNode
8 );
9 createRelationshipsBetween( startNode, middleNode2,
middleNode3,
1 endNode );
0
1 // Will find the shortest path between startNode and endNode
1 via
1 // "MY_TYPE" relationships (in OUTGOING direction), like f.ex:
2 //
1 // (startNode)-->(middleNode1)-->(endNode)
3 //
1 PathFinder<Path> finder = GraphAlgoFactory.shortestPath(
4 Traversal.expanderForTypes( ExampleTypes.MY_TYPE,
Direction.OUTGOING
1 ), 15 );
5 Iterable<Path> paths = finder.findAllPaths( startNode,
endNode
1 );
6
使用迪科斯彻（Dijkstra）算法解决有向图中任意两个顶点之间的最短路径问题。
1 PathFinder<WeightedPath> finder = GraphAlgoFactory.dijkstra(

2 Traversal.expanderForTypes( ExampleTypes.MY_TYPE,
Direction.BOTH
3 ), "cost" );
39
4
5 WeightedPath path = finder.findSinglePath( nodeA, nodeB );
6
7 // Get the weight for the found path
path.weight();
使用 A*算法是解决静态路网中求解最短路最有效的方法。
这儿是我们的范例图：
1
2
3
4
Node nodeA = createNode( "name", "A", "x", 0d, "y", 0d );
5
Node nodeB = createNode( "name", "B", "x", 7d, "y", 0d );
6
Node nodeC = createNode( "name", "C", "x", 2d, "y", 1d );
7
Relationship relAB = createRelationship( nodeA, nodeC, "length", 2d );
8
Relationship relBC = createRelationship( nodeC, nodeB, "length", 3d );
9
Relationship relAC = createRelationship( nodeA, nodeB, "length", 10d );
1
0
EstimateEvaluator<Double> estimateEvaluator = new EstimateEvaluator<Double>()
1
{
1
public Double getCost( final Node node, final Node goal )
1
{
2
double dx = (Double) node.getProperty( "x" ) - (Double)
1
goal.getProperty( "x" );
3
double dy = (Double) node.getProperty( "y" ) - (Double)
1
goal.getProperty( "y" );
4
double result = Math.sqrt( Math.pow( dx, 2 ) + Math.pow( dy, 2 ) );
1
return result;
5
}
1
};
6
PathFinder<WeightedPath> astar = GraphAlgoFactory.aStar(
1
Traversal.expanderForAllTypes(),
7
CommonEvaluators.doubleCostEvaluator( "length" ), estimateEvaluator );
1
WeightedPath path = astar.findSinglePath( nodeA, nodeB );
8
1
9
2
40
0
2
1
4.8. 读取一个管理配置属性
EmbeddedGraphDatabase 类包括了一个
http://components.neo4j.org/neo4j/1.8/apidocs/org/neo4j/kernel/EmbeddedGraphDatabase.html#get
ManagementBean%28java.lang.Class%29[方便的方法]来获取 Neo4j 管理用的 beans。
一般 JMX 服务也能使用，但比起你自己编码不如使用这概述的方法。
提示
JmxTest.java
这个范例演示了如何得到一个图数据库的开始时间：
private static Date getStartTimeFromManagementBean(

1 GraphDatabaseService graphDbService )
2 {
3 GraphDatabaseAPI graphDb = (GraphDatabaseAPI)
graphDbService;
4
5 Kernel kernel =
graphDb.getSingleManagementBean(
6 Kernel.class );
7 Date startTime = kernel.getKernelStartTime();
8 return startTime;
}
不同的 Neo4j 版本，你将使用不同的管理 beans 设置。
• 了解所有 Neo4j 版本的信息，请参考：org.neo4j.jmx。
• 了解 Neo4j 高级版和企业版的信息，请参考：org.neo4j.management。
4.9. OSGi 配置
4.9.1. Simple OSGi Activator 脚本
在 OSGi 关联的上下文比如大量的应用服务器（e.g. Glassfish ）和基于 Eclipse 的系统中，
Neo4j 能被明确地建立起来而不是通过 Java 服务加载机制来发现。
4.9.1. Simple OSGi Activator 脚本

如同在下面的范例中看到的一样，为了代替依赖 Neo4j 内核的类加载，Neo4j Bundle
被作为库 bundles，而像 IndexProviders 和 CacheProviders 这样的服务被明确地实例化，
配置和注册了秩序。只需要确保必要的 jars，所以所有必须的类都被导出并且包括这
Activator。
41
1 public class Neo4jActivator implements BundleActivator
2 {
3
4 private static GraphDatabaseService db;
5 private ServiceRegistration serviceRegistration;
6 private ServiceRegistration indexServiceRegistration;
7
8 @Override
9 public void start( BundleContext context ) throws Exception
1 {
0 //the cache providers
1 ArrayList<CacheProvider> cacheList = new
1 ArrayList<CacheProvider>();
1 cacheList.add( new SoftCacheProvider() );
2
1 //the index providers
3 IndexProvider lucene = new LuceneIndexProvider();
1 ArrayList<IndexProvider> provs = new ArrayList<IndexProvider>();
4 provs.add( lucene );
1 ListIndexIterable providers = new ListIndexIterable();
5 providers.setIndexProviders( provs );
1
6 //the database setup
1 GraphDatabaseFactory gdbf = new GraphDatabaseFactory();
7 gdbf.setIndexProviders( providers );
1 gdbf.setCacheProviders( cacheList );
8 db = gdbf.newEmbeddedDatabase( "target/db" );
1
9 //the OSGi registration
2 serviceRegistration = context.registerService(
0 GraphDatabaseService.class.getName(), db, new
2Hashtable<String,String>() );
1 System.out.println( "registered " +
2serviceRegistration.getReference() );
2 indexServiceRegistration = context.registerService(
2 Index.class.getName(), db.index().forNodes( "nodes" ),
3 new Hashtable<String,String>() );
2 Transaction tx = db.beginTx();
4 try
2 {
5 Node firstNode = db.createNode();
2 Node secondNode = db.createNode();
6 Relationship relationship = firstNode.createRelationshipTo(
2 secondNode,
7 DynamicRelationshipType.withName( "KNOWS" ) );
2
8 firstNode.setProperty( "message", "Hello, " );
42
2 secondNode.setProperty( "message", "world!" );
9 relationship.setProperty( "message", "brave Neo4j " );
3 db.index().forNodes( "nodes" ).add( firstNode, "message",
0 "Hello" );
3 tx.success();
1 }
2 {
3 e.printStackTrace();
3 throw new RuntimeException( e );
3 }
4 finally
3 {
5 tx.finish();
3 }
6
3 }
7
3 @Override
8 public void stop( BundleContext context ) throws Exception
3 {
9 serviceRegistration.unregister();
4 indexServiceRegistration.unregister();
0 db.shutdown();
4
1 }
4
2 }
4
4.10. 在 Java 中执行 Cypher 查询

提示
源代码下载地址：
JavaQuery.java
在 Java 中，你能使用 cypher-query-lang,Cypher 查询语言像下面这样：
1 GraphDatabaseService db = new
2 GraphDatabaseFactory().newEmbeddedDatabase( DB_PATH );
3 // add some data first
4 Transaction tx = db.beginTx();
5 try
6 {
7 Node refNode = db.getReferenceNode();
8 refNode.setProperty( "name", "reference node" );
9 tx.success();
43
10 }
11 finally
12 {
13 tx.finish();
14 }
15
16 // let's execute a query now
17 ExecutionEngine engine = new ExecutionEngine( db );
18 ExecutionResult result = engine.execute( "start
n=node(0) return n, n.name" );
System.out.println( result );
输出结果：
+------------------------------------------------
---+
| n |
1
n.name |
2
+------------------------------------------------
3
---+
4
| Node[0]{name:"reference node"} | "reference node"
5
|
6
+------------------------------------------------
7
---+
1 row
0 ms
注意：在这使用的类来自于 org.neo4j.cypher.javacompat 包，而不是
+org.neo4j.cypher+，通过下面的链接查看 Java API。
你可以在结果中获取列的一个列表：
List<String> columns =
1
result.columns();
2
System.out.println( columns );
输出结果：
[n,
1
n.name]
在单列中获取结果数据集，像下面这样：
1 Iterator<Node> n_column = result.columnAs( "n" );

2 for ( Node node : IteratorUtil.asIterable( n_column ) )
3 {
4 // note: we're grabbing the name property from the node,
5 // not from the n.name in this case.
6 nodeResult = node + ": " + node.getProperty( "name" );
7 System.out.println( nodeResult );
44
8 }
在这种情况下结果中只有一个几个记录：
Node[0]: reference
1
node
要获取所有的列，用下面的代替：
for ( Map<String, Object> row : result )

1
{
2
for ( Entry<String, Object> column : row.entrySet() )
3
{
4
rows += column.getKey() + ": " + column.getValue() + ";
5
";
6
}
7
rows += "\n";
8
}
9
System.out.println( rows );
输出结果：
n.name: reference node; n:

1
Node[0];
要获取 Java 接口中关于 Cypher 的更多信息，请参考：Java API。
要获取更多关于 Cypher 的范例的信息，请参考： cypher-query-lang 和

data-modeling-examples。
第 5 章 Neo4j 远程客户端库
包括 Java 范例在内的这些内容向我们展示了一个在 Java 中使用 Neo4j REST API 的更
``低级``的方法。
要获取更多信息，请浏览下面的介绍。
5.1. 由社区贡献的 Neo4j REST 客户端
表 5.1. 由社区贡献的 Neo4j REST 客户端。

名称语言 / 地址
框架
Java-Rest-Bi Java https://github.com/neo4j/java-rest-bindi

nding ng/
Neo4jClient .NET http://hg.readify.net/neo4jclient/
45
名称语言 / 地址
框架
Neo4jRestNe .NET https://github.com/SepiaGroup/Neo4jRe

t stNet
py2neo Python http://py2neo.org/
Bulbflow Python http://bulbflow.com/
neo4jrestclie Python https://github.com/versae/neo4j-rest-clie

nt nt
neo4django Django https://github.com/scholrly/neo4django
Neo4jPHP PHP https://github.com/jadell/Neo4jPHP
neography Ruby https://github.com/maxdemarzi/neograp

hy
Neoid Ruby https://github.com/elado/neoid
node.js JavaScri https://github.com/thingdom/node-neo4j

pt
Neocons Clojure https://github.com/michaelklishin/neoco

ns
5.2. 在 Java 中如何使用 REST API
5.2.1. 通过 REST API 创建一个图数据库

REST API 使用 HTTP 协议和 JSON 数据格式，以至于它能用于多种语言和平台。当
准备开始使用的时候，看一些可以被重用的模式也是非常有帮助的。在这个简短的概述中，
我们将为你展示如何使用 REST API 创建和维护一个简单的图数据库并且如何从中查询
数据。
对于这些范例，我们选择了 Jersey 客户端组件，这个组件之前我们已经通过 Maven 下载

了。 For these examples, we’ve chosen the Jersey client components, which are easily downloaded
via Maven.
5.2.2. 启动图数据库服务器
在我们对服务器做任何操作之前，我们需要启动它。了解服务器安装的详细信息，请
参考：server-installation。
1 WebResource resource = Client.create()

46
2 .resource( SERVER_ROOT_URI );
3 ClientResponse response = resource.get( ClientResponse.class );
4
5 System.out.println( String.format( "GET on [%s], status code
[%d]",
6
7 SERVER_ROOT_URI, response.getStatus() ) );
response.close();
如果返回状态码是 +200 OK+，那我们知道服务器已经运行良好而我们也能继续了。
如果连接到服务器失败，请参考：server。
注意：如果你得到任何大于 +200 OK+（特别是 +4xx+和 +5xx+）的返回码，那么请

检查你的配置并且查看在目录'data/log'的日志文件。
5.2.3. 创建一个节点
REST API 使用+POST+方式创建节点。在 Java 中使用 Jersey 客户端封装是很简单的：
1
2
3
4
final String nodeEntryPointUri = SERVER_ROOT_URI + "node";
5
// http://localhost:7474/db/data/node
6
7
WebResource resource = Client.create()
8
.resource( nodeEntryPointUri );
9
// POST {} to the node entry point URI
1
ClientResponse response =
0
resource.accept( MediaType.APPLICATION_JSON )
1
.type( MediaType.APPLICATION_JSON )
1
.entity( "{}" )
1
.post( ClientResponse.class );
2
1
final URI location = response.getLocation();
3
System.out.println( String.format(
1
"POST to [%s], status code [%d], location header [%s]",
4
nodeEntryPointUri, response.getStatus(),
1
location.toString() ) );
5
response.close();
1
6
return location;
1
7
1
8
47
如果请求成功完成，它会在后台发送一个包括 SON 格式的数据的 HTTP 请求到图数
据库服务器。这服务器将会在数据库中创建一个新的节点并且返回一个状态码 +201
Created+和一个包含新节点地址的 +Location+头信息。
在我们的范例中，我们将调用两次这个功能以便在我们的数据库中创建两个节点。
5.2.4. 增加属性
一旦我们在我们的数据库中有了节点，我们能用他们存储有用的数据。在这种情况下，
我们将在我们的数据库中存储关于音乐的信息。让我们先看看我们创建节点和增加属性的
代码。这儿我们已经增加了一个节点用来表示”Joe Strummer“和一个乐队“The Clash”。
URI firstNode = createNode();

1
addProperty( firstNode, "name", "Joe
2
Strummer" );
3
URI secondNode = createNode();
4
addProperty( secondNode, "band", "The Clash" );
在 +addProperty+方法内部我们确定了表示节点属性的资源以及这个属性的名称。我
们然后 +PUT+那个属性的值到服务器。
String propertyUri = nodeUri.toString() + "/properties/" +

1
propertyName;
2
//
3
http://localhost:7474/db/data/node/{node_id}/properties/{proper
4
ty_name}
5
6
7
.resource( propertyUri );
8
9
1
0
.entity( "\"" + propertyValue + "\"" )
1
.put( ClientResponse.class );
1
1
System.out.println( String.format( "PUT to [%s], status code
2
[%d]",
1
propertyUri, response.getStatus() ) );
3
response.close();
如果一切运行正常，我们将得到一个 204 No Content 的返回码表示服务器已经处理了
我们的情况但并不会回显属性的值。
5.2.5. 增加关系
现在我们有了表示 Joe Strummer 和 The Clash 的节点，我们将给他们建立关系。REST
API 支持通过一个 +POST+请求来为节点间建立关系。在 Java 中与此相对应，我们 +POST+
48
一些 JSON 数据到表示 Joe Strummer 的节点的地址上面，来确定一个该节点和表示 The
Clash 的节点之前的关系。
URI relationshipUri = addRelationship( firstNode, secondNode,

1
"singer",
2
"{ \"from\" : \"1976\", \"until\" : \"1986\" }" );
在 +addRelationship+方法内部，我们确定了节点 Joe Strummer 的关系的 URI，然后
+POST+了一个 JSON 数据到服务器。这个 JSON 数据包括目标节点，关系类型以及其他任
何属性。
1
2
3
4
private static URI addRelationship( URI startNode, URI endNode,
5
String relationshipType, String jsonAttributes )
6
throws URISyntaxException
7
{
8
URI fromUri = new URI( startNode.toString() +
9
"/relationships" );
1
String relationshipJson =
0
generateJsonRelationship( endNode,
1
relationshipType, jsonAttributes );
1
1
2
.resource( fromUri );
1
// POST JSON to the relationships URI
3
1
4
1
.entity( relationshipJson )
5
1
6
final URI location = response.getLocation();
1
7
"POST to [%s], status code [%d], location header
1
[%s]",
8
fromUri, response.getStatus(),
1
location.toString() ) );
9
2
response.close();
0
return location;
2
}
1
2
2
2
49
3
2
4
如果 i 一切运行正常，我们将收到一个状态码 +201 Created+并且一个我们刚创建的关
系的 URI 在 HTTP 头里面的 +Location+。
5.2.6. 给关系增加属性
像节点一样，关系也可以有属性。因为我们是 both Joe Strummer 和 the Clash 的超级
我们将增加一个评价属性到关系上面以至于其他人能看到这个乐队的 5 星级歌手。
大粉丝，
addMetadataToProperty( relationshipUri, "stars",

1
"5" );
在 +addMetadataToProperty+方法内部，我们确定关系的属性的 URI，并且 +PUT+我
们的新值到服务器（因为它是 +PUT+所以它总是会覆盖已经存在的值，所以一定要小心）。
1
2
private static void addMetadataToProperty( URI
3
relationshipUri,
4
String name, String value ) throws URISyntaxException
5
{
6
URI propertyUri = new URI( relationshipUri.toString() +
7
"/properties" );
8
String entity = toJsonNameValuePairCollection( name,
9
value );
1
0
.resource( propertyUri );
1
1
1
2
.entity( entity )
1
.put( ClientResponse.class );
3
1
4
"PUT [%s] to [%s], status code [%d]", entity,
1
propertyUri,
5
response.getStatus() ) );
1
response.close();
6
}
1
7
假设一切运行正常，我们将得到一个 +200 OK+返回码（我们也可以调用
+ClientResponse.getStatus()+来获取）而且我们现在可以确定我们已经可以从一个小型图数
据库中查询数据了。
50
5.2.7. 从图数据库中查询数据
作为图数据库的嵌入模式，Neo4j 服务器使用图遍历来在途中查询数据。当前 Neo4j
服务器期望一个 JSON 数据通过 +POST+发送过来进行遍历查询（虽然这也可以改变成
+GET+的方式）。
要启动这个进程，我们用一个简单的类来封装 JSON 数据并通过 +POST+发送到服务

器，在这种情况下我们硬编码遍历查询来查找所有带有输出方向关系 +"singer"+的所有节
点。
// TraversalDescription turns into JSON to send to the Server

1
TraversalDescription t = new TraversalDescription();
2
t.setOrder( TraversalDescription.DEPTH_FIRST );
3
t.setUniqueness( TraversalDescription.NODE );
4
t.setMaxDepth( 10 );
5
t.setReturnFilter( TraversalDescription.ALL );
6
t.setRelationships( new Relationship( "singer",
7
Relationship.OUT ) );
一旦我们定义了我们的遍历查询所需的参数，我们只需要传递它。我们先确定起始节
点的遍历查询的 URI，然后 +POST+遍历查询的 JSON 数据来完成这个需求。
1
URI traverserUri = new URI( startNode.toString() +
2
"/traverse/node" );
3
4
.resource( traverserUri );
5
String jsonTraverserPayload = t.toJson();
6
7
8
9
.entity( jsonTraverserPayload )
1
0
1
1
"POST [%s] to [%s], status code [%d], returned data:
1
"
2
+ System.getProperty( "line.separator" ) +
1
"%s",
3
jsonTraverserPayload, traverserUri,
1
response.getStatus(),
4
response.getEntity( String.class ) ) );
1
response.close();
5
一旦请求被完成，我们将得到歌手的数据集以及他们所属的乐队：
1 [ {
51
2 "outgoing_relationships" :
"http://localhost:7474/db/data/node/82/relationships/out",
3
4 "data" : {
5 "band" : "The Clash",
6 "name" : "Joe Strummer"
7 },
8 "traverse" :
"http://localhost:7474/db/data/node/82/traverse/{returnType}",
9
1 "all_typed_relationships" :
0 "http://localhost:7474/db/data/node/82/relationships/all/{-list
|&|types}",
1
1 "property" :
"http://localhost:7474/db/data/node/82/properties/{key}",
1
2 "all_relationships" :
"http://localhost:7474/db/data/node/82/relationships/all",
1
3 "self" : "http://localhost:7474/db/data/node/82",
1 "properties" :
4 "http://localhost:7474/db/data/node/82/properties",
1 "outgoing_typed_relationships" :
5 "http://localhost:7474/db/data/node/82/relationships/out/{-list
|&|types}",
1
6 "incoming_relationships" :
"http://localhost:7474/db/data/node/82/relationships/in",
1
7 "incoming_typed_relationships" :
"http://localhost:7474/db/data/node/82/relationships/in/{-list|
1
8 &|types}",
1 "create_relationship" :
9 "http://localhost:7474/db/data/node/82/relationships"
2 }, {
2
1 "data" : {
2 },
2 "traverse" :
2
"http://localhost:7474/db/data/node/83/relationships/all/{-list
2
4 |&|types}",
2 "property" :
5 "http://localhost:7474/db/data/node/83/properties/{key}",
6 "http://localhost:7474/db/data/node/83/relationships/all",
7 "properties" :
"http://localhost:7474/db/data/node/83/properties",
2
"http://localhost:7474/db/data/node/83/relationships/out/{-list
2
52
9 |&|types}",
0 "http://localhost:7474/db/data/node/83/relationships/in",
1 "http://localhost:7474/db/data/node/83/relationships/in/{-list|
&|types}",
"create_relationship" :
"http://localhost:7474/db/data/node/83/relationships"
} ]
5.2.8. 喔，是这样吗？==
那是我们用 REST API 做我们的事情的方式。That’s a flavor of what we can do with the
REST API. 自然而然的我们提交到服务器的任何 HTTP 语义都很容易被封装，包括通过
+DELETE+来移除节点和关系。不过，如果你已经完全掌握了，那么在 Jersey 客户端从
+.delete()+切换成 +.delete()+是非常容易的。
5.2.9. 下一步计划是什么呢？
HTTP API 提供一个客户端库更好的基本实现，它也是优秀的基于 HTTP 的 REST 接口。
比起提供友好的语言级的开发架构实现，尽管他们能非常简单的绑定来使用嵌入模式的图数据
库，我们还是 i 计划在将来让常用语言都提供基于 RESTAPI 的客户端绑定实现。要了解当前各
种语言的 Neo4j REST 客户端实现以及嵌入封装，请参考：
http://www.delicious.com/neo4j/drivers 。
5.2.10. 附录：代码
• CreateSimpleGraph.java
• Relationship.java
• TraversalDescription.java
第 6 章遍历查询框架
在 Java 中的 Neo4j Traversal API 是基于回调机制的，懒加载执行的。一些遍历查询范例收
集正这里：第 4.5 节 “遍历查询”。
在 Neo4j 中其他一些遍历和查询方式还有 cypher-query-lang,Cypher 和

gremlin-plugin,Gremlin。
6.1. 主要观点
下面对各种修改遍历描述对象的方法进行一个简短的说明。
• Expanders — 定义遍历的内容，特别是关系的方向和类型。
53
• Order — 比如宽度优先或者深度优先。
• Uniqueness — 只访问一次节点（关系，路径）。
• Evaluator — 决定返回的内容以及在超过当前位置是否继续遍历。
• Starting nodes 决定遍历的起点。
See 第 6.2 节 “遍历查询框架 Java API” for more details.
6.2. 遍历查询框架 Java API

除了 Node 和 Relationship 遍历查询框架还包括一些主要的接口:
TraversalDescription+，+Evaluator+，+Traverser 和 Uniqueness 是一些主要的。
Path 接口在遍历中还有一个特殊的用途，因为当评估该位置时它常被用来表示在图中的一个
位置。此外，PathExpander (代替 RelationshipExpander) 和 Expander 接口是遍历框
架中的核心，但 API 的用户很少需要自己去实现它们。对于高级应用，也有一个设置接口，当
需要精确控制遍历的顺序的时候，可以采用：BranchSelector, BranchOrderingPolicy 和
+TraversalBranch+。
6.2.1. TraversalDescription
TraversalDescription 是用来定义和初始化遍历查询的非常重要的接口。这并不是说
要用户去实现遍历查询框架，而是相对于有框架提供的实现来说，这提供了一个用户自定义查
询条件的方法。 TraversalDescription 实例是不可改变的而它的方法返回一个新的
TraversalDescription 实例是可以改变的。 TraversalDescription instances are
immutable and its methods returns a new TraversalDescription that is modified compared to
the object the method was invoked on with the arguments of the method.
Relationships
54
增加一个关系类型到遍历的关系类型列表中。默认情况下，这个列表是空的，意味着默认
会返回所有类型的关系，而不考虑类型。如果有关系被加入到这个列表中，那就意味着只
有列表中的关系才会被遍历。有两个方法，一个是包括方向，另外一个是排除方向，这个方
法中遍历关系是双向的。
6.2.2. Evaluator
大量的 Evaluator 是用来决定在每一个位置（用一个 Path 表示）：是应该继续遍历查
询以及节点是否包括在结果中。对于一个给定的 +Path+，它要求对遍历查询分支采用下面四个
动作中的一种：
• Evaluation.INCLUDE_AND_CONTINUE: 包括这个节点在结果中并且继续遍历
查询。
• Evaluation.INCLUDE_AND_PRUNE: 包括这个节点在结果中并且继续不遍历查
询。
• Evaluation.EXCLUDE_AND_CONTINUE: 排除这个节点在结果中并且继续遍历
查询。
• Evaluation.EXCLUDE_AND_PRUNE: 排除这个节点在结果中并且继续不遍历查
询。
Evaluator 可以加入多个。注意 Evaluator 将被遍历过程中遇到的每一个位置所调用，

甚至包括起点节点。
6.2.3. Traverser
Traverser 对象是调用一个 TraversalDescription 的 traverse()方法返回的结果。
它表示正一个图数据库中遍历的位置集合和结果格式化的一个规范。实际的遍历在每次的
执行中都是懒加载的，只有当我们调用 Traverser 的方法 next() 时才会被真正执行。
6.2.4. Uniqueness
在 Uniqueness 中说明了一个设置规则，这个规则决定在一个遍历期间如何访问已经访问
过的位置策略。默认规则是 NODE_GLOBAL。
一个 Uniqueness 能提供给 TraversalDescription,用来决定一个遍历是否重新可以访问

相同的位置。不同的策略级别可以使用：
• +NONE+：在图数据库中任何节点都可以被重访。
• NODE_GLOBAL uniqueness：在图数据库中每个节点只能被访问一次。这会潜在的
消耗大量的内存因为图要求保持一个内存中的数据结构用来保存所有被访问过的节点。
55
• RELATIONSHIP_GLOBAL uniqueness：在图数据库中每个关系只能被访问一次。这
会潜在的消耗大量的内存因为图中一般关系的数量远远大于节点的数量。这种级别的内存
开销会增长得更快。
• NODE_PATH uniqueness：一个节点不能在之前的遍历路径中出现过。
• RELATIONSHIP_PATH uniqueness：一个关系不能在之前的遍历路径中出现过。
• NODE_RECENT uniqueness：这是 NODE_GLOBAL uniqueness 的简化版，有一个全

局访问过的节点集合在每个位置进行核对。这种级别不会消耗大量的内存因为这个集合只
会包含最近访问过的节点。这个集合的尺寸可以通过方法
TraversalDescription.uniqueness() 的第二个参数来指定。
• RELATIONSHIP_RECENT uniqueness：跟节点类似，只是换成关系而已。
Depth First / Breadth First

有很多方法可以设置顺序的 BranchSelector|ordering 策略：depth-first/ breadth-first。
相同的结果可以通过调用 Traversal factory 的 order，也可以新建自己的
BranchSelector/BranchOrderingPolicy 并传入完成。
6.2.5. Order — 如何穿过分支呢？

depthFirst/breadthFirst 方法的普通版本是允许一个任意的 BranchOrderingPolicy 注入
到 TraversalDescription。
6.2.6. BranchSelector
一个 BranchSelector 是用来定义如何选择遍历下一个分支。这被用来实现遍历顺序。
遍历框架提供了一些基本的顺序实现：
• Traversal.preorderDepthFirst(): 深度优先，在访问的子节点之前访问每
一个节点。
• Traversal.postorderDepthFirst(): 深度优先，在访问的子节点之后访问每
一个节点。
• Traversal.preorderBreadthFirst(): 宽度优先，在访问的子节点之前访问
每一个节点。
• Traversal.postorderBreadthFirst(): 宽度优先，在访问的子节点之后访问
每一个节点。
注意：请注意宽度优先遍历策略比深度优先策略消耗更多的内存。
BranchSelectors 带有状态信息因此需要正每一次遍历的时候都被唯一实例化。因此，
它被通过一个 BranchOrderingPolicy 接口来提供给 TraversalDescription，而它是一个
BranchSelector 实例化工厂。
56
遍历查询框架的用户很少需要实现图自己的 BranchSelector 和 BranchOrderingPolicy，
它们让图形算法实现者提供它们自己的遍历顺序。Neo4j 图算法包包含了一个最好优先
（BestFirst）的 BranchSelector/BranchOrderingPolicy 实现，常用于 A* 和 Dijkstra 算法中。
BranchOrderingPolicy
它是一个工厂，用来创建 BranchSelectors 决定哪些分支需要返回（一个分支的位置常用
Path 来表示）。一般策略是 depth-first 和 breadth-first。举个例子，调用
TraversalDescription#depthFirst()等价于：
description.order( Traversal.preorderDepthFirst(
1
) );
TraversalBranch
被 BranchSelector 使用的一个对象用来从一个给定的分支获取更多分支。本质上，有一个
由一个路径和一个 RelationshipExpander 组成，RelationshipExpander 能被用来从当前的分支获
取新的 TraversalBranch。
6.2.7. Path
一个 http://components.neo4j.org/neo4j/1.8/apidocs/org/neo4j/graphdb/Path.html[Path]是
Neo4j API 中的一个普通接口。在 Neo4j 遍历查询 API 中，Path 的用法有两方面。 Traversers
可以以 +Path+的形式返回它们在图中被标记为要返回的结果。 +Path+对象也可以用于正图中
进行位置评估，决定一个遍历在某个点是否继续和某个点是否被包含在结果中。
6.2.8. PathExpander/RelationshipExpander
遍历查询框架用 PathExpanders（取代 RelationshipExpander）查找在遍历查询中从一个
特别的路径到更多的分支应跟随的关系。
6.2.9. Expander
注入+RelationshipExpander+的关系的一个更通用的版本，定义了所有要被遍历节点的
所有关系。默认情况下，一个
http://components.neo4j.org/neo4j/1.8/apidocs/org/neo4j/kernel/Traversal.html#emptyExpander[
default expander]被使用，这个时候任何关系的方向都不考虑。
有另外一个实现方法，担保遍历的关系类型都在 order of relationship type 中。
+Expander+接口是+RelationshipExpander+接口的一个扩展确保能自定义一个
+Expander+。 +TraversalDescription+用这个来提供方法定义遍历的关系类型，这是一个 API 用
户用来定义一个 +RelationshipExpander+不常用的方法 — 在+TraversalDescription+内部构造
它。
57
通过 Neo4j 遍历查询框架提供的所有 RelationshipExpanders 也实现了 Expander 接口，因为
它之包含一个方法 — 这个方法从一个路径/节点获取关系，Expander 接口增加的方法只能用于
构件新的 Expanders。
6.2.10. 如何使用遍历查询框架
与 Node#traverse 相反，一个 traversal description 被构造而它能产生一个 traversers 。
图 6.1. 遍历查询范例数据库
+RelationshipTypes+的定义如下：
private enum Rels implements

1
RelationshipType
2
{
3
LIKES, KNOWS
4
}
58
图数据库可以被下面范例的遍历查询器便利，从‘`Joe’'节点开始：
for ( Path position : Traversal.description()

1
.depthFirst()
2
.relationships( Rels.KNOWS )
3
.relationships( Rels.LIKES,
4
Direction.INCOMING )
5
.evaluator( Evaluators.toDepth( 5 ) )
6
.traverse( node ) )
7
{
8
output += position + "\n";
9
}
i 遍历后输出结果：
(7)
1 (7)<--[LIKES,1]--(4)
2 (7)<--[LIKES,1]--(4)--[KNOWS,6]-->(1)
3 (7)<--[LIKES,1]--(4)--[KNOWS,6]-->(1)--[KNOWS,4]-->(6)
4 (7)<--[LIKES,1]--(4)--[KNOWS,6]-->(1)--[KNOWS,4]-->(6)--[KNO
WS,3]-->(5)
5
6 (7)<--[LIKES,1]--(4)--[KNOWS,6]-->(1)--[KNOWS,4]-->(6)--[KNO
WS,3]-->(5)--[KNOWS,2]-->(2)
7
(7)<--[LIKES,1]--(4)--[KNOWS,6]-->(1)<--[KNOWS,5]--(3)
因为 TraversalDescription 是不可以改变的因此创建一个描述模板来在不同的遍历
器中共享是非常游泳的，比如，让他们从这个遍历器开始：
final TraversalDescription FRIENDS_TRAVERSAL =

1
Traversal.description()
2
.depthFirst()
3
.relationships( Rels.KNOWS )
4
.uniqueness( Uniqueness.RELATIONSHIP_GLOBAL );
这个遍历器将会输出下面的结果（我们始终保持从节点 ‘`Joe’'开始）：
(7)
(7)--[KNOWS,0]-->(2)
1
(7)--[KNOWS,0]-->(2)<--[KNOWS,2]--(5)
2
(7)--[KNOWS,0]-->(2)<--[KNOWS,2]--(5)<--[KNOWS,3]--(6)
3
(7)--[KNOWS,0]-->(2)<--[KNOWS,2]--(5)<--[KNOWS,3]--(6)<--[KN
4
OWS,4]--(1)
5
6
OWS,4]--(1)<--[KNOWS,5]--(3)
7
OWS,4]--(1)<--[KNOWS,6]--(4)
现在让我们从它上面创建一个新的遍历器，严格规定深度为 3：
1 for ( Path path : FRIENDS_TRAVERSAL

59
2 .evaluator( Evaluators.toDepth( 3
)3 )
4 .traverse( node ) )
5 {
6 output += path + "\n";
}
这会返回这样的结果：
(7)
1
(7)--[KNOWS,0]-->(2)
2
(7)--[KNOWS,0]-->(2)<--[KNOWS,2]--(5)
3
(7)--[KNOWS,0]-->(2)<--[KNOWS,2]--(5)<--[KNOWS,3]-
4
-(6)
或者我们把深度从 2 变成 4 又会怎么样呢？下面是我们的测试：
for ( Path path : FRIENDS_TRAVERSAL

1
.evaluator( Evaluators.fromDepth( 2
2
) )
3
.evaluator( Evaluators.toDepth( 4 ) )
4
.traverse( node ) )
5
{
6
output += path + "\n";
7
}
这个遍历器会返回这样的结果：
(7)--[KNOWS,0]-->(2)<--[KNOWS,2]--(5)
1
(7)--[KNOWS,0]-->(2)<--[KNOWS,2]--(5)<--[KNOWS,3]--(6)
2
3
OWS,4]--(1)
要获取各种不同有用的 evaluators，请参考： Evaluators Java API 。或者自己简单实现
Evaluator 接口。
如果你对 Path 没有兴趣，但对 Node 有兴趣，你可以转换遍历器成一个节点的迭代器，

像下面这样：
for ( Node currentNode : FRIENDS_TRAVERSAL

1
.traverse( node )
2
.nodes() )
3
{
4
output += currentNode.getProperty( "name" ) +
5
"\n";
6
}
在这种情况下我们使用它来接收名称：
1 J
60
oe2
3 S
ara
4
5 P
eter
6
7 D
irk
L
ars
E
d
L
isa
Relationships 也同样可以这样，下面是我们如何得到他们：
for ( Relationship relationship :

1
FRIENDS_TRAVERSAL
2
.traverse( node )
3
.relationships() )
4
{
5
output += relationship.getType() + "\n";
6
}
这儿是书写的关系类型，我们将得到他们：
K
NOWS
K
NOWS
1
2 K
NOWS
3
4 K
NOWS
5
6 K
NOWS
K
NOWS
在这个范例中的遍历器的源代码下载地址： TraversalExample.java。
第 7 章数据模型范例
下面的章节包括了如何在不同领域使用 Neo4j 的简单范例。这不是为了给我们完整的
范例，而是给我们演示使用节点，关系，图数据模式以及在查询中的数据局部性的可能的
方法。
这些范例使用了大量的 Cypher 查询，要了解更多信息，请参考：cypher-query-lang。
61
7.1. 在图数据库中的用户角色模型
7.1.1. 得到管理员
7.1.2. 得到一个用户的组成员
7.1.3. 获取所有的用户组
7.1.4. 找到所有用户
这个范例展示了一个角色的等级制度。有趣的是，一棵树是不能足够用于存储这样的
结构，下文阐述。
这个范例是 Kemal Erdogan 在由撰写的文章 A Model to Represent Directed Acyclic Graphs

(DAG) on SQL Databases 中的一个实现。这个文章讨论了如何在基于 SQL 的数据库中存储
directed acyclic graphs (DAGs) 。DAGs 很像树结构，但有一点：它们可能通过不同的路径到达
相同的节点。从这个来说树结构是被严格限制的，保证他们更容易控制。在我们的这种情况下，
它是 "Ali" 和 "Engin"，因为他们是管理员和普通用户因此可以通过他们的组节点到达。现实
往往看起来这样而不能被树结构处理。
在文章中一个 SQL 存储过程提供了一个解决方案。主要的意思就是他们有来自科学家

的一些支持，他们能预先计算所有可能的路径。这种方法的优缺点是：
• 提升读性能
• 降低插入性能
• _大量_的空间浪费
• 依赖存储过程
在 Neo4j 中存储角色是没有价值的。在这种情况下我们使用 PART_OF （绿边）关系来对

用户组层次建模而 MEMBER_OF+（蓝边）用来表示组中的成员。我们也把顶级组节点通过关系
62
+ROOT 连接到参考节点（第一个节点）。这提供了一个非常有用的分割方法给我们。Neo4j 并
没有预先定义关系类型，你可以自由的创建任何关系类型并且赋予他们任何你希望的语义。
现在让我们看下如何从图数据库接收信息。Java 代码使用 Neo4j 遍历相关的 API（参

考：tutorial-traversal-java-api），查询工具使用 cypher-query-lang, Cypher。
7.1.1. 得到管理员
Node admins = getNodeByName( "Admins" );
1Traverser traverser = admins.traverse(
2 Traverser.Order.BREADTH_FIRST,
3 StopEvaluator.END_OF_GRAPH,
4 ReturnableEvaluator.ALL_BUT_START_NOD
E,5
6 RoleRels.PART_OF, Direction.INCOMING,
7 RoleRels.MEMBER_OF,
Direction.INCOMING );
结果是：
Found: Ali at depth: 0

1
Found: HelpDesk at depth:
2
0
3
Found: Engin at depth: 1
4
Found: Demet at depth: 1
结果是从遍历器收集的：
String output = "";

1 for ( Node node : traverser )
2 {
3 output += "Found: " + node.getProperty( NAME ) + " at depth:
"4
5 + ( traverser.currentPosition().depth() - 1 ) +
"\n";
6
}
在 Cypher 中，一个简单的查询如下：
START admins=node(14)
1
MATCH
2
admins<-[:PART_OF*0..]-group<-[:MEMBER_OF]-user
3
RETURN user.name, group.name
输出结果：
表 . --docbook
63
user.n group.n
ame ame
3 rows
4 ms
"Ali" "Admins"
"Engin" "HelpDesk"
"Demet" "HelpDesk"
7.1.2. 得到一个用户的组成员
使用 Neo4j Java 遍历 API,这个查询像这样：
Node jale = getNodeByName( "Jale" );

traverser = jale.traverse(
1
Traverser.Order.DEPTH_FIRST,
2
3
ReturnableEvaluator.ALL_BUT_START_N
4
ODE,
5
RoleRels.MEMBER_OF,
6
Direction.OUTGOING,
7
RoleRels.PART_OF,
输出结果：
Found: ABCTechnicians at depth:

1
0
2
Found: Technicians at depth: 1
3
Found: Users at depth: 2
在 Cypher 中：
START jale=node(10)
1
MATCH
2
jale-[:MEMBER_OF]->()-[:PART_OF*0..]->group
3
RETURN group.name
表 . --docbook
group.na
me
3 rows
1 ms
"ABCTechnici
64
group.na
me
ans"
"Technicians
"
"Users"
7.1.3. 获取所有的用户组
在 Java 中：
Node referenceNode =
1
getNodeByName( "Reference_Node") ;
2
traverser = referenceNode.traverse(
3
Traverser.Order.BREADTH_FIRST,
4
5
ReturnableEvaluator.ALL_BUT_START_NODE,
6
RoleRels.ROOT, Direction.INCOMING,
7
RoleRels.PART_OF, Direction.INCOMING );
输出结果:
Found: Admins at depth: 0

1
Found: Users at depth: 0
2
Found: HelpDesk at depth: 1
3
Found: Managers at depth: 1
4
Found: Technicians at depth: 1
5
Found: ABCTechnicians at depth:
6
2
在 Cypher 中：
START refNode=node(16)
1
MATCH
2
refNode<-[:ROOT]->()<-[:PART_OF*0..]-group
3
RETURN group.name
表 . --docbook
group.na
me
6 rows
2 ms
"Admins"
"HelpDesk"
65
group.na
me
"Users"
"Managers"
"Technicians
"
"ABCTechnici
ans"
7.1.4. 找到所有用户
现在，让我们试图找到在系统中属于任何用户组的所有用户。
在 Java 中：
1
2
3
4
traverser = referenceNode.traverse(
5
Traverser.Order.BREADTH_FIRST,
6
7
new ReturnableEvaluator()
8
{
9
@Override
1
public boolean isReturnableNode(
0
TraversalPosition currentPos )
1
{
1
if ( currentPos.isStartNode() )
1
{
2
return false;
1
}
3
Relationship rel =
1
currentPos.lastRelationshipTraversed();
4
return rel.isType( RoleRels.MEMBER_OF );
1
}
5
},
1
RoleRels.ROOT, Direction.INCOMING,
6
RoleRels.PART_OF, Direction.INCOMING,
1
RoleRels.MEMBER_OF, Direction.INCOMING );
7
1
8
1
66
9
2
0
1
Found: Ali at depth: 1
2
Found: Engin at depth: 1
3
Found: Burcu at depth: 1
4
Found: Can at depth: 1
5
Found: Demet at depth: 2
6
Found: Gul at depth: 2
7
Found: Fuat at depth: 2
8
Found: Hakan at depth: 2
9
Found: Irmak at depth: 2
1
Found: Jale at depth: 3
0
在 Cypher 中像这样:
START refNode=node(16)
1
MATCH refNode<-[:ROOT]->root,
2
p=root<-[PART_OF*0..]-()<-[:MEMBER_OF]-user
3
RETURN user.name, min(length(p))
4
ORDER BY min(length(p)), user.name
输出结果：
表 . --docbook
user.n min(lengt
ame h(p))
10 rows
33 ms
"Ali" 1
"Burcu" 1
"Can" 1
"Engin" 1
"Demet" 2
"Fuat" 2
"Gul" 2
"Hakan" 2
"Irmak" 2
67
user.n min(lengt
ame h(p))
"Jale" 3
使用在 Java 中比较短的构建和其他查询机制来实现更加复杂语义的查询。
7.2. 在图数据库中的 ACL 结构模型

这个范例对于在图数据库中控制 ACL 给了一个通用方法概述，和一个具体查询的简
单范例。
7.2.1. 通用方法
在许多场景，一个应用需要控制某些管理对象的安全。这个范例描述了一个管理这些的一个模
式，通过在一个图中建立一个对任何管理对象的完整权限结构。这导致产生一个基于位置和管理对
象上下文的动态结构。
这个结果是一个很容易正图结构中实现的复杂安全规划。支持权限重载，规则和内容组件，而
不用复制数据。
技术
就像范例图布局看到的一样，在这个领域模型中有如下一些要点：
• 管理内容（文件和目录）通过 HAS_CHILD_CONTENT 连接
• 主要子树通过 PRINCIPAL 关系指出能作为 ACL 成员的规则
• 规则聚合成组，通过 IS_MEMBER_OF 关系连接，我们的规则可以同时属于多个组。
68
• SECURITY — 一个关系，连接内容组件到规则组件，附带一个可以修改和删除的
属性： ("+RW")。
构建 ACL
将穿越对于任何给定的 ACL 管理节点（内容）的有效权限（如读，写，执行）的计
算都将遵循一系列被编码的权限遍历的规则。
自顶向下的遍历
这个方案将让你在根内容上面定义一个通用的权限模式，而后提取出指定的子内容节
点和指定的规则。
1. 从内容节点开始向上移动直到内容根节点以找到它的路径。
2. 从一个"所有允许的"有效的乐观权限列表开始（ 111 是以 BIT 位编码表示读写执
行情况）。
3. 从最上面的内容节点开始，查找在它上面有任何 SECURITY 关系的节点。
4. If found, look if the principal in question is part of the end-principal of the SECURITY
relationship.
5. If yes, add the "+" permission modifiers to the existing permission pattern, revoke the
"-" permission modifiers from the pattern.
6. If two principal nodes link to the same content node, first apply the more generic
prinipals modifiers.
7. Repeat the security modifier search all the way down to the target content node, thus
overriding more generic permissions with the set on nodes closer to the target node.
相同的算法也适用于自下而上的方案，基本上只是从目标内容节点向上遍历和在遍历
器向上时应用动态安全修饰符。
范例
现在，为了得到访问权限的结果，比如在文件 "My File.pdf" 上的用户 "user 1" 处
于一个自上而下方案中时，在图数据库中的模型应该如下：
1. Traveling upward, we start with "Root folder", and set the permissions to 11 initially
(only considering Read, Write).
2. There are two SECURITY relationships to that folder. User 1 is contained in both of
them, but "root" is more generic, so apply it first then "All principals" +W +R → 11.
3. "Home" has no SECURITY instructions, continue.
4. "user1 Home" has SECURITY. First apply "Regular Users" (-R -W) → 00, Then
"user 1" (+R +W) → 11.
5. The target node "My File.pdf" has no SECURITY modifiers on it, so the effective
permissions for "User 1" on "My File.pdf" are ReadWrite → 11.
7.2.2. 读取权限范例
69
在这个范例中，我们将解释一个 directories 和 files 的树结构。也包括拥有这些文
件的用户以及这些用户的角色。角色可以作用于目录或者文件结构上（相对于完整的 rwx Unix
权限，这儿我们只考虑 +canRead+），而且可以被继承。一个定义 ACL 架构的更彻底的范例
可以在这找到：如何在 SQL 中建立基于角色控制的权限系统。
在目录结构中查找所有的文件
为了找到包含正这个结构中的所有文件，我们需要一个可变长度查询，找到跟随了关系
contains 的节点，并返回关系 leaf 另外一端的节点。
START root=node:node_auto_index(name = 'FileRoot')

1
MATCH
2
root-[:contains*0..]->(parentDir)-[:leaf]->file
3
RETURN file
输出结果：
表 . --docbook
file
2 行
163 毫秒
70
file
Node[11]{name:"Fil
e1"}
Node[10]{name:"Fil
e2"}
谁拥有哪些文件？
如果我们引入了文件的所有权概念，然后我们要求找到我们找到的文件的拥有者 — 通过
owns 关系连接到文件节点的。
START root=node:node_auto_index(name = 'FileRoot')

1
MATCH
2
root-[:contains*0..]->()-[:leaf]->file<-[:owns]-user
3
RETURN file, user
返回在 FileRoot 节点下面的所有文件的拥有者。
表 . --docbook
file user
2 行
3 毫秒
Node[11]{name:"Fil Node[8]{name:"Use
e1"} r1"}
Node[10]{name:"Fil Node[7]{name:"Use
e2"} r2"}
谁可以访问这个文件？
如果我们现在想检查下哪些有用可以读取所有文件，可以定义我们的 ACL 如下：
• 根目录授权无法访问。
• 任何用户都被授予了一个角色 canRead ，允许访问一个文件的上级目录中任何

一个。
为了找到可以读取上面的文件的父文件夹层次结构中的任何部分的用户， Cypher 提
供了可选的可变长度的路径。
START file=node:node_auto_index('name:File*')
1 MATCH
file<-[:leaf]-()<-[:contains*0..]-dir<-[?:canRead]-role-[:membe
2
r]->readUser
3
RETURN file.name, dir.name, role.name, readUser.name
71
这将返回 file ，拥有 canRead 权限的目录，以及 user 本身以及他们的 role 。
表 . --docbook
file.n dir.n role.n readUser.n

ame ame ame ame
9 行
84 毫秒
"File2" "Deskt <null> <null>
op"
"File2" "HomeU <null> <null>
2"
"File2" "Home" <null> <null>
"File2" "FileR "SUDOer "Admin1"
oot" s"
oot" s"
"File1" "HomeU <null> <null>
1"
"File1" "Home" <null> <null>
oot" s"
oot" s"
这个结果列出了包含 null 值的路径片段，它们能通过一些查询来缓解或者只返回真正需

要的值。
7.3. 链表
使用图形数据库的强大功能是您可以创建您自己的图形数据结构 — — 喜欢的链接的列
表。
此数据结构使用单个节点列表的引用。引用已传出关系到列表，头和传入的关系，从列表中的
最后一个元素。如果该列表为空，该引用将指向它自我。就像这样：
Graph
72
要初始化为空的链接的列表，我们只需创建一个空的节点，并使其链接到本身。
Query
1 CREATE root-[:LINK]->root // no ‘value’ property assigned to root

2 RETURN root
Adding values is done by finding the relationship where the new value should be placed in,
and replacing it with a new node, and two relationships to it.
Query
START root=node:node_auto_index(name = "ROOT")

1
MATCH root-[:LINK*0..]->before,// before could be same as root
2
after-[:LINK*0..]->root, // after could be same as root
3
before-[old:LINK]->after
4
WHERE before.value? < 25 // This is the value, which would
5
normally
6
AND 25 < after.value? // be supplied through a parameter.
7
CREATE before-[:LINK]->({value:25})-[:LINK]->after
8
DELETE old
Deleting a value, conversely, is done by finding the node with the value, and the two
relationships going in and out from it, and replacing with a new value.
Query
START root=node:node_auto_index(name = "ROOT")

1
MATCH root-[:LINK*0..]->before,
2
before-[delBefore:LINK]->del-[delAfter:LINK]->af
3
ter,
4
after-[:LINK*0..]->root
5
WHERE del.value = 10
6
CREATE before-[:LINK]->after
7
DELETE del, delBefore, delAfter
7.4. 超边
假设用户正在不同组的一部分。一个组可以有不同的角色，并且用户可以不同组的一部分。
他还可以在不同的组成员除了具有不同的角色。该协会的用户、组和角色可以称为
HyperEdge。然而，它可以轻松地建模属性图中为捕获此多元的关系，一个节点下面所示的
U1G2R1 节点。
Graph
73
7.4.1. Find Groups
To find out in what roles a user is for a particular groups (here Group2), the following query can
traverse this HyperEdge node and provide answers.
Query
START n=node:node_auto_index(name = "User1")

1
MATCH n-[:hasRoleInGroup]->hyperEdge-[:hasGroup]->group,
2
hyperEdge-[:hasRole]->role
3
WHERE group.name = "Group2"
4
RETURN role.name
The role of User1 is returned:
表 . --docbook
role.n
ame
1 row
0 ms
74
role.n
ame
"Role1"
7.4.2. Find all groups and roles for a user
Here, find all groups and the roles a user has, sorted by the name of the role.
Query
START n=node:node_auto_index(name = "User1")

1
MATCH n-[:hasRoleInGroup]->hyperEdge-[:hasGroup]->group,
2
hyperEdge-[:hasRole]->role
3
RETURN role.name, group.name
4
ORDER BY role.name asc
The groups and roles of User1 are returned:
表 . --docbook
role.n group.n
ame ame
2 rows
0 ms
"Role1" "Group2"
"Role2" "Group1"
7.4.3. Find common groups based on shared roles
Assume a more complicated graph:
1. Two user nodes User1, User2.

2. User1 is in Group1, Group2, Group3.
3. User1 has Role1, Role2 in Group1; Role2, Role3 in Group2; Role3, Role4 in
Group3 (hyper edges).
4. User2 is in Group1, Group2, Group3.
5. User2 has Role2, Role5 in Group1; Role3, Role4 in Group2; Role5, Role6 in
Group3 (hyper edges).
The graph for this looks like the following (nodes like U1G2R23 representing the HyperEdges):
75
Graph
To return Group1 and Group2 as User1 and User2 share at least one common role in these
two groups, the query looks like this:
Query
START u1=node:node_auto_index(name =
1
"User1"),u2=node:node_auto_index(name = "User2")
2
MATCH u1-[:hasRoleInGroup]->hyperEdge1-[:hasGroup]->group,
3
hyperEdge1-[:hasRole]->role,
4
u2-[:hasRoleInGroup]->hyperEdge2-[:hasGroup]->group,
5
hyperEdge2-[:hasRole]->role
6
RETURN group.name, count(role)
7
ORDER BY group.name ASC
The groups where User1 and User2 share at least one common role:
表 . --docbook
group.n count(
ame role)
2 rows
0 ms
"Group1" 1
"Group2" 1
7.5. 基于社交邻居的朋友查找
Imagine an example graph like the following one:
Graph
76
To find out the friends of Joe’s friends that are not already his friends, the query looks like
this:
Query
START joe=node:node_auto_index(name =
"Joe")
1
2 MATCH joe-[:knows*2..2]-friend_of_friend
3 WHERE not(joe-[:knows]-friend_of_friend)
4 RETURN friend_of_friend.name, COUNT(*)
5 ORDER BY COUNT(*) DESC,
friend_of_friend.name
This returns a list of friends-of-friends ordered by the number of connections to them, and
secondly by their name.
表 . --docbook
friend_of_friend.n COU
ame NT(*)
3 rows
0 ms
"Ian" 2
"Derrick" 1
77
friend_of_friend.n COU
ame NT(*)
"Jill" 1
7.6. Co-favorited places

Graph
7.6.1. Co-favorited places — users who like x also like y
Find places that people also like who favorite this place:
• Determine who has favorited place x.
• What else have they favorited that is not place x.
Query
START place=node:node_auto_index(name =
1
"CoffeeShop1")
2
MATCH place<-[:favorite]-person-[:favorite]->stuff
3
RETURN stuff.name, count(*)
4
ORDER BY count(*) DESC, stuff.name
The list of places that are favorited by people that favorited the start place.
表 . --docbook
stuff.n cou
ame nt(*)
3 rows
0 ms
"MelsPla 2
ce"
78
stuff.n cou
ame nt(*)
"CoffeSh 1
op2"
"SaunaX" 1
7.6.2. Co-Tagged places — places related through tags
Find places that are tagged with the same tags:
• Determine the tags for place x.
• What else is tagged the same as x that is not x.
Query
START place=node:node_auto_index(name =
"CoffeeShop1")
1
2 MATCH place-[:tagged]->tag<-[:tagged]-otherPlace
3 RETURN otherPlace.name, collect(tag.name)
4 ORDER BY length(collect(tag.name)) DESC,
otherPlace.name
This query returns other places than CoffeeShop1 which share the same tags; they are
ranked by the number of tags.
表 . --docbook
otherPlace.n collect(tag.n
ame ame)
3 rows
0 ms
"MelsPlace" ["Cool","Cosy"]
"CoffeeShop2" ["Cool"]
"CoffeeShop3" ["Cosy"]
7.7. Find people based on similar favorites

Graph
79
To find out the possible new friends based on them liking similar things as the asking
person, use a query like this:
Query
START me=node:node_auto_index(name = "Joe")

1
MATCH
2
me-[:favorite]->stuff<-[:favorite]-person
3
WHERE NOT(me-[:friend]-person)
4
RETURN person.name, count(stuff)
5
ORDER BY count(stuff) DESC
The list of possible friends ranked by them liking similar stuff that are not yet friends is
returned.
表 . --docbook
person.n count(s
ame tuff)
2 rows
0 ms
"Derrick" 2
"Jill" 1
7.8. Find people based on mutual friends and

groups
Graph
80
In this scenario, the problem is to determine mutual friends and groups, if any, between persons.
If no mutual groups or friends are found, there should be a 0 returned.
Query
START me=node(5), other=node(4, 3)

1
MATCH
2
pGroups=me-[?:member_of_group]->mg<-[?:member_of_group]-other,
3
pMutualFriends=me-[?:knows]->mf<-[?:knows]-other
4
RETURN other.name as name,
5
count(distinct pGroups) AS mutualGroups,
6
count(distinct pMutualFriends) AS mutualFriends
7
ORDER BY mutualFriends DESC
The question we are asking is — how many unique paths are there between me and Jill, the
paths being common group memberships, and common friends. If the paths are mandatory, no results
will be returned if me and Bob lack any common friends, and we don’t want that. To make a path
optional, you have to make at least one of it’s relationships optional. That makes the whole path
optional.
表 . --docbook
n mutualGr mutualFri
ame oups ends
2 rows
0 ms
"J 1 1
ill"
"B 1 0
81
n mutualGr mutualFri
ame oups ends
ob"
7.9. Find friends based on similar tagging

Graph
To find people similar to me based on the taggings of their favorited items, one approach
could be:
• Determine the tags associated with what I favorite.
• What else is tagged with those tags?
• Who favorites items tagged with the same tags?
• Sort the result by how many of the same things these people like.
Query
START me=node:node_auto_index(name = "Joe")

1 MATCH
me-[:favorite]->myFavorites-[:tagged]->tag<-[:tagged]-theirFavo
2
rites<-[:favorite]-people
3
4 WHERE NOT(me=people)
5 RETURN people.name as name, count(*) as similar_favs
ORDER BY similar_favs DESC
The query returns the list of possible friends ranked by them liking similar stuff that are not
yet friends.
表 . --docbook
82
na similar_
me favs
2 rows
0 ms
"Sara 2
"
"Derr 1
ick"
7.10. Multirelational (social) graphs

Graph
This example shows a multi-relational network between persons and things they like. A
multi-relational graph is a graph with more than one kind of relationship between nodes.
Query
START me=node:node_auto_index(name =
'Joe')
1
2 MATCH
me-[r1:FOLLOWS|LOVES]->other-[r2]->me
3
4 WHERE type(r1)=type(r2)
RETURN other.name, type(r1)
The query returns people that FOLLOWS or LOVES Joe back.
83
表 . --docbook
other.n typ
ame e(r1)
3 rows
0 ms
"Sara" "FOLL
OWS"
"Maria" "FOLL
OWS"
"Maria" "LOVE
S"
7.11. Boosting recommendation results

Graph
This query finds the recommended friends for the origin that are working at the same place as
the origin, or know a person that the origin knows, also, the origin should not already know the target.
84
This recommendation is weighted for the weight of the relationship r2, and boosted with a factor of 2,
if there is an activity-property on that relationship
Query
START origin=node:node_auto_index(name = "Clark Kent")

1
MATCH
2
origin-[r1:KNOWS|WORKSAT]-(c)-[r2:KNOWS|WORKSAT]-candidate
3
WHERE type(r1)=type(r2) AND (NOT (origin-[:KNOWS]-candidate))
4
RETURN origin.name as origin, candidate.name as candidate,
5
SUM(ROUND(r2.weight + (COALESCE(r2.activity?, 0) * 2))) as boost
6
ORDER BY boost desc
7
LIMIT 10
This returns the recommended friends for the origin nodes and their recommendation score.
表 . --docbook
origin candidate b
oost
2 rows
0 ms
"Clark "Perry White" 22
Kent"
"Clark "Anderson 4
Kent" Cooper"
7.12. Calculating the clustering coefficient of a

network
Graph
85
In this example, adapted from Niko Gamulins blog post on Neo4j for Social Network Analysis,
the graph in question is showing the 2-hop relationships of a sample person as nodes with KNOWS
relationships.
The clustering coefficient of a selected node is defined as the probability that two randomly
selected neighbors are connected to each other. With the number of neighbors as n and the number of
mutual connections between the neighbors r the calculation is:
The number of possible connections between two neighbors is n!/(2!(n-2)!) =

4!/(2!(4-2)!) = 24/4 = 6, where n is the number of neighbors n = 4 and the actual number r
of connections is 1. Therefore the clustering coefficient of node 1 is 1/6.
n and r are quite simple to retrieve via the following query:
Query
START a = node(1)
1 MATCH (a)--(b)
2 WITH a, count(distinct b) as
n3
4 MATCH (a)--()-[r]-()--(a)
5 RETURN n, count(distinct r)
as r
This returns n and r for the above calculations.
表 . --docbook
n r
1
row
86
n r
0
ms
4 1
7.13. Pretty graphs

7.13.1. Star graph
7.13.2. Wheel graph
7.13.3. Complete graph
7.13.4. Friendship graph
This section is showing how to create some of the named pretty graphs on Wikipedia.
7.13.1. Star graph
The graph is created by first creating a center node, and then once per element in the range,
creates a leaf node and connects it to the center.
Query
CREATE center
1
foreach( x in range(1,6) :
2
3
CREATE leaf,
4
center-[:X]->leaf
5
)
6
RETURN id(center) as id;
The query returns the id of the center node.
表 . --docbook
id
1 row
Nodes created: 7
Relationships
created: 6
2 ms
Graph
87
7.13.2. Wheel graph
This graph is created in a number of steps:
• Create a center node.
• Once per element in the range, create a leaf and connect it to the center.
• Select 2 leafs from the center node and connect them.
• Find the minimum and maximum leaf and connect these.
• Return the id of the center node.
Query
1
2 CREATE center
3 foreach( x in range(1,6) :
4
5 CREATE leaf={count:x}, center-[:X]->leaf
6 )
7 ==== center ====
8 MATCH large_leaf<--center-->small_leaf
9 WHERE large_leaf.count = small_leaf.count + 1
1 CREATE small_leaf-[:X]->large_leaf
0
1 ==== center, min(small_leaf.count) as min,
1 max(large_leaf.count) as max ====
1 MATCH first_leaf<--center-->last_leaf
2 WHERE first_leaf.count = min AND last_leaf.count = max
1 CREATE last_leaf-[:X]->first_leaf
3
1 RETURN id(center) as id
4
88
1
5
1
6
The query returns the id of the center node.
表 . --docbook
id
1 row
Nodes created: 7
Relationships
created: 12
Properties set: 6
14 ms
Graph
7.13.3. Complete graph
For this graph, a root node is created, and used to hang a number of nodes from. Then, two
nodes are selected, hanging from the center, with the requirement that the id of the first is less
than the id of the next. This is to prevent double relationships and self relationships. Using said
match, relationships between all these nodes are created. Lastly, the center node and all
relationships connected to it are removed.
Query
1 CREATE center
89
2 foreach( x in range(1,6) :
3
4 CREATE leaf={count : x},
center-[:X]->leaf
5
6 )
7 ==== center ====
8 MATCH leaf1<--center-->leaf2
9 WHERE id(leaf1)<id(leaf2)
1 CREATE leaf1-[:X]->leaf2
0 ==== center ====
1 MATCH center-[r]->()
1 DELETE center,r;
1
2
Nothing is returned by this query.
表 . --docbook
Nodes created: 7
Relationships
created: 21
Properties set: 6
Nodes deleted: 1
Relationships
deleted: 6
19 ms
(empty result)
Graph
90
7.13.4. Friendship graph
This query first creates a center node, and then once per element in the range, creates a
cycle graph and connects it to the center
Query
CREATE center
1
foreach( x in range(1,3) :
2
3
CREATE leaf1, leaf2, center-[:X]->leaf1, center-[:X]->leaf2,
4
leaf1-[:X]->leaf2
5
)
6
RETURN ID(center) as id
The id of the center node is returned by the query.
表 . --docbook
id
1 row
Nodes created: 7
Relationships
created: 9
12 ms
91
Graph
7.14. A multilevel indexing structure (path tree)

In this example, a multi-level tree structure is used to index event nodes (here Event1, Event2
and Event3, in this case with a YEAR-MONTH-DAY granularity, making this a timeline indexing
structure. However, this approach should work for a wide range of multi-level ranges.
The structure follows a couple of rules:
• Events can be indexed multiple times by connecting the indexing structure leafs with
the events via a VALUE relationship.
• The querying is done in a path-range fashion. That is, the start- and end path from the
indexing root to the start and end leafs in the tree are calculated
• Using Cypher, the queries following different strategies can be expressed as path
sections and put together using one single query.
The graph below depicts a structure with 3 Events being attached to an index structure at
different leafs.
Graph
92
7.14.1. Return zero range
Here, only the events indexed under one leaf (2010-12-31) are returned. The query only needs
one path segment rootPath (color Green) through the index.
Graph
93
Query
START root=node:node_auto_index(name = 'Root')

1
MATCH rootPath=root-[:`2010`]->()-[:`12`]->()-[:`31`]->leaf,
2
leaf-[:VALUE]->event
3
RETURN event.name
4
ORDER BY event.name ASC
Returning all events on the date 2010-12-31, in this case Event1 and Event2
表 . --docbook
event.n
ame
2 rows
0 ms
"Event1"
"Event2"
94
7.14.2. Return the full range
In this case, the range goes from the first to the last leaf of the index tree. Here, startPath
(color Greenyellow) and endPath (color Green) span up the range, valuePath (color Blue) is
then connecting the leafs, and the values can be read from the middle node, hanging off the values
(color Red) path.
Graph
Query

MATCH
1
startPath=root-[:`2010`]->()-[:`12`]->()-[:`31`]->startLeaf,
2
endPath=root-[:`2011`]->()-[:`01`]->()-[:`03`]->endLea
3
f,
4
valuePath=startLeaf-[:NEXT*0..]->middle-[:NEXT*0..]->e
5
ndLeaf,
6
values=middle-[:VALUE]->event
7
RETURN event.name
95
Returning all events between 2010-12-31 and 2011-01-03, in this case all events.
表 . --docbook
event.n
ame
4 rows
0 ms
"Event1"
"Event2"
"Event2"
"Event3"
7.14.3. Return partly shared path ranges
Here, the query range results in partly shared paths when querying the index, making the
introduction of and common path segment commonPath (color Black) necessary, before spanning
up startPath (color Greenyellow) and endPath (color Darkgreen) . After that, valuePath
(color Blue) connects the leafs and the indexed values are returned off values (color Red) path.
Graph
96
Query

1
MATCH commonPath=root-[:`2011`]->()-[:`01`]->commonRootEnd,
2
startPath=commonRootEnd-[:`01`]->startLeaf,
3
endPath=commonRootEnd-[:`03`]->endLeaf,
4
valuePath=startLeaf-[:NEXT*0..]->middle-[:NEXT*0..]->e
5
ndLeaf,
6
values=middle-[:VALUE]->event
7
RETURN event.name
8
Returning all events between 2011-01-01 and 2011-01-03, in this case Event2 and Event3.
表 . --docbook
event.n
ame
2 rows
0 ms
"Event2"
97
event.n
ame
"Event3"
7.15. Complex similarity computations

7.15.1. Calculate similarities by complex calculations
7.15.1. Calculate similarities by complex calculations
Here, a similarity between two players in a game is calculated by the number of times they
have eaten the same food.
Query
START me=node:node_auto_index(name = "me")

MATCH me-[r1:ATE]->food<-[r2:ATE]-you
1
==== me,count(distinct r1) as H1,count(distinct r2) as H2,you
2
====
3
MATCH me-[r1:ATE]->food<-[r2:ATE]-you
4
RETURN
5
sum((1-ABS(r1.times/H1-r2.times/H2))*(r1.times+r2.times)/(H1+H2
)) as similarity
The two players and their similarity measure.
表 . --docbook
simil
arity
1 row
0 ms
-30.0
Graph
98
第 8 章多语言支持
下表列出了社区提供的用于将 Neo4j 工作在嵌入模式的不同语言以及相应的框架。
表 8.1. 由社区贡献的 Neo4j 嵌入驱动

nam language / URL
e framework
Neo4j. JRuby https://github.com/andreasronge/neo4j

rb
Neo4d Python, Django https://github.com/scholrly/neo4django

jango
Neo4js JavaScript https://github.com/neo4j/neo4js
Gremli Java, Groovy gremlin-plugin,

n https://github.com/tinkerpop/gremlin/wiki
Neo4j- Scala https://github.com/FaKod/neo4j-scala

Scala
Borne Clojure https://github.com/wagjo/borneo

o
第 9 章在 Python 应用中使用 Neo4j
目录
9.1. 你好，世界！
9.2. 一个使用遍历查询和索引的范例应用
要获取关于 Python 语言绑定的通用信息，请参考：python-embedded-installation。要获
取关于如何在 Python 下安装 Neo4j 驱动的信息，请参考：python-embedded。
9.1. 你好，世界！
这是一个让你能开始使用的简单范例。
1 from neo4j import GraphDatabase

2
3 # Create a database
4 db = GraphDatabase(folder_to_put_db_in)
5
6 # All write operations happen in a transaction
7 with db.transaction:
99
8 firstNode = db.node(name='Hello')
9 secondNode = db.node(name='world!')
1
0 # Create a relationship with type 'knows'
1 relationship = firstNode.knows(secondNode, name='graphy')
1
1 # Read operations can happen anywhere
2 message = ' '.join([firstNode['name'], relationship['name'],
secondNode['name']])
1
3
1 print message
4
1 # Delete the data
1 firstNode.knows.single.delete()
6 firstNode.delete()
1 secondNode.delete()
7
1 # Always shut down your database when your application exits
8 db.shutdown()
1
9
2
0
2
1
2
2
2
3
2
4
2
5
2
6
9.2. 一个使用遍历查询和索引的范例应用
9.2.1. 域逻辑
9.2.2. 创建数据并返回他们
要了解这里用到的一些概念的详细文档，请参考： python-embedded-reference-indexes
和 python-embedded-reference-traversal 。
这个范例展示了如何开始构建一些像使用 Neo4j 的简单缴费跟踪程序。
我们开始引入 Neo4j，并创建一些我们将用来组织我们实际数据的 meta 数据。

100
1
2
3
4
5
6
from neo4j import GraphDatabase, INCOMING, Evaluation
7
8
# Create a database
9
db = GraphDatabase(folder_to_put_db_in)
1
0
# All write operations happen in a transaction
1
with db.transaction:
1
1
# A node to connect customers to
2
customers = db.node()
1
3
# A node to connect invoices to
1
invoices = db.node()
4
1
# Connected to the reference node, so
5
# that we can always find them.
1
db.reference_node.CUSTOMERS(customers)
6
db.reference_node.INVOICES(invoices)
1
7
# An index, helps us rapidly look up customers
1
customer_idx =
8
db.node.indexes.create('customers')
1
9
2
0
2
1
9.2.1. 域逻辑
然后我们定义一些我们的应用能执行的域逻辑。我们的应用有连个域对象，客户和缴
费单。让我们创建一些方法来新增客户和缴费单。
1 def create_customer(name):
3 customer = db.node(name=name)
4 customer.INSTANCE_OF(customers)
5
6 # Index the customer by name
7 customer_idx['name'][name] =
customer
8
101
9 return customer
1
0 def create_invoice(customer, amount):
1 invoice = db.node(amount=amount)
1 invoice.INSTANCE_OF(invoices)
2
1 invoice.RECIPIENT(customer)
3 return customer
1
4
1
5
1
6
在客户这方面，我们创建一个新的节点来表示客户并把他连接到客户集节点。这会帮助
我们稍后查找客户以及判定节点是否表示是客户。
我们也用客户的名称做索引，为了能快速的通过客户的名字找到客户。
在缴费单这方面，我们也做相同的动作，除了没有索引。我们也连接一个新的缴费单到发
送这个缴费单的客户节点上面，并定义这个关系的类型为：SENT_TO 。
下一步，我们想能接收到我们新增的客户和缴费单。因为我们用了客户的名称做索引，
因此找到他们是非常简单的。
def get_customer(name):
1
return
2
customer_idx['name'][name].single
让我们说我们也喜欢做一些类似于找到给定客户的所有大于某一金额的所有缴费单
的事情。这应该通过写一个遍历查询器来完成，像下面这样：
1 def get_invoices_with_amount_over(customer, min_sum):

2 def evaluator(path):
3 node = path.end
4 if node.has_key('amount') and node['amount'] >
min_sum:
5
6 return Evaluation.INCLUDE_AND_CONTINUE
7 return Evaluation.EXCLUDE_AND_CONTINUE
8
9 return db.traversal()\
1 .relationships('RECIPIENT', INCOMING)\
0 .evaluator(evaluator)\
1 .traverse(customer)\
1 .nodes
102
1
2
9.2.2. 创建数据并返回他们
把他们放在一起，我们就能创建客户和缴费单了，并且使用我们写的查询方法找到他
们。
1
2
3 for name in ['Acme Inc.', 'Example Ltd.']:
4 create_customer(name)
5
6 # Loop through customers
7 for relationship in customers.INSTANCE_OF:
8 customer = relationship.start
9 for i in range(1,12):
1 create_invoice(customer, 100 * i)
0
1 # Finding large invoices
1 large_invoices =
get_invoices_with_amount_over(get_customer('Acme
1 Inc.'), 500)
2
1 # Getting all invoices per customer:
3 for relationship in get_customer('Acme
Inc.').RECIPIENT.incoming:
1
4 invoice = relationship.start
1
5
第 10 章扩展 Neo4j 服务器
目录
10.1. 服务器插件
10.2. 非托管扩展
Neo4j 服务器可以通过插件或者非托管扩展来增强。为了获取更多关于服务器的信息，请
参考：第 17 章 Neo4j 服务器。
10.1. 服务器插件
内容提示
• 服务器的功能可以通过增加插件的方式来增强。
• 插件是用户自己编码完成的，以便增强数据库，节点以及属性的功能。
103
• Neo4j 服务器在与客户端通过 HTTP 方式进行交互时使用这些自定义插件。
插件提供了一种增加 Neo4j Rest API 新功能的很好的方式，而不需要去从新发明你自

己的 API。作为在服务端运行的脚本，插件可以用于接受和维护节点，关系，路径，属性
以及指数。
提示
如果你想完全控制你的 API，你需要多花心思，并理解这样做的风险，而 Neo4j

服务端也可以通过 JAX-RS 提供非托管的扩展。
涉及到的类在 jar 包 org.neo4j:server-api 中，看看下载页面的链接地址，介绍了如何引入。
对于一个 Maven 工程，新增服务端 API 依赖到你的工程中，你可以修改 pom.xml 像下面这
样：
<dependency>
1 <groupId>org.neo4j</groupId>
2 <artifactId>server-api</artifac
tId>
3
4 <version>${neo4j-version}</vers
ion>
5
</dependency>
_ ${neo4j-version} 是一个版本号。_
为了创建插件，你的代码必须继承类： ServerPlugin 。你的插件必须做到下面几点：
• 确定插件可以产生一个（一组）节点，关系或者路径，或者任何 Java 原生类型以

及字符串
• 可以指定参数，
• 可以指定一个扩展点
• 包括应用逻辑。
• 确定在 @PluginTarget 和 @Source 参数中可以发现类型。
下面是一个插件的范例，参数是数据库（也可以是节点或者关系）：
获取所有节点和关系的插件.
1 @Description( "An extension to the Neo4j Server for getting all

nodes
2 or relationships" )
3 public class GetAll extends ServerPlugin
4 {
5 @Name( "get_all_nodes" )
6 @Description( "Get all nodes from the Neo4j graph
database"
7 )
8 @PluginTarget( GraphDatabaseService.class )
104
9 public Iterable<Node> getAllNodes( @Source
1 graphDb )
0 {
1 return
1 GlobalGraphOperations.at( graphDb ).getAllNodes();
1 }
2
1 @Description( "Get all relationships from the Neo4j graph
3 database" )
1 @PluginTarget( GraphDatabaseService.class )
4 public Iterable<Relationship>
getAllRelationships(
1 @Source GraphDatabaseService graphDb )
5 {
1 return
6 GlobalGraphOperations.at( graphDb ).getAllRelationships();
1 }
7 }
1
8
完整源代码下载地址： GetAll.java
找到两点之间最短距离的插件.
1 public class ShortestPath extends ServerPlugin

2 {
3 @Description( "Find the shortest path between two nodes." )
4 @PluginTarget( Node.class )
5 public Iterable<Path> shortestPath(
6 @Source Node source,
7 @Description( "The node to find the shortest path
to."
8 )
9 @Parameter( name = "target" ) Node target,
1 @Description( "The relationship types to follow
0 when searching for the shortest path(s). " +
1 "Order is insignificant, if omitted all
1 types are followed." )
1 @Parameter( name = "types", optional = true )
2 String[] types,
1 @Description( "The maximum path length to search
3 for, default value (if omitted) is 4." )
1 @Parameter( name = "depth", optional = true )
4 Integer depth )
1 {
5 Expander expander;
1 if ( types == null )
6 {
1 expander = Traversal.expanderForAllTypes();
105
7 }
1 else
8 {
1 expander = Traversal.emptyExpander();
9 for ( int i = 0; i < types.length; i++ )
2 {
0 expander =
expander.add(
2 DynamicRelationshipType.withName( types[i] ) );
1 }
2 }
2 PathFinder<Path> shortestPath =
GraphAlgoFactory.shortestPath(
2
3 expander, depth == null ? 4 :
depth.intValue()
2 );
4 return shortestPath.findAllPaths( source, target );
2 }
5 }
2
6
2
7
2
8
2
9
3
0
3
1
3
2
完整源代码下载地址： ShortestPath.java
为了部署代码，可以简单的编译成.jar 文件并把他放到服务器的 classpath 下面（按照

惯例，插件目录在 Neo4j 服务端的主目录下面）。
提示
确保通过 Maven 或者 `jar -cvf myext.jar *`编译时，目录列表被保留在 jar 文件中，

确定 jar 目录代替指定的单个文件。
'.jar' 文件必须包括
'META-INF/services/org.neo4j.server.plugins.ServerPlugi
n' 。
106
这是一个范例，有多个入口，每一个都在下面独立的一行：
org.neo4j.examples.server.plugins.GetAll
1
org.neo4j.examples.server.plugins.DepthTwo
2
org.neo4j.examples.server.plugins.Shortest
3
Path
上面的代码让一个扩展在接收来自 Neo4j 服务端的服务时，总是在数据库中可见（通过
@PluginTarget 注解）。简单的改变 @PluginTarget 参数成 Node.class 或者
Relationship.class ，就可以让我们轻松的切换到我们希望的其他数据模型。通过插件提
供的功能性扩展会自动生效的。比如，客户端与服务端交互时，可以通过上面的插件提供的扩
展，更早的检查他们接收到的反馈。比如在默认数据库 URI 上发送一个 GET 请求：
curl -v
1
http://localhost:7474/db/data/
对于 GET 请求的反馈是默认情况下会是一个 JSON 数据，里面有一属性”extensions“，列
出了所有可以使用的插件。在下面例子中，我们只有 GetAll 插件注册到了服务端，所以只
有他的扩展功能是可以使用的。扩展名如果没有通过 @Name 指定的话会被自动分配，基于方
法名。
{
1
"extensions-info" : "http://localhost:7474/db/data/ext",
2
"node" : "http://localhost:7474/db/data/node",
3
"node_index" : "http://localhost:7474/db/data/index/node",
4
"relationship_index" :
5
"http://localhost:7474/db/data/index/relationship",
6
"reference_node" : "http://localhost:7474/db/data/node/0",
7
"extensions_info" : "http://localhost:7474/db/data/ext",
8
"extensions" : {
9
"GetAll" : {
1
"get_all_nodes" :
0
"http://localhost:7474/db/data/ext/GetAll/graphdb/get_all_nodes
1
",
1
"get_all_relationships" :
1
"http://localhost:7474/db/data/ext/GetAll/graphdb/getAllRelatio
2
nships"
1
}
3
}
在两个扩展 URI 上面做 GET 请求获取 meta 信息：
curl
1
http://localhost:7474/db/data/ext/GetAll/graphdb/get_all_nodes
1 {
2 "extends" : "graphdb",
3 "description" : "Get all nodes from the Neo4j graph database",
4 "name" : "get_all_nodes",
107
5 "parameters" : [ ]
6 }
为了使用它，需要 POST 到这个地址，并带上指定的参数和 JSON 数据。比如我们调用
shortest path 扩展（URI 通过发送 GET 请求到 http://localhost:7474/db/data/node/123 获
取）：
curl -X POST
http://localhost:7474/db/data/ext/GetAll/node/123/shortestPath
1 \
2 -H "Content-Type: application/json" \
3 -d
'{"target":"http://localhost:7474/db/data/node/456&depth=5"}'
如果一切顺利，将返回状态码： 200 和一些 0 或者更多的数据。如果没有任何返回，将
返回状态码： 204 。如果扩展出现异常，将返回状态码： 500 以及异常错误信息。
那些需要进行写操作的扩展必须管理他们自己的事务。事务不能自动被管理的。
通过这个模型，任何插件都会很自然的适应任何 Neo4j 支持的超媒体 — 这意味着客户端

在像节点，关系和路径这些抽象的一个简单的升级路径上占有优势，因为服务端已经被插件增
强了（老客户端不会）。
10.2. 非托管扩展
内容提示
危险：Men at Work! 非托管扩展是部署随意的 JAX-RS 代码进 Neo4j 服务器的一
•
种方式。
•
非托管扩展更准确的说是：非托管。如果你把没有经过完整单元测试的代码放入
了服务器，有很高的几率会降低服务器性能，所以请小心使用。
一些工程希望更加友好的管控他们的服务端代码。为此，我们将介绍一些非托管 API。
警告
这是一把双刃剑，它允许用户随意部署 JAX-RS 类到服务端，因此你要小心使用。

特别是你应该明白，他会轻易的就占用服务端的大量的堆空间，降低你可能不关心的
服务端的性能。
当然，如果你了解这些，你可以简单的通过一个 @Context 注解加入到你的代码，然后
把你的 JAX-RS 类编译成 JAX-RS jar 文件，加载进 Neo4j 服务端。然后增加你的类到运行时路
径（classpath）中（只需要把他放到 Neo4j 服务端的目录中）。你可以使用通过
org.neo4j.server.logging.Logger 使用日志功能，获取 Neo4j 服务端的主机环境信息。
在你的代码中，你可以通过 @Context 访问 GraphDatabaseService ：
1 public MyCoolService( @Context GraphDatabaseService

108
database
2 )
3 {
4 // Have fun here, but be safe!
}
记住，非托管 API 是一种非常高效的工具。他是通过部署代码来影响服务器的最容易
的方式，因此我们首先应该想到的是你不能优先使用非托管扩展。然而，大量的上下文参
数可以提供给你，像数据库引用。
为了指定你的扩展的挂载点，完整的类应该像这样：
非托管扩展范例.
1
2
3
4
5
@Path( "/helloworld" )
6
public class HelloWorldResource
7
{
8
private final GraphDatabaseService database;
9
1
public HelloWorldResource( @Context GraphDatabaseService
0
database )
1
{
1
this.database = database;
1
}
2
1
@GET
3
@Produces( MediaType.TEXT_PLAIN )
1
@Path( "/{nodeId}" )
4
public Response hello( @PathParam( "nodeId" ) long nodeId )
1
{
5
// Do stuff with the database
1
return Response.status( Status.OK ).entity(
6
( "Hello World, nodeId=" +
1
nodeId ).getBytes() ).build();
7
}
1
}
8
1
9
2
0
范例源代码下载地址： HelloWorldResource.java
109
编译这些代码，并把编译的 jar 文件（包括其他任何依赖包）放到
$NEO4J_SERVER_HOME/plugins 目录中，包括在 neo4j-server.properties 文件中的
类。像下面这样：
提示
确保通过 Maven 或者 jar -cvf myext.jar * 构建的 jar 文件包括了目录清单，确

保用 jar 目录代替单个文件。
#包含 JAXRS 资源的 JAXRS 包被逗号分隔开，一个包名对应一个挂载点。

1
2 org.neo4j.server.thirdparty_jaxrs_classes=org.neo4j.exa
mples.server.unmanaged=/examples/unmanaged
将 hello 方法绑定到 GET 后的地址是:
http://{neo4j_server}:{neo4j_port}/examples/unmanaged/helloworld/{nodeI
d}
curl
1
http://localhost:7474/examples/unmanaged/helloworld/123
输出结果：
Hello World,
1
nodeId=123
部分 III. 参考
The reference part is the authoritative source for details on Neo4j usage. It covers details on
capabilities, transactions, indexing and queries among other topics.
目录
11. 性能
11.1. 数据安全性
11.2. 数据完整性
11.3. 数据集成
11.4. 可用性和可靠性
11.5. 容量
12. 事务管理
12.1. Interaction cycle
12.2. Isolation levels
12.3. Default locking behavior
12.4. Deadlocks
12.5. Delete semantics
12.6. Creating unique nodes
12.7. Transaction events
110
13. 数据导入
13.1. 批量插入
14. 索引
14.1. Introduction
14.2. Create
14.3. Delete
14.4. Add
14.5. Remove
14.6. Update
14.7. Search
14.8. Relationship indexes
14.9. Scores
14.10. Configuration and fulltext indexes
14.11. Extra features for Lucene indexes
14.12. Automatic Indexing
15. Cypher 查询语言
15.1. 操作符
15.2. 表达式
15.3. 参数
15.4. 标识符
15.5. 备注
15.6. 更新图数据库
15.7. 事务
15.8. Patterns
16. 图形算法
16.1. Introduction
17. Neo4j 服务器
17.1. 服务器安装
17.2. 服务器配置
17.3. 设置远程调试
17.4. 使用 Neo4j 服务器（带 Web 管理控制台）连接一个 Neo4j 嵌入数据库
17.5. 服务器性能优化
17.6. 在云计算环境中的服务器安装
17.7. Heroku
18. REST API
18.1. Service root
18.2. Streaming
18.3. Cypher queries
18.4. Nodes
18.5. Relationships
18.6. Relationship types
18.7. Node properties
18.8. Relationship properties
18.9. Indexes
18.10. Unique Indexes
18.11. Automatic Indexes
18.12. Configurable Automatic Indexing
111
18.13. Traversals
18.14. Built-in Graph Algorithms
18.15. Batch operations
18.16. WADL Support
18.17. Cypher 插件
18.18. Gremlin Plugin
19. 在 Python 中使用 Neo4j 嵌入模式
19.1. 安装
19.2. Core API
19.3. 索引
19.4. Cypher 查询
19.5. 遍历查询
第 11 章性能
目录
11.3. 数据集成
11.5. 容量
某些数据可能需要不受未经授权的访问（如盗窃、修改）。Neo4j 不明确，处理数据加密，
但支持一切手段内置到 Java 编程语言和 JVM 来保护数据通过加密存储之前。此外，可以由文件
系统级别上加密的数据存储区运行轻松地保护数据。最后，数据保护应周边系统上层中考虑为了防
止刮、恶意数据插入和其他威胁的问题。
11.2.1. 核心图引擎
11.2.2. 不同的数据源
In order to keep data consistent, there needs to be mechanisms and structures that guarantee
the integrity of all stored data. In Neo4j, data integrity is maintained for the core graph engine
together with other data sources - see below.
11.2.1. 核心图引擎
In Neo4j, the whole data model is stored as a graph on disk and persisted as part of every
committed transaction. In the storage layer, Relationships, Nodes, and Properties have direct
pointers to each other. This maintains integrity without the need for data duplication between the
different backend store files.
112
11.2.2. 不同的数据源
在一些情况下，与其他系统结合了为了实现最佳性能非图形执行查找的核心图形引擎。例如，
11.3. 数据集成
11.3.1. Event-based Synchronization
11.3.2. Periodic Synchronization
11.3.3. Periodic Full Export/Import of Data
大多数企业主要依靠关系数据库来存储它们的数据，但这可能会导致性能限制。在某些情况
下，可以作为扩展使用 Neo4j，以补充搜索/查找速度更快的决策。然而，在任何情况下多个数据
存储库中包含相同的数据，同步可以是一个问题。在某些应用程序，是可以接受的搜索平台是稍
有脱节与关系数据库。在其他国家，紧数据完整性 (eg。、 Neo4j 与 RDBMS）是必要的。通常，
这是处理数据的实时更改和发生的 RDBMS 的大容量数据更改。集成的数据同步的几个战略如
下。
11.3.1. Event-based Synchronization
In this scenario, all data stores, both RDBMS and Neo4j, are fed with domain-specific
events via an event bus. Thus, the data held in the different backends is not actually
synchronized but rather replicated.
11.3.2. Periodic Synchronization
另一个可行的方案是通过某种形式的 SQL 查询 Neo4j 到 RDBMS 的最新变化的定期出口。

这允许在同步过程中的少量的等待时间，但已使用 RDBMS 作为主节点的所有数据的目的的优
势。作为主数据源，可以与 Neo4j 应用相同的过程。
11.3.3. Periodic Full Export/Import of Data
Using the Batch Inserter tools for Neo4j, even large amounts of data can be imported into
the database in very short times. Thus, a full export from the RDBMS and import into Neo4j
becomes possible. If the propagation lag between the RDBMS and Neo4j is not a big issue, this
is a very viable solution.
11.4.1. Operational Availability
11.4.2. Disaster Recovery/ Resiliency
Most mission-critical systems require the database subsystem to be accessible at all times.
Neo4j ensures availability and reliability through a few different strategies.
11.4.1. Operational Availability
113
In order not to create a single point of failure, Neo4j supports different approaches which
provide transparent fallback and/or recovery from failures.
Online backup (Cold spare)

In this approach, a single instance of the master database is used, with Online Backup
enabled. In case of a failure, the backup files can be mounted onto a new Neo4j instance and
reintegrated into the application.
Online Backup High Availability (Hot spare)

Here, a Neo4j "backup" instance listens to online transfers of changes from the master. In
the event of a failure of the master, the backup is already running and can directly take over the
load.
High Availability cluster

This approach uses a cluster of database instances, with one (read/write) master and a
number of (read-only) slaves. Failing slaves can simply be restarted and brought back online.
Alternatively, a new slave may be added by cloning an existing one. Should the master instance
fail, a new master will be elected by the remaining cluster nodes.
11.4.2. Disaster Recovery/ Resiliency
In cases of a breakdown of major part of the IT infrastructure, there need to be mechanisms

in place that enable the fast recovery and regrouping of the remaining services and servers. In
Neo4j, there are different components that are suitable to be part of a disaster recovery strategy.
Prevention
• Online Backup High Availability to other locations outside the current data center.
• Online Backup to different file system locations: this is a simpler form of backup,
applying changes directly to backup files; it is thus more suited for local backup scenarios.
• Neo4j High Availability cluster: a cluster of one write-master Neo4j server and a
number of read-slaves, getting transaction logs from the master. Write-master failover is
handled by quorum election among the read-slaves for a new master.
Detection
• SNMP and JMX monitoring can be used for the Neo4j database.
Correction
• Online Backup: A new Neo4j server can be started directly on the backed-up files and
take over new requests.
• Neo4j High Availability cluster: A broken Neo4j read slave can be reinserted into the
cluster, getting the latest updates from the master. Alternatively, a new server can be inserted by
copying an existing server and applying the latest updates to it.
114
11.5. 容量
11.5.1. File Sizes

Neo4j 依赖 Java 的非阻塞 I/O 子系统的所有文件处理。此外，虽然相互关联的数据，
优化存储文件布局 Neo4j 不需要原始设备。因此，filesizes 只局限于底层操作系统处理大
型文件的能力。物理上，没有内置限制的处理能力在 Neo4j 中的文件。Neo4j 尝试内存映
射到尽可能多的底层存储文件尽可能。如果可用的 RAM 不足将所有数据保存在 RAM
中，Neo4j 将使用缓冲区在某些情况下，动态地重新分配内存映射的高性能 I/O 窗口，大
多数 I/O 活动的地区。因此，酸速度会优雅地降低如 RAM 成为制约因素。
11.5.2. Read speed

企业想要优化的硬件，以提供最大的业务价值，从可用的资源使用。Neo4j 的办法对
读取数据提供了最佳地利用所有可用的硬件资源。Neo4j 不会阻止或锁定任何读的操作；
因此，没有危险的死锁在读取操作和无需读取的交易。具有对数据库的线程读取访问，可
能是可用的所有处理器上可同时运行的查询。这很好扩大规模的方案提供更大的服务器.
11.5.3.写入速度
写入速度是很多企业应用程序的一个考虑。然而，有两种不同方案：持续的连续运
转和批量访问（例如，备份、初始或批量加载）。为了支持这些方案的不同要求，Neo4j
支持写入存储层的两种模式。在事务性、酸符合正常操作中，隔离级别是保持和阅读写
作过程同时发生操作。在每个提交数据保存到磁盘，可以恢复到系统故障时的一致状态。
这需要磁盘写入访问权限和刷新数据的真实。因此，Neo4j 在连续模式下的单个服务器上
的写入速度受到硬件 I/O 容量的限制。因此，使用快速策略性污水排放计划的强烈建议为
生产方案.
Neo4j has a Batch Inserter that operates directly on the store files. This mode does not
provide transactional security, so it can only be used when there is a single write thread. Because
data is written sequentially, and never flushed to the logical logs, huge performance boosts are
achieved. The Batch Inserter is optimized for non-transactional bulk import of large amounts of
data.
11.5.4. Data size
• 在 Neo4j，数据大小主要是由主键的地址空间节点、关系、属性和

RelationshipTypes 限制。目前，地址空间如下： 2ˆ35 （~ 340 亿）节点 2ˆ35 (~ 340 亿)
关系 2ˆ36 (~ 680 亿) 属性 2ˆ15 (~ 32 000) 关系类型
第 12 章事务管理
115
• 为了完全保持数据的完整性，并确保良好的交易行为，Neo4j 支持的 ACID 属性：
原子性：如果一个事务的任何部分失败，离开数据库状态保持不变。一致性：任何交易
将使数据库处于一致状态。隔离：在一个事务中修改的数据无法访问的其他操作。耐久
性： DBMS 始终可以恢复已提交的事务的结果。特别是：必须在交易中包装到 Neo4j 的
数据的所有修改。READ_COMMITTED 的默认隔离级别。通过遍历检索到的数据不被修
改受其他交易。非可重复的读取可能发生（即只写锁定是后天和一直保持到事务结束为
止）。一个可以手动写上获取锁节点和关系，以实现更高程度的隔离（可序列化）。在
节点和关系级别会获取锁。死锁检测内置核心跨....
12.1. Interaction cycle

All write operations that work with the graph must be performed in a transaction.
Transactions are thread confined and can be nested as “flat nested transactions”. Flat nested
transactions means that all nested transactions are added to the scope of the top level transaction.
A nested transaction can mark the top level transaction for rollback, meaning the entire
transaction will be rolled back. To only rollback changes made in a nested transaction is not
possible.
When working with transactions the interaction cycle looks like this:
1. Begin a transaction.
2. Operate on the graph performing write operations.
3. Mark the transaction as successful or not.
4. Finish the transaction.
It is very important to finish each transaction. The transaction will not release the locks or
memory it has acquired until it has been finished. The idiomatic use of transactions in Neo4j is to use
a try-finally block, starting the transaction and then try to perform the write operations. The last
operation in the try block should mark the transaction as successful while the finally block should
finish the transaction. Finishing the transaction will perform commit or rollback depending on the
success status.
小心
All modifications performed in a transaction are kept in memory. This means that very
large updates have to be split into several top level transactions to avoid running out of
memory. It must be a top level transaction since splitting up the work in many nested
transactions will just add all the work to the top level transaction.
In an environment that makes use of thread pooling other errors may occur when failing to finish
a transaction properly. Consider a leaked transaction that did not get finished properly. It will be tied
to a thread and when that thread gets scheduled to perform work starting a new (what looks to be a)
top level transaction it will actually be a nested transaction. If the leaked transaction state is “marked
116
for rollback” (which will happen if a deadlock was detected) no more work can be performed on that
transaction. Trying to do so will result in error on each call to a write operation.
12.2. Isolation levels

By default a read operation will read the last committed value unless a local modification within
the current transaction exist. The default isolation level is very similar to READ_COMMITTED: reads
do not block or take any locks so non-repeatable reads can occur. It is possible to achieve a stronger
isolation level (such as REPETABLE_READ and SERIALIZABLE) by manually acquiring read and
write locks
12.3. Default locking behavior

• When adding, changing or removing a property on a node or relationship a write lock
will be taken on the specific node or relationship.
• When creating or deleting a node a write lock will be taken for the specific node.
• When creating or deleting a relationship a write lock will be taken on the specific
relationship and both its nodes.
The locks will be added to the transaction and released when the transaction finishes.
12.4. Deadlocks
Since locks are used it is possible for deadlocks to happen. Neo4j will however detect any
deadlock (caused by acquiring a lock) before they happen and throw an exception. Before the
exception is thrown the transaction is marked for rollback. All locks acquired by the transaction
are still being held but will be released when the transaction is finished (in the finally block as
pointed out earlier). Once the locks are released other transactions that were waiting for locks
held by the transaction causing the deadlock can proceed. The work performed by the transaction
causing the deadlock can then be retried by the user if needed.
Experiencing frequent deadlocks is an indication of concurrent write requests happening in

such a way that it is not possible to execute them while at the same time live up to the intended
isolation and consistency. The solution is to make sure concurrent updates happen in a
reasonable way. For example given two specific nodes (A and B), adding or deleting
relationships to both these nodes in random order for each transaction will result in deadlocks
when there are two or more transactions doing that concurrently. One solution is to make sure
that updates always happens in the same order (first A then B). Another solution is to make sure
that each thread/transaction does not have any conflicting writes to a node or relationship as
some other concurrent transaction. This can for example be achieved by letting a single thread
do all updates of a specific type.
117
重要
Deadlocks caused by the use of other synchronization than the locks managed by Neo4j
can still happen. Since all operations in the Neo4j API are thread safe unless specified
otherwise, there is no need for external synchronization. Other code that requires
synchronization should be synchronized in such a way that it never performs any Neo4j
operation in the synchronized block.
12.5. Delete semantics

When deleting a node or a relationship all properties for that entity will be automatically
removed but the relationships of a node will not be removed.
小心
Neo4j enforces a constraint (upon commit) that all relationships must have a valid start
node and end node. In effect this means that trying to delete a node that still has
relationships attached to it will throw an exception upon commit. It is however possible to
choose in which order to delete the node and the attached relationships as long as no
relationships exist when the transaction is committed.
The delete semantics can be summarized in the following bullets:
• All properties of a node or relationship will be removed when it is deleted.
• A deleted node can not have any attached relationships when the transaction commits.
• It is possible to acquire a reference to a deleted relationship or node that has not yet
been committed.
• Any write operation on a node or relationship after it has been deleted (but not yet
committed) will throw an exception
• After commit trying to acquire a new or work with an old reference to a deleted node or
relationship will throw an exception.
12.6. Creating unique nodes

12.6.1. Single thread
12.6.2. Get or create
12.6.3. Pessimistic locking
In many use cases, a certain level of uniqueness is desired among entities. You could for
instance imagine that only one user with a certain e-mail address may exist in a system. If
multiple concurrent threads naively try to create the user, duplicates will be created. There are
three main strategies for ensuring uniqueness, and they all work across HA and single-instance
deployments.
118
12.6.1. Single thread
By using a single thread, no two threads will even try to create a particular entity
simultaneously. On HA, an external single-threaded client can perform the operations on the
cluster.
12.6.2. Get or create

By using put-if-absent functionality, entity uniqueness can be guaranteed using an index.
Here the index acts as the lock and will only lock the smallest part needed to guaranteed uniqueness
across threads and transactions. To get the more high-level get-or-create functionality make use
of UniqueFactory as seen in the example below.
Example code:
1 public Node getOrCreateUserWithUniqueFactory( String

username,
2 GraphDatabaseService graphDb )
3 {
4 UniqueFactory<Node> factory = new
UniqueFactory.UniqueNodeFactory(
5 graphDb, "users" )
6 {
7 @Override
8 protected void initialize( Node created, Map<String,
Object>
9 properties )
1 {
0 created.setProperty( "name",
properties.get(
1 "name" ) );
1 }
1 };
2
1 return factory.getOrCreate( "name", username );
3 }
12.6.3. Pessimistic locking

重要
While this is a working solution, please consider using the preferred 第 12.6.2 节 “Get
or create” instead.
By using explicit, pessimistic locking, unique creation of entities can be achieved in a

multi-threaded environment. It is most commonly done by locking on a single or a set of
common nodes.
One might be tempted to use Java synchronization for this, but it is dangerous. By mixing
locks in the Neo4j kernel and in the Java runtime, it is easy to produce deadlocks that are not
119
detectable by Neo4j. As long as all locking is done by Neo4j, all deadlocks will be detected and
avoided. Also, a solution using manual synchronization doesn’t ensure uniqueness in an HA
environment.
Example code:
1
2
3
4
5
6
public Node getOrCreateUserPessimistically( String username,
7
GraphDatabaseService graphDb, Node lockNode )
8
{
9
Index<Node> usersIndex =
1
graphDb.index().forNodes( "users" );
0
Node userNode = usersIndex.get( "name",
1
username ).getSingle();
1
if ( userNode != null ) return userNode;
1
Transaction tx = graphDb.beginTx();
2
try
1
{
3
tx.acquireWriteLock( lockNode );
1
userNode = usersIndex.get( "name",
4
username ).getSingle();
1
if ( userNode == null )
5
{
1
userNode = graphDb.createNode();
6
userNode.setProperty( "name", username );
1
usersIndex.add( userNode, "name", username );
7
}
1
tx.success();
8
return userNode;
1
}
9
finally
2
{
0
tx.finish();
2
}
1
}
2
2
2
3
2
4
120
12.7. Transaction events
Transaction event handlers can be registered to receive Neo4j Transaction events. Once it has
been registered at a GraphDatabaseService instance it will receive events about what has
happened in each transaction which is about to be committed. Handlers won’t get notified about
transactions which haven’t performed any write operation or won’t be committed (either if
Transaction#success() hasn’t been called or the transaction has been marked as failed
Transaction#failure(). Right before a transaction is about to be committed the
beforeCommit method is called with the entire diff of modifications made in the transaction. At
this point the transaction is still running so changes can still be made. However there’s no guarantee
that other handlers will see such changes since the order in which handlers are executed is undefined.
This method can also throw an exception and will, in such a case, prevent the transaction from being
committed (where a call to afterRollback will follow). If beforeCommit is successfully
executed the transaction will be committed and the afterCommit method will be called with the
same transaction data as well as the object returned from beforeCommit. This assumes that all
other handlers (if more were registered) also executed beforeCommit successfully.
第 13 章数据导入
为高性能数据导入，建议使用本章中描述的批量插入设施。将数据导入 Neo4j 的其他方
法包括使用小鬼图形导入(see 第 18.18.2 节 “Load a sample graph”) or using the Geoff notation
(see http://geoff.nigelsmall.net/).
13.1. 批量插入
13.1.1. Batch Inserter Examples
13.1.2. Batch Graph Database
13.1.3. 批量插入数据
Neo4j 拥有一批插入设施用于初始的进口，绕过交易和其他检查支持性能。这是非常
有用的当你有一个大的数据集，需要加载一次.
Batch insertion is inlcuded in the neo4j-kernel component, which is part of all Neo4j
distributions and editions.
Be aware of the following points when using batch insertion:
• The intended use is for initial import of data.
• Batch insertion is not thread safe.
• Batch insertion is non-transactional.
• Unless shutdown is successfully invoked at the end of the import, the database files
will be corrupt.
121
警告
Always perform batch insertion in a single thread (or use synchronization to make
only one thread at a time access the batch inserter) and invoke shutdown when finished.
13.1.1. Batch Inserter Examples
Creating a batch inserter is similar to how you normally create data in the database, but in this
case the low-level BatchInserter interface is used. As we have already pointed out, you can’t have
multiple threads using the batch inserter concurrently without external synchronization.
提示
The source code of the examples is found here:

BatchInsertExampleTest.java
To get hold of a BatchInseter, use BatchInserters and then go from there:
BatchInserter inserter =
1
BatchInserters.inserter( "target/batchinserter-example" );
2
Map<String, Object> properties = new HashMap<String, Object>();
3
properties.put( "name", "Mattias" );
4
long mattiasNode = inserter.createNode( properties );
5
properties.put( "name", "Chris" );
6
long chrisNode = inserter.createNode( properties );
7
RelationshipType knows =
8
DynamicRelationshipType.withName( "KNOWS" );
9
// To set properties on the relationship, use a properties map
1
// instead of null as the last parameter.
0
inserter.createRelationship( mattiasNode, chrisNode, knows,
1
null );
1
inserter.shutdown();
To gain good performance you probably want to set some configuration settings for the batch
inserter. Read 第 21.9.2 节 “Batch insert example” for information on configuring a batch inserter.
This is how to start a batch inserter with configuration options:
Map<String, String> config = new HashMap<String,

String>();
1
2 config.put( "neostore.nodestore.db.mapped_memory",
"90M"
3 );
4 BatchInserter inserter = BatchInserters.inserter(
5 "target/batchinserter-example-config", config );
6 // Insert data here ... and then shut down:
In case you have stored the configuration in a file, you can load it like this:
122
Map<String, String> config = MapUtil.load( new File(
1
"target/batchinsert-config" ) );
2
BatchInserter inserter = BatchInserters.inserter(
3
"target/batchinserter-example-config",
4
config );
5
// Insert data here ... and then shut down:
6
13.1.2. Batch Graph Database

如果您已经有数据导入依据正常 Neo4j API 编写的代码，您可以考虑使用公开的 API
的批量插入器.
注意
This will not perform as good as using the BatchInserter API

directly.
Also be aware of the following:
• Starting a transaction or invoking Transaction.finish() or

Transaction.success() will do nothing.
• Invoking the Transaction.failure() method will generate a
NotInTransaction exception.
• Node.delete() and Node.traverse() are not supported.
• Relationship.delete() is not supported.
• Event handlers and indexes are not supported.
• GraphDatabaseService.getRelationshipTypes(), getAllNodes() and

getAllRelationships() are not supported.
With these precautions in mind, this is how to do it:
GraphDatabaseService batchDb =
1 BatchInserters.batchDatabase( "target/batchdb-ex
2 ample" );
3 Node mattiasNode = batchDb.createNode();
4 mattiasNode.setProperty( "name", "Mattias" );
5 Node chrisNode = batchDb.createNode();
6 chrisNode.setProperty( "name", "Chris" );
7 RelationshipType knows =
8 DynamicRelationshipType.withName( "KNOWS" );
9 mattiasNode.createRelationshipTo( chrisNode, knows );
batchDb.shutdown();
提示
123
The source code of the example is found here: BatchInsertExampleTest.java
13.1.3. 批量插入数据
对批量插入的一般说明，请参见 batchinsert.
Indexing during batch insertion is done using BatchInserterIndex which are provided via
BatchInserterIndexProvider. An example:
1
2
3 BatchInserter inserter =
BatchInserters.inserter(
4 "target/neo4jdb-batchinsert" );
5 BatchInserterIndexProvider indexProvider =
6 new LuceneBatchInserterIndexProvider( inserter );
7 BatchInserterIndex actors =
8 indexProvider.nodeIndex( "actors",
MapUtil.stringMap(
9 "type", "exact" ) );
1 actors.setCacheCapacity( "name", 100000 );
0
1 Map<String, Object> properties = MapUtil.map( "name", "Keanu
1 Reeves" );
1 long node = inserter.createNode( properties );
2 actors.add( node, properties );
1
3 //make the changes visible for reading, use this sparsely,
requires
1 IO!
4 actors.flush();
1
5 // Make sure to shut down the index provider as well
1 indexProvider.shutdown();
6 inserter.shutdown();
1
7
配置参数都相同，如中所述第 14.10 节 “Configuration and fulltext indexes”.
Best practices
这里有一些指针，以便获取 BatchInserterIndex 之外的大多数性能:
• Try to avoid flushing too often because each flush will result in all additions (since last
flush) to be visible to the querying methods, and publishing those changes can be a performance
penalty.
• Have (as big as possible) phases where one phase is either only writes or only reads, and
don’t forget to flush after a write phase so that those changes becomes visible to the querying
methods.
124
• Enable caching for keys you know you’re going to do lookups for later on to increase
performance significantly (though insertion performance may degrade slightly).
注意
对索引的更改都可以用于第一次读取后他们被刷新到磁盘。因此，为获得最佳性
能，阅读和查找操作应保持最低期间 batchinsertion 因为他们涉及 IO 和产生消极影
响速度。
第 14 章索引
目录
14.1. Introduction
14.2. Create
14.3. Delete
14.4. Add
14.5. Remove
14.6. Update
14.7. Search
14.9. Scores
Indexing in Neo4j can be done in two different ways:
1. The database itself is a natural index consisting of its relationships of different types
between nodes. For example a tree structure can be layered on top of the data and used for index
lookups performed by a traverser.
2. Separate index engines can be used, with Apache Lucene being the default backend
included with Neo4j.
This chapter demonstrate how to use the second type of indexing, focusing on Lucene.
14.1. Introduction
Indexing operations are part of the Neo4j index API.
Each index is tied to a unique, user-specified name (for example "first_name" or "books") and
can index either nodes or relationships.
The default index implementation is provided by the neo4j-lucene-index component,

which is included in the standard Neo4j download. It can also be downloaded separately from
http://repo1.maven.org/maven2/org/neo4j/neo4j-lucene-index/ . For Maven users, the
125
neo4j-lucene-index component has the coordinates org.neo4j:neo4j-lucene-index
and should be used with the same version of org.neo4j:neo4j-kernel. Different versions of the
index and kernel components are not compatible in the general case. Both components are included
transitively by the org.neo4j:neo4j:pom artifact which makes it simple to keep the versions in
sync.
For initial import of data using indexes, see 第 13.1.3 节 “批量插入数据”.
14.2. Create
An index is created if it doesn’t exist when you ask for it. Unless you give it a custom
configuration, it will be created with default configuration and backend.
To set the stage for our examples, let’s create some indexes to begin with:
IndexManager index = graphDb.index();

1
Index<Node> actors = index.forNodes( "actors" );
2
Index<Node> movies = index.forNodes( "movies" );
3
RelationshipIndex roles =
4
index.forRelationships( "roles" );
This will create two node indexes and one relationship index with default configuration. See
第 14.8 节 “Relationship indexes” for more information specific to relationship indexes.
See 第 14.10 节 “Configuration and fulltext indexes” for how to create fulltext indexes.
You can also check if an index exists like this:

1
boolean indexExists =
2
index.existsForNodes( "actors" );
14.3. Delete
Indexes can be deleted. When deleting, the entire contents of the index will be removed as
well as its associated configuration. A new index can be created with the same name at a later
point in time.

1
Index<Node> actors =
2
index.forNodes( "actors" );
3
actors.delete();
Note that the actual deletion of the index is made during the commit of the surrounding
transaction. Calls made to such an index instance after delete() has been called are invalid inside that
126
transaction as well as outside (if the transaction is successful), but will become valid again if the
transaction is rolled back.
14.4. Add
Each index supports associating any number of key-value pairs with any number of entities
(nodes or relationships), where each association between entity and key-value pair is performed
individually. To begin with, let’s add a few nodes to the indexes:
1
2
3
4
// Actors
5
Node reeves = graphDb.createNode();
6
reeves.setProperty( "name", "Keanu Reeves" );
7
actors.add( reeves, "name", reeves.getProperty( "name" ) );
8
Node bellucci = graphDb.createNode();
9
bellucci.setProperty( "name", "Monica Bellucci" );
1
actors.add( bellucci, "name", bellucci.getProperty( "name" ) );
0
// multiple values for a field, in this case for search only
1
// and not stored as a property.
1
actors.add( bellucci, "name", "La Bellucci" );
1
// Movies
2
Node theMatrix = graphDb.createNode();
1
theMatrix.setProperty( "title", "The Matrix" );
3
theMatrix.setProperty( "year", 1999 );
1
movies.add( theMatrix, "title",
4
theMatrix.getProperty( "title" ) );
1
movies.add( theMatrix, "year",
5
theMatrix.getProperty( "year" ) );
1
Node theMatrixReloaded = graphDb.createNode();
6
theMatrixReloaded.setProperty( "title", "The Matrix
1
Reloaded" );
7
theMatrixReloaded.setProperty( "year", 2003 );
1
movies.add( theMatrixReloaded, "title",
8
theMatrixReloaded.getProperty( "title" ) );
1
movies.add( theMatrixReloaded, "year", 2003 );
9
Node malena = graphDb.createNode();
2
malena.setProperty( "title", "Malèna" );
0
malena.setProperty( "year", 2000 );
2
movies.add( malena, "title", malena.getProperty( "title" ) );
1
movies.add( malena, "year", malena.getProperty( "year" ) );
2
2
2
3
127
2
4
2
5
2
6
Note that there can be multiple values associated with the same entity and key.
Next up, we’ll create relationships and index them as well:
1
// we need a relationship type
2
DynamicRelationshipType ACTS_IN =
3
DynamicRelationshipType.withName( "ACTS_IN" );
4
// create relationships
5
Relationship role1 = reeves.createRelationshipTo( theMatrix,
6
ACTS_IN );
7
role1.setProperty( "name", "Neo" );
8
roles.add( role1, "name", role1.getProperty( "name" ) );
9
Relationship role2 =
1
reeves.createRelationshipTo( theMatrixReloaded, ACTS_IN );
0
role2.setProperty( "name", "Neo" );
1
1
Relationship role3 =
1
bellucci.createRelationshipTo( theMatrixReloaded, ACTS_IN );
2
role3.setProperty( "name", "Persephone" );
1
3
Relationship role4 = bellucci.createRelationshipTo( malena,
1
ACTS_IN );
4
role4.setProperty( "name", "Malèna Scordia" );
1
5
After these operations, our example graph looks like this:
图 14.1. Movie and Actor Graph
14.5. Remove
128
Removing from an index is similar to adding, but can be done by supplying one of the following
combinations of arguments:
• entity
• entity, key
• entity, key, value
1 // completely remove bellucci from the actors index

2 actors.remove( bellucci );
3 // remove any "name" entry of bellucci from the actors index
4 actors.remove( bellucci, "name" );
5 // remove the "name" -> "La Bellucci" entry of bellucci
6 actors.remove( bellucci, "name", "La Bellucci" );
14.6. Update
重要
To update an index entry, the old one must be removed and a new one added. For
details on removing index entries, see 第 14.5 节 “Remove”.
Remember that a node or relationship can be associated with any number of key-value pairs in
an index. This means that you can index a node or relationship with many key-value pairs that have
the same key. In the case where a property value changes and you’d like to update the index, it’s not
enough to just index the new value — you’ll have to remove the old value as well.
Here’s a code example that demonstrates how it’s done:
1
// create a node with a property
2
// so we have something to update later on
3
Node fishburn = graphDb.createNode();
4
fishburn.setProperty( "name", "Fishburn" );
5
// index it
6
actors.add( fishburn, "name", fishburn.getProperty( "name" ) );
7
// update the index entry
8
// when the property value changes
9
actors.remove( fishburn, "name",
1
fishburn.getProperty( "name" ) );
0
fishburn.setProperty( "name", "Laurence Fishburn" );
1
actors.add( fishburn, "name", fishburn.getProperty( "name" ) );
1
14.7. Search
14.7.1. Get
14.7.2. Query
129
An index can be searched in two ways, get and query. The get method will return exact matches
to the given key-value pair, whereas query exposes querying capabilities directly from the backend
used by the index. For example the Lucene query syntax can be used directly with the default
indexing backend.
14.7.1. Get
This is how to search for a single exact match:
1 IndexHits<Node> hits = actors.get( "name", "Keanu Reeves" );

2 Node reeves = hits.getSingle();
IndexHits is an Iterable with some additional useful methods. For example getSingle()
returns the first and only item from the result iterator, or null if there isn’t any hit.
Here’s how to get a single relationship by exact matching and retrieve its start and end
nodes:
Relationship persephone = roles.get( "name",

1
"Persephone" ).getSingle();
2
Node actor = persephone.getStartNode();
3
Node movie = persephone.getEndNode();
Finally, we can iterate over all exact matches from a relationship index:
1 for ( Relationship role : roles.get( "name", "Neo" ) )

2 {
3 // this will give us Reeves twice
4 Node reeves = role.getStartNode();
5 }
重要
In case you don’t iterate through all the hits, IndexHits.close() must be called
explicitly.
14.7.2. Query
有两种查询方法，其中使用键-值签名的值与给定键只表示值的查询的位置。另一种方法是
更通用，并支持查询在同一查询中的多个键-值对。
下面是一个示例使用键查询选项：
1 for ( Node actor : actors.query( "name", "*e*" ) )

2 {
3 // This will return Reeves and Bellucci
4 }
In the following example the query uses multiple keys:
130
for ( Node movie : movies.query( "title:*Matrix* AND
1
year:1999" ) )
2
{
3
// This will return "The Matrix" from 1999 only.
4
}
注意
Beginning a wildcard search with "*" or "?" is discouraged by Lucene, but will
nevertheless work.
小心
You can’t have any whitespace in the search term with this syntax. See 第 14.11.3 节
“Querying with Lucene Query objects” for how to do that.

An index for relationships is just like an index for nodes, extended by providing support to
constrain a search to relationships with a specific start and/or end nodes These extra methods reside in
the RelationshipIndex interface which extends Index<Relationship>.
Example of querying a relationship index:
1
2 // find relationships filtering on start node
3 // using exact matches
4 IndexHits<Relationship> reevesAsNeoHits;
5 reevesAsNeoHits = roles.get( "name", "Neo", reeves, null );
6 Relationship reevesAsNeo =
reevesAsNeoHits.iterator().next();
7
8 reevesAsNeoHits.close();
9 // find relationships filtering on end node
1 // using a query
0 IndexHits<Relationship> matrixNeoHits;
1 matrixNeoHits = roles.query( "name", "*eo", null, theMatrix );
1 Relationship matrixNeo = matrixNeoHits.iterator().next();
1 matrixNeoHits.close();
2
And here’s an example for the special case of searching for a specific relationship type:
1 // find relationships filtering on end node

2 // using a relationship type.
3 // this is how to add it to the index:
4 roles.add( reevesAsNeo, "type",
reevesAsNeo.getType().name()
5 );
6 // Note that to use a compound query, we can't combine committed
7 // and uncommitted index entries, so we'll commit before
131
querying:
8
9 tx.success();
1 tx.finish();
0 // and now we can search for it:
1 IndexHits<Relationship> typeHits;
1 typeHits = roles.query( "type:ACTS_IN AND name:Neo", null,
theMatrix
1 );
2 Relationship typeNeo = typeHits.iterator().next();
1 typeHits.close();
3
Such an index can be useful if your domain has nodes with a very large number of
relationships between them, since it reduces the search time for a relationship between two nodes.
A good example where this approach pays dividends is in time series data, where we have
readings represented as a relationship per occurrence.
14.9. Scores
The IndexHits interface exposes scoring so that the index can communicate scores for the hits.
Note that the result is not sorted by the score unless you explicitly specify that. See 第 14.11.2 节
“Sorting” for how to sort by score.
IndexHits<Node> hits = movies.query( "title", "The*" );

1
for ( Node movie : hits )
2
{
3
System.out.println( movie.getProperty( "title" ) + " " +
4
hits.currentScore() );
5
}

At the time of creation extra configuration can be specified to control the behavior of the
index and which backend to use. For example to create a Lucene fulltext index:

Index<Node> fulltextMovies =
index.forNodes(
1 "movies-fulltext",
2 MapUtil.stringMap( IndexManager.PROVIDER, "lucene",
"type",
3 "fulltext" ) );
4 fulltextMovies.add( theMatrix, "title", "The Matrix" );
5 fulltextMovies.add( theMatrixReloaded, "title", "The Matrix
Reloaded"
6 );
7 // search in the fulltext index
Node found = fulltextMovies.query( "title",
"reloAdEd" ).getSingle();
Here’s an example of how to create an exact index which is case-insensitive:
132
Index<Node> index =
graphDb.index().forNodes( "exact-case-insensitive",
1 stringMap( "type", "exact", "to_lower_case",
2 "true" ) );
3 Node node = graphDb.createNode();
4 index.add( node, "name", "Thomas Anderson" );
5 assertContains( index.query( "name", "\"Thomas
6 Anderson\"" ), node );
assertContains( index.query( "name", "\"thoMas
ANDerson\"" ), node );
提示
In order to search for tokenized words, the query method has to be used. The get
method will only match the full string value, not the tokens.
The configuration of the index is persisted once the index has been created. The provider
configuration key is interpreted by Neo4j, but any other configuration is passed onto the backend
index (e.g. Lucene) to interpret.
表 14.1. Lucene indexing configuration parameters

Parame Possible Effect
ter values
type exact, exact is the default and uses a Lucene keyword analyzer.
fulltext fulltext uses a white-space tokenizer in its analyzer.
to_lower_ true, false This parameter goes together with type: fulltext and
case converts values to lower case during both additions and querying,
making the index case insensitive. Defaults to true.
analyzer the full Overrides the type so that a custom analyzer can be used.
class name of an Note: to_lower_case still affects lowercasing of string queries. If
Analyzer the custom analyzer uppercases the indexed tokens, string queries
will not match as expected.

14.11.1. Numeric ranges
14.11.2. Sorting
14.11.3. Querying with Lucene Query objects
14.11.4. Compound queries
14.11.5. Default operator
14.11.6. Caching
14.11.1. Numeric ranges
133
Lucene supports smart indexing of numbers, querying for ranges and sorting such results, and so
does its backend for Neo4j. To mark a value so that it is indexed as a numeric value, we can make use
of the ValueContext class, like this:
movies.add( theMatrix, "year-numeric", new

ValueContext( 1999 ).indexNumeric() );
movies.add( theMatrixReloaded, "year-numeric", new
1
2
movies.add( malena, "year-numeric", new
3
4
5
int from = 1997;
6
int to = 1999;
7
hits =
movies.query( QueryContext.numericRange( "year-numeric",
from, to ) );
注意
The same type must be used for indexing and querying. That is, you can’t index a value
as a Long and then query the index using an Integer.
By giving null as from/to argument, an open ended query is created. In the following example
we are doing that, and have added sorting to the query as well:
hits = movies.query(
1
QueryContext.numericRange( "year-numeric", from,
2
null )
3
.sortNumeric( "year-numeric", false ) );
From/to in the ranges defaults to be inclusive, but you can change this behavior by using two
extra parameters:
movies.add( theMatrix, "score", new

ValueContext( 8.7 ).indexNumeric() );
1 movies.add( theMatrixReloaded, "score", new
ValueContext(
2 7.1 ).indexNumeric() );
3 movies.add( malena, "score", new
ValueContext(
4 7.4 ).indexNumeric() );
5
6 // include 8.0, exclude 9.0
hits = movies.query( QueryContext.numericRange( "score", 8.0,
9.0, true, false ) );
14.11.2. Sorting
Lucene performs sorting very well, and that is also exposed in the index backend, through the
QueryContext class:
134
hits = movies.query( "title", new
1
QueryContext( "*" ).sort( "title" ) );
2
for ( Node hit : hits )
3
{
4
// all movies with a title in the index, ordered by title
5
}
6
// or
7
hits = movies.query( new
8
QueryContext( "title:*" ).sort( "year", "title" ) );
9
for ( Node hit : hits )
1
{
0
// all movies with a title in the index, ordered by year,
1
then title
1
}
We sort the results by relevance (score) like this:
hits = movies.query( "title", new

1
QueryContext( "The*" ).sortByScore() );
2
3
{
4
// hits sorted by relevance (score)
5
}
14.11.3. Querying with Lucene Query objects
Instead of passing in Lucene query syntax queries, you can instantiate such queries
programmatically and pass in as argument, for example:
// a TermQuery will give exact matches

1
Node actor = actors.query( new TermQuery( new Term( "name",
2
"Keanu Reeves" ) ) ).getSingle();
Note that the TermQuery is basically the same thing as using the get method on the index.
This is how to perform wildcard searches using Lucene Query Objects:
hits = movies.query( new WildcardQuery( new Term( "title", "The

1
Matrix*" ) ) );
2
3
{
4
System.out.println( movie.getProperty( "title" ) );
5
}
Note that this allows for whitespace in the search string.
14.11.4. Compound queries
Lucene supports querying for multiple terms in the same query, like so:
135
1 hits = movies.query( "title:*Matrix* AND year:1999" );
小心
Compound queries can’t search across committed index entries and those who haven’t
got committed yet at the same time.
14.11.5. Default operator

The default operator (that is whether AND or OR is used in between different terms) in a query is
OR. Changing that behavior is also done via the QueryContext class:
QueryContext query = new QueryContext( "title:*Matrix*

1
year:1999" )
2
.defaultOperator( Operator.AND );
3
hits = movies.query( query );
14.11.6. Caching
If your index lookups becomes a performance bottle neck, caching can be enabled for certain
keys in certain indexes (key locations) to speed up get requests. The caching is implemented with an
LRU cache so that only the most recently accessed results are cached (with "results" meaning a query
result of a get request, not a single entity). You can control the size of the cache (the maximum
number of results) per index key.
Index<Node> index =
1 graphDb.index().forNodes( "actors" );
2 ( (LuceneIndex<Node>)
index ).setCacheCapacity( "name", 300000 );
小心
This setting is not persisted after shutting down the

database. This means: set this value after each startup of the
database if you want to keep it.

14.12.1. Configuration
14.12.2. Search
14.12.3. Runtime Configuration
14.12.4. Updating the Automatic Index
Neo4j provides a single index for nodes and one for relationships in each database that
automatically follow property values as they are added, deleted and changed on database primitives.
This functionality is called auto indexing and is controlled both from the database configuration Map
and through its own API.
136
14.12.1. Configuration
By default Auto Indexing is off for both Nodes and Relationships. To enable it on database
startup set the configuration options Config.NODE_AUTO_INDEXING and
Config.RELATIONSHIP_AUTO_INDEXING to the string "true".
If you just enable auto indexing as above, then still no property will be auto indexed. To define
which property names you want the auto indexer to monitor as a configuration parameter, set the
Config.{NODE,RELATIONSHIP}_KEYS_INDEXABLE option to a String that is a comma
separated concatenation of the property names you want auto indexed.
1 /*
2 * Creating the configuration, adding nodeProp1 and
3 nodeProp2 as
4 * auto indexed properties for Nodes and relProp1 and
5 relProp2 as
6 * auto indexed properties for Relationships. Only those
7 will be
8 * indexed. We also have to enable auto indexing for both
9 these
10 * primitives explicitly.
11 */
12 GraphDatabaseService graphDb = new
13 GraphDatabaseFactory().
14 newEmbeddedDatabaseBuilder( storeDirectory ).
15 setConfig( GraphDatabaseSettings.node_keys_indexabl
16 e, "nodeProp1,nodeProp2" ).
17 setConfig( GraphDatabaseSettings.relationship_keys_
18 indexable, "relProp1,relProp2" ).
19 setConfig( GraphDatabaseSettings.node_auto_indexing
20 , "true" ).
21 setConfig( GraphDatabaseSettings.relationship_auto_
22 indexing, "true" ).
23 newGraphDatabase();
24
26 Node node1 = null, node2 = null;
27 Relationship rel = null;
28 try
29 {
30 // Create the primitives
31 node1 = graphDb.createNode();
33 rel = node1.createRelationshipTo( node2,
34 DynamicRelationshipType.withName( "DYNAMIC"
35 ) );
137
36
37 // Add indexable and non-indexable properties
38 node1.setProperty( "nodeProp1", "nodeProp1Value" );
39 node2.setProperty( "nodeProp2", "nodeProp2Value" );
40 node1.setProperty( "nonIndexed",
41 "nodeProp2NonIndexedValue" );
42 rel.setProperty( "relProp1", "relProp1Value" );
43 rel.setProperty( "relPropNonIndexed",
44 "relPropValueNonIndexed" );
// Make things persistent

tx.success();
}
catch ( Exception e )
{
tx.failure();
}
finally
{
tx.finish();
}
14.12.2. Search
The usefulness of the auto indexing functionality comes of course from the ability to actually
query the index and retrieve results. To that end, you can acquire a ReadableIndex object from the
AutoIndexer that exposes all the query and get methods of a full Index with exactly the same
functionality. Continuing from the previous example, accessing the index is done like this:
1 // Get the Node auto index

2 ReadableIndex<Node> autoNodeIndex = graphDb.index()
3 .getNodeAutoIndexer()
4 .getAutoIndex();
5 // node1 and node2 both had auto indexed properties, get them
6 assertEquals( node1,
7 autoNodeIndex.get( "nodeProp1",
"nodeProp1Value"
8 ).getSingle() );
9 assertEquals( node2,
1 autoNodeIndex.get( "nodeProp2",
0 "nodeProp2Value" ).getSingle() );
1 // node2 also had a property that should be ignored.
1 assertFalse( autoNodeIndex.get( "nonIndexed",
1 "nodeProp2NonIndexedValue" ).hasNext() );
2
1 // Get the relationship auto index
3 ReadableIndex<Relationship> autoRelIndex = graphDb.index()
138
1 .getRelationshipAutoIndexer()
4 .getAutoIndex();
1 // One property was set for auto indexing
5 assertEquals( rel,
1 autoRelIndex.get( "relProp1",
6 "relProp1Value" ).getSingle() );
1 // The rest should be ignored
7 assertFalse( autoRelIndex.get( "relPropNonIndexed",
1 "relPropValueNonIndexed" ).hasNext() );
8
1
9
2
0
2
1
2
2
2
3
14.12.3. Runtime Configuration
The same options that are available during database creation via the configuration can also be set
during runtime via the AutoIndexer API.
Gaining access to the AutoIndexer API and adding two Node and one Relationship
properties to auto index is done like so:
1 // Start without any configuration

4 newEmbeddedDatabase( storeDirectory );
5
6 // Get the Node AutoIndexer, set nodeProp1 and nodeProp2
7 as auto
8 // indexed.
9 AutoIndexer<Node> nodeAutoIndexer = graphDb.index()
10 .getNodeAutoIndexer();
11 nodeAutoIndexer.startAutoIndexingProperty( "nodeProp1"
12 );
13 nodeAutoIndexer.startAutoIndexingProperty( "nodeProp2"
14 );
15
16 // Get the Relationship AutoIndexer
17 AutoIndexer<Relationship> relAutoIndexer =
139
18 graphDb.index()
19 .getRelationshipAutoIndexer();
relAutoIndexer.startAutoIndexingProperty( "relProp1" );
// None of the AutoIndexers are enabled so far. Do that

now
nodeAutoIndexer.setEnabled( true );
relAutoIndexer.setEnabled( true );
Parameters to the AutoIndexers passed through the Configuration and settings made
through the API are cumulative. So you can set some beforehand known settings, do runtime
checks to augment the initial configuration and then enable the desired auto indexers - the final
configuration is the same regardless of the method used to reach it.
14.12.4. Updating the Automatic Index
Updates to the auto indexed properties happen of course automatically as you update them.
Removal of properties from the auto index happens for two reasons. One is that you actually
removed the property. The other is that you stopped autoindexing on a property. When the latter
happens, any primitive you touch and it has that property, it is removed from the auto index,
regardless of any operations on the property. When you start or stop auto indexing on a property,
no auto update operation happens currently. If you need to change the set of auto indexed
properties and have them re-indexed, you currently have to do this by hand. An example will
illustrate the above better:
1 /*
2 * Creating the configuration
3 */
6 newEmbeddedDatabaseBuilder( storeDirectory
7 ).
8 setConfig( GraphDatabaseSettings.node_keys_
9 indexable, "nodeProp1,nodeProp2" ).
10 setConfig( GraphDatabaseSettings.node_auto_
11 indexing, "true" ).
12 newGraphDatabase();
13
15 Node node1 = null, node2 = null, node3 = null, node4
16 = null;
17 try
18 {
19 // Create the primitives
140
24
25 // Add indexable and non-indexable properties
26 node1.setProperty( "nodeProp1",
27 "nodeProp1Value" );
34
35 // Make things persistent
36 tx.success();
37 }
39 {
40 tx.failure();
41 }
42 finally
43 {
44 tx.finish();
45 }
46
47 /*
48 * Here both nodes are indexed. To demonstrate
49 removal, we stop
50 * autoindexing nodeProp1.
51 */
52 AutoIndexer<Node> nodeAutoIndexer =
53 graphDb.index().getNodeAutoIndexer();
54 nodeAutoIndexer.stopAutoIndexingProperty( "node
55 Prop1" );
56
57 tx = graphDb.beginTx();
58 try
59 {
60 /*
61 * nodeProp1 is no longer auto indexed. It will
62 be
63 * removed regardless. Note that node3 will
64 remain.
65 */
67 "nodeProp1Value2" );
141
68 /*
69 * node2 will be auto updated
70 */
72 "nodeProp2Value2" );
73 /*
74 * remove node4 property nodeProp2 from index.
75 */
76 node4.removeProperty( "nodeProp2" );
77 // Make things persistent
78 tx.success();
79 }
81 {
82 tx.failure();
83 }
84 finally
85 {
86 tx.finish();
87 }
88
89 // Verify
ReadableIndex<Node> nodeAutoIndex =
nodeAutoIndexer.getAutoIndex();
// node1 is completely gone
assertFalse( nodeAutoIndex.get( "nodeProp1",
"nodeProp1Value" ).hasNext() );
"nodeProp1Value2" ).hasNext() );
// node2 is updated
assertEquals( node2,
nodeAutoIndex.get( "nodeProp2",
"nodeProp2Value2" ).getSingle() );
/*
* node3 is still there, despite its nodeProp1
property not being monitored
* any more because it was not touched, in contrast
with node1.
*/
assertEquals( node3,
nodeAutoIndex.get( "nodeProp1",
"nodeProp3Value" ).getSingle() );
// Finally, node4 is removed because the property
was removed.
142
小心
If you start the database with auto

indexing enabled but different auto
indexed properties than the last run, then
already auto-indexed entities will be
deleted as you work with them. Make
sure that the monitored set is what you
want before enabling the functionality.
第 15 章 Cypher 查询语言
目录
15.1. 操作符
15.2. 表达式
15.3. 参数
15.4. 标识符
15.5. 备注
15.7. 事务
15.8. Patterns
Cypher is a declarative graph query language that allows for expressive and efficient querying
and updating of the graph store without having to write traversals through the graph structure in code.
Cypher is still growing and maturing, and that means that there probably will be breaking syntax
changes. It also means that it has not undergone the same rigorous performance testing as other Neo4j
components.
Cypher is designed to be a humane query language, suitable for both developers and
(importantly, we think) operations professionals who want to make ad-hoc queries on the
database. Our guiding goal is to make the simple things simple, and the complex things possible.
Its constructs are based on English prose and neat iconography, which helps to make it
(somewhat) self-explanatory.
Cypher is inspired by a number of different approaches and builds upon established practices for
expressive querying. Most of the keywords like WHERE and ORDER BY are inspired by SQL. Pattern
matching borrows expression approaches from SPARQL.
Being a declarative language, Cypher focuses on the clarity of expressing what to retrieve from a
graph, not how to do it, in contrast to imperative languages like Java, and scripting languages like
143
Gremlin (supported via the 第 18.18 节 “Gremlin Plugin”) and the JRuby Neo4j bindings. This
makes the concern of how to optimize queries an implementation detail not exposed to the user.
The query language is comprised of several distinct clauses.
• START: Starting points in the graph, obtained via index lookups or by element IDs.
• MATCH: The graph pattern to match, bound to the starting points in START.
• WHERE: Filtering criteria.
• RETURN: What to return.
• CREATE: Creates nodes and relationships.
• DELETE: Removes nodes, relationships and properties.
• SET: Set values to properties.
• FOREACH: Performs updating actions once per element in a list.
• WITH: Divides a query into multiple, distinct parts.
Let’s see three of them in action.
Imagine an example graph like the following one:
图 15.1. Example Graph
For example, here is a query which finds a user called John in an index and then traverses
the graph looking for friends of Johns friends (though not his direct friends) before returning
both John and any friends-of-friends that are found.
1 START john=node:node_auto_index(name =
'John')
2
3 MATCH john-[:friend]->()-[:friend]->fof
144
RETURN john, fof
Resulting in:
表 . --docbook
john fof
2 rows
3 ms
Node[4]{name:"Jo Node[2]{name:"Mar
hn"} ia"}
Node[4]{name:"Jo Node[3]{name:"Ste
hn"} ve"}
Next up we will add filtering to set more parts in motion:
In this next example, we take a list of users (by node ID) and traverse the graph looking for those
other users that have an outgoing friend relationship, returning only those followed users who have
a name property starting with S.
START user=node(5,4,1,2,3)
1 MATCH
user-[:friend]->follower
2
3 WHERE follower.name =~
'S.*'
4
RETURN user, follower.name
Resulting in
表 . --docbook
user follower.n
ame
2 rows
1 ms
Node[5]{name:"Jo "Steve"
e"}
Node[4]{name:"Jo "Sara"
hn"}
To use Cypher from Java, see 第 4.10 节 “在 Java 中执行 Cypher 查询”. For more Cypher
examples, see 第 7 章数据模型范例 as well.
145
15.1. 操作符
Operators in Cypher are of three different varieties — mathematical, equality and relationships.
The mathematical operators are +, -, *, / and %. Of these, only the plus-sign works on strings
and collections.
The equality operators are =, <>, <, >, <=, >=.
Since Neo4j is a schema-free graph database, Cypher has two special operators — ? and !.
They are used on properties, and are used to deal with missing values. A comparison on a
property that does not exist would normally cause an error. Instead of having to always check if
the property exists before comparing its value with something else, the question mark make the
comparison always return true if the property is missing, and the exclamation mark makes the
comparator return false.
This predicate will evaluate to true if n.prop is missing.
WHERE n.prop? = "foo"
This predicate will evaluate to false if n.prop is missing.
WHERE n.prop! = "foo"
警告
Mixing the two in the same comparison will lead to unpredictable

results.
This is really syntactic sugar that expands to this:
WHERE n.prop? = "foo" ⇒ WHERE (not(has(n.prop)) OR n.prop = "foo")
WHERE n.prop! = "foo" ⇒ WHERE (has(n.prop) AND n.prop = "foo")
15.2. 表达式
15.2.1. Note on string literals
An expression in Cypher can be:
• A numeric literal (integer or double): 13, 40000, 3.14.
• A string literal: "Hello", 'World'.
• A boolean literal: true, false, TRUE, FALSE.
146
• An identifier: n, x, rel, myFancyIdentifier, `A name with weird stuff in
it[]!`.
• A property: n.prop, x.prop, rel.thisProperty,
myFancyIdentifier.`(weird property name)`.
• A nullable property: it’s a property, with a question mark or exclamation

mark — n.prop?, rel.thisProperty!.
• A parameter: {param}, {0}
• A collection of expressions: ["a", "b"], [1,2,3], ["a", 2, n.property,

{param}], [ ].
• A function call: length(p), nodes(p).
• An aggregate function: avg(x.prop), count(*).
• Relationship types: :REL_TYPE, :`REL TYPE`, :REL1|REL2.
• A path-pattern: a-->()<--b.
15.2.1. Note on string literals
String literals can contain these escape sequences.
Escape sequence Character
\t Tab
\b Backspace
\n Newline
\r Carriage
return
\f Form feed
\' Single quote
\" Double quote
\\ Backslash
15.3. 参数
Cypher supports querying with parameters. This allows developers to not to have to do
string building to create a query, and it also makes caching of execution plans much easier for
Cypher.
147
Parameters can be used for literals and expressions in the WHERE clause, for the index key and
index value in the START clause, index queries, and finally for node/relationship ids. Parameters can
not be used as for property names, since property notation is part of query structure that is compiled
into a query plan.
Accepted names for parameter are letters and number, and any combination of these.
Here follows a few examples of how you can use parameters from Java.
Parameter for node id.
Map<String, Object> params = new HashMap<String, Object>();

1
params.put( "id", 0 );
2
ExecutionResult result = engine.execute( "start n=node({id})
3
return n.name", params );
Parameter for node object.

1
params.put( "node", andreasNode );
2
ExecutionResult result = engine.execute( "start n=node({node})
3
Parameter for multiple node ids.

1
params.put( "id", Arrays.asList( 0, 1, 2 ) );
2
ExecutionResult result = engine.execute( "start n=node({id})
3
Parameter for string literal.

1
params.put( "name", "Johan" );
2
ExecutionResult result =
3
engine.execute( "start n=node(0,1,2) where n.name =
4
{name} return n", params );
Parameter for index key and value.

1
params.put( "key", "name" );
2
params.put( "value", "Michaela" );
3
4
engine.execute( "start n=node:people({key} = {value})
5
return n", params );
Parameter for index query.
1 Map<String, Object> params = new HashMap<String, Object>();

148
2 params.put( "query", "name:Andreas" );
3 ExecutionResult result = engine.execute( "start
n=node:people({query}) return n", params );
Numeric parameters for SKIP and LIMIT.

1
params.put( "s", 1 );
2
params.put( "l", 1 );
3
4
engine.execute( "start n=node(0,1,2) return n.name skip
5
{s} limit {l}", params );
Parameter for regular expression.

1
params.put( "regex", ".*h.*" );
2
3
engine.execute( "start n=node(0,1,2) where n.name =~
4
{regex} return n.name", params );
15.4. 标识符
When you reference parts of the pattern, you do so by naming them. The names you give
the different parts are called identifiers.
In this example:
START n=node(1) MATCH n-->b

1
RETURN b
The identifiers are n and b.
Identifier names are case sensitive, and can contain underscores and alphanumeric characters
(a-z, 0-9), but must start with a letter. If other characters are needed, you can quote the identifier
using backquote (`) signs.
The same rules apply to property names.
15.5. 备注
To add comments to your queries, use double slash. Examples:
1 START n=node(1) RETURN b //This is an end of line comment

1 START n=node(1)
2 //This is a whole line comment
3 RETURN b
1 START n=node(1) WHERE n.property = "//This is NOT a comment"
149
RETURN b
15.6.1. The Structure of Updating Queries
15.6.2. Returning data
Cypher can be used for both querying and updating your graph.
15.6.1. The Structure of Updating Queries
Quick info
• A Cypher query part can’t both match and update the graph at the same time.
• Every part can either read and match on the graph, or make updates on it.
If you read from the graph, and then update the graph, your query implicitly has two
parts — the reading is the first part, and the writing is the second. If your query is read-only, Cypher
will be lazy, and not actually pattern match until you ask for the results. Here, the semantics are that
all the reading will be done before any writing actually happens. This is very important — without
this it’s easy to find cases where the pattern matcher runs into data that is being created by the very
same query, and all bets are off. That road leads to Heisenbugs, Brownian motion and cats that are
dead and alive at the same time.
First reading, and then writing, is the only pattern where the query parts are implicit — any
other order and you have to be explicit about your query parts. The parts are separated using the WITH
statement. WITH is like the event horizon — it’s a barrier between a plan and the finished execution
of that plan.
When you want to filter using aggregated data, you have to chain together two reading query
parts — the first one does the aggregating, and the second query filters on the results coming from
the first one.
START n=node(...)
1
MATCH n-[:friend]-friend
2
WITH n, count(friend) as
3
friendsCount
4
WHERE friendsCount > 3
5
RETURN n, friendsCount
Using WITH, you specify how you want the aggregation to happen, and that the aggregation has
to be finished before Cypher can start filtering.
You can chain together as many query parts as you have JVM heap for.
15.6.2. Returning data

150
Any query can return data. If your query only reads, it has to return data — it serves no purpose
if it doesn’t, and it is not a valid Cypher query. Queries that update the graph don’t have to return
anything, but they can.
After all the parts of the query comes one final RETURN statement. RETURN is not part of any
query part — it is a period symbol after an eloquent statement. When RETURN is legal, it’s also legal
to use SKIP/LIMIT and ORDER BY.
If you return graph elements from a query that has just deleted them — beware, you are holding
a pointer that is no longer valid. Operations on that node might fail mysteriously and unpredictably.
15.7. 事务
Any query that updates the graph will run in a transaction. An updating query will always
either fully succeed, or not succeed at all.
Cypher will either create a new transaction, and commit it once the query finishes. Or if a
transaction already exists in the running context, the query will run inside it, and nothing will be
persisted to disk until the transaction is successfully committed.
This can be used to have multiple queries be committed as a single transaction:
1. Open a transaction,
2. run multiple updating Cypher queries,
3. and commit all of them in one go.
Note that a query will hold the changes in heap until the whole query has finished executing.
A large query will consequently need a JVM with lots of heap space.
15.8. Patterns
Patterns are at the very core of Cypher, and are used in a lot of different places. Using patterns,
you describe the shape of the data that you are looking for. Patterns are used in the MATCH clause.
Path patterns are expressions. Since these expressions are collections, they can also be used as
predicates (a non-empty collection signifies true). They are also used to CREATE/CREATE UNIQUE
the graph.
So, understanding patterns is important, to be able to be effective with Cypher.
You describe the pattern, and Cypher will figure out how to get that data for you. The idea
is for you to draw your query on a whiteboard, naming the interesting parts of the pattern, so you
can then use values from these parts to create the result set you are looking for.
151
Patterns have bound points, or starting points. They are the parts of the pattern that are already
“bound” to a set of graph nodes or relationships. All parts of the pattern must be directly or indirectly
connected to a starting point — a pattern where parts of the pattern are not reachable from any
starting point will be rejected.
Clause Optio Multiple rel. Varlen Pa M

nal types gth ths aps
Match Yes Yes Yes Ye -

s
Create - - - Ye Ye
s s
Create - - - Ye Ye
Unique s s
Expressions - Yes Yes - -
15.8.1. Patterns for related nodes
The description of the pattern is made up of one or more paths, separated by commas. A
path is a sequence of nodes and relationships that always start and end in nodes. An example
path would be:
(a)-->(b)
This is a path starting from the pattern node a, with an outgoing relationship from it to pattern
node b.
Paths can be of arbitrary length, and the same node may appear in multiple places in the
path.
Node identifiers can be used with or without surrounding parenthesis. The following match is
semantically identical to the one we saw above — the difference is purely aesthetic.
a-->b
If you don’t care about a node, you don’t need to name it. Empty parenthesis are used for
these nodes, like so:
a-->()<--b
15.8.2. Working with relationships

152
If you need to work with the relationship between two nodes, you can name it.
a-[r]->b
If you don’t care about the direction of the relationship, you can omit the arrow at either
end of the relationship, like this:
a--b
Relationships have types. When you are only interested in a specific relationship type, you
can specify this like so:
a-[:REL_TYPE]->b
If multiple relationship types are acceptable, you can list them, separating them with the pipe
symbol | like this:
a-[r:TYPE1|TYPE2]->b
This pattern matches a relationship of type TYPE1 or TYPE2, going from a to b. The
relationship is named r. Multiple relationship types can not be used with CREATE or CREATE
UNIQUE.
15.8.3. Optional relationships

An optional relationship is matched when it is found, but replaced by a null otherwise.
Normally, if no matching relationship is found, that sub-graph is not matched. Optional relationships
could be called the Cypher equivalent of the outer join in SQL.
They can only be used in MATCH.
Optional relationships are marked with a question mark. They allow you to write queries
like this one:
Query
START me=node(*)
1
MATCH
2
me-->friend-[?]->friend_of_friend
3
RETURN friend, friend_of_friend
The query above says “for every person, give me all their friends, and their friends friends,
if they have any.”
153
Optionality is transitive — if a part of the pattern can only be reached from a bound point
through an optional relationship, that part is also optional. In the pattern above, the only bound point
in the pattern is me. Since the relationship between friend and children is optional, children
is an optional part of the graph.
Also, named paths that contain optional parts are also optional — if any part of the path is
null, the whole path is null.
In the following examples, b and p are all optional and can contain null:
Query
START
a=node(4)
1
2 MATCH p =
a-[?]->b
3
RETURN b
Query
START a=node(4)
1
MATCH p =
2
a-[?*]->b
3
RETURN b
Query
START a=node(4)
1
MATCH p =
2
a-[?]->x-->b
3
RETURN b
Query
START a=node(4), x=node(3)

1
MATCH p =
2
shortestPath( a-[?*]->x )
3
RETURN p
15.8.4. Controlling depth
A pattern relationship can span multiple graph relationships. These are called variable length
relationships, and are marked as such using an asterisk (*):
(a)-[*]->(b)
154
This signifies a path starting on the pattern node a, following only outgoing relationships, until it
reaches pattern node b. Any number of relationships can be followed searching for a path to b, so this
can be a very expensive query, depending on what your graph looks like.
You can set a minimum set of steps that can be taken, and/or the maximum number of
steps:
(a)-[*3..5]->(b)
This is a variable length relationship containing at least three graph relationships, and at
most five.
Variable length relationships can not be used with CREATE and CREATE UNIQUE.
As a simple example, let’s take the query below:
Query
START me=node(3)
1
MATCH
2
me-[:KNOWS*1..2]-remote_friend
3
RETURN remote_friend
表 . --docbook
remote_friend
2 rows
0 ms
Node[1]{name:"Dilsh
ad"}
Node[4]{name:"Ander
s"}
This query starts from one node, and follows KNOWS relationships two or three steps out, and
then stops.
15.8.5. Assigning to path identifiers
In a graph database, a path is a very important concept. A path is a collection of nodes and
relationships, that describe a path in the graph. To assign a path to a path identifier, you simply
assign a path pattern to an identifier, like so:
p = (a)-[*3..5]->(b)
155
You can do this in MATCH, CREATE and CREATE UNIQUE, but not when using patterns as
expressions. Example of the three in a single query:
Query
START me=node(3)
1
MATCH p1 = me-[*2]-friendOfFriend
2
CREATE p2 = me-[:MARRIED_TO]-(wife
3
{name:"Gunhild"})
4
CREATE UNIQUE p3 = wife-[:KNOWS]-friendOfFriend
5
RETURN p1,p2,p3
15.8.6. Setting properties
Nodes and relationships are important, but Neo4j uses properties on both of these to allow
for far denser graphs models.
Properties are expressed in patterns using the map-construct, which is simply curly brackets
surrounding a number of key-expression pairs, separated by commas, e.g. { name: "Andres",
sport: "BJJ" }. If the map is supplied through a parameter, the normal parameter expression is
used: { paramName }.
Maps are only used by CREATE and CREATE UNIQUE. In CREATE they are used to set the
properties on the newly created nodes and relationships.
When used with CREATE UNIQUE, they are used to try to match a pattern element with the
corresponding graph element. The match is successful if the properties on the pattern element can be
matched exactly against properties on the graph elements. The graph element can have additional
properties, and they do not affect the match. If Neo4j fails to find matching graph elements, the maps
is used to set the properties on the newly created elements.
15.8.7. Start
Every query describes a pattern, and in that pattern one can have multiple starting points. A
starting point is a relationship or a node where a pattern is anchored. You can either introduce
starting points by id, or by index lookups. Note that trying to use an index that doesn’t exist will
throw an exception.
Graph
156
Node by id
Binding a node as a starting point is done with the node(*) function .
Query
START
1
n=node(1)
2
RETURN n
The corresponding node is returned.
表 . --docbook
n
1 row
0 ms
Node[1]{name:
"A"}
Relationship by id
Binding a relationship as a starting point is done with the relationship(*) function, which
can also be abbreviated rel(*).
Query
START
1
r=relationship(0)
2
RETURN r
The relationship with id 0 is returned.
表 . --docbook
r
1 row
157
r
0 ms
:KNOWS[0
] {}
Multiple nodes by id
Multiple nodes are selected by listing them separated by commas.
Query
START n=node(1, 2,
1
3)
2
RETURN n
This returns the nodes listed in the START statement.
表 . --docbook
n
3 rows
0 ms
Node[1]{name:
"A"}
Node[2]{name:
"B"}
Node[3]{name:
"C"}
All nodes
To get all the nodes, use an asterisk. This can be done with relationships as well.
Query
START
1
n=node(*)
2
RETURN n
This query returns all the nodes in the graph.
表 . --docbook
158
n
3 rows
0 ms
Node[1]{name:
"A"}
Node[2]{name:
"B"}
Node[3]{name:
"C"}
Node by index lookup

When the starting point can be found by using index lookups, it can be done like this:
node:index-name(key = "value"). In this example, there exists a node index named nodes.
Query
START n=node:nodes(name =
1
"A")
2
RETURN n
The query returns the node indexed with the name "A".
表 . --docbook
n
1 row
0 ms
Node[1]{name:
"A"}
Relationship by index lookup

When the starting point can be found by using index lookups, it can be done like this:
relationship:index-name(key = "value").
Query
START r=relationship:rels(name =
1
"Andrés")
2
RETURN r
The relationship indexed with the name property set to "Andrés" is returned by the query.
159
表 . --docbook
r
1 row
0 ms
:KNOWS[0]
{name:"Andrés"
Node by index query

When the starting point can be found by more complex Lucene queries, this is the syntax to use:
node:index-name("query").This allows you to write more advanced index queries.
Query
START
1
n=node:nodes("name:A")
2
RETURN n
The node indexed with name "A" is returned by the query.
表 . --docbook
n
1 row
0 ms
Node[1]{name:
"A"}
Multiple starting points

Sometimes you want to bind multiple starting points. Just list them separated by commas.
Query
START a=node(1),
1
b=node(2)
2
RETURN a,b
Both the nodes A and the B are returned.
表 . --docbook
a b
1 row
160
a b
0 ms
Node[1]{name: Node[2]{name:
"A"} "B"}
15.8.8. Match
Introduction
提示
In the MATCH clause, patterns are used a lot. Read 第 15.8 节 “Patterns” for an
introduction.
The following graph is used for the examples below:
Graph
Related nodes
The symbol -- means related to, without regard to type or direction.
Query
1 START
161
n=node(3)
2
3 MATCH
(n)--(x)
RETURN x
All nodes related to A (Anders) are returned by the query.
表 . --docbook
x
3 rows
0 ms
Node[4]{name:"Bossm
an"}
Node[1]{name:"David
"}
Node[5]{name:"Cesar
"}
Outgoing relationships
When the direction of a relationship is interesting, it is shown by using --> or <--, like this:
Query
START
n=node(3)
1
2 MATCH
(n)-->(x)
3
RETURN x
All nodes that A has outgoing relationships to are returned.
表 . --docbook
x
2 rows
0 ms
Node[4]{name:"Bossm
an"}
Node[5]{name:"Cesar
"}
162
Directed relationships and identifier
If an identifier is needed, either for filtering on properties of the relationship, or to return
the relationship, this is how you introduce the identifier.
Query
START
n=node(3)
1
2 MATCH
(n)-[r]->()
3
RETURN r
The query returns all outgoing relationships from node A.
表 . --docbook
r
2 rows
0 ms
:KNOWS[0]
{}
:BLOCKS[1
] {}
Match by relationship type

When you know the relationship type you want to match on, you can specify it by using a
colon together with the relationship type.
Query
START n=node(3)
1
MATCH
2
(n)-[:BLOCKS]->(x)
3
RETURN x
All nodes that are BLOCKed by A are returned by this query.
表 . --docbook
x
1 row
0 ms
163
x
Node[5]{name:"Ces
ar"}
Match by multiple relationship types

To match on one of multiple types, you can specify this by chaining them together with the pipe
symbol |.
Query
START n=node(3)
1
MATCH
2
(n)-[:BLOCKS|KNOWS]->(x)
3
RETURN x
All nodes with a BLOCK or KNOWS relationship to A are returned.
表 . --docbook
x
2 rows
0 ms
Node[5]{name:"Cesar
"}
Node[4]{name:"Bossm
an"}
Match by relationship type and use an identifier

If you both want to introduce an identifier to hold the relationship, and specify the
relationship type you want, just add them both, like this.
Query
START n=node(3)
1
MATCH
2
(n)-[r:BLOCKS]->()
3
RETURN r
All BLOCKS relationships going out from A are returned.
表 . --docbook
164
r
1 row
0 ms
:BLOCKS[1
] {}
Relationship types with uncommon characters

Sometime your database will have types with non-letter characters, or with spaces in them. Use
` (backtick) to quote these.
Query
START n=node(3)
1
MATCH (n)-[r:`TYPE THAT HAS SPACE IN
2
IT`]->()
3
RETURN r
This query returns a relationship of a type with spaces in it.
表 . --docbook
r
1 row
0 ms
:TYPE THAT HAS SPACE IN IT[6]
{}
Multiple relationships
Relationships can be expressed by using multiple statements in the form of ()--(), or they can
be strung together, like this:
Query
START a=node(3)
1
MATCH
2
(a)-[:KNOWS]->(b)-[:KNOWS]->(c)
3
RETURN a,b,c
The three nodes in the path are returned by the query.
表 . --docbook
a b c
165
a b c
1 row
0 ms
Node[3]{name:"Ande Node[4]{name:"Bossm Node[2]{name:"Em
rs"} an"} il"}
Variable length relationships

Nodes that are a variable number of relationship→node hops away can be found using the
following syntax: -[:TYPE*minHops..maxHops]->. minHops and maxHops are optional and
default to 1 and infinity respectively. When no bounds are given the dots may be omitted.
Query
START a=node(3), x=node(2,

1
4)
2
MATCH a-[:KNOWS*1..3]->x
3
RETURN a,x
This query returns the start and end point, if there is a path between 1 and 3 relationships
away.
表 . --docbook
a x
2 rows
0 ms
Node[3]{name:"Ande Node[2]{name:"Emil"
rs"} }
Node[3]{name:"Ande Node[4]{name:"Bossm
rs"} an"}
Relationship identifier in variable length relationships

When the connection between two nodes is of variable length, a relationship identifier
becomes an collection of relationships.
Query
START a=node(3), x=node(2,

1
4)
2
MATCH a-[r:KNOWS*1..3]->x
3
RETURN r
166
The query returns the relationships, if there is a path between 1 and 3 relationships away.
表 . --docbook
r
2 rows
0 ms
[:KNOWS[0] {},:KNOWS[3]
{}]
[:KNOWS[0] {}]
Zero length paths

Using variable length paths that have the lower bound zero means that two identifiers can
point to the same node. If the distance between two nodes is zero, they are by definition the same
node.
Query
START a=node(3)
1
MATCH p1=a-[:KNOWS*0..1]->b,
2
p2=b-[:BLOCKS*0..1]->c
3
RETURN a,b,c, length(p1), length(p2)
This query will return four paths, some of which have length zero.
表 . --docbook
a b c lengt lengt
h(p1) h(p2)
4 rows
0 ms
Node[3]{name:"An Node[3]{name:"And Node[3]{name:"And 0 0
ders"} ers"} ers"}
Node[3]{name:"An Node[3]{name:"And Node[5]{name:"Ces 0 1
ders"} ers"} ar"}
Node[3]{name:"An Node[4]{name:"Bos Node[4]{name:"Bos 1 0
ders"} sman"} sman"}
Node[3]{name:"An Node[4]{name:"Bos Node[1]{name:"Dav 1 1
ders"} sman"} id"}
167
Optional relationship
If a relationship is optional, it can be marked with a question mark. This is similar to how a SQL
outer join works. If the relationship is there, it is returned. If it’s not, null is returned in it’s place.
Remember that anything hanging off an optional relationship, is in turn optional, unless it is
connected with a bound node through some other path.
Query
START
a=node(2)
1
2 MATCH
a-[?]->x
3
RETURN a,x
A node, and null are returned, since the node has no outgoing relationships.
表 . --docbook
a x
1 row
0 ms
Node[2]{name:"Em <n
il"} ull>
Optional typed and named relationship

Just as with a normal relationship, you can decide which identifier it goes into, and what
relationship type you need.
Query
START a=node(3)
1
MATCH
2
a-[r?:LOVES]->()
3
RETURN a,r
This returns a node, and null, since the node has no outgoing LOVES relationships.
表 . --docbook
a r
1 row
0 ms
168
a r
Node[3]{name:"Ande <n
rs"} ull>
Properties on optional elements

Returning a property from an optional element that is null will also return null.
Query
START
a=node(2)
1
MATCH
2
a-[?]->x
3
RETURN x,
x.name
This returns the element x (null in this query), and null as it’s name.
表 . --docbook
x x.n
ame
1 row
0 ms
<n <nul
ull> l>
Complex matching
Using Cypher, you can also express more complex patterns to match on, like a diamond
shape pattern.
Query
START a=node(3)
1
MATCH (a)-[:KNOWS]->(b)-[:KNOWS]->(c),
2
(a)-[:BLOCKS]-(d)-[:KNOWS]-(c)
3
RETURN a,b,c,d
This returns the four nodes in the paths.
表 . --docbook
a b c d
169
a b c d
1 row
0 ms
Node[3]{name:"And Node[4]{name:"Boss Node[2]{name:"E Node[5]{name:"Ce
ers"} man"} mil"} sar"}
Shortest path
Finding a single shortest path between two nodes is as easy as using the shortestPath
function. It’s done like this:
Query
START d=node(1), e=node(2)

1
MATCH p =
2
shortestPath( d-[*..15]->e )
3
RETURN p
This means: find a single shortest path between two nodes, as long as the path is max 15
relationships long. Inside of the parenthesis you define a single link of a path — the starting node,
the connecting relationship and the end node. Characteristics describing the relationship like
relationship type, max hops and direction are all used when finding the shortest path. You can also
mark the path as optional.
表 . --docbook
p
1 row
0 ms
[Node[1]{name:"David"},:KNOWS[2] {},Node[3]{name:"Anders"},:KNOWS[0]
{},Node[4]{name:"Bossman"},:KNOWS[3] {},Node[2]{name:"Emil"}]
All shortest paths

Finds all the shortest paths between two nodes.
Query
START d=node(1), e=node(2)

1
MATCH p =
2
allShortestPaths( d-[*..15]->e )
3
RETURN p
This example will find the two directed paths between David and Emil.
170
表 . --docbook
p
2 rows
0 ms
[Node[1]{name:"David"},:KNOWS[2] {},Node[3]{name:"Anders"},:KNOWS[0]
{},Node[4]{name:"Bossman"},:KNOWS[3] {},Node[2]{name:"Emil"}]
[Node[1]{name:"David"},:KNOWS[2] {},Node[3]{name:"Anders"},:BLOCKS[1]
{},Node[5]{name:"Cesar"},:KNOWS[4] {},Node[2]{name:"Emil"}]
Named path
If you want to return or filter on a path in your pattern graph, you can a introduce a named
path.
Query
START
a=node(3)
1
2 MATCH p =
a-->b
3
RETURN p
This returns the two paths starting from the first node.
表 . --docbook
p
2 rows
0 ms
[Node[3]{name:"Anders"},:KNOWS[0]
{},Node[4]{name:"Bossman"}]
[Node[3]{name:"Anders"},:BLOCKS[1]
{},Node[5]{name:"Cesar"}]
Matching on a bound relationship

When your pattern contains a bound relationship, and that relationship pattern doesn’t
specify direction, Cypher will try to match the relationship where the connected nodes switch
sides.
Query
171
START
r=rel(0)
1
2 MATCH
a-[r]-b
3
RETURN a,b
This returns the two connected nodes, once as the start node, and once as the end node.
表 . --docbook
a b
2 rows
0 ms
Node[3]{name:"Ander Node[4]{name:"Bossm
s"} an"}
Node[4]{name:"Bossm Node[3]{name:"Ander
an"} s"}
Match with OR
Strictly speaking, you can’t do OR in your MATCH. It’s still possible to form a query that works a
lot like OR.
Query
START a=node(3), b=node(2)

1
MATCH
2
a-[?:KNOWS]-x-[?:KNOWS]-b
3
RETURN x
This query is saying: give me the nodes that are connected to a, or b, or both.
表 . --docbook
x
3 rows
0 ms
Node[4]{name:"Bossm
an"}
Node[5]{name:"Cesar
"}
Node[1]{name:"David
172
x
"}
15.8.9. Where
If you need filtering apart from the pattern of the data that you are looking for, you can add
clauses in the WHERE part of the query.
Graph
Boolean operations
You can use the expected boolean operators AND and OR, and also the boolean function NOT().
Query
1 START n=node(3, 1)
2 WHERE (n.age < 30 and n.name = "Tobias") or not(n.name = "Tobias")
3 RETURN n
This will return both nodes in the start clause.
表 . --docbook
n
2 rows
0 ms
Node[3]{name:"Andres",age:36,belt:"whi
te"}
Node[1]{name:"Tobias",age:25}
173
Filter on node property
To filter on a property, write your clause after the WHERE keyword. Filtering on relationship
properties works just the same way.
Query
START
n=node(3,
1 1)
2 WHERE n.age <
303
RETURN n
The "Tobias" node will be returned.
表 . --docbook
n
1 row
0 ms
Node[1]{name:"Tobias",age
:25}
Regular expressions
You can match on regular expressions by using =~ "regexp", like this:
Query
START n=node(3, 1)
1
WHERE n.name =~
2
'Tob.*'
3
RETURN n
The "Tobias" node will be returned.
表 . --docbook
n
1 row
0 ms
:25}
Escaping in regular expressions

174
If you need a forward slash inside of your regular expression, escape it. Remember that
back slash needs to be escaped in string literals
Query
START n=node(3, 1)
1
WHERE n.name =~
2
'Some\\/thing'
3
RETURN n
No nodes match this regular expression.
表 . --docbook
n
0 row
0 ms
(empty
result)
Case insensitive regular expressions

By pre-pending a regular expression with (?i), the whole expression becomes case insensitive.
Query
START n=node(3, 1)
1
WHERE n.name =~
2
'(?i)ANDR.*'
3
RETURN n
The node with name "Andres" is returned.
表 . --docbook
n
1 row
0 ms
te"}
Filtering on relationship type

You can put the exact relationship type in the MATCH pattern, but sometimes you want to be able
to do more advanced filtering on the type. You can use the special property TYPE to compare the type
175
with something else. In this example, the query does a regular expression comparison with the name
of the relationship type.
Query
START n=node(3)
1
MATCH (n)-[r]->()
2
WHERE type(r) =~
3
'K.*'
4
RETURN r
This returns relationships that has a type whose name starts with K.
表 . --docbook
r
2
rows
0 ms
:KNOWS[0
] {}
:KNOWS[1
] {}
Property exists
To only include nodes/relationships that have a property, use the HAS() function and just write
out the identifier and the property you expect it to have.
Query
START
n=node(3,
1 1)
2 WHERE
has(n.belt)
3
RETURN n
The node named "Andres" is returned.
表 . --docbook
n
1 row
0 ms
176
n
te"}
Default true if property is missing

If you want to compare a property on a graph element, but only if it exists, use the nullable
property syntax. You can use a question mark if you want missing property to return true, like:
Query
START n=node(3, 1)
1
WHERE n.belt? =
2
'white'
3
RETURN n
This returns all nodes, even those without the belt property.
表 . --docbook
n
2 rows
0 ms
te"}
Node[1]{name:"Tobias",age:25}
Default false if property is missing

When you need missing property to evaluate to false, use the exclamation mark.
Query
START n=node(3, 1)
1
WHERE n.belt! =
2
'white'
3
RETURN n
No nodes without the belt property are returned.
表 . --docbook
n
1 row
177
n
0 ms
te"}
Filter on null values

Sometimes you might want to test if a value or an identifier is null. This is done just like SQL
does it, with IS NULL. Also like SQL, the negative is IS NOT NULL, although NOT(IS NULL x) also
works.
Query
START a=node(1), b=node(3,

1
2)
2
MATCH a<-[r?]-b
3
WHERE r is null
4
RETURN b
Nodes that Tobias is not connected to are returned.
表 . --docbook
b
1 row
0 ms
Node[2]{name:"Peter",age
:34}
Filter on patterns
Patterns are expressions in Cypher, expressions that return a collection of paths. Collection
expressions are also predicates — an empty collection represents false, and a non-empty
represents true.
So, patterns are not only expressions, they are also predicates. The only limitation to your pattern
is that you must be able to express it in a single path. You can not use commas between multiple
paths like you do in MATCH. You can achieve the same effect by combining multiple patterns with
AND.
Note that you can not introduce new identifiers here. Although it might look very similar to the
MATCH patterns, the WHERE clause is all about eliminating matched subgraphs. MATCH a-[*]->b is
very different from WHERE a-[*]->b; the first will produce a subgraph for every path it can find
178
between a and b, and the latter will eliminate any matched subgraphs where a and b do not have a
directed relationship chain between them.
Query
START tobias=node(1), others=node(3,

1
2)
2
WHERE tobias<--others
3
RETURN others
Nodes that have an outgoing relationship to the "Tobias" node are returned.
表 . --docbook
others
1 row
0 ms
te"}
Filter on patterns using NOT

The NOT() function can be used to exclude a pattern.
Query
START persons=node(*),
1
peter=node(2)
2
WHERE not(persons-->peter)
3
RETURN persons
Nodes that do not have an outgoing relationship to the "Peter" node are returned.
表 . --docbook
persons
2 rows
0 ms
:25}
Node[2]{name:"Peter",age:
34}
179
IN operator
To check if an element exists in a collection, you can use the IN operator.
Query
START a=node(3, 1, 2)
1
WHERE a.name IN ["Peter",
2
"Tobias"]
3
RETURN a
This query shows how to check if a property exists in a literal collection.
表 . --docbook
a
2 rows
0 ms
:25}
Node[2]{name:"Peter",age:
34}
15.8.10. Return
In the RETURN part of your query, you define which parts of the pattern you are interested in. It
can be nodes, relationships, or properties on these.
Graph
Return nodes
To return a node, list it in the RETURN statemenet.
180
Query
START
1
n=node(2)
2
RETURN n
The example will return the node.
表 . --docbook
n
1 row
0 ms
Node[2]{name:
"B"}
Return relationships
To return a relationship, just include it in the RETURN list.
Query
START n=node(1)
1
MATCH
2
(n)-[r:KNOWS]->(c)
3
RETURN r
The relationship is returned by the example.
表 . --docbook
r
1 row
0 ms
:KNOWS[0
] {}
Return property
To return a property, use the dot separator, like this:
Query
1 START
n=node(1)
2
181
RETURN
n.name
The value of the property name gets returned.
表 . --docbook
n.n
ame
1
row
0
ms
"A"
Return all elements

When you want to return all nodes, relationships and paths found in a query, you can use the *
symbol.
Query
START
a=node(1)
1
2 MATCH
p=a-[r]->b
3
RETURN *
This returns the two nodes, the relationship and the path used in the query.
表 . --docbook
a b r p
2 rows
0 ms
Node[1]{name:"A",happy: Node[2]{n :KNO [Node[1]{name:"A",happy:"Yes!",a
"Yes!",age:55} ame:"B"} WS[0] ge:55},:KNOWS[0]
{} {},Node[2]{name:"B"}]
Node[1]{name:"A",happy: Node[2]{n :BLO [Node[1]{name:"A",happy:"Yes!",a
"Yes!",age:55} ame:"B"} CKS[1] ge:55},:BLOCKS[1]
{} {},Node[2]{name:"B"}]
Identifier with uncommon characters

182
To introduce a placeholder that is made up of characters that are outside of the english alphabet,
you can use the ` to enclose the identifier, like this:
Query
START `This isn't a common

1
identifier`=node(1)
2
RETURN `This isn't a common identifier`.happy
The node indexed with name "A" is returned
表 . --docbook
This isn't a common

identifier.happy
1 row
0 ms
"Yes!"
Column alias
If the name of the column should be different from the expression used, you can rename it by
using AS <new name>.
Query
START a=node(1)
1
RETURN a.age AS
2
SomethingTotallyDifferent
Returns the age property of a node, but renames the column.
表 . --docbook
SomethingTotallyDiffe
rent
1 row
0 ms
55
Optional properties
If a property might or might not be there, you can select it optionally by adding a questionmark
to the identifier, like this:
183
Query
START
1
n=node(1, 2)
2
RETURN n.age?
This example returns the age when the node has that property, or null if the property is not
there.
表 . --docbook
n.
age?
2
rows
0
ms
55
<nul
l>
Unique results
DISTINCT retrieves only unique rows depending on the columns that have been selected to
output.
Query
START
a=node(1)
1
MATCH
2
(a)-->(b)
3
RETURN
distinct b
The node named B is returned by the query, but only once.
表 . --docbook
b
1 row
0 ms
Node[2]{name:
"B"}
184
15.8.11. Aggregation
Introduction
To calculate aggregated data, Cypher offers aggregation, much like SQL’s GROUP BY.
Aggregate functions take multiple input values and calculate an aggregated value from them.
Examples are AVG that calculate the average of multiple numeric values, or MIN that finds the
smallest numeric value in a set of values.
Aggregation can be done over all the matching sub graphs, or it can be further divided by
introducing key values. These are non-aggregate expressions, that are used to group the values going
into the aggregate functions.
So, if the return statement looks something like this:
RETURN n,
1
count(*)
We have two return expressions — n, and count(*). The first, n, is no aggregate function,
and so it will be the grouping key. The latter, count(*) is an aggregate expression. So the matching
subgraphs will be divided into different buckets, depending on the grouping key. The aggregate
function will then run on these buckets, calculating the aggregate values.
The last piece of the puzzle is the DISTINCT keyword. It is used to make all values unique
before running them through an aggregate function.
An example might be helpful:
Query
START me=node(1)
1
MATCH me-->friend-->friend_of_friend
2
RETURN count(distinct friend_of_friend),
3
count(friend_of_friend)
In this example we are trying to find all our friends of friends, and count them. The first
aggregate function, count(distinct friend_of_friend), will only see a
friend_of_friend once — DISTINCT removes the duplicates. The latter aggregate function,
count(friend_of_friend), might very well see the same friend_of_friend multiple times.
Since there is no real data in this case, an empty result is returned. See the sections below for real
data.
表 . --docbook
185
count(distinct count(friend_of_fri
friend_of_friend) end)
1 row
0 ms
0 0
The following examples are assuming the example graph structure below.
Graph
COUNT
COUNT is used to count the number of rows. COUNT can be used in two forms — COUNT(*)
which just counts the number of matching rows, and COUNT(<identifier>), which counts the
number of non-null values in <identifier>.
Count nodes
To count the number of nodes, for example the number of nodes connected to one node, you can
use count(*).
Query
START
n=node(2)
1
MATCH
2
(n)-->(x)
3
RETURN n,
count(*)
This returns the start node and the count of related nodes.
表 . --docbook
186
n cou
nt(*)
1 row
0 ms
Node[2]{name:"A",property 3
:13}
Group Count Relationship Types

To count the groups of relationship types, return the types and count them with count(*).
Query
START n=node(2)
1
MATCH (n)-[r]->()
2
RETURN type(r),
3
count(*)
The relationship types and their group count is returned by the query.
表 . --docbook
ty cou
pe(r) nt(*)
1 row
0 ms
"KNO 3
WS"
Count entities
Instead of counting the number of results with count(*), it might be more expressive to
include the name of the identifier you care about.
Query
START
n=node(2)
1
MATCH
2
(n)-->(x)
3
RETURN
count(x)
The example query returns the number of connected nodes from the start node.
187
表 . --docbook
cou
nt(x)
1
row
0
ms
Count non-null values

You can count the non-null values by using count(<identifier>).
Query
START n=node(2,3,4,1)
1
RETURN
2
count(n.property?)
The count of related nodes with the property property set is returned by the query.
表 . --docbook
count(n.prope
rty?)
1 row
0 ms
SUM
The SUM aggregation function simply sums all the numeric values it encounters. Nulls are
silently dropped. This is an example of how you can use SUM.
Query
START
n=node(2,3,4)
1
2 RETURN
sum(n.property)
This returns the sum of all the values in the property property.
表 . --docbook
188
sum(n.prop
erty)
1 row
0 ms
90
AVG
AVG calculates the average of a numeric column.
Query
START
n=node(2,3,4)
1
2 RETURN
avg(n.property)
The average of all the values in the property property is returned by the example query.
表 . --docbook
avg(n.prop
erty)
1 row
0 ms
30.0
MAX
MAX find the largets value in a numeric column.
Query
START
n=node(2,3,4)
1
2 RETURN
max(n.property)
The largest of all the values in the property property is returned.
表 . --docbook
max(n.prop
erty)
1 row
189
max(n.prop
erty)
0 ms
44
MIN
MIN takes a numeric property as input, and returns the smallest value in that column.
Query
START
n=node(2,3,4)
1
2 RETURN
min(n.property)
This returns the smallest of all the values in the property property.
表 . --docbook
min(n.prop
erty)
1 row
0 ms
13
COLLECT
COLLECT collects all the values into a list.
Query
START n=node(2,3,4)
1
RETURN
2
collect(n.property)
Returns a single row, with all the values collected.
表 . --docbook
collect(n.prop
erty)
1 row
0 ms
[13,33,44]
190
DISTINCT
All aggregation functions also take the DISTINCT modifier, which removes duplicates from the
values. So, to count the number of unique eye colors from nodes related to a, this query can be used:
Query
START a=node(2)
1
MATCH a-->b
2
RETURN count(distinct
3
b.eyes)
Returns the number of eye colors.
表 . --docbook
count(distinct
b.eyes)
1 row
0 ms
15.8.12. Order by
To sort the output, use the ORDER BY clause. Note that you can not sort on nodes or relationships,
just on properties on these.
Graph
191
Order nodes by property
ORDER BY is used to sort the output.
Query
START
1
n=node(3,1,2)
2
RETURN n
3
ORDER BY n.name
The nodes are returned, sorted by their name.
表 . --docbook
n
3 rows
0 ms
Node[1]{name:"A",age:34,length:
170}
Node[2]{name:"B",age:34}
Node[3]{name:"C",age:32,length:
185}
Order nodes by multiple properties

192
You can order by multiple properties by stating each identifier in the ORDER BY clause. Cypher
will sort the result by the first identifier listed, and for equals values, go to the next property in the
ORDER BY clause, and so on.
Query
START
n=node(3,1,2)
1
2 RETURN n
3 ORDER BY n.age,
n.name
This returns the nodes, sorted first by their age, and then by their name.
表 . --docbook
n
3 rows
0 ms
185}
170}
Order nodes in descending order

By adding DESC[ENDING] after the identifier to sort on, the sort will be done in reverse order.
Query
START
n=node(3,1,2)
1
2 RETURN n
3 ORDER BY n.name
DESC
The example returns the nodes, sorted by their name reversely.
表 . --docbook
n
3 rows
0 ms
193
n
185}
170}
Ordering null
When sorting the result set, null will always come at the end of the result set for ascending
sorting, and first when doing descending sort.
Query
START
n=node(3,1,2)
1
RETURN
2
n.length?, n
3
ORDER BY
n.length?
The nodes are returned sorted by the length property, with a node without that property last.
表 . --docbook
n.len n
gth?
3 rows
0 ms
170 Node[1]{name:"A",age:34,length:
170}
185 Node[3]{name:"C",age:32,length:
185}
<null> Node[2]{name:"B",age:34}
15.8.13. Limit
LIMIT enables the return of only subsets of the total result.
Graph
194
Return first part
To return a subset of the result, starting from the top, use this syntax:
Query
START n=node(3, 4, 5, 1,
1
2)
2
RETURN n
3
LIMIT 3
The top three items are returned by the example query.
表 . --docbook
n
3 rows
0 ms
Node[3]{name:
"A"}
Node[4]{name:
"B"}
Node[5]{name:
"C"}
15.8.14. Skip
SKIP enables the return of only subsets of the total result. By using SKIP, the result set will get
trimmed from the top. Please note that no guarantees are made on the order of the result unless the
query specifies the ORDER BY clause.
Graph
195
Skip first three
To return a subset of the result, starting from the fourth result, use the following syntax:
Query
1
2)
2
RETURN n
3
ORDER BY n.name
4
SKIP 3
The first three nodes are skipped, and only the last two are returned in the result.
表 . --docbook
n
2 rows
0 ms
Node[1]{name:
"D"}
Node[2]{name:
"E"}
Return middle two

To return a subset of the result, starting from somewhere in the middle, use this syntax:
Query
1
2)
2
RETURN n
3
ORDER BY n.name
4
SKIP 1
5
LIMIT 2
196
Two nodes from the middle are returned.
表 . --docbook
n
2 rows
0 ms
Node[4]{name:
"B"}
Node[5]{name:
"C"}
15.8.15. With
The ability to chain queries together allows for powerful constructs. In Cypher, the WITH clause
is used to pipe the result from one query to the next.
WITH is also used to separate reading from updating of the graph. Every sub-query of a query
must be either read-only or write-only.
Graph
197
Filter on aggregate function results
Aggregated results have to pass through a WITH clause to be able to filter on.
Query
START david=node(1)
1
MATCH david--otherPerson-->()
2
WITH otherPerson, count(*) as
3
foaf
4
WHERE foaf > 1
5
RETURN otherPerson
The person connected to David with the at least more than one outgoing relationship will be
returned by the query.
表 . --docbook
otherPerson
1 row
0 ms
Node[3]{name:"Ande
rs"}
198
Alternative syntax of WITH
If you prefer a more visual way of writing your query, you can use equal-signs as delimiters
before and after the column list. Use at least three before the column list, and at least three after.
Query
START david=node(1)
1
MATCH david--otherPerson-->()
2
========== otherPerson, count(*) as foaf
3
==========
4
SET otherPerson.connection_count = foaf
For persons connected to David, the connection_count property is set to their number of
outgoing relationships.
表 . --docbook
Properties
set: 2
2 ms
(empty result)
15.8.16. Create
Creating graph elements — nodes and relationships, is done with CREATE.
提示
In the CREATE clause, patterns are used a lot. Read 第 15.8 节 “Patterns” for an
introduction.
15.8.17. Create single node
Creating a single node is done by issuing the following query.
Query
CREA
1
TE n
Nothing is returned from this query, except the count of affected nodes.
表 . --docbook
199
Nodes
created: 1
0 ms
(empty result)
15.8.18. Create single node and set properties
The values for the properties can be any scalar expressions.
Query
1 CREATE n = {name : 'Andres', title : 'Developer'}

Nothing is returned from this query.
表 . --docbook
Nodes
created: 1
Properties
set: 2
2 ms
(empty result)
15.8.19. Return created node
Creating a single node is done by issuing the following query.
Query
CREATE (a {name :
1
'Andres'})
2
RETURN a
The newly created node is returned. This query uses the alternative syntax for single node
creation.
表 . --docbook
a
1 row
Nodes created:
1
200
a
Properties set:
1
3 ms
Node[1]{name:"Andr
es"}
15.8.20. Create a relationship between two nodes
To create a relationship between two nodes, we first get the two nodes. Once the nodes are
loaded, we simply create a relationship between them.
Query
START a=node(1),
b=node(2)
1
2 CREATE
a-[r:RELTYPE]->b
3
RETURN r
The created relationship is returned by the query.
表 . --docbook
r
1 row
Relationships
created: 1
2 ms
:RELTYPE[0] {}
15.8.21. Create a relationship and set properties
Setting properties on relationships is done in a similar manner to how it’s done when creating
nodes. Note that the values can be any expression.
Query
1 START a=node(1), b=node(2)

2 CREATE a-[r:RELTYPE {name : a.name + '<->' + b.name }]->b
3 RETURN r
The newly created relationship is returned by the example query.
201
表 . --docbook
r
1 row
Relationships created: 1
Properties set: 1
1 ms
:RELTYPE[0]
{name:"Andres<->Michael"}
15.8.22. Create a full path

When you use CREATE and a pattern, all parts of the pattern that are not already in scope at this
time will be created.
Query
CREATE p = (andres
{name:'Andres'})-[:WORKS_AT]->neo<-[:WORKS_AT]-(michael
1
{name:'Michael'})
2
RETURN p
This query creates three nodes and two relationships in one go, assigns it to a path identifier, and
returns it
表 . --docbook
p
1 row
Nodes created: 3
Relationships created: 2
Properties set: 2
6 ms
[Node[1]{name:"Andres"},:WORKS_AT[0] {},Node[2]{},:WORKS_AT[1]
{},Node[3]{name:"Michael"}]
15.8.23. Create single node from map

You can also create a graph entity from a Map<String,Object> map. All the key/value pairs
in the map will be set as properties on the created relationship or node.
202
Query
create
1
({props})
This query can be used in the following fashion:
Map<String, Object> props = new HashMap<String, Object>();

1
props.put( "name", "Andres" );
2
props.put( "position", "Developer" );
3
4
Map<String, Object> params = new HashMap<String,
5
Object>();
6
params.put( "props", props );
7
engine.execute( "create ({props})", params );
15.8.24. Create multiple nodes from maps

By providing an iterable of maps (Iterable<Map<String,Object>>), Cypher will create a
node for each map in the iterable. When you do this, you can’t create anything else in the same create
statement.
Query
create (n {props})
1
return n
This query can be used in the following fashion:
1
2 Map<String, Object> n1 = new HashMap<String, Object>();
3 n1.put( "name", "Andres" );
4 n1.put( "position", "Developer" );
5
6 Map<String, Object> n2 = new HashMap<String, Object>();
7 n2.put( "name", "Michael" );
8 n2.put( "position", "Developer" );
9
1 Map<String, Object> params = new HashMap<String,
0 Object>();
1 List<Map<String, Object>> maps = Arrays.asList(n1, n2);
1 params.put( "props", maps);
1 engine.execute("create (n {props}) return n", params);
2
15.8.25. Create Unique
203
CREATE UNIQUE is in the middle of MATCH and CREATE — it will match what it can, and
create what is missing. CREATE UNIQUE will always make the least change possible to the
graph — if it can use parts of the existing graph, it will.
Another difference to MATCH is that CREATE UNIQUE assumes the pattern to be unique. If
multiple matching subgraphs are found an exception will be thrown.
提示
In the CREATE UNIQUE clause, patterns are used a lot. Read 第 15.8 节 “Patterns” for an
introduction.
15.8.26. Create relationship if it is missing

CREATE UNIQUE is used to describe the pattern that should be found or created.
Query
START left=node(1),
right=node(3,4)
1
2 CREATE UNIQUE
left-[r:KNOWS]->right
3
RETURN r
The left node is matched agains the two right nodes. One relationship already exists and can be
matched, and the other relationship is created before it is returned.
表 . --docbook
r
2 rows
Relationships
created: 1
4 ms
:KNOWS[4] {}
:KNOWS[3] {}
15.8.27. Create node if missing
If the pattern described needs a node, and it can’t be matched, a new node will be created.
Query
204
START root=node(2)
1
CREATE UNIQUE
2
root-[:LOVES]-someone
3
RETURN someone
The root node doesn’t have any LOVES relationships, and so a node is created, and also a
relationship to that node.
表 . --docbook
someone
1 row
Nodes created: 1
Relationships
created: 1
2 ms
Node[5]{}
15.8.28. Create nodes with values
The pattern described can also contain values on the node. These are given using the following
syntax: prop : <expression>.
Query
START root=node(2)
1
CREATE UNIQUE root-[:X]-(leaf
2
{name:'D'} )
3
RETURN leaf
No node connected with the root node has the name D, and so a new node is created to match the
pattern.
表 . --docbook
leaf
1 row
Nodes created: 1
Relationships
created: 1
Properties set: 1
2 ms
205
leaf
Node[5]{name:"D"}
15.8.29. Create relationship with values
Relationships to be created can also be matched on values.
Query
START root=node(2)
1
CREATE UNIQUE root-[r:X
2
{since:'forever'}]-()
3
RETURN r
In this example, we want the relationship to have a value, and since no such relationship can be
found, a new node and relationship are created. Note that since we are not interested in the created
node, we don’t name it.
表 . --docbook
r
1 row
Nodes created: 1
Relationships
created: 1
Properties set: 1
1 ms
:X[4] {since:"forever"}
15.8.30. Describe complex pattern

The pattern described by CREATE UNIQUE can be separated by commas, just like in MATCH and
CREATE.
Query
START root=node(2)
1
CREATE UNIQUE root-[:FOO]->x,
2
root-[:BAR]->x
3
RETURN x
This example pattern uses two paths, separated by a comma.
206
表 . --docbook
x
1 row
Nodes created: 1
Relationships
created: 2
10 ms
Node[5]{}
15.8.31. Set
Updating properties on nodes and relationships is done with the SET clause.
15.8.32. Set a property

To set a property on a node or relationship, use SET.
Query
START n = node(2)
1
SET n.surname =
2
'Taylor'
3
RETURN n
The newly changed node is returned by the query.
表 . --docbook
n
1 row
Properties set: 1
1 ms
Node[2]{name:"Andres",age:36,surname:"Tayl
or"}
15.8.33. Delete
Removing graph elements — nodes, relationships and properties, is done with DELETE.
15.8.34. Delete single node
207
To remove a node from the graph, you can delete it with the DELETE clause.
Query
START n =
1
node(4)
2
DELETE n
表 . --docbook
Nodes
deleted: 1
15 ms
(empty result)
15.8.35. Remove a node and connected relationships
If you are trying to remove a node with relationships on it, you have to remove these as well.
Query
START n =
node(3)
1
2 MATCH
n-[r]-()
3
DELETE n, r
表 . --docbook
Nodes deleted: 1
Relationships
deleted: 2
1 ms
(empty result)
15.8.36. Remove a property

Neo4j doesn’t allow storing null in properties. Instead, if no value exists, the property is just
not there. So, to remove a property value on a node or a relationship, is also done with DELETE.
208
Query
START andres =
1
node(3)
2
DELETE andres.age
3
RETURN andres
The node is returned, and no property age exists on it.
表 . --docbook
andres
1 row
Properties set:
1
2 ms
Node[3]{name:"Andr
es"}
15.8.37. Foreach
Collections and paths are key concepts in Cypher. To use them for updating data, you can use
the FOREACH construct. It allows you to do updating commands on elements in a collection — a
path, or a collection created by aggregation.
The identifier context inside of the foreach parenthesis is separate from the one outside it, i.e. if
you CREATE a node identifier inside of a FOREACH, you will not be able to use it outside of the
foreach statement, unless you match to find it.
Inside of the FOREACH parentheses, you can do any updating commands — CREATE, CREATE
UNIQUE, DELETE, and FOREACH.
15.8.38. Mark all nodes along a path

This query will set the property marked to true on all nodes along a path.
Query
1 START begin = node(2), end = node(1)

2 MATCH p = begin -[*]-> end foreach(n in nodes(p) :
3 SET n.marked = true)
Nothing is returned from this query.
209
表 . --docbook
Properties
set: 4
2 ms
(empty result)
15.8.39. Functions
Most functions in Cypher will return null if the input parameter is null.
Here is a list of the functions in Cypher, seperated into three different sections: Predicates, Scalar
functions and Aggregated functions
Graph
15.8.40. Predicates
Predicates are boolean functions that return true or false for a given set of input. They are most
commonly used to filter out subgraphs in the WHERE part of a query.
ALL
Tests whether a predicate holds for all element of this collection collection.
210
Syntax: ALL(identifier in collection WHERE predicate)
Arguments:
• collection: An expression that returns a collection
• identifier: This is the identifier that can be used from the predicate.
• predicate: A predicate that is tested against all items in the collection.
Query
START a=node(3),
1
b=node(1)
2
MATCH p=a-[*1..3]->b
3
WHERE all(x in nodes(p)
4
WHERE x.age > 30)
5
RETURN p
All nodes in the returned paths will have an age property of at least 30.
表 . --docbook
p
1 row
0 ms
[Node[3]{name:"A",age:38,eyes:"brown"},:KNOWS[1]
{},Node[5]{name:"C",age:53,eyes:"green"},:KNOWS[3]
{},Node[1]{name:"D",age:54,eyes:"brown"}]
ANY
Tests whether a predicate holds for at least one element in the collection.
Syntax: ANY(identifier in collection WHERE predicate)
Arguments:
Query
1 START a=node(2)
2 WHERE any(x in
211
a.array
3
4 WHERE x = "one")
RETURN a
All nodes in the returned paths has at least one one value set in the array property named
array.
表 . --docbook
a
1 row
0 ms
Node[2]{name:"E",age:41,eyes:"blue",array:["one","two","thre
e"]}
NONE
Returns true if the predicate holds for no element in the collection.
Syntax: NONE(identifier in collection WHERE predicate)
Arguments:
Query
START n=node(3)
1
MATCH p=n-[*1..3]->b
2
WHERE NONE(x in
3
nodes(p)
4
WHERE x.age = 25)
5
RETURN p
No nodes in the returned paths has a age property set to 25.
表 . --docbook
p
2 rows
0 ms
212
p
{},Node[5]{name:"C",age:53,eyes:"green"}]
{},Node[5]{name:"C",age:53,eyes:"green"},:KNOWS[3]
{},Node[1]{name:"D",age:54,eyes:"brown"}]
SINGLE
Returns true if the predicate holds for exactly one of the elements in the collection.
Syntax: SINGLE(identifier in collection WHERE predicate)
Arguments:
Query
START n=node(3)
1
MATCH p=n-->b
2
WHERE SINGLE(var in
3
nodes(p)
4
WHERE var.eyes = "blue")
5
RETURN p
Exactly one node in every returned path will have the eyes property set to "blue".
表 . --docbook
p
1 row
0 ms
{},Node[4]{name:"B",age:25,eyes:"blue"}]
15.8.41. Scalar functions
Scalar functions return a single value.
LENGTH
To return or filter on the length of a collection, use the LENGTH() function.
213
Syntax: LENGTH( collection )
Arguments:
Query
START
a=node(3)
1
MATCH
2
p=a-->b-->c
3
RETURN
length(p)
The length of the path p is returned by the query.
表 . --docbook
lengt
h(p)
3
rows
0 ms
TYPE
Returns a string representation of the relationship type.
Syntax: TYPE( relationship )
Arguments:
• relationship: A relationship.
Query
START
n=node(3)
1
2 MATCH
(n)-[r]->()
3
RETURN
214
type(r)
The relationship type of r is returned by the query.
表 . --docbook
ty
pe(r)
2
rows
0
ms
"KNO
WS"
"KNO
WS"
ID
Returns the id of the relationship or node.
Syntax: ID( property-container )
Arguments:
• property-container: A node or a relationship.
Query
START a=node(3, 4,
1
5)
2
RETURN ID(a)
This returns the node id for three nodes.
表 . --docbook
ID
(a)
3
rows
0
ms
215
ID
(a)
COALESCE
Returns the first non-null value in the list of expressions passed to it.
Syntax: COALESCE( expression [, expression]* )
Arguments:
• expression: The expression that might return null.
Query
START a=node(3)
1
RETURN coalesce(a.hairColour?,
2
a.eyes?)
表 . --docbook
coalesce(a.hairColour?,
a.eyes?)
1 row
0 ms
"brown"
HEAD
HEAD returns the first element in a collection.
Syntax: HEAD( expression )
Arguments:
• expression: This expression should return a collection of some kind.
Query
START a=node(2)
1
RETURN a.array,
2
head(a.array)
The first node in the path is returned.
216
表 . --docbook
a.array head(a.a
rray)
1 row
0 ms
["one","two","thr "one"
ee"]
LAST
LAST returns the last element in a collection.
Syntax: LAST( expression )
Arguments:
Query
START a=node(2)
1
RETURN a.array,
2
last(a.array)
The last node in the path is returned.
表 . --docbook
a.array last(a.a
rray)
1 row
0 ms
["one","two","thr "three"
ee"]
15.8.42. Collection functions

Collection functions return collections of things — nodes in a path, and so on.
NODES
Returns all nodes in a path.
Syntax: NODES( path )

217
Arguments:
• path: A path.
Query
START a=node(3),
1
c=node(2)
2
MATCH p=a-->b-->c
3
RETURN NODES(p)
All the nodes in the path p are returned by the example query.
表 . --docbook
NODES(p)
1 row
0 ms
[Node[3]{name:"A",age:38,eyes:"brown"},Node[4]{name:"B",age:25,eyes:"blue"},Node[2]
{name:"E",age:41,eyes:"blue",array:["one","two","three"]}]
RELATIONSHIPS
Returns all relationships in a path.
Syntax: RELATIONSHIPS( path )
Arguments:
• path: A path.
Query
START a=node(3),
c=node(2)
1
2 MATCH p=a-->b-->c
3 RETURN
RELATIONSHIPS(p)
All the relationships in the path p are returned.
表 . --docbook
RELATIONSHIPS(p)
1 row
0 ms
218
RELATIONSHIPS(p)
[:KNOWS[0] {},:MARRIED[4]
{}]
EXTRACT
To return a single property, or the value of a function from a collection of nodes or relationships,
you can use EXTRACT. It will go through a collection, run an expression on every element, and return
the results in an collection with these values. It works like the map method in functional languages
such as Lisp and Scala.
Syntax: EXTRACT( identifier in collection : expression )
Arguments:
• identifier: The closure will have an identifier introduced in it’s context. Here you decide
which identifier to use.
• expression: This expression will run once per value in the collection, and produces the
result collection.
Query
START a=node(3), b=node(4),

c=node(1)
1
2 MATCH p=a-->b-->c
3 RETURN extract(n in nodes(p) :
n.age)
The age property of all nodes in the path are returned.
表 . --docbook
extract(n in nodes(p) :
n.age)
1 row
0 ms
[38,25,54]
FILTER
FILTER returns all the elements in a collection that comply to a predicate.
Syntax: FILTER(identifier in collection : predicate)

219
Arguments:
Query
START a=node(2)
1
RETURN a.array, filter(x in a.array : length(x) =
2
3)
This returns the property named array and a list of values in it, which have the length 3.
表 . --docbook
a.array filter(x in a.array : length(x)

= 3)
1 row
0 ms
["one","two","thr ["one","two"]
ee"]
TAIL
TAIL returns all but the first element in a collection.
Syntax: TAIL( expression )
Arguments:
Query
START a=node(2)
1
RETURN a.array,
2
tail(a.array)
This returns the property named array and all elements of that property except the first one.
表 . --docbook
a.array tail(a.arr
ay)
1 row
220
a.array tail(a.arr
ay)
0 ms
["one","two","thr ["two","thr
ee"] ee"]
RANGE
Returns numerical values in a range with a non-zero step value step. Range is inclusive in both
ends.
Syntax: RANGE( start, end [, step] )
Arguments:
• start: A numerical expression.
• end: A numerical expression.
• step: A numerical expression.
Query
START n=node(1)
1
RETURN range(0,10),
2
range(2,18,3)
Two lists of numbers are returned.
表 . --docbook
range(0,10) range(2,1
8,3)
1 row
0 ms
[0,1,2,3,4,5,6,7,8,9 [2,5,8,11,14
,10] ,17]
15.8.43. Mathematical functions
These functions all operate on numerical expressions only, and will return an error if used on any
other values.
ABS
ABS returns the absolute value of a number.
221
Syntax: ABS( expression )
Arguments:
• expression: A numeric expression.
Query
START a=node(3), c=node(2)

1
RETURN a.age, c.age, abs(a.age -
2
c.age)
The absolute value of the age difference is returned.
表 . --docbook
a c abs(a.age -
.age .age c.age)
1 row
0 ms
38 41 3.0
ROUND
ROUND returns the numerical expression, rounded to the nearest integer.
Syntax: ROUND( expression )
Arguments:
• expression: A numerical expression.
Query
START a=node(1)
1
RETURN
2
round(3.141592)
表 . --docbook
round(3.141
592)
1 row
0 ms
222
SQRT
SQRT returns the square root of a number.
Syntax: SQRT( expression )
Arguments:
• expression: A numerical expression
Query
START
a=node(1)
1
2 RETURN
sqrt(256)
表 . --docbook
sqrt(
256)
1
row
0 ms
SIGN
SIGN returns the signum of a number — zero if the expression is zero, -1 for any negative
number, and 1 for any positive number.
Syntax: SIGN( expression )
Arguments:
• expression: A numerical expression
Query
START n=node(1)
1
RETURN sign(-17),
2
sign(0.1)
表 . --docbook
sign sign
(-17) (0.1)
1 row
0 ms
223
sign sign
(-17) (0.1)
-1.0 1.0
15.8.44. 兼容性
Cypher is still changing rather rapidly. Parts of the changes are internal — we add new pattern
matchers, aggregators and other optimizations, which hopefully makes your queries run faster.
Other changes are directly visible to our users — the syntax is still changing. New concepts are
being added and old ones changed to fit into new possibilities. To guard you from having to keep up
with our syntax changes, Cypher allows you to use an older parser, but still gain the speed from new
optimizations.
There are two ways you can select which parser to use. You can configure your database with
the configuration parameter cypher_parser_version, and enter which parser you’d like to use
(1.6, 1.7 and 1.8 are supported now). Any Cypher query that doesn’t explicitly say anything else, will
get the parser you have configured.
The other way is on a query by query basis. By simply pre-pending your query with "CYPHER
1.6", that particular query will be parsed with the 1.6 version of the parser. Example:
CYPHER 1.6 START

1
n=node(0)
2
WHERE n.foo = "bar"
3
RETURN n
15.8.45. 从 SQL 到 Cypher 查询
This guide is for people who understand SQL. You can use that prior knowledge to quickly get
going with Cypher and start exploring Neo4j.
Start
SQL starts with the result you want — we SELECT what we want and then declare how to
source it. In Cypher, the START clause is quite a different concept which specifies starting points in
the graph from which the query will execute.
From a SQL point of view, the identifiers in START are like table names that point to a set of
nodes or relationships. The set can be listed literally, come via parameters, or as I show in the
following example, be defined by an index look-up.
224
So in fact rather than being SELECT-like, the START clause is somewhere between the FROM
and the WHERE clause in SQL.
SQL Query.
SELECT *
1
FROM "Person"
2
WHERE name =
3
'Anakin'
表 . --docbook
N I A H
AME D GE AIR
1 rows
Ana 1 2 bl
kin 0 onde
Cypher Query.
START person=node:Person(name =
1
'Anakin')
2
RETURN person
表 . --docbook
person
1 row
1 ms
Node[1]{name:"Anakin",id:1,age:20,hair:"blon
de"}
Cypher allows multiple starting points. This should not be strange from a SQL
perspective — every table in the FROM clause is another starting point.
Match
Unlike SQL which operates on sets, Cypher predominantly works on sub-graphs. The relational
equivalent is the current set of tuples being evaluated during a SELECT query.
The shape of the sub-graph is specified in the MATCH clause. The MATCH clause is analogous to
the JOIN in SQL. A normal a→b relationship is an inner join between nodes a and b — both sides
have to have at least one match, or nothing is returned.
225
We’ll start with a simple example, where we find all email addresses that are connected to the
person “Anakin”. This is an ordinary one-to-many relationship.
SQL Query.
SELECT "Email".*
1
FROM "Person"
2
JOIN "Email" ON "Person".id =
3
"Email".person_id
4
WHERE "Person".name = 'Anakin'
表 . --docbook
ADDRESS COMM PERSO

ENT N_ID
2 rows
anakin@example home 1
.com
anakin@example work 1
.org
Cypher Query.
1
'Anakin')
2
MATCH person-[:email]->email
3
RETURN email
表 . --docbook
email
2 rows
11 ms
Node[7]{address:"anakin@example.com",comment:"ho
me"}
Node[8]{address:"anakin@example.org",comment:"wo
rk"}
在这里，没有联接表，但如果一个必要下一个示例将显示如何做到这一点，写作模式
的关系像这样：-[r: belongs_to]-> 将介绍（相当于）联接表可用作变量 r。在现实中这是
命名的关系，在加密，所以我们说"belongs_to 通过加入人到组"。为了说明这一点，请考虑
226
此图像，比较 SQL 模型和 Neo4j/暗号表达。
And here are example queries:
SQL Query.
SELECT "Group".*, "Person_Group".*

1
FROM "Person"
2
JOIN "Person_Group" ON "Person".id =
3
"Person_Group".person_id
4
JOIN "Group" ON "Person_Group".Group_id="Group".id
5
WHERE "Person".name = 'Bridget'
表 . --docbook
N I BELONGS_TO_GROU PERSO GROU

AME D P_ID N_ID P_ID
1 rows
Adm 4 3 2 4
in
Cypher Query.
1
'Bridget')
2
MATCH person-[r:belongs_to]->group
3
RETURN group, r
表 . --docbook
group r
1 row
0 ms
227
group r
Node[6]{name:"Admin",i :belongs_to[0
d:4} ] {}
An outer join is just as easy. Add a question mark -[?:KNOWS]-> and it’s an optional
relationship between nodes — the outer join of Cypher.
Whether it’s a left outer join, or a right outer join is defined by which side of the pattern has a
starting point. This example is a left outer join, because the bound node is on the left side:
SQL Query.
SELECT "Person".name, "Email".address

1
FROM "Person" LEFT
2
JOIN "Email" ON "Person".id =
3
"Email".person_id
表 . --docbook
N ADDRESS
AME
3 rows
Ana anakin@example
kin .com
Ana anakin@example
kin .org
Bri <null>
dget
Cypher Query.
START person=node:Person('name:
*')
1
2 MATCH person-[?:email]->email
3 RETURN person.name,
email.address?
表 . --docbook
person.n email.addres
ame s?
3 rows
228
person.n email.addres
ame s?
3 ms
"Anakin" "anakin@example.
com"
"Anakin" "anakin@example.
org"
"Bridget" <null>
Relationships in Neo4j are first class citizens — it’s like the SQL tables are pre-joined with
each other. So, naturally, Cypher is designed to be able to handle highly connected data easily.
One such domain is tree structures — anyone that has tried storing tree structures in SQL
knows that you have to work hard to get around the limitations of the relational model. There are even
books on the subject.
To find all the groups and sub-groups that Bridget belongs to, this query is enough in Cypher:
Cypher Query.
1
Bridget')
2
MATCH person-[:belongs_to*]->group
3
RETURN person.name, group.name
表 . --docbook
person.n group.n
ame ame
3 rows
4 ms
"Bridget" "Admin"
"Bridget" "Technichi
an"
"Bridget" "User"
The * after the relationship type means that there can be multiple hops across belongs_to
relationships between group and user. Some SQL dialects have recursive abilities, that allow the
229
expression of queries like this, but you may have a hard time wrapping your head around those.
Expressing something like this in SQL is hugely impractical if not practically impossible.
Where
This is the easiest thing to understand — it’s the same animal in both languages. It filters out
result sets/subgraphs. Not all predicates have an equivalent in the other language, but the concept is
the same.
SQL Query.
SELECT *
1
FROM "Person"
2
WHERE "Person".age > 35 AND "Person".hair =
3
'blonde'
表 . --docbook
N I A H
AME D GE AIR
1 rows
Bri 2 4 bl
dget 0 onde
Cypher Query.
START person=node:Person('name: *')

1
WHERE person.age > 35 AND person.hair =
2
'blonde'
3
RETURN person
表 . --docbook
person
1 row
3 ms
Node[2]{name:"Bridget",id:2,age:40,hair:"blon
de"}
Return
This is SQL’s SELECT. We just put it in the end because it felt better to have it there — you do
a lot of matching and filtering, and finally, you return something.
230
Aggregate queries work just like they do in SQL, apart from the fact that there is no explicit
GROUP BY clause. Everything in the return clause that is not an aggregate function will be used as the
grouping columns.
SQL Query.
SELECT "Person".name,
1
count(*)
2
FROM "Person"
3
GROUP BY "Person".name
4
ORDER BY "Person".name
表 . --docbook
N C
AME 2
2
rows
Ana 1
kin
Bri 1
dget
Cypher Query.
1
*')
2
RETURN person.name, count(*)
3
ORDER BY person.name
表 . --docbook
person.n cou
ame nt(*)
2 rows
1 ms
"Anakin" 1
"Bridget" 1
Order by is the same in both languages — ORDER BY expression ASC/DESC. Nothing weird
here.
231
第 16 章图形算法
目录
16.1. Introduction
Neo4j graph algorithms is a component that contains Neo4j implementations of some common
algorithms for graphs. It includes algorithms like:
• Shortest paths,
• all paths,
• all simple paths,
• Dijkstra and
• A*.
16.1. Introduction
The graph algorithms are found in the neo4j-graph-algo component, which is included in
the standard Neo4j download.
• Javadocs
• Download
• Source code
For information on how to use neo4j-graph-algo as a dependency with Maven and other
dependency management tools, see org.neo4j:neo4j-graph-algo Note that it should be used
with the same version of org.neo4j:neo4j-kernel. Different versions of the graph-algo and
kernel components are not compatible in the general case. Both components are included transitively
by the org.neo4j:neo4j artifact which makes it simple to keep the versions in sync.
The starting point to find and use graph algorithms is GraphAlgoFactory.
For examples, see 第 4.7 节 “图算法范例” (embedded database) and 第 18.14 节 “Built-in
Graph Algorithms” (REST API).
第 17 章 Neo4j 服务器
17.1. 服务器安装
Neo4j 可以作为运行作为无头的应用程序或系统服务的服务器安装.
1. Download the latest release from http://neo4j.org/download
232
2. 选择适当的版本，您的平台
3. 归档文件的内容解压
• refer to the top-level extracted directory as NEO4J_HOME
4. 在 bin 目录中使用的脚本
• for Linux/MacOS, run $NEO4J_HOME/bin/neo4j start
• for Windows, double-click on %NEO4J_HOME%\bin\Neo4j.bat
5. 请参阅详细信息的文档目录中的打包信息
高可用性的信息，请参阅第 22 章高可用性模式.
17.1.1. As a Windows service

具有管理权限，可以作为 Windows 服务安装 Neo4j.
1. 单击开始 → 所有程序 → 附件
2. 右键单击命令提示符 → 以管理员身份运行
3. 提供授权和/或管理员密码
4. Navigate to %NEO4J_HOME%
5. Run bin\Neo4j.bat install
To uninstall, run bin\Neo4j.bat remove as Administrator.
To query the status of the service, run bin\Neo4j.bat status
To start the service from the command prompt, run bin\Neo4j.bat start
To stop the service from the command prompt, run bin\Neo4j.bat stop
注意
一些用户报告问题在 Windows 上的使用 ZoneAlarm 防火墙时。如果您有问题从

服务器获取大的反应，或如果 Webadmin 不工作，请尝试禁用 ZoneAlarm。请联系
ZoneAlarm 支持，以获取有关如何解决此问题的信息.
17.1.2. Linux Service
Neo4j can participate in the normal system startup and shutdown process. The following
procedure should work on most popular Linux distributions:
1. cd $NEO4J_HOME
2. sudo ./bin/neo4j install
if asked, enter your password to gain super-user privileges

233
3. service neo4j-service status
should indicate that the server is not running
4. service neo4j-service start
will start the server
During installation you will be given the option to select the user Neo4j will run as. You will be
asked to supply a username (defaulting to neo4j) and if that user is not present on the system it will
be created as a system account and the $NEO4J_HOME/data directory will be chown'ed to that user.
You are encouraged to create a dedicated user for running the service and for that reason it is
suggested that you unpack the distribution package under /opt or your site specific optional packages
directory.
After installation you may have to do some platform specific configuration and performance
tuning. For that, refer to 第 21.11 节 “Linux 特有的注意事项”.
Finally, note that if you chose to create a new user account, on uninstall you will be prompted to
remove it from the system.
17.1.4. Multiple Server instances on one machine
Neo4j can be set up to run as several instances on one machine, providing for instance several
databases for development. To configure, install two instances of the Neo4j Server in two different
directories following the steps outlined below.
First instance
First, create a directory to hold both database instances, and unpack the development instance:
1. cd $INSTANCE_ROOT
2. mkdir -p neo4j
3. cd neo4j
4. tar -xvzf /path/to/neo4j-community.tar.gz
5. mv neo4j-community dev
Next, configure the instance by changing the following values in

dev/conf/neo4j-server.properties, see even 第 24.1 节 “安全访问 Neo4j 服务器”:
org.neo4j.server.webserver.port=7474
1
2
# Uncomment the following if the instance will be accessed from
3
a host other than localhost.
4
org.neo4j.server.webserver.address=0.0.0.0
234
Before running the Windows install or startup, change in dev/conf/neo4j-wrapper.properties
1 # Name of the service for the first instance

2 wrapper.name=neo4j_1
Start the instance:
dev/bin/neo4j start
Check that instance is available by browsing to http://localhost:7474/webadmin/
Second instance (testing, development)

In many cases during application development, it is desirable to have one development database
set up, and another against which to run unit tests. For the following example, we are assuming that
both databases will run on the same host.
Now create the unit testing second instance:
1. cd $INSTANCE_ROOT/neo4j
2. tar -xvzf /path/to/neo4j-community.tar.gz
3. mv neo4j-community test
Next, configure the instance by changing the following values in

test/conf/neo4j-server.properties to
• change the server port to 7475
# Note the different port number from the development instance

1
org.neo4j.server.webserver.port=7475
2
3
# Uncomment the following if the instance will be accessed from
4
a host other than localhost
5
org.neo4j.server.webserver.address=0.0.0.0
Differentiate the instance from the development instance by modifying
test/conf/neo4j-wrapper.properties.
wrapper.name=neo4j-
1
test
On Windows, you even need to change the name of the service in bin\neo4j.bat to be able to run
it together with the first instance.
set serviceName=Neo4j-Server-test
1
set
2
serviceDisplayName=Neo4j-Server-test
Start the instance:
235
test/bin/neo4j start
Check that instance is available by browsing to http://localhost:7475/webadmin/
17.2. 服务器配置
快速浏览信息
• The server’s primary configuration file is found under conf/neo4j-server.properties
• The conf/log4j.properties file contains the default server logging configuration
• Low-level performance tuning parameters are found in conf/neo4j.properties
• Configuraion of the deamonizing wrapper are found in conf/neo4j-wrapper.properties
• HTTP logging configuration is found in conf/neo4j-http-logging.xml
17.2.1. 重要的服务器配置参数
主配置文件的服务器可以发现在 conf/neo4j-server.properties。此文件包含几个重要的设置，
并且虽然默认设置是明智管理员可能会选择进行更改（特别是对端口设置）.
Set the location on disk of the database directory like this:
1 org.neo4j.server.database.location=data/graph.db
注意
On Windows systems, absolute locations including drive letters need to read

"c:/data/db".
Specify the HTTP server port supporting data, administrative, and UI access:
org.neo4j.server.webserver.port=
1
7474
Specify the client accept pattern for the webserver (default is 127.0.0.1, localhost only):
#allow any client to connect

1
org.neo4j.server.webserver.address=0.0
2
.0.0
For securing the Neo4j Server, see also security-server
Set the location of the round-robin database directory which gathers metrics on the running
server instance:
org.neo4j.server.webadmin.rrdb.location=data/graph.db/..
1
/rrd
Set the URI path for the REST data API through which the database is accessed. This should be
a relative path.
236
org.neo4j.server.webadmin.data.uri=/db/d
1
ata/
Setting the management URI for the administration API that the Webadmin tool uses. This
should be a relative path.
org.neo4j.server.webadmin.management.uri=/db/ma
1
nage
Force the server to use IPv4 network addresses, in conf/neo4j-wrapper.conf under the section
Java Additional Parameters add a new paramter:
wrapper.java.additional.3=-Djava.net.preferIPv4Stack=
1
true
Low-level performance tuning parameters can be explicitly set by referring to the following
property:
org.neo4j.server.db.tuning.properties=neo4j.proper
1
ties
If this property isn’t set, the server will look for a file called neo4j.properties in the same
directory as the neo4j-server.properties file.
If this property isn’t set, and there is no neo4j.properties file in the default configuration
directory, then the server will log a warning. Subsequently at runtime the database engine will attempt
tune itself based on the prevailing conditions.
17.2.2. Neo4j 数据库性能配置
The fine-tuning of the low-level Neo4j graph database engine is specified in a separate
properties file, conf/neo4j.properties.
图数据库引擎具有一系列性能优化选项，列举服务器性能。请注意应考虑其他因素比
Neo4j 优化，优化服务器，虽然包括一般服务器负载、内存和文件争用和 JVM 的刑罚甚至
垃圾收集这种考虑性能时超出了此配置文件的范围.
17.2.3. 服务器日志配置
Application events within Neo4j server are processed with java.util.logging and
configured in the file conf/logging.properties.
By default it is setup to print INFO level messages both on screen and in a rolling file in data/log.
Most deployments will choose to use their own configuration here to meet local standards. During
development, much useful information can be found in the logs so some form of logging to disk is
well worth keeping. On the other hand, if you want to completely silence the console output, set:
237
java.util.logging.ConsoleHandler.level
1
=OFF
By default log files are rotated at approximately 10Mb and named consecutively
neo4j.<id>.<rotation sequence #>.log To change the naming scheme, rotation frequency and backlog
size modify
java.util.logging.FileHandler.pat
tern
1
java.util.logging.FileHandler.lim
2
it
3
java.util.logging.FileHandler.cou
nt
respectively to your needs. Details are available at the Javadoc for
java.util.logging.FileHandler.
Apart from log statements originating from the Neo4j server, other libraries report their
messages through various frameworks.
Zookeeper is hardwired to use the log4j logging framework. The bundled conf/log4j.properties
applies for this use only and uses a rolling appender and outputs logs by default to the data/log
directory.
17.2.4. HTTP 日志配置
As well as logging events happening within the Neo4j server, it is possible to log the HTTP
requests and responses that the server consumes and produces. Configuring HTTP logging requires
operators to enable and configure the logger and where it will log; and then to optionally configure
the log format.
警告
By default the HTTP logger uses Common Log Format meaning that most Web server
tooling can automtically consume such logs. In general users should only enable HTTP
logging, select an output directory, and if necessary alter the rollover and retention policies.
To enable HTTP logging, edit the conf/neo4j-server.properties file resemble the following:
org.neo4j.server.http.log.enabled=true
1
org.neo4j.server.http.log.config=conf/neo4j-http-logging
2
.xml
org.neo4j.server.http.log.enabled=true tells the server that HTTP logging is enabled. HTTP
logging can be totally disabled by setting this property to false.
org.neo4j.server.http.log.config=conf/neo4j-http-logging.xml specifies the logging format and
238
rollover policy file that governs how HTTP log output is presented and archived. The defaults
provided with Neo4j server uses an hourly log rotation and Common Log Format.
If logging is set up to use log files then the server will check that the log file directory exists and
is writable. If this check fails, then the server will not startup and wil report the failure another
available channel like standard out.
17.2.5. 其他配置选项
启用来自垃圾收集器日志记录
To get garbage collection logging output you have to pass the corresponding option to the server
JVM executable by setting in conf/neo4j-wrapper.conf the value
wrapper.java.additional.3=-Xloggc:data/log/neo4j-gc
1
.log
This line is already present and needs uncommenting. Note also that logging is not directed to
console ; You will find the logging statements in data/log/ne4j-gc.log or whatever directory you set at
the option.
在 Webadmin 中禁用控制台类型
You may, for security reasons, want to disable the Gremlin console and/or the Neo4j Shell in
Webadmin. Both of them allow arbitrary code execution, and so they could constitute a security risk
if you do not trust all users of your Neo4j Server.
In the conf/neo4j-server.properties file:
# To disable both Neo4j Shell and Gremlin:

1
org.neo4j.server.manage.console_engines=
2
3
# To enable only the Neo4j Shell:
4
org.neo4j.server.manage.console_engines=shell
5
6
# To enable both
7
org.neo4j.server.manage.console_engines=gremlin,s
8
hell
17.3. 设置远程调试
In order to configure the Neo4j server for remote debugging sessions, the Java debugging
parameters need to be passed to the Java process through the configuration. They live in the
conf/neo4j-wrapper.properties file.
In order to specify the parameters, add a line for the additional Java arguments like this:
239
# Java Additional Parameters
wrapper.java.additional.1=-Dorg.neo4j.server.properties=conf
/neo4j-server.properties
1
wrapper.java.additional.2=-Dlog4j.configuration=file:conf/lo
2
g4j.properties
3
wrapper.java.additional.3=-agentlib:jdwp=transport=dt_socket
4
,server=y,suspend=n,address=5005
-Xdebug-Xnoagent-Djava.compiler=NONE-Xrunjdwp:transport=dt_sock
et,server=y,suspend=n,address=5005
This configuration will start a Neo4j server ready for remote debugging attachement at localhost
and port 5005. Use these parameters to attach to the process from Eclipse, IntelliJ or your remote
debugger of choice after starting the server.
Neo4j 简体中文手册 > 参考 > Neo4j 服务器 > 使用 Neo4j 服务器（带 Web 管理控制台）
连接一个 Neo4j 嵌入数据库上一页下一页
--------------------------------------------------------------------------------
17.4. 使用 Neo4j 服务器（带 Web 管理控制台）连接一个 Neo4j 嵌入数据库 17.4.1. Getting
the libraries
17.4.2. Starting the Server from Java
17.4.3. Providing custom configuration
Even if you are using the Neo4j Java API directly, for instance via EmbeddedGraphDatabase or
HighlyAvailableGraphDatabase, you can still use the features the server provides.
17.4.1. Getting the librariesFrom the Neo4j Server installationTo run the server all the libraries
you need are in the system/lib/ directory of the download package. For further instructions, see 第
4.1 节 “将 Neo4j 引入到你的项目工程中”. The only difference to the embedded setup is that
system/lib/ should be added as well, not only the lib/ directory.
Via MavenFor users of dependency management, an example for Apache Maven follows. Note
that the web resources are in a different artifact.
Maven pom.xml snippet.
1234567891011121314151617181920212223242526 <dependencies> <dependency>

<groupId>org.neo4j.app</groupId> <artifactId>neo4j-server</artifactId>
<version>1.8</version> </dependency> <dependency>
<groupId>org.neo4j.app</groupId> <artifactId>neo4j-server</artifactId>
<classifier>static-web</classifier> <version>1.8</version> </dependency> </dependencies>
<repositories> <repository> <id>neo4j-snapshot-repository</id> <name>Neo4j Maven
2 snapshot repository</name> <url>http://m2.neo4j.org/content/repositories/snapshots/</url>
<releases> <enabled>false</enabled> </releases> <snapshots>
<enabled>true</enabled> </snapshots> </repository> </repositories>
Via Scala SBT / IvyIn order to pull in the dependencys with SBT and configure the underlying
Ivy dependency manager, you can use a setup like the following in your build.sbt:
240
1234567891011121314151617 organization := "your.org" name := "your.name" version :=
"your.version" /** Deps for Embedding the Neo4j Admin server. */libraryDependencies ++=
Seq( "org.neo4j.app" % "neo4j-server" % "1.8" classifier "static-web" classifier "",
"com.sun.jersey" % "jersey-core" % "1.9") /** Repos for Neo4j Admin server dep */resolvers ++=
Seq( "maven-central" at "http://repo1.maven.org/maven2", "neo4j-public-repository" at
"http://m2.neo4j.org/content/groups/public")
17.4.2. Starting the Server from JavaThe Neo4j server exposes a class called
WrappingNeoServerBootstrapper, which is capable of starting a Neo4j server in the same process as
your application. It uses an AbstractGraphDatabase instance that you provide.
This gives your application, among other things, the REST API, statistics gathering and the web
interface that comes with the server.
Usage example.
1234567 InternalAbstractGraphDatabase graphdb = getGraphDb();

WrappingNeoServerBootstrapper srv; srv = new WrappingNeoServerBootstrapper( graphdb );
srv.start(); // The server is now running // until we stop it: srv.stop();
Once you have the server up and running, see 第 26 章基于 Web 的 Neo4j 图数据库管理工
具 and rest-api for how to use it!
17.4.3. Providing custom configurationYou can modify the server settings programmatically and,
within reason, the same settings are available to you here as those outlined in server-configuration.
The settings that are not available (or rather, that are ignored) are those that concern the
underlying database, such as database location and database configuration path.
Custom configuration example.
1234567891011121314 // let the database accept remote neo4j-shell connections

GraphDatabaseAPI graphdb = (GraphDatabaseAPI) new
GraphDatabaseFactory() .newEmbeddedDatabaseBuilder( "target/configDb" ) .setConfig(
ShellSettings.remote_shell_enabled, GraphDatabaseSetting.TRUE ) .newGraphDatabase();
ServerConfigurator config; config = new ServerConfigurator( graphdb ); // let the server endpoint be
on a custom port
config.configuration().setProperty( Configurator.WEBSERVER_PORT_PROPERTY_KE
Y, 7575 ); WrappingNeoServerBootstrapper srv; srv = new
WrappingNeoServerBootstrapper( graphdb, config ); srv.start();
17.5. 服务器性能优化
17.5.1. Specifying Neo4j tuning properties
17.5.2. Specifying JVM tuning properties
At the heart of the Neo4j server is a regular Neo4j storage engine instance. That engine can be
tuned in the same way as the other embedded configurations, using the same file format. The only
difference is that the server must be told where to find the fine-tuning configuration.
241
Quick info
• The neo4j.properties file is a standard configuration file that databases load in order to
tune their memory use and caching strategies.
• See 第 21.4 节 “Neo4j 的缓存设置” for more information.
17.5.1. Specifying Neo4j tuning properties

The conf/neo4j-server.properties file in the server distribution, is the main
configuration file for the server. In this file we can specify a second properties file that contains the
database tuning settings (that is, the neo4j.properties file). This is done by setting a single
property to point to a valid neo4j.properties file:
org.neo4j.server.db.tuning.properties={neo4j.properties
1
file}
On restarting the server the tuning enhancements specified in the neo4j.properties file will
be loaded and configured into the underlying database engine.
17.5.2. Specifying JVM tuning properties

Tuning the standalone server is achieved by editing the neo4j-wrapper.conf file in the
conf directory of NEO4J_HOME.
Edit the following properties:
表 17.1. neo4j-wrapper.conf JVM tuning properties

Property Name Meaning
wrapper.java.initmemo initial heap size (in MB)

ry
wrapper.java.maxmemor maximum heap size (in MB)

y
wrapper.java.addition additional literal JVM parameter, where N is a number for

al.N each
For more information on the tuning properties, see 第 21.6 节 “JVM 设置”.
17.6. 在云计算环境中的服务器安装
在各种云服务上的 Neo4j，要么是用户自己建立的，要么是 Neo Technology 云。下面是关
于这些的介绍。
242
17.7. Heroku
了解一个基本的安装设置，请参考：the Heroku 快速入门教程。
为了增加 Neo4j 到你的 Heroku 应用中，如下操作：
heroku addons:add
1
neo4j
第 18 章 REST API
The Neo4j REST API is designed with discoverability in mind, so that you can start with a GET
on the 第 18.1 节 “Service root” and from there discover URIs to perform other requests. The
examples below uses URIs in the examples; they are subject to change in the future, so for
future-proofness discover URIs where possible, instead of relying on the current layout. The default
representation is json, both for responses and for data sent with POST/PUT requests.
Below follows a listing of ways to interact with the REST API. For language bindings to the
REST API, see 第 5 章 Neo4j 远程客户端库.
To interact with the JSON interface you must explicitly set the request header
Accept:application/json for those requests that responds with data. You should also set the
header Content-Type:application/json if your request sends data, for example when you’re
creating a relationship. The examples include the relevant request and response headers.
The server supports streaming results, with better performance and lower memory overhead. See
第 18.2 节 “Streaming” for more information.
18.1. Service root

18.1.1. Get service root
18.1.1. Get service root
The service root is your starting point to discover the REST API. It contains the basic starting
points for the database, and some version and extension information. The reference_node entry
will only be present if there is a reference node set and that node actually exists in the database.
图 18.1. Final Graph
243
Example request
• GET http://localhost:7474/db/data/
• Accept: application/json
Example response
• 200: OK
• Content-Type: application/json
1
{
2
"extensions" : {
3
},
4
5
"reference_node" :
6
"http://localhost:7474/db/data/node/193",
7
8
9
1
0
"relationship_types" :
1
"http://localhost:7474/db/data/relationship/types",
1
"batch" : "http://localhost:7474/db/data/batch",
1
"cypher" : "http://localhost:7474/db/data/cypher",
2
"neo4j_version" : "1.8.M07-98-ge9ef235"
1
}
3
18.2. Streaming
The whole REST API can be transmitted as JSON streams, resulting in better performance and
lower memory overhead on the server side. To use it, adjust the request headers for every call, see the
example below for details.
小心
244
This feature is new, and you should make yourself comfortable with the streamed
response style versus the non-streamed API where results are delivered in a single large
response. Expect future releases to have streaming enabled by default since it is a far more
efficient mechanism for both client and server.
Example request
• GET http://localhost:7474/db/data/
• X-Stream: true
Example response
• 200: OK
• Content-Type: application/json; stream=true
1
{
2
"extensions" : {
3
},
4
5
"reference_node" :
6
7
8
9
1
0
"relationship_types" :
1
"http://localhost:7474/db/data/relationship/types",
1
"batch" : "http://localhost:7474/db/data/batch",
1
"cypher" : "http://localhost:7474/db/data/cypher",
2
"neo4j_version" : "1.8.M07-98-ge9ef235"
1
}
3
245
18.3. Cypher queries
18.3.1. Send queries with parameters
18.3.2. Send a Query
18.3.3. Return paths
18.3.4. Nested results
18.3.5. Server errors
The Neo4j REST API allows querying with Cypher, see 第 15 章 Cypher 查询语言. The
results are returned as a list of string headers (columns), and a data part, consisting of a list of all
rows, every row consisting of a list of REST representations of the field value — Node,
Relationship, Path or any simple value like String.
提示
In order to speed up queries in repeated scenarios, try not to use literals but replace
them with parameters wherever possible in order to let the server cache query plans, see
第 18.3.1 节 “Send queries with parameters” for details.
Cypher supports queries with parameters which are submitted as a JSON map.
START x =
1
node:node_auto_index(name={startName})
2
MATCH path = (x-[r]-friend)
3
WHERE friend.name = {name}
4
RETURN TYPE(r)
Example request
• POST http://localhost:7474/db/data/cypher
246
{
1 "query" : "start x = node:node_auto_index(name={startName})
match
2 path = (x-[r]-friend) where friend.name = {name} return
TYPE(r)",
3
4 "params" : {
5 "startName" : "I",
6 "name" : "you"
7 }
}
Example response
• 200: OK
{
1
"columns" :
2
[ "TYPE(r)" ],
3
"data" : [ [ "know" ] ]
4
}
A simple query returning all nodes connected to node 1, returning the node and the name
property, if it exists, otherwise null:
START x = node(10)
1
MATCH x -[r]-> n
2
RETURN type(r), n.name?,
3
n.age?
Example request
247
{
1
"query" : "start x = node(10) match x -[r]-> n return type(r),
2
n.name?, n.age?",
3
"params" : {
4
}
5
}
Example response
• 200: OK
1 {
2 "columns" : [ "type(r)", "n.name?", "n.age?" ],
3 "data" : [ [ "know", "him", 25 ], [ "know", "you", null ] ]
4 }
Paths can be returned together with other return types by just specifying returns.
START x = node(17)
1 MATCH path =
(x--friend)
2
3 RETURN path,
friend.name
Example request
248
{
1
"query" : "start x = node(17) match path = (x--friend) return
2
path, friend.name",
3
"params" : {
4
}
5
}
Example response
• 200: OK
{
1
"columns" : [ "path", "friend.name" ],
2
"data" : [ [ {
3
"start" : "http://localhost:7474/db/data/node/17",
4
"nodes" : [ "http://localhost:7474/db/data/node/17",
5
"http://localhost:7474/db/data/node/16" ],
6
"length" : 1,
7
"relationships" :
8
[ "http://localhost:7474/db/data/relationship/9" ],
9
"end" : "http://localhost:7474/db/data/node/16"
1
}, "you" ] ]
0
}
18.3.4. Nested results
When sending queries that return nested results like list and maps, these will get serialized into
nested JSON representations according to their types.
START n =
node(26,25)
1
2 RETURN
collect(n.name)
Example request
249
1 {
2 "query" : "start n = node(26,25) return collect(n.name)",
3 "params" : {
4 }
5 }
Example response
• 200: OK
{
1
"columns" :
2
[ "collect(n.name)" ],
3
"data" : [ [ [ "I", "you" ] ] ]
4
}
Errors on the server will be reported as a JSON-formatted stacktrace and message.
START x =
1
node(15)
2
RETURN x.dummy
Example request
250
1 {
2 "query" : "start x = node(15) return x.dummy",
3 "params" : {
4 }
5 }
Example response
• 400: Bad Request
{
"message" : "The property 'dummy' does not exist on Node[15]",
"exception" : "BadInputException",
"stacktrace" :
[ "org.neo4j.server.rest.repr.RepresentationExceptionHandlingIte
rable.exceptionOnHasNext(RepresentationExceptionHandlingIterable
.java:51)",
"org.neo4j.helpers.collection.ExceptionHandlingIterable$1.hasNex
t(ExceptionHandlingIterable.java:61)",
"org.neo4j.helpers.collection.IteratorWrapper.hasNext(IteratorWr
apper.java:42)",
"org.neo4j.server.rest.repr.ListRepresentation.serialize(ListRep
resentation.java:58)",
"org.neo4j.server.rest.repr.Serializer.serialize(Serializer.java
1
:75)",
2
"org.neo4j.server.rest.repr.MappingSerializer.putList(MappingSer
3
ializer.java:61)",
4
"org.neo4j.server.rest.repr.CypherResultRepresentation.serialize
5
(CypherResultRepresentation.java:50)",
"org.neo4j.server.rest.repr.MappingRepresentation.serialize(Mapp
ingRepresentation.java:42)",
"org.neo4j.server.rest.repr.OutputFormat.format(OutputFormat.jav
a:170)",
"org.neo4j.server.rest.repr.OutputFormat.formatRepresentation(Ou
tputFormat.java:120)",
"org.neo4j.server.rest.repr.OutputFormat.response(OutputFormat.j
ava:107)",
"org.neo4j.server.rest.repr.OutputFormat.ok(OutputFormat.java:55
)",
"org.neo4j.server.rest.web.CypherService.cypher(CypherService.ja
va:80)", "java.lang.reflect.Method.invoke(Method.java:597)" ]
}
18.4. Nodes
18.4.1. Create Node

251
Example request
• POST http://localhost:7474/db/data/node
Example response
• 201: Created
• Location: http://localhost:7474/db/data/node/26
{
1
"extensions" : {
2
},
3
"paged_traverse" :
4
"http://localhost:7474/db/data/node/26/paged/traverse/{returnTy
5
pe}{?pageSize,leaseTime}",
6
"outgoing_relationships" :
7
8
"traverse" :
9
1
"all_typed_relationships" :
0
1
|&|types}",
1
"all_relationships" :
1
2
"property" :
1
3
"self" : "http://localhost:7474/db/data/node/26",
1
"outgoing_typed_relationships" :
4
1
|&|types}",
5
"properties" :
1
6
"incoming_relationships" :
1
7
"incoming_typed_relationships" :
1
8
&|types}",
252
"http://localhost:7474/db/data/node/26/relationships",
"data" : {
}
}
18.4.2. Create Node with properties
Example request
{
1
"foo" :
2
"bar"
3
}
Example response
• 201: Created
• Content-Length: 1108
• Location: http://localhost:7474/db/data/node/27
1 {
2 "extensions" : {
3 },
4 "paged_traverse" :
5
6
8
9 "traverse" :
1
1
1 |&|types}",
253
1 "property" :
1
5 |&|types}",
1 "properties" :
&|types}",
1
"data" : {
"foo" : "bar"
}
}
18.4.3. Get node
Note that the response contains URI/templates for the available operations for getting properties
and relationships.
Example request
• GET http://localhost:7474/db/data/node/199
Example response
• 200: OK
1 {
2 "extensions" : {
3 },
254
"http://localhost:7474/db/data/node/199/paged/traverse/{returnT
5
ype}{?pageSize,leaseTime}",
6
8
9 "traverse" :
1
"http://localhost:7474/db/data/node/199/relationships/all/{-lis
1
1 t|&|types}",
1 "property" :
"http://localhost:7474/db/data/node/199/relationships/out/{-lis
1
5 t|&|types}",
1 "properties" :
8 "http://localhost:7474/db/data/node/199/relationships/in/{-list
|&|types}",
"data" : {
}
}
18.4.4. Get non-existent node
Example request
• GET http://localhost:7474/db/data/node/20300000
Example response
255
• 404: Not Found
{
"message" : "Cannot find node with id [20300000] in database.",
"exception" : "NodeNotFoundException",
"stacktrace" :
1
[ "org.neo4j.server.rest.web.DatabaseActions.node(DatabaseActio
2
ns.java:123)",
3
"org.neo4j.server.rest.web.DatabaseActions.getNode(DatabaseActi
4
ons.java:234)",
5
"org.neo4j.server.rest.web.RestfulGraphDatabase.getNode(Restful
GraphDatabase.java:225)",
"java.lang.reflect.Method.invoke(Method.java:597)" ]
}
18.4.5. Delete node
Example request
• DELETE http://localhost:7474/db/data/node/35
Example response
• 204: No Content
18.4.6. Nodes with relationships can not be deleted
The relationships on a node has to be deleted before the node can be deleted.
256
Example request
• DELETE http://localhost:7474/db/data/node/36
Example response
• 409: Conflict
{
"message" : "The node with id 36 cannot be deleted. Check that
the node is orphaned before deletion.",
1 "exception" : "OperationFailureException",
2 "stacktrace" :
[ 3"org.neo4j.server.rest.web.DatabaseActions.deleteNode(Databas
eActions.java:255)",
4
"org.neo4j.server.rest.web.RestfulGraphDatabase.deleteNode(Rest
5
fulGraphDatabase.java:239)",
}
18.5. Relationships
Relationships are a first class citizen in the Neo4j REST API. They can be accessed either
stand-alone or through the nodes they are attached to.
The general pattern to get relationships from a node is:
GET
http://localhost:7474/db/data/node/123/relationships/{dir}/{-li
1
st|&|types}
Where dir is one of all, in, out and types is an ampersand-separated list of types. See the
examples below for more information.
257
18.5.1. Get Relationship by ID
Example request
• GET http://localhost:7474/db/data/relationship/20
Example response
• 200: OK
1 {
2 "extensions" : {
3 },
4 "start" : "http://localhost:7474/db/data/node/44",
5 "property" :
"http://localhost:7474/db/data/relationship/20/properties/{key
6
}",
7
8 "self" : "http://localhost:7474/db/data/relationship/20",
9 "properties" :
"http://localhost:7474/db/data/relationship/20/properties",
1
0 "type" : "know",
1 "end" : "http://localhost:7474/db/data/node/43",
1 "data" : {
1 }
2 }
18.5.2. Create relationship
Upon successful creation of a relationship, the new relationship is returned.
258
Example request
• POST http://localhost:7474/db/data/node/66/relationships
{
1
"to" :
2
3
"type" : "LOVES"
4
}
Example response
• 201: Created
• Location: http://localhost:7474/db/data/relationship/35
1 {
2 "extensions" : {
3 },
5 "property" :
6
}",
7
9 "properties" :
1
0 "type" : "LOVES",
1 "data" : {
1 }
2 }
18.5.3. Create a relationship with properties
259
Upon successful creation of a relationship, the new relationship is returned.
图 18.16. Starting Graph
Example request
• POST http://localhost:7474/db/data/node/68/relationships
{
1
"to" :
2
3
"type" : "LOVES",
4
"data" : {
5
"foo" : "bar"
6
}
7
}
Example response
• 201: Created
260
• Location: http://localhost:7474/db/data/relationship/37
1
{
2
"extensions" : {
3
},
4
5
"property" :
6
7
}",
8
"self" : "http://localhost:7474/db/data/relationship/37",
9
"properties" :
1
0
"type" : "LOVES",
1
"end" : "http://localhost:7474/db/data/node/67",
1
"data" : {
1
"foo" : "bar"
2
}
1
}
3
18.5.4. Delete relationship
Example request
• DELETE http://localhost:7474/db/data/relationship/26
261
Example response
• 204: No Content
18.5.5. Get all properties on a relationship
Example request
• GET http://localhost:7474/db/data/relationship/30/properties
Example response
• 200: OK
{
1 "since" :
"1day",
2
3 "cost" :
"high"
4
}
18.5.6. Set all properties on a relationship
262
Example request
• PUT http://localhost:7474/db/data/relationship/29/properties
{
1
"happy" :
2
false
3
}
Example response
• 204: No Content
18.5.7. Get single property on a relationship
263
Example request
• GET
http://localhost:7474/db/data/relationship/27/properties/cost
Example response
• 200: OK
"h
1
igh"
18.5.8. Set single property on a relationship
264
Example request
• PUT
"dea
1
dly"
Example response
• 204: No Content
18.5.9. Get all relationships
265
Example request
• GET http://localhost:7474/db/data/node/25/relationships/all
Example response
• 200: OK
[ {
2 "data" : {
3 },
4 "self" :
5 "http://localhost:7474/db/data/relationship/9",
6 "property" :
7 "http://localhost:7474/db/data/relationship/9/properties/{
8 key}",
9 "properties" :
10 "http://localhost:7474/db/data/relationship/9/properties",
11 "type" : "LIKES",
12 "extensions" : {
13 },
14 "end" : "http://localhost:7474/db/data/node/26"
15 }, {
17 "data" : {
18 },
19 "self" :
21 "property" :
22 "http://localhost:7474/db/data/relationship/10/properties/
23 {key}",
24 "properties" :
25 "http://localhost:7474/db/data/relationship/10/properties"
26 ,
27 "type" : "LIKES",
28 "extensions" : {
29 },
31 }, {
33 "data" : {
34 },
"self" :
266
"http://localhost:7474/db/data/relationship/11",
"property" :
"http://localhost:7474/db/data/relationship/11/properties/
{key}",
"properties" :
"http://localhost:7474/db/data/relationship/11/properties"
,
"type" : "HATES",
"extensions" : {
},
} ]
18.5.10. Get incoming relationships
Example request
• GET http://localhost:7474/db/data/node/35/relationships/in
Example response
• 200: OK
1 [ {
3 "data" : {
267
4 },
6 "property" :
7
}",
8
9 "properties" :
1
0 "type" : "LIKES",
1 "extensions" : {
1 },
2 } ]
18.5.11. Get outgoing relationships
Example request
• GET http://localhost:7474/db/data/node/40/relationships/out
Example response
• 200: OK
1 [ {
3 "data" : {
268
4 },
6 "property" :
7
}",
8
9 "properties" :
1
0 "type" : "LIKES",
1 "extensions" : {
1 },
2 }, {
3 "data" : {
1 },
1 "property" :
5 "http://localhost:7474/db/data/relationship/20/properties/{key
}",
1
6 "properties" :
1
7 "type" : "HATES",
1 "extensions" : {
8 },
9 } ]
2
0
2
1
2
2
2
3
18.5.12. Get typed relationships

Note that the "&" needs to be encoded like "%26" for example when using cURL from the
terminal.
269
Example request
• GET
http://localhost:7474/db/data/node/45/relationships/all/LIKES&HATES
Example response
• 200: OK
1 [ {
3 "data" : {
4 },
6 "property" :
7
}",
8
9 "properties" :
1
0 "type" : "LIKES",
1 "extensions" : {
1 },
2 }, {
3 "data" : {
1 },
270
1 "property" :
}",
1
6 "properties" :
1
7 "type" : "LIKES",
1 "extensions" : {
8 },
9 }, {
0 "data" : {
2 },
2 "property" :
}",
2
3 "properties" :
2
4 "type" : "HATES",
2 "extensions" : {
5 },
6 } ]
2
7
2
8
2
9
3
0
3
1
3
2
3
3
3
4
18.5.13. Get relationships on a node without relationships
271
Example request
• GET http://localhost:7474/db/data/node/64/relationships/all
Example response
• 200: OK
[
1
]
18.6. Relationship types

18.6.1. Get relationship types
18.6.1. Get relationship types
272
Example request
• GET http://localhost:7474/db/data/relationship/types
Example response
• 200: OK
1 [ "knows", "likes", "KNOWS", "foo", "bar", "know", "has" ]
18.7. Node properties
18.7.1. Set property on node
Setting different properties will retain the existing ones for this node. Note that a single value are
submitted not as a map but just as a value (which is valid JSON) like in the example below.
Example request
• PUT http://localhost:7474/db/data/node/7/properties/foo
"
1
bar"
Example response
• 204: No Content
18.7.2. Update node properties
This will replace all existing properties on the node with the new set of attributes.
273
Example request
• PUT http://localhost:7474/db/data/node/1/properties
{
1
"age" :
2
"18"
3
}
Example response
• 204: No Content
18.7.3. Get properties for node
Example request
• GET http://localhost:7474/db/data/node/4/properties
Example response
• 200: OK
1 {
274
2 "foo" :
"bar"
3
}
18.7.4. Property values can not be null

This example shows the response you get when trying to set a property to null.
Example request
{
1
"foo" :
2
null
3
}
Example response
{
"message" : "Could not set property \"foo\", unsupported type:
null",
"exception" : "PropertyValueException",
1 "stacktrace" :
[ 2"org.neo4j.server.rest.web.DatabaseActions.set(DatabaseAction
s.java:155)",
3
"org.neo4j.server.rest.web.DatabaseActions.createNode(DatabaseA
4
ctions.java:213)",
5
"org.neo4j.server.rest.web.RestfulGraphDatabase.createNode(Rest
}
18.7.5. Property values can not be nested
Nesting properties is not supported. You could for example store the nested JSON as a string
instead.

Example request
275
• POST http://localhost:7474/db/data/node/
{
1
"foo" : {
2
"bar" :
3
"baz"
4
}
5
}
Example response
{
"message" : "Could not set property \"foo\", unsupported type:
{bar=baz}",
"exception" : "PropertyValueException",
1 "stacktrace" :
[ 2"org.neo4j.server.rest.web.DatabaseActions.set(DatabaseAction
s.java:155)",
3
"org.neo4j.server.rest.web.DatabaseActions.createNode(DatabaseA
4
ctions.java:213)",
5
"org.neo4j.server.rest.web.RestfulGraphDatabase.createNode(Rest
}
18.7.6. Delete all properties from node
Example request
• DELETE http://localhost:7474/db/data/node/5/properties
Example response
276
• 204: No Content
18.7.7. Delete a named property from a node
To delete a single property from a node, see the example below.
Example request
• DELETE http://localhost:7474/db/data/node/6/properties/name
Example response
• 204: No Content
18.8. Relationship properties
18.8.1. Update relationship properties
Example request
277
• PUT http://localhost:7474/db/data/relationship/131/properties
{
1
"jim" :
2
"tobias"
3
}
Example response
• 204: No Content
18.8.2. Remove properties from a relationship
Example request
• DELETE http://localhost:7474/db/data/relationship/19
Example response
• 204: No Content
18.8.3. Remove property from a relationship
See the example request below.
278
Example request
• DELETE
Example response
• 204: No Content
18.8.4. Remove non-existent property from a relationship
Attempting to remove a property that doesn’t exist results in an error.
279
Example request
• DELETE
http://localhost:7474/db/data/relationship/22/properties/non-existe
nt
Example response
• 404: Not Found
{
"message" : "Relationship[22] does not have a property
\"non-existent\"",
1 "exception" : "NoSuchPropertyException",
2 "stacktrace" :
[ "org.neo4j.server.rest.web.DatabaseActions.removeRelationshipP
3
roperty(DatabaseActions.java:729)",
4
"org.neo4j.server.rest.web.RestfulGraphDatabase.deleteRelationsh
5
ipProperty(RestfulGraphDatabase.java:595)",
}
18.8.5. Remove properties from a non-existing relationship
Attempting to remove all properties from a relationship which doesn’t exist results in an error.
Example request
• DELETE
http://localhost:7474/db/data/relationship/1234/properties
Example response
280
• 404: Not Found
{
"exception" : "RelationshipNotFoundException",
"stacktrace" :
[ "org.neo4j.server.rest.web.DatabaseActions.relationship(Databa
1
seActions.java:137)",
2
"org.neo4j.server.rest.web.DatabaseActions.removeAllRelationship
3
Properties(DatabaseActions.java:707)",
4
"org.neo4j.server.rest.web.RestfulGraphDatabase.deleteAllRelatio
nshipProperties(RestfulGraphDatabase.java:579)",
}
18.8.6. Remove property from a non-existing relationship
Attempting to remove a property from a relationship which doesn’t exist results in an error.
Example request
• DELETE
Example response
• 404: Not Found
{
1 "exception" : "RelationshipNotFoundException",
2 "stacktrace" :
[ "org.neo4j.server.rest.web.DatabaseActions.relationship(Databa
3
seActions.java:137)",
4
"org.neo4j.server.rest.web.DatabaseActions.removeRelationshipPro
281
perty(DatabaseActions.java:723)",
"org.neo4j.server.rest.web.RestfulGraphDatabase.deleteRelationsh
ipProperty(RestfulGraphDatabase.java:595)",
}
18.9. Indexes
An index can contain either nodes or relationships.
注意
To create an index with default configuration, simply start using it by adding

nodes/relationships to it. It will then be automatically created for you.
What default configuration means depends on how you have configured your database. If you
haven’t changed any indexing configuration, it means the indexes will be using a Lucene-based
backend.
All the examples below show you how to do operations on node indexes, but all of them are just
as applicable to relationship indexes. Simple change the "node" part of the URL to "relationship".
If you want to customize the index settings, see 第 18.9.2 节 “Create node index with
configuration”.
18.9.1. Create node index

注意
Instead of creating the index this way, you can simply start to use it, and it will be
created automatically with default configuration.

Example request
• POST http://localhost:7474/db/data/index/node/
{
1
"name" :
2
"favorites"
3
}
Example response
• 201: Created
282
• Location: http://localhost:7474/db/data/index/node/favorites/
{
1 "template" :
"http://localhost:7474/db/data/index/node/favorites/{key}/{valu
2
e}"
3
}
18.9.2. Create node index with configuration
This request is only necessary if you want to customize the index settings. If you are happy with
the defaults, you can just start indexing nodes/relationships, as non-existent indexes will
automatically be created as you do. See 第 14.10 节 “Configuration and fulltext indexes” for more
information on index configuration.
Example request
{
1 "name" : "fulltext",
2 "config" : {
3 "type" :
"fulltext",
4
5 "provider" :
"lucene"
6
7 }
}
Example response
• 201: Created
• Location: http://localhost:7474/db/data/index/node/fulltext/
1 {
2 "template" :
"http://localhost:7474/db/data/index/node/fulltext/{key}/{value
3
}",
4
283
5 "type" : "fulltext",
"provider" : "lucene"
}
18.9.3. Delete node index
Example request
• DELETE http://localhost:7474/db/data/index/node/kvnode
Example response
• 204: No Content
18.9.4. List node indexes
Example request
• GET http://localhost:7474/db/data/index/node/
Example response
• 200: OK
1 {
2 "node_auto_index" : {
3 "template" :
"http://localhost:7474/db/data/index/node/node_auto_index/{key}
4
/{value}",
5
6 "provider" : "lucene",
7 "type" : "exact"
8 },
9 "favorites" : {
284
1 "template" :
0 "http://localhost:7474/db/data/index/node/favorites/{key}/{valu
e}",
1
1 "provider" : "lucene",
1 "type" : "exact"
2 }
}
18.9.5. Add node to index
Associates a node with the given key/value pair in the given index.
注意
Spaces in the URI have to be encoded as %20.
小心
This does not overwrite previous entries. If you index the same key/value/item
combination twice, two index entries are created. To do update-type operations, you need to
delete the old entry before adding a new one.
Example request
• POST http://localhost:7474/db/data/index/node/favorites
{
1
"value" : "some value",
2
"uri" :
3
4
"key" : "some-key"
5
}
Example response
• 201: Created
285
• Location:
http://localhost:7474/db/data/index/node/favorites/some-key/some%20
value/99
{
"extensions" : {
},
"paged_traverse" :
1
2
3
4
5
"traverse" :
6
7
8
9
|&|types}",
1
0
1
"property" :
1
1
2
1
3
|&|types}",
1
"properties" :
4
1
5
1
6
1
&|types}",
7
1
8
"data" : {
1
},
9
"indexed" :
"http://localhost:7474/db/data/index/node/favorites/some-key/so
me%20value/99"
}
18.9.6. Remove all entries with a given node from an index
286
Example request
• DELETE http://localhost:7474/db/data/index/node/kvnode/109
Example response
• 204: No Content
18.9.7. Remove all entries with a given node and key from an index
Example request
• DELETE
http://localhost:7474/db/data/index/node/kvnode/kvkey2/110
Example response
• 204: No Content
18.9.8. Remove all entries with a given node, key and value from
an index
287
Example request
• DELETE
http://localhost:7474/db/data/index/node/kvnode/kvkey1/value1/111
Example response
• 204: No Content
18.9.9. Find node by exact match

注意
Spaces in the URI have to be encoded

as %20.
Example request
• GET
http://localhost:7474/db/data/index/node/favorites/key/the%2520valu
e
Example response
• 200: OK
1 [ {
2 "indexed" :
"http://localhost:7474/db/data/index/node/favorites/key/the%252
3
0value/100",
4
288
6
7 "data" : {
8 },
9 "traverse" :
1
1
1 t|&|types}",
1 "property" :
1
4 t|&|types}",
1 "properties" :
1 "extensions" : {
7 },
8 "http://localhost:7474/db/data/node/100/relationships",
9 "http://localhost:7474/db/data/node/100/paged/traverse/{returnT
"http://localhost:7474/db/data/node/100/relationships/in/{-list
|&|types}"
} ]
18.9.10. Find node by query
The query language used here depends on what type of index you are querying. The default
index type is Lucene, in which case you should use the Lucene query language here. Below an
example of a fuzzy search over multiple keys.
See: http://lucene.apache.org/java/3_5_0/queryparsersyntax.html
Getting the results with a predefined ordering requires adding the parameter
order=ordering
289
where ordering is one of index, relevance or score. In this case an additional field will be added
to each result, named score, that holds the float value that is the score reported by the query result.
Example request
• GET
http://localhost:7474/db/data/index/node/bobTheIndex?query=Name:Bui
ld~0.1%20AND%20Gender:Male
Example response
• 200: OK
1 [ {
3
4 "data" : {
5 "Name" : "Builder"
6 },
7 "traverse" :
8
1
0 t|&|types}",
1 "property" :
1
3 t|&|types}",
1 "properties" :
1 "extensions" : {
6 },
290
8 "http://localhost:7474/db/data/node/101/paged/traverse/{returnT
1
"http://localhost:7474/db/data/node/101/relationships/in/{-list
|&|types}"
} ]
18.10. Unique Indexes
18.10.1. Create a unique node in an index
Example request
• POST http://localhost:7474/db/data/index/node/people?unique
{
1 "key" : "name",
2 "value" :
"Tobias",
3
4 "properties" : {
5 "name" :
"Tobias",
6
7 "sequence" : 1
8 }
}
Example response
• 201: Created
• Location:
http://localhost:7474/db/data/index/node/people/name/Tobias/112
1 {
291
2 "extensions" : {
3 },
5
6
8
9 "traverse" :
1
1
1 t|&|types}",
1 "property" :
1
5 t|&|types}",
1 "properties" :
|&|types}",
1
2
0 "data" : {
2 "sequence" : 1,
1 "name" : "Tobias"
},
"indexed" :
"http://localhost:7474/db/data/index/node/people/name/Tobias/11
2"
}
18.10.2. Create a unique node in an index (the case where it exists)
292
Example request
{
1 "key" : "name",
2 "value" :
"Peter",
3
4 "properties" : {
5 "name" :
"Peter",
6
7 "sequence" : 2
8 }
}
Example response
• 200: OK
1 {
2 "extensions" : {
3 },
5
6
8
9 "traverse" :
1
1
1 t|&|types}",
1 "property" :
1
5 t|&|types}",
1 "properties" :
293
|&|types}",
1
2
0 "data" : {
2 "sequence" : 1,
1 "name" : "Peter"
},
"indexed" :
"http://localhost:7474/db/data/index/node/people/name/Peter/113
"
}
18.10.3. Add a node to an index unless a node already exists for
the given mapping
Example request
{
1
"key" : "name",
2
"value" : "Mattias",
3
"uri" :
4
"http://localhost:7474/db/data/node/114"
5
}
Example response
• 201: Created
• Location:
http://localhost:7474/db/data/index/node/people/name/Mattias/114
1 {
2 "extensions" : {
294
3 },
5
6
8
9 "traverse" :
1
1
1 t|&|types}",
1 "property" :
1
5 t|&|types}",
1 "properties" :
|&|types}",
1
"data" : {
},
"indexed" :
"http://localhost:7474/db/data/index/node/people/name/Mattias/1
14"
}
18.10.4. Create a unique relationship in an index
295
Example request
• POST
http://localhost:7474/db/data/index/relationship/knowledge/?unique
{
1 "key" : "name",
2 "value" : "Tobias",
3 "start" :
4
5 "end" :
6
7 "type" : "knowledge"
}
Example response
• 201: Created
• Location:
http://localhost:7474/db/data/index/relationship/knowledge/name/Tob
ias/32
1 {
2 "extensions" : {
3 },
5 "property" :
"http://localhost:7474/db/data/relationship/32/properties/{key}
6
",
7
9 "properties" :
1
296
0 "type" : "knowledge",
1 "data" : {
1 "name" : "Tobias"
2 },
1 "indexed" :
3 "http://localhost:7474/db/data/index/relationship/knowledge/nam
e/Tobias/32"
1
4 }
18.10.5. Add a relationship to an index unless a relationship
already exists for the given mapping
Example request
• POST
http://localhost:7474/db/data/index/relationship/knowledge/?unique
{
1
"key" : "name",
2
"value" : "Mattias",
3
"uri" :
4
"http://localhost:7474/db/data/relationship/33"
5
}
Example response
• 201: Created
297
• Location:
http://localhost:7474/db/data/index/relationship/knowledge/name/Mat
tias/33
{
1
"extensions" : {
2
},
3
4
"property" :
5
"http://localhost:7474/db/data/relationship/33/properties/{key}
6
",
7
8
"properties" :
9
1
"type" : "knowledge",
0
1
"data" : {
1
},
1
"indexed" :
2
"http://localhost:7474/db/data/index/relationship/knowledge/nam
1
e/Mattias/33"
3
}
18.11. Automatic Indexes

18.11.1. Find node by exact match from an automatic index
18.11.2. Find node by query from an automatic index
To enable automatic indexes in neo4j, set up the database for that, see 第 14.12.1 节
“Configuration”. With this feature enabled, you can then index and query nodes in these indexes.
18.11.1. Find node by exact match from an automatic index
Automatic index nodes can be found via exact lookups with normal Index REST syntax.
Example request
• GET http://localhost:7474/db/data/index/auto/node/name/I
Example response
298
• 200: OK
[ {
1
"data" : {
2
"name" : "I"
3
},
4
"traverse" :
5
6
7
8
|&|types}",
9
"property" :
1
0
1
1
1
|&|types}",
2
"properties" :
1
3
1
4
"extensions" : {
1
},
5
1
6
"paged_traverse" :
1
7
1
8
1
9
&|types}"
} ]
18.11.2. Find node by query from an automatic index
See Find node by query for the actual query syntax.
299
Example request
• GET
http://localhost:7474/db/data/index/auto/node/?query=name:I
Example response
• 200: OK
[ {
1
"data" : {
2
"name" : "I"
3
},
4
"traverse" :
5
6
7
8
|&|types}",
9
"property" :
1
0
1
1
1
|&|types}",
2
"properties" :
1
3
1
4
"extensions" : {
1
},
5
1
6
"paged_traverse" :
1
7
1
8
1
9
&|types}"
} ]
300
18.12. Configurable Automatic Indexing
18.12.1. Create an auto index for nodes with specific configuration
18.12.2. Create an auto index for relationships with specific configuration
18.12.3. Get current status for autoindexing on nodes
18.12.4. Enable node autoindexing
18.12.5. Lookup list of properties being autoindexed
18.12.6. Add a property for autoindexing on nodes
18.12.7. Remove a property for autoindexing on nodes
Out of the box auto-indexing supports exact matches since they are created with the default
configuration (see 第 14.12 节 “Automatic Indexing”) the first time you access them. However it is
possible to intervene in the lifecycle of the server before any auto indexes are created to change their
configuration.
警告
This approach cannot be used on databases that already have auto-indexes established.
To change the auto-index configuration existing indexes would have to be deleted first, so
be careful!
小心
This technique works, but it is not particularly pleasant. Future versions of Neo4j may
remove this loophole in favour of a better structured feature for managing auto-indexing
configurations.
Auto-indexing must be enabled through configuration before we can create or configure them.
Firstly ensure that you’ve added some config like this into your server’s neo4j.properties file:
node_auto_indexing=true
relationship_auto_indexing=tr
1
ue
2
node_keys_indexable=name,phon
3
e
4
relationship_keys_indexable=s
ince
The node_auto_indexing and relationship_auto_indexing settings turn
auto-indexing on for nodes and relationships respectively. The node_keys_indexable key allows
you to specify a comma-separated list of node property keys to be indexed. The
relationship_keys_indexable does the same for relationship property keys.
Next start the server as usual by invoking the start script as described in 第 17.1 节 “服务器安
装”.
301
Next we have to pre-empt the creation of an auto-index, by telling the server to create an
apparently manual index which has the same name as the node (or relationship) auto-index. For
example, in this case we’ll create a node auto index whose name is node_auto_index, like so:
18.12.1. Create an auto index for nodes with specific configuration
Example request
{
1
"name" :
2
"node_auto_index",
3
"config" : {
4
"type" : "fulltext",
5
6
}
7
}
Example response
• 201: Created
• Location:
http://localhost:7474/db/data/index/node/node_auto_index/
{
1 "template" :
"http://localhost:7474/db/data/index/node/node_auto_index/{key}
2
/{value}",
3
5 "provider" : "lucene"
}
If you require configured auto-indexes for relationships, the approach is similar:
18.12.2. Create an auto index for relationships with specific
configuration
Example request
• POST http://localhost:7474/db/data/index/relationship/
302
{
1
"name" :
2
"relationship_auto_index",
3
"config" : {
4
"type" : "fulltext",
5
6
}
7
}
Example response
• 201: Created
• Location:
http://localhost:7474/db/data/index/relationship/relationship_auto_
index/
{
1 "template" :
"http://localhost:7474/db/data/index/relationship/relationship_
2
auto_index/{key}/{value}",
3
5 "provider" : "lucene"
}
In case you’re curious how this works, on the server side it triggers the creation of an index
which happens to have the same name as the auto index that the database would create for itself. Now
when we interact with the database, the index thinks the index is already created so the state machine
skips over that step and just gets on with normal day-to-day auto-indexing.
小心
You have to do this early in your server lifecycle, before any normal auto indexes are
created.
There are a few REST calls providing a REST interface to the AutoIndexer component. The
following REST calls work both, for node and relationship by simply changing the respective
part of the URL.
18.12.3. Get current status for autoindexing on nodes
303
Example request
• GET http://localhost:7474/db/data/index/auto/node/status
Example response
• 200: OK
f
1
alse
18.12.4. Enable node autoindexing

Example request
• PUT http://localhost:7474/db/data/index/auto/node/status
t
1
rue
Example response
• 204: No Content
18.12.5. Lookup list of properties being autoindexed
Example request
• GET http://localhost:7474/db/data/index/auto/node/properties
Example response
• 200: OK
304
[ "some-propert
1
y" ]
18.12.6. Add a property for autoindexing on nodes
Example request
• POST http://localhost:7474/db/data/index/auto/node/properties
myPrope
1
rty1
Example response
• 204: No Content
• 18.12.7. Remove a property for autoindexing on nodes
Example request
• DELETE
http://localhost:7474/db/data/index/auto/node/properties/myProperty
1
Example response
• 204: No Content
18.13. Traversals
警告
The Traversal REST Endpoint executes arbitrary Groovy code under the hood as
part of the evaluators definitions. In hosted and open environments, this can constitute
305
a security risk. In these case, consider using declarative approaches like 第 15 章
Cypher 查询语言 or write your own server side plugin executing the interesting
traversals with the Java API ( see 第 10.1 节 “服务器插件” ) or secure your server,
see 第 24.1 节 “安全访问 Neo4j 服务器”.
Traversals are performed from a start node. The traversal is controlled by the URI and the body
sent with the request.
returnType
The kind of objects in the response is determined by traverse/{returnType} in the URL.
returnType can have one of these values:
• node
• relationship
• path: contains full representations of start and end node, the rest are URIs.
• fullpath: contains full representations of all nodes and relationships.
To decide how the graph should be traversed you can use these parameters in the request body:
order
Decides in which order to visit nodes. Possible values:
• breadth_first: see Breadth-first search.

• depth_first: see Depth-first search
relationships
Decides which relationship types and directions should be followed. The direction can be
one of:
• all
• in
• out
uniqueness
Decides how uniqueness should be calculated. For details on different uniqueness values
see the Java API on Uniqueness. Possible values:
• node_global
• none
• relationship_global
• node_path
• relationship_path
306
prune_evaluator
Decides whether the traverser should continue down that path or if it should be
pruned so that the traverser won’t continue down that path. You can write your own prune
evaluator as (see 第 18.13.1 节 “Traversal using a return filter” or use the built-in
none prune evaluator.
return_filter
Decides whether the current position should be included in the result. You can provide
your own code for this (see 第 18.13.1 节 “Traversal using a return filter”), or use one of the
built-in filters:
• all
• all_but_start_node
max_depth
Is a short-hand way of specifying a prune evaluator which prunes after a certain
depth. If not specified a max depth of 1 is used and if a prune_evaluator is specified
instead of a max_depth, no max depth limit is set.
The position object in the body of the return_filter and prune_evaluator is a Path
object representing the path from the start node to the current traversal position.
Out of the box, the REST API supports JavaScript code in filters and evaluators. The script body
will be executed in a Java context which has access to the full Neo4j Java API. See the examples for
the exact syntax of the request.
18.13.1. Traversal using a return filter

In this example, the none prune evaluator is used and a return filter is supplied in order to return
all names containing "t". The result is to be returned as nodes and the max depth is set to 3.
307
Example request
• POST http://localhost:7474/db/data/node/13/traverse/node
1 {
2 "order" : "breadth_first",
3 "return_filter" : {
4 "body" :
"position.endNode().getProperty('name').toLowerCase().contains
5
('t')",
6
7 "language" : "javascript"
8 },
9 "prune_evaluator" : {
1 "body" : "position.length() > 10",
1 },
308
1 "uniqueness" : "node_global",
1 "relationships" : [ {
2 "direction" : "all",
1 "type" : "knows"
3 }, {
1 "direction" : "all",
4 "type" : "loves"
1 } ],
5 "max_depth" : 3
1 }
6
1
7
1
8
1
9
2
0
Example response
• 200: OK
1 [ {
3
4 "data" : {
5 "name" : "Root"
6 },
7 "traverse" :
8
1
0 |&|types}",
1 "property" :
1
3 |&|types}",
1 "properties" :
1 "extensions" : {
309
6 },
8 "http://localhost:7474/db/data/node/13/paged/traverse/{returnTy
1
2
2
1 &|types}"
2 }, {
2
3 "data" : {
2 "name" : "Mattias"
4 },
2 "traverse" :
5 "http://localhost:7474/db/data/node/16/traverse/{returnType}",
|&|types}",
2
7 "property" :
2
|&|types}",
3
0 "properties" :
3
3
2 "extensions" : {
3 },
3
3
5 pe}{?pageSize,leaseTime}",
&|types}"
3
8 }, {
9 "http://localhost:7474/db/data/node/15/relationships/out",
310
4 "data" : {
0 "name" : "Peter"
4 },
1 "traverse" :
4
4
3 |&|types}",
4 "property" :
4
6 |&|types}",
4 "properties" :
4 "extensions" : {
9 },
5
5
5
4 &|types}"
5 }, {
5
6 "data" : {
5 "name" : "Tobias"
7 },
5 "traverse" :
|&|types}",
6
0 "property" :
6
|&|types}",
6
311
3 "properties" :
6
6
5 "extensions" : {
6 },
6
6
&|types}"
7
1 } ]
7
2
7
3
18.13.2. Return relationships from a traversal
Example request
• POST
http://localhost:7474/db/data/node/6/traverse/relationship
1 {
2 "order" :
"breadth_first",
3
312
4 "uniqueness" : "none",
6 "language" :
"builtin",
7
8 "name" : "all"
}
}
Example response
• 200: OK
[ {
"data" : {
},
1 "self" :
3 "property" :
4 "http://localhost:7474/db/data/relationship/1/properties/{k
5 ey}",
6 "properties" :
8 "type" : "know",
9 "extensions" : {
10 },
12 }, {
14 "data" : {
15 },
16 "self" :
18 "property" :
19 "http://localhost:7474/db/data/relationship/2/properties/{k
20 ey}",
21 "properties" :
23 "type" : "own",
"extensions" : {
},
} ]
18.13.3. Return paths from a traversal
313
Example request
• POST http://localhost:7474/db/data/node/9/traverse/path
{
1 "order" :
"breadth_first",
2
3 "uniqueness" : "none",
5 "language" :
"builtin",
6
7 "name" : "all"
8 }
}
Example response
• 200: OK
1 [ {
3 "nodes" : [ "http://localhost:7474/db/data/node/9" ],
4 "length" : 0,
5 "relationships" : [ ],
7 }, {
9 "nodes" : [ "http://localhost:7474/db/data/node/9",
1 ],
0 "length" : 1,
1 "relationships" :
1 [ "http://localhost:7474/db/data/relationship/3" ],
314
2 }, {
1 ],
4 "length" : 1,
1 "relationships" :
5 [ "http://localhost:7474/db/data/relationship/4" ],
6 } ]
1
7
1
8
1
9
18.13.4. Traversal returning nodes below a certain depth
Here, all nodes at a traversal depth below 3 are returned.
315
Example request
• POST http://localhost:7474/db/data/node/20/traverse/node
1 {
3 "body" :
"position.length()<3;",
4
6 },
8 "name" : "none",
9 "language" : "builtin"
1 }
0 }
Example response
316
• 200: OK
1 [ {
4 "data" : {
5 "name" : "Root"
6 },
7 "traverse" :
8 "http://localhost:7474/db/data/node/20/traverse/{returnType
9 }",
11 "http://localhost:7474/db/data/node/20/relationships/all/{-
12 list|&|types}",
13 "property" :
17 "http://localhost:7474/db/data/node/20/relationships/out/{-
18 list|&|types}",
19 "properties" :
23 "extensions" : {
24 },
28 "http://localhost:7474/db/data/node/20/paged/traverse/{retu
29 rnType}{?pageSize,leaseTime}",
33 "http://localhost:7474/db/data/node/20/relationships/in/{-l
34 ist|&|types}"
35 }, {
38 "data" : {
39 "name" : "Mattias"
40 },
41 "traverse" :
42 "http://localhost:7474/db/data/node/23/traverse/{returnType
43 }",
317
45 "http://localhost:7474/db/data/node/23/relationships/all/{-
46 list|&|types}",
47 "property" :
51 "http://localhost:7474/db/data/node/23/relationships/out/{-
52 list|&|types}",
53 "properties" :
57 "extensions" : {
58 },
62 "http://localhost:7474/db/data/node/23/paged/traverse/{retu
63 rnType}{?pageSize,leaseTime}",
67 "http://localhost:7474/db/data/node/23/relationships/in/{-l
68 ist|&|types}"
69 }, {
72 "data" : {
73 "name" : "Johan"
},
"traverse" :
"http://localhost:7474/db/data/node/18/traverse/{returnType
}",
"http://localhost:7474/db/data/node/18/relationships/all/{-
list|&|types}",
"property" :
"http://localhost:7474/db/data/node/18/relationships/out/{-
list|&|types}",
"properties" :
318
"extensions" : {
},
"paged_traverse" :
"http://localhost:7474/db/data/node/18/paged/traverse/{retu
rnType}{?pageSize,leaseTime}",
"http://localhost:7474/db/data/node/18/relationships/in/{-l
ist|&|types}"
}, {
"data" : {
"name" : "Emil"
},
"traverse" :
"http://localhost:7474/db/data/node/19/traverse/{returnType
}",
"http://localhost:7474/db/data/node/19/relationships/all/{-
list|&|types}",
"property" :
"http://localhost:7474/db/data/node/19/relationships/out/{-
list|&|types}",
"properties" :
"extensions" : {
},
"paged_traverse" :
"http://localhost:7474/db/data/node/19/paged/traverse/{retu
rnType}{?pageSize,leaseTime}",
"http://localhost:7474/db/data/node/19/relationships/in/{-l
ist|&|types}"
} ]
319
18.13.5. Creating a paged traverser
Paged traversers are created by POST-ing a traversal description to the link identified by the
paged_traverser key in a node representation. When creating a paged traverser, the same options
apply as for a regular traverser, meaning that node, path, or fullpath, can be targeted.
Example request
• POST http://localhost:7474/db/data/node/67/paged/traverse/node
1
2
3
{
4
"prune_evaluator" : {
5
"language" : "builtin",
6
"name" : "none"
7
},
8
"return_filter" : {
9
"language" : "javascript",
1
"body" :
0
"position.endNode().getProperty('name').contains('1');"
1
},
1
"order" : "depth_first",
1
"relationships" : {
2
"type" : "NEXT",
1
"direction" : "out"
3
}
1
}
4
1
5
Example response
• 201: Created
• Location:
http://localhost:7474/db/data/node/67/paged/traverse/node/1ff2c67ce
33a4b85bcea5ab3f4b60ee8
1 [ {
3"http://localhost:7474/db/data/node/68/relationships/ou
4t",
320
5 "data" : {
6 "name" : "1"
7 },
8 "traverse" :
9"http://localhost:7474/db/data/node/68/traverse/{return
1Type}",
1"http://localhost:7474/db/data/node/68/relationships/al
1 l/{-list|&|types}",
1 "property" :
2 "http://localhost:7474/db/data/node/68/properties/{key}
1",
4 "http://localhost:7474/db/data/node/68/relationships/ou
1t/{-list|&|types}",
5 "properties" :
1"http://localhost:7474/db/data/node/68/properties",
1"http://localhost:7474/db/data/node/68/relationships/in
7 ",
1 "extensions" : {
8 },
0 "http://localhost:7474/db/data/node/68/paged/traverse/{
2returnType}{?pageSize,leaseTime}",
2 l",
3 "http://localhost:7474/db/data/node/68/relationships/in
2/{-list|&|types}"
4 }, {
2t",
6 "data" : {
2 "name" : "10"
7 },
2 "traverse" :
8 "http://localhost:7474/db/data/node/77/traverse/{return
2Type}",
321
3 "property" :
3",
4 "properties" :
6 ",
3 "extensions" : {
7 },
4 "http://localhost:7474/db/data/node/77/relationships/al
1 l",
4/{-list|&|types}"
3 }, {
4t",
5 "data" : {
4 "name" : "11"
6 },
4 "traverse" :
4Type}",
5 "property" :
5",
3 "properties" :
322
5 ",
5 "extensions" : {
6 },
0 l",
6/{-list|&|types}"
2 }, {
6t",
4 "data" : {
6 "name" : "12"
5 },
6 "traverse" :
6Type}",
6 "property" :
7",
2 "properties" :
4 ",
7 "extensions" : {
5 },
323
9 l",
8/{-list|&|types}"
1 }, {
8t",
3 "data" : {
8 "name" : "13"
4 },
8 "traverse" :
8Type}",
8 "property" :
8",
1 "properties" :
3 ",
9 "extensions" : {
4 },
8 l",
1/{-list|&|types}"
00 }, {
324
1t",
02 "data" : {
1 "name" : "14"
03 },
1 "traverse" :
1Type}",
1 "property" :
1",
10 "properties" :
12 ",
1 "extensions" : {
13 },
17 l",
1/{-list|&|types}"
19 }, {
1t",
21 "data" : {
1 "name" : "15"
22 },
1 "traverse" :
1Type}",
325
1 "property" :
1",
29 "properties" :
31 ",
1 "extensions" : {
32 },
36 l",
1/{-list|&|types}"
38 }, {
1t",
40 "data" : {
1 "name" : "16"
41 },
1 "traverse" :
1Type}",
1 "property" :
1",
326
48 "properties" :
50 ",
1 "extensions" : {
51 },
55 l",
1/{-list|&|types}"
57 }, {
1t",
59 "data" : {
1 "name" : "17"
60 },
1 "traverse" :
1Type}",
1 "property" :
1",
67 "properties" :
69 ",
1 "extensions" : {
70 },
327
74 l",
1/{-list|&|types}"
76 }, {
1t",
78 "data" : {
1 "name" : "18"
79 },
1 "traverse" :
1Type}",
1 "property" :
1",
86 "properties" :
88 ",
1 "extensions" : {
89 },
93 l",
1/{-list|&|types}"
328
95 }, {
1t",
97 "data" : {
1 "name" : "19"
98 },
1 "traverse" :
2Type}",
2 "property" :
2",
05 "properties" :
07 ",
2 "extensions" : {
08 },
12 l",
2/{-list|&|types}"
14 }, {
2t",
16 "data" : {
2 "name" : "21"
17 },
2 "traverse" :
329
2Type}",
2 "property" :
2",
24 "properties" :
26 ",
2 "extensions" : {
27 },
31 l",
2/{-list|&|types}"
33 }, {
2t",
35 "data" : {
"name" : "31"
},
"traverse" :
"http://localhost:7474/db/data/node/98/traverse/{return
Type}",
"http://localhost:7474/db/data/node/98/relationships/al
l/{-list|&|types}",
"property" :
"http://localhost:7474/db/data/node/98/properties/{key}
",
330
"http://localhost:7474/db/data/node/98/relationships/ou
t/{-list|&|types}",
"properties" :
"http://localhost:7474/db/data/node/98/relationships/in
",
"extensions" : {
},
"paged_traverse" :
"http://localhost:7474/db/data/node/98/paged/traverse/{
returnType}{?pageSize,leaseTime}",
"http://localhost:7474/db/data/node/98/relationships/al
l",
"http://localhost:7474/db/data/node/98/relationships/in
/{-list|&|types}"
} ]
18.13.6. Paging through the results of a paged traverser
Paged traversers hold state on the server, and allow clients to page through the results of a
traversal. To progress to the next page of traversal results, the client issues a HTTP GET request on
the paged traversal URI which causes the traversal to fill the next page (or partially fill it if
insufficient results are available).
Note that if a traverser expires through inactivity it will cause a 404 response on the next GET
request. Traversers' leases are renewed on every successful access for the same amount of time as
originally specified.
When the paged traverser reaches the end of its results, the client can expect a 404 response as
the traverser is disposed by the server.
Example request
• GET
http://localhost:7474/db/data/node/100/paged/traverse/node/ea5c0265
74e44cc6a3222927b103bea2
Example response
331
• 200: OK
1 [ {
4 "data" : {
5 "name" : "331"
6 },
7 "traverse" :
8 "http://localhost:7474/db/data/node/431/traverse/{returnTyp
9 e}",
11 "http://localhost:7474/db/data/node/431/relationships/all/{
12 -list|&|types}",
13 "property" :
17 "http://localhost:7474/db/data/node/431/relationships/out/{
18 -list|&|types}",
19 "properties" :
23 "extensions" : {
24 },
28 "http://localhost:7474/db/data/node/431/paged/traverse/{ret
29 urnType}{?pageSize,leaseTime}",
33 "http://localhost:7474/db/data/node/431/relationships/in/{-
34 list|&|types}"
35 }, {
38 "data" : {
39 "name" : "341"
40 },
41 "traverse" :
43 e}",
332
46 -list|&|types}",
47 "property" :
52 -list|&|types}",
53 "properties" :
57 "extensions" : {
58 },
68 list|&|types}"
69 }, {
72 "data" : {
73 "name" : "351"
74 },
75 "traverse" :
77 e}",
80 -list|&|types}",
81 "property" :
86 -list|&|types}",
87 "properties" :
333
91 "extensions" : {
92 },
102 list|&|types}"
103 }, {
106 "data" : {
107 "name" : "361"
108 },
109 "traverse" :
111 e}",
114 -list|&|types}",
115 "property" :
121 "properties" :
125 "extensions" : {
126 },
136 list|&|types}"
137 }, {
334
140 "data" : {
141 "name" : "371"
142 },
143 "traverse" :
145 e}",
149 "property" :
155 "properties" :
160 },
170 list|&|types}"
171 }, {
174 "data" : {
175 "name" : "381"
176 },
177 "traverse" :
179 e}",
183 "property" :
335
189 "properties" :
194 },
204 list|&|types}"
205 }, {
208 "data" : {
209 "name" : "391"
210 },
211 "traverse" :
213 e}",
217 "property" :
223 "properties" :
228 },
336
238 list|&|types}"
239 }, {
242 "data" : {
243 "name" : "401"
244 },
245 "traverse" :
247 e}",
251 "property" :
257 "properties" :
262 },
272 list|&|types}"
273 }, {
276 "data" : {
277 "name" : "410"
278 },
337
279 "traverse" :
281 e}",
285 "property" :
291 "properties" :
296 },
306 list|&|types}"
307 }, {
310 "data" : {
311 "name" : "411"
312 },
313 "traverse" :
315 e}",
319 "property" :
325 "properties" :
338
330 },
340 list|&|types}"
341 }, {
344 "data" : {
345 "name" : "412"
346 },
347 "traverse" :
349 e}",
353 "property" :
359 "properties" :
364 },
339
374 list|&|types}"
375 }, {
378 "data" : {
379 "name" : "413"
},
"traverse" :
"http://localhost:7474/db/data/node/513/traverse/{returnTyp
e}",
"http://localhost:7474/db/data/node/513/relationships/all/{
-list|&|types}",
"property" :
"http://localhost:7474/db/data/node/513/relationships/out/{
-list|&|types}",
"properties" :
"extensions" : {
},
"paged_traverse" :
"http://localhost:7474/db/data/node/513/paged/traverse/{ret
urnType}{?pageSize,leaseTime}",
"http://localhost:7474/db/data/node/513/relationships/in/{-
list|&|types}"
}, {
"data" : {
"name" : "414"
},
"traverse" :
e}",
340
-list|&|types}",
"property" :
-list|&|types}",
"properties" :
"extensions" : {
},
"paged_traverse" :
list|&|types}"
}, {
"data" : {
"name" : "415"
},
"traverse" :
e}",
-list|&|types}",
"property" :
-list|&|types}",
"properties" :
"extensions" : {
},
341
"paged_traverse" :
list|&|types}"
}, {
"data" : {
"name" : "416"
},
"traverse" :
e}",
-list|&|types}",
"property" :
-list|&|types}",
"properties" :
"extensions" : {
},
"paged_traverse" :
list|&|types}"
}, {
342
"data" : {
"name" : "417"
},
"traverse" :
e}",
-list|&|types}",
"property" :
-list|&|types}",
"properties" :
"extensions" : {
},
"paged_traverse" :
list|&|types}"
}, {
"data" : {
"name" : "418"
},
"traverse" :
e}",
-list|&|types}",
"property" :
343
-list|&|types}",
"properties" :
"extensions" : {
},
"paged_traverse" :
list|&|types}"
}, {
"data" : {
"name" : "419"
},
"traverse" :
e}",
-list|&|types}",
"property" :
-list|&|types}",
"properties" :
"extensions" : {
},
"paged_traverse" :
344
list|&|types}"
}, {
"data" : {
"name" : "421"
},
"traverse" :
e}",
-list|&|types}",
"property" :
-list|&|types}",
"properties" :
"extensions" : {
},
"paged_traverse" :
list|&|types}"
}, {
"data" : {
"name" : "431"
},
"traverse" :
345
e}",
-list|&|types}",
"property" :
-list|&|types}",
"properties" :
"extensions" : {
},
"paged_traverse" :
list|&|types}"
}, {
"data" : {
"name" : "441"
},
"traverse" :
e}",
-list|&|types}",
"property" :
-list|&|types}",
"properties" :
346
"extensions" : {
},
"paged_traverse" :
list|&|types}"
} ]
18.13.7. Paged traverser page size
The default page size is 50 items, but depending on the application larger or smaller pages sizes
might be appropriate. This can be set by adding a pageSize query parameter.
Example request
• POST
http://localhost:7474/db/data/node/577/paged/traverse/node?pageSize
=1
1
2 {
4 "language" : "builtin",
5 "name" : "none"
6 },
8 "language" : "javascript",
9 "body" :
1
0 },
1 "order" : "depth_first",
1 "relationships" : {
1 "type" : "NEXT",
2 "direction" : "out"
1 }
3 }
1
347
4
1
5
Example response
• 201: Created
• Location:
http://localhost:7474/db/data/node/577/paged/traverse/node/2c4fa996
b58443108a945ca9ce5a83d4
[ {
"http://localhost:7474/db/data/node/578/relationships
1/out",
2 "data" : {
3 "name" : "1"
4 },
5 "traverse" :
6"http://localhost:7474/db/data/node/578/traverse/{ret
7urnType}",
9"http://localhost:7474/db/data/node/578/relationships
1/all/{-list|&|types}",
0 "property" :
1"http://localhost:7474/db/data/node/578/properties/{k
1 ey}",
1 "self" :
2 "http://localhost:7474/db/data/node/578",
3 "http://localhost:7474/db/data/node/578/relationships
1/out/{-list|&|types}",
4 "properties" :
1"http://localhost:7474/db/data/node/578/relationships
6 /in",
1 "extensions" : {
7 },
8 "http://localhost:7474/db/data/node/578/relationships
1",
"http://localhost:7474/db/data/node/578/paged/travers
e/{returnType}{?pageSize,leaseTime}",
348
/all",
/in/{-list|&|types}"
} ]
18.13.8. Paged traverser timeout
The default timeout for a paged traverser is 60 seconds, but depending on the application larger
or smaller timeouts might be appropriate. This can be set by adding a leaseTime query parameter
with the number of seconds the paged traverser should last.
Example request
• POST
http://localhost:7474/db/data/node/610/paged/traverse/node?leaseTim
e=10
1
2
3
{
4
"prune_evaluator" : {
5
"language" : "builtin",
6
"name" : "none"
7
},
8
"return_filter" : {
9
"language" : "javascript",
1
"body" :
0
1
},
1
"order" : "depth_first",
1
"relationships" : {
2
"type" : "NEXT",
1
"direction" : "out"
3
}
1
}
4
1
5
Example response
• 201: Created
349
• Location:
http://localhost:7474/db/data/node/610/paged/traverse/node/868b071c
0dd74df78e45d40b39aaf03c
1 [ {
3 "http://localhost:7474/db/data/node/611/relationships/out
4 ",
5 "data" : {
6 "name" : "1"
7 },
8 "traverse" :
9 "http://localhost:7474/db/data/node/611/traverse/{returnT
10 ype}",
12 "http://localhost:7474/db/data/node/611/relationships/all
13 /{-list|&|types}",
14 "property" :
15 "http://localhost:7474/db/data/node/611/properties/{key}"
16 ,
21 "properties" :
24 "http://localhost:7474/db/data/node/611/relationships/in"
25 ,
26 "extensions" : {
27 },
31 "http://localhost:7474/db/data/node/611/paged/traverse/{r
32 eturnType}{?pageSize,leaseTime}",
35 ",
37 "http://localhost:7474/db/data/node/611/relationships/in/
38 {-list|&|types}"
39 }, {
350
42 ",
43 "data" : {
44 "name" : "10"
45 },
46 "traverse" :
48 ype}",
52 "property" :
54 ,
59 "properties" :
63 ,
64 "extensions" : {
65 },
73 ",
76 {-list|&|types}"
77 }, {
80 ",
81 "data" : {
82 "name" : "11"
83 },
84 "traverse" :
86 ype}",
351
90 "property" :
92 ,
97 "properties" :
101 ,
103 },
111 ",
114 {-list|&|types}"
115 }, {
118 ",
119 "data" : {
120 "name" : "12"
121 },
122 "traverse" :
124 ype}",
128 "property" :
130 ,
135 "properties" :
352
139 ,
141 },
149 ",
153 }, {
156 ",
157 "data" : {
158 "name" : "13"
159 },
160 "traverse" :
162 ype}",
166 "property" :
168 ,
173 "properties" :
177 ,
179 },
353
187 ",
191 }, {
194 ",
195 "data" : {
196 "name" : "14"
197 },
198 "traverse" :
200 ype}",
204 "property" :
206 ,
211 "properties" :
215 ,
217 },
225 ",
229 }, {
354
232 ",
233 "data" : {
234 "name" : "15"
235 },
"traverse" :
"http://localhost:7474/db/data/node/625/traverse/{returnT
ype}",
"http://localhost:7474/db/data/node/625/relationships/all
/{-list|&|types}",
"property" :
"http://localhost:7474/db/data/node/625/properties/{key}"
,
"http://localhost:7474/db/data/node/625/relationships/out
/{-list|&|types}",
"properties" :
"http://localhost:7474/db/data/node/625/relationships/in"
,
"extensions" : {
},
"paged_traverse" :
"http://localhost:7474/db/data/node/625/paged/traverse/{r
eturnType}{?pageSize,leaseTime}",
",
"http://localhost:7474/db/data/node/625/relationships/in/
{-list|&|types}"
}, {
",
"data" : {
"name" : "16"
},
"traverse" :
ype}",
355
/{-list|&|types}",
"property" :
,
/{-list|&|types}",
"properties" :
,
"extensions" : {
},
"paged_traverse" :
",
{-list|&|types}"
}, {
",
"data" : {
"name" : "17"
},
"traverse" :
ype}",
/{-list|&|types}",
"property" :
,
356
/{-list|&|types}",
"properties" :
,
"extensions" : {
},
"paged_traverse" :
",
{-list|&|types}"
}, {
",
"data" : {
"name" : "18"
},
"traverse" :
ype}",
/{-list|&|types}",
"property" :
,
/{-list|&|types}",
"properties" :
,
"extensions" : {
},
357
"paged_traverse" :
",
{-list|&|types}"
}, {
",
"data" : {
"name" : "19"
},
"traverse" :
ype}",
/{-list|&|types}",
"property" :
,
/{-list|&|types}",
"properties" :
,
"extensions" : {
},
"paged_traverse" :
",
358
{-list|&|types}"
}, {
",
"data" : {
"name" : "21"
},
"traverse" :
ype}",
/{-list|&|types}",
"property" :
,
/{-list|&|types}",
"properties" :
,
"extensions" : {
},
"paged_traverse" :
",
{-list|&|types}"
}, {
",
"data" : {
"name" : "31"
},
"traverse" :
359
ype}",
/{-list|&|types}",
"property" :
,
/{-list|&|types}",
"properties" :
,
"extensions" : {
},
"paged_traverse" :
",
{-list|&|types}"
} ]
上一页下一页
18.14. Built-in Graph Algorithms

Neo4j 附带许多内置图算法。他们是从起始节点执行的。遍历控制的 URI 和随请求发送
的机构.
algorithm
The algorithm to choose. If not set, default is shortestPath. algorithm can have
one of these values:
• shortestPath
• allSimplePaths
• allPaths
360
• dijkstra (optional with cost_property and default_cost
parameters)
max_depth
作为一个整数算法的最大深度喜欢 ShortestPath，在适用的情况。默认值为 1.
18.14.1. 查找所有的最短路径
The shortestPath algorithm can find multiple paths between the same nodes, like in this
example.
Example request
• POST http://localhost:7474/db/data/node/125/paths
361
{
1
"to" :
2
3
"max_depth" : 3,
4
"relationships" : {
5
"type" : "to",
6
"direction" : "out"
7
},
8
"algorithm" : "shortestPath"
9
}
Example response
• 200: OK
[ {
2
3 ],
4 "length" : 2,
5 "relationships" :
[6 "http://localhost:7474/db/data/relationship/73",
7 ],
9 }, {
1
1 "http://localhost:7474/db/data/node/120" ],
1 "length" : 2,
2 "relationships" :
[1 "http://localhost:7474/db/data/relationship/74",
3 "http://localhost:7474/db/data/relationship/80" ],
} ]
18.14.2. 查找节点之间的最短路径之一
如果未不指定任何路径算法，将选择最大深度为 1 的 ShortestPath 算法。在此示例中，
max_depth 设置为 3，找到 3 链接的节点之间的最短路径.
362
Example request
• POST http://localhost:7474/db/data/node/132/path
{
1
"to" :
2
3
"max_depth" : 3,
4
"relationships" : {
5
"type" : "to",
6
"direction" : "out"
7
},
8
"algorithm" : "shortestPath"
9
}
Example response
363
• 200: OK
{
2
3 ],
4 "length" : 2,
5 "relationships" :
[ 6"http://localhost:7474/db/data/relationship/84",
7 ],
}
18.14.3. 在关系上执行类似权重 Dijkstra 算法
364
Example request
365
{
1
"to" :
2
3
"cost_property" : "cost",
4
"relationships" : {
5
"type" : "to",
6
"direction" : "out"
7
},
8
"algorithm" : "dijkstra"
9
}
Example response
• 200: OK
{
"weight" : 2.0,
3
4 ],
5 "length" : 2,
6 "relationships" :
8 ],
}
18.14.4. 在关系上执行权重 Dijkstra 算法
366
Example request
367
{
1
"to" :
2
3
"cost_property" : "cost",
4
"relationships" : {
5
"type" : "to",
6
"direction" : "out"
7
},
8
"algorithm" : "dijkstra"
9
}
Example response
• 200: OK
{
"weight" : 6.0,
1
2
3
4 ],
5 "length" : 6,
6 "relationships" :
8
"http://localhost:7474/db/data/relationship/103" ],
}
18.15. Batch operations
18.15.1. 在批处理中执行多个操作
这允许您执行多个 API 调用通过单个 HTTP 调用，极大地提高性能的大型插入和更新操
作。批处理服务期望职务说明的数组作为输入，描述要通过正常的服务器 API 执行的操作的
368
每个工作说明。此服务是事务性的。如果任何操作执行失败（非 2xx 返回 HTTP 状态代码），
将回滚事务并将撤消所有更改.
Each job description should contain a to attribute, with a value relative to the data API root (so
http://localhost:7474/db/data/node becomes just /node), and a method attribute containing HTTP
verb to use.
（可选）您可以提供身体特性和 id 属性，以帮助您跟踪的响应，虽然反应，保证能够在

收到的职务说明的相同顺序返回。下图概述了职务说明的不同部分:
Eample request
• POST http://localhost:7474/db/data/batch
1 [ {
2 "method" : "PUT",
3 "to" :
"/node/27/properties",
4
5 "body" : {
6 "age" : 1
7 },
8 "id" : 0
9 }, {
1 "method" : "GET",
0 "to" : "/node/27",
369
1 "id" : 1
1 }, {
1 "method" : "POST",
2 "to" : "/node",
1 "body" : {
3 "age" : 1
1 },
4 "id" : 2
1 }, {
1 "to" : "/node",
6 "body" : {
1 "age" : 1
7 },
1 "id" : 3
8 } ]
1
9
2
0
2
1
2
2
2
3
2
4
2
5
2
6
Example response
• 200: OK
1 [ {
2 "id" : 0,
3 "from" : "/node/27/properties"
4 }, {
5 "id" : 1,
6 "body" : {
7 "extensions" : {
8 },
370
10 "http://localhost:7474/db/data/node/27/paged/traverse/{retur
11 nType}{?pageSize,leaseTime}",
14 "traverse" :
15 "http://localhost:7474/db/data/node/27/traverse/{returnType}
16 ",
18 "http://localhost:7474/db/data/node/27/relationships/all/{-l
19 ist|&|types}",
22 "property" :
26 "http://localhost:7474/db/data/node/27/relationships/out/{-l
27 ist|&|types}",
28 "properties" :
33 "http://localhost:7474/db/data/node/27/relationships/in/{-li
34 st|&|types}",
37 "data" : {
38 "age" : 1
39 }
40 },
41 "from" : "/node/27"
42 }, {
43 "id" : 2,
44 "location" : "http://localhost:7474/db/data/node/28",
45 "body" : {
46 "extensions" : {
47 },
49 "http://localhost:7474/db/data/node/28/paged/traverse/{retur
50 nType}{?pageSize,leaseTime}",
53 "traverse" :
54 "http://localhost:7474/db/data/node/28/traverse/{returnType}
55 ",
371
57 "http://localhost:7474/db/data/node/28/relationships/all/{-l
58 ist|&|types}",
61 "property" :
65 "http://localhost:7474/db/data/node/28/relationships/out/{-l
66 ist|&|types}",
67 "properties" :
72 "http://localhost:7474/db/data/node/28/relationships/in/{-li
st|&|types}",
"data" : {
"age" : 1
}
},
"from" : "/node"
}, {
"id" : 3,
"location" : "http://localhost:7474/db/data/node/29",
"body" : {
"extensions" : {
},
"paged_traverse" :
"http://localhost:7474/db/data/node/29/paged/traverse/{retur
nType}{?pageSize,leaseTime}",
"traverse" :
"http://localhost:7474/db/data/node/29/traverse/{returnType}
",
"http://localhost:7474/db/data/node/29/relationships/all/{-l
ist|&|types}",
"property" :
372
"http://localhost:7474/db/data/node/29/relationships/out/{-l
ist|&|types}",
"properties" :
"http://localhost:7474/db/data/node/29/relationships/in/{-li
st|&|types}",
"data" : {
"age" : 1
}
},
"from" : "/node"
} ]
18.15.2. 较早前在同一个批处理作业中创建的项目，请参阅
批处理操作 API 允许您从创建的资源，在以后的工作说明，在同一批次调用中返回的 URI
引用。使用 {[作业 ID]} 的特殊语法来从创建资源的 Uri 注入 JSON 字符串中后续作业说明.
Example request
1 [ {
3 "to" : "/node",
4 "id" : 0,
5 "body" : {
373
6 "name" : "bob"
7 }
8 }, {
1 "to" : "/node",
0 "id" : 1,
1 "body" : {
1 "age" : 12
1 }
2 }, {
3 "to" : "{0}/relationships",
1 "id" : 3,
4 "body" : {
1 "to" : "{1}",
5 "data" : {
1 "since" : "2010"
6 },
1 "type" : "KNOWS"
7 }
1 }, {
1 "to" :
9 "/index/relationship/my_rels",
2 "id" : 4,
0 "body" : {
2 "key" : "since",
1 "value" : "2010",
2 "uri" : "{3}"
2 }
2 } ]
3
2
4
2
5
2
6
2
7
2
8
2
9
3
0
3
374
1
3
2
3
3
3
4
3
5
Example response
• 200: OK
[ {
"id" : 0,
"body" : {
"extensions" : {
},
"paged_traverse" :
"http://localhost:7474/db/data/node/30/paged/traverse/{returnType
}{?pageSize,leaseTime}",
"traverse" :
"http://localhost:7474/db/data/node/30/relationships/all/{-list|&
|types}",
"property" :
"http://localhost:7474/db/data/node/30/relationships/out/{-list|&
|types}",
"properties" :
"http://localhost:7474/db/data/node/30/relationships/in/{-list|&|
types}",
375
"data" : {
"name" : "bob"
}
},
"from" : "/node"
}, {
"id" : 1,
"body" : {
"extensions" : {
},
"paged_traverse" :
"http://localhost:7474/db/data/node/31/paged/traverse/{returnType
}{?pageSize,leaseTime}",
"traverse" :
"http://localhost:7474/db/data/node/31/relationships/all/{-list|&
|types}",
"property" :
"http://localhost:7474/db/data/node/31/relationships/out/{-list|&
|types}",
"properties" :
"http://localhost:7474/db/data/node/31/relationships/in/{-list|&|
types}",
"data" : {
"age" : 12
}
},
"from" : "/node"
}, {
"id" : 3,
376
"location" : "http://localhost:7474/db/data/relationship/10",
"body" : {
"extensions" : {
},
"property" :
"http://localhost:7474/db/data/relationship/10/properties/{key}",
"properties" :
"type" : "KNOWS",
"data" : {
"since" : "2010"
}
},
"from" :
}, {
"id" : 4,
"location" :
"http://localhost:7474/db/data/index/relationship/my_rels/since/2
010/10",
"body" : {
"extensions" : {
},
"property" :
"http://localhost:7474/db/data/relationship/10/properties/{key}",
"properties" :
"type" : "KNOWS",
"data" : {
"since" : "2010"
},
"indexed" :
"http://localhost:7474/db/data/index/relationship/my_rels/since/2
010/10"
},
"from" : "/index/relationship/my_rels"
} ]
18.15.3. 批处理流中执行多个操作
377
Example request
• X-Stream: true
1
[ {
2
"method" : "PUT",
3
"to" : "/node/145/properties",
4
"body" : {
5
"age" : 1
6
},
7
"id" : 0
8
}, {
9
"method" : "GET",
1
"to" : "/node/145",
0
"id" : 1
1
}, {
1
"method" : "POST",
1
"to" : "/node",
2
"body" : {
1
"age" : 1
3
},
1
"id" : 2
4
}, {
1
"method" : "POST",
5
"to" : "/node",
1
"body" : {
6
"age" : 1
1
},
7
"id" : 3
1
} ]
8
378
1
9
2
0
2
1
2
2
2
3
2
4
2
5
2
6
Example response
• 200: OK
1 [ {
2 "id" : 0,
3 "from" : "/node/145/properties",
4 "body" : null,
5 "status" : 204
6 }, {
7 "id" : 1,
8 "from" : "/node/145",
9 "body" : {
1 "extensions" : {
0 },
1 "http://localhost:7474/db/data/node/145/paged/traverse
/{returnType}{?pageSize,leaseTime}",
1
"http://localhost:7474/db/data/node/145/relationships/
1
3 out",
1 "traverse" :
4 "http://localhost:7474/db/data/node/145/traverse/{retu
rnType}",
1
1
6 all/{-list|&|types}",
7 "http://localhost:7474/db/data/node/145/relationships/
379
all",
1
8 "property" :
"http://localhost:7474/db/data/node/145/properties/{ke
1
9 y}",
2 "self" :
out/{-list|&|types}",
2
2 "properties" :
2
2
4 in",
in/{-list|&|types}",
2
2
7 ,
2 "data" : {
8 "age" : 1
2 }
9 },
3 "status" : 200
0 }, {
3 "id" : 2,
1 "from" : "/node",
3 "body" : {
2 "extensions" : {
3 },
"http://localhost:7474/db/data/node/146/paged/traverse
3
4 /{returnType}{?pageSize,leaseTime}",
out",
3
6 "traverse" :
"http://localhost:7474/db/data/node/146/traverse/{retu
3
7 rnType}",
all/{-list|&|types}",
3
4
0 all",
4 "property" :
380
1 "http://localhost:7474/db/data/node/146/properties/{ke
y}",
4
2 "self" :
4
4
4 out/{-list|&|types}",
4 "properties" :
in",
4
4
8 in/{-list|&|types}",
9 "http://localhost:7474/db/data/node/146/relationships"
,5
0 "data" : {
5 "age" : 1
1 }
5 },
2 "location" :
5
3 "status" : 201
5 }, {
4 "id" : 3,
5 "from" : "/node",
5 "body" : {
5 "extensions" : {
6 },
7 "http://localhost:7474/db/data/node/147/paged/traverse
/{returnType}{?pageSize,leaseTime}",
5
5
9 out",
6 "traverse" :
0 "http://localhost:7474/db/data/node/147/traverse/{retu
rnType}",
6
6
2 all/{-list|&|types}",
all",
6
4 "property" :
381
"http://localhost:7474/db/data/node/147/properties/{ke
6
5 y}",
6 "self" :
out/{-list|&|types}",
6
8 "properties" :
6
7
0 in",
in/{-list|&|types}",
7
7
3 ,
7 "data" : {
4 "age" : 1
7 }
5 },
7 "location" :
7 "status" : 201
7 } ]
18.16. WADL Support

Neo4j 休息 API 是一个真正 rest 风格的界面，依靠超媒体控件（链接）广告对用户的
允许操作。超媒体是一种动态界面样式声明构造（语义标记）用于告知他们下一步的法律选
择的客户端，只是在时间.
小心
RESTful APIs cannot be modelled by static interface description languages like

WSDL or WADL.
不过为某些使用情况下，开发人员可能希望公开 WADL 说明 Neo4j 休息 api，尤其是当
使用工具，预计这种。在这些情况下可能将添加到您的服务器的 neo4j.properties 文件启用
WADL 代:
unsupported_wadl_generation_enabled=tru
1
e
小心
382
WADL is not an officially supported part of the Neo4j server
API because WADL is insufficiently expressive to capture the set
of potential interactions a client can drive with Neo4j server.
Expect the WADL description to be incomplete, and in some
cases contradictory to the real API. In any cases where the WADL
description disagrees with the REST API, the REST API should
be considered authoritative. WADL generation may be withdrawn
at any point in the Neo4j release cycle.
18.17. Cypher 插件
警告
This functionality is now provided by the core REST API. The plugin will
continue to work for some time, but is as of Neo4j 1.6 deprecated. See 第 18.3 节
“Cypher queries” for documentation on the built-in Cypher support.
The Neo4j Cypher Plugin enables querying with the 第 15 章 Cypher 查询语言. The results are
returned as a list of string headers (columns), and a data part, consisting of a list of all rows, every
row consisting of a list of REST representations of the field value - Node, Relationship or any
simple value like String.
A simple query returning all nodes connected to node 1, returning the node and the name
property, if it exists, otherwise null:
START x = node(3)
1
MATCH x -[r]-> n
2
RETURN type(r), n.name?,
3
n.age?
383
Example request
• POST
http://localhost:7474/db/data/ext/CypherPlugin/graphdb/execute_quer
y
{
1
"query" : "start x = node(3) match x -[r]-> n return type(r),
2
n.name?, n.age?",
3
"params" : {
4
}
5
}
Example response
• 200: OK
1 {
2 "columns" : [ "type(r)", "n.name?", "n.age?" ],
3 "data" : [ [ "know", "him", 25 ], [ "know", "you", null ] ]
4 }
Paths can be returned together with other return types by just specifying returns.
START x = node(7)
1 MATCH path =
(x--friend)
2
3 RETURN path,
friend.name
384
Example request
• POST
y
{
1
"query" : "start x = node(7) match path = (x--friend) return
2
path, friend.name",
3
"params" : {
4
}
5
}
Example response
• 200: OK
{
1
"columns" : [ "path", "friend.name" ],
2
"data" : [ [ {
3
4
5
"http://localhost:7474/db/data/node/6" ],
6
"length" : 1,
7
"relationships" :
8
[ "http://localhost:7474/db/data/relationship/3" ],
9
1
}, "you" ] ]
0
}
Cypher supports queries with parameters which are submitted as a JSON map.
START x =
1
node:node_auto_index(name={startName})
2
MATCH path = (x-[r]-friend)
3
WHERE friend.name = {name}
4
RETURN TYPE(r)
385
Example request
• POST
y
{
1 "query" : "start x = node:node_auto_index(name={startName})
match
2 path = (x-[r]-friend) where friend.name = {name} return
TYPE(r)",
3
4 "params" : {
5 "startName" : "I",
6 "name" : "you"
7 }
}
Example response
• 200: OK
{
1
"columns" :
2
[ "TYPE(r)" ],
3
"data" : [ [ "know" ] ]
4
}
Errors on the server will be reported as a JSON-formatted stacktrace and message.
START x =
node(5)
1
2 RETURN
x.dummy
386
Example request
• POST
y
1 {
2 "query" : "start x = node(5) return x.dummy",
3 "params" : {
4 }
5 }
Example response
{
"message" : "The property 'dummy' does not exist on Node[5]",
"stacktrace" :
[ "org.neo4j.server.rest.repr.RepresentationExceptionHandlingIte
rable.exceptionOnHasNext(RepresentationExceptionHandlingIterable
1
.java:51)",
2
"org.neo4j.helpers.collection.ExceptionHandlingIterable$1.hasNex
3
t(ExceptionHandlingIterable.java:61)",
4
"org.neo4j.helpers.collection.IteratorWrapper.hasNext(IteratorWr
5
apper.java:42)",
"org.neo4j.server.rest.repr.ListRepresentation.serialize(ListRep
resentation.java:58)",
"org.neo4j.server.rest.repr.Serializer.serialize(Serializer.java
:75)",
"org.neo4j.server.rest.repr.MappingSerializer.putList(MappingSer
387
ializer.java:61)",
"org.neo4j.server.rest.repr.CypherResultRepresentation.serialize
(CypherResultRepresentation.java:50)",
"org.neo4j.server.rest.repr.MappingRepresentation.serialize(Mapp
ingRepresentation.java:42)",
"org.neo4j.server.rest.repr.OutputFormat.format(OutputFormat.jav
a:170)",
"org.neo4j.server.rest.repr.OutputFormat.formatRepresentation(Ou
tputFormat.java:120)",
"org.neo4j.server.rest.repr.OutputFormat.response(OutputFormat.j
ava:107)",
"org.neo4j.server.rest.repr.OutputFormat.ok(OutputFormat.java:55
)",
"org.neo4j.server.rest.web.ExtensionService.invokeGraphDatabaseE
xtension(ExtensionService.java:122)",
}
18.18. Gremlin Plugin
Gremlin 是 Groovy 基于图形遍历语言。它提供了很好的表现方式的显式地编写脚本通过

Neo4j 图形遍历.
The Neo4j Gremlin Plugin provides an endpoint to send Gremlin scripts to the Neo4j Server.
The scripts are executed on the server database and the results are returned as Neo4j Node and
Relationship representations. This keeps the types throughout the REST API consistent. The results
are quite verbose when returning Neo4j Node, Relationship or Graph representations. On the
other hand, just return properties like in the 第 18.18.4 节 “Send a Gremlin Script - JSON encoded
with table results” example for responses tailored to specific needs.
警告
The Gremlin plugin lets you execute arbitrary Groovy code under the hood. In hosted
and open environments, this can constitute a security risk. In these case, consider using
declarative approaches like 第 15 章 Cypher 查询语言 or write your own server side
plugin executing the interesting Gremlin or Java routines, see 第 10.1 节 “服务器插件” or
secure your server, see 第 24.1 节 “安全访问 Neo4j 服务器”.
提示
388
When returning results from pipes like g.v(0).in(), make sure to iterate through the
results in order not to return the pipe object but its content, like g.v(0).in().iterate().
For more caveats, see Gremlin Troubleshooting
18.18.1. Send a Gremlin Script - URL encoded
Scripts can be sent as URL-encoded In this example, the graph has been autoindexed by Neo4j,
so we can look up the name property on nodes.
Raw script source
g.idx('node_auto_index')[[name:'I']]
1
.out
Example request
• POST
http://localhost:7474/db/data/ext/GremlinPlugin/graphdb/execute_scr
ipt
• Content-Type: application/x-www-form-urlencoded
script=g.idx%28%27node_auto_index%27%29%5B%5Bname%3A%27I%27%
1
5D%5D.out
Example response
• 200: OK
1 [ {
3
4 "data" : {
389
5 "name" : "you"
6 },
7 "traverse" :
8
"http://localhost:7474/db/data/node/1/relationships/all/{-list|
1
0 &|types}",
1 "property" :
2 "properties" :
1
"http://localhost:7474/db/data/node/1/relationships/out/{-list|
1
4 &|types}",
1 "extensions" : {
6 },
8 "http://localhost:7474/db/data/node/1/paged/traverse/{returnTyp
e}{?pageSize,leaseTime}",
1
"http://localhost:7474/db/data/node/1/relationships/in/{-list|&
|types}"
} ]
18.18.2. Load a sample graph
Import a graph form a GraphML file can be achieved through the Gremlin GraphMLReader. The
following script imports a small GraphML file from an URL into Neo4j, resulting in the depicted
graph. The underlying database is auto-indexed, see 第 14.12 节 “Automatic Indexing” so the script
can return the imported node by index lookup.
Raw script source
g.clear()
1
g.loadGraphML('https://raw.github.com/neo4j/gremlin-plugin/m
2
aster/src/data/graphml1.xml')
3
g.idx('node_auto_index')[[name:'you']]

390
Example request
• POST
ipt
{
"script" :
1
"g.clear();g.loadGraphML('https://raw.github.com/neo4j/gremlin-p
2
lugin/master/src/data/graphml1.xml');g.idx('node_auto_index')[[n
3
ame:'you']];",
4
"params" : {
5
}
}
Example response
• 200: OK
1 [ {
3
4 "data" : {
5 "name" : "you"
6 },
7 "traverse" :
8
1
0 |&|types}",
1 "property" :
391
2 "properties" :
1
1
4 |&|types}",
1 "extensions" : {
6 },
1
&|types}"
} ]
18.18.3. Sort a result using raw Groovy operations
The following script returns a sorted list of all nodes connected via outgoing relationships to
node 1, sorted by their name-property.
Raw script source
g.idx('node_auto_index')[[name:'I']].out.sort{it.n
1
ame}
Example request
392
• POST
ipt
{
1
"script" :
2
"g.idx('node_auto_index')[[name:'I']].out.sort{it.name}",
3
"params" : {
4
}
5
}
Example response
• 200: OK
1 [ {
3
4 "data" : {
5 "name" : "him"
6 },
7 "traverse" :
8
1
0 |&|types}",
1 "property" :
2 "properties" :
1
1
4 |&|types}",
1 "extensions" : {
6 },
1
393
2
2
1 &|types}"
2 }, {
2
3 "data" : {
2 "name" : "you"
4 },
2 "traverse" :
|&|types}",
2
7 "property" :
2
2 "properties" :
|&|types}",
3
3
2 "extensions" : {
3 },
3
3
&|types}"
} ]
18.18.4. Send a Gremlin Script - JSON encoded with table results
To send a Script JSON encoded, set the payload Content-Type Header. In this example, find all
the things that my friends like, and return a table listing my friends by their name, and the names of
the things they like in a table with two columns, ignoring the third named step variable I. Remember
that everything in Gremlin is an iterator - in order to populate the result table t, iterate through the
pipes with iterate().
394
Raw script source
t= new Table()
1
g.v(23).as('I').out('know').as('friend').out('like').as('lik
2
es').table(t,['friend','likes']){it.name}{it.name}.iterate()
3
t
Example request
• POST
ipt
{
"script" : "t= new
1
Table();g.v(23).as('I').out('know').as('friend').out('like').as(
2
'likes').table(t,['friend','likes']){it.name}{it.name}.iterate()
3
;t;",
4
"params" : {
5
}
}
Example response
• 200: OK
395
1 {
2 "columns" : [ "friend", "likes" ],
3 "data" : [ [ "Joe", "cats" ], [ "Joe", "dogs" ] ]
4 }
18.18.5. Returning nested pipes
Raw script source
g.v(27).as('I').out('know').as('friend').out('like').as('lik
1
es').table(new Table()){it.name}{it.name}.cap
Example request
• POST
ipt
{
1 "script" :
"g.v(27).as('I').out('know').as('friend').out('like').as('likes
2
').table(new
3 Table()){it.name}{it.name}.cap;",
4 "params" : {
5 }
}
Example response
396
• 200: OK
1 [ [ {
2 "data" : [ [ "I", "Joe", "cats" ], [ "I", "Joe", "dogs" ] ],
3 "columns" : [ "I", "friend", "likes" ]
4 } ] ]
18.18.6. Set script variables
To set variables in the bindings for the Gremlin Script Engine on the server, you can include a
params parameter with a String representing a JSON map of variables to set to initial values. These
can then be accessed as normal variables within the script.
Raw script source
meaning_of_
1
life
Example request
• POST
ipt
{
1
"script" :
2
"meaning_of_life",
3
"params" : {
4
"meaning_of_life" : 42.0
5
}
6
}
Example response
• 200: OK
4
1
2.0
397
18.18.7. Send a Gremlin Script with variables in a JSON Map
Send a Gremlin Script, as JSON payload and additional parameters
Raw script source
g.v(me)
1
.out
Example request
• POST
ipt
{
1
"script" :
2
"g.v(me).out",
3
"params" : {
4
"me" : "6"
5
}
6
}
Example response
• 200: OK
1 [ {
3
4 "data" : {
5 "name" : "you"
398
6 },
7 "traverse" :
8
"http://localhost:7474/db/data/node/5/relationships/all/{-list|
1
0 &|types}",
1 "property" :
2 "properties" :
1
"http://localhost:7474/db/data/node/5/relationships/out/{-list|
1
4 &|types}",
1 "extensions" : {
6 },
8 "http://localhost:7474/db/data/node/5/paged/traverse/{returnTyp
e}{?pageSize,leaseTime}",
1
"http://localhost:7474/db/data/node/5/relationships/in/{-list|&
|types}"
} ]
18.18.8. Return paths from a Gremlin script
The following script returns paths. Paths in Gremlin consist of the pipes that make up the path
from the starting pipes. The server is returning JSON representations of their content as a nested list.
Raw script source
g.v(20).out.name.p
1
aths
399
Example request
• POST
ipt
{
1
"script" :
2
"g.v(20).out.name.paths",
3
"params" : {
4
}
5
}
Example response
• 200: OK
1 [ [ {
3
4 "data" : {
5 "name" : "I"
6 },
7 "traverse" :
8
1
0 |&|types}",
1 "property" :
2 "properties" :
1
400
1
4 |&|types}",
1 "extensions" : {
6 },
1
2
2
1 &|types}"
2 }, {
2
3 "data" : {
2 "name" : "you"
4 },
2 "traverse" :
|&|types}",
2
7 "property" :
2
2 "properties" :
|&|types}",
3
3
2 "extensions" : {
3 },
3
3
401
&|types}"
3
8 }, "you" ], [ {
4 "data" : {
0 "name" : "I"
4 },
1 "traverse" :
4
4
3 |&|types}",
4 "property" :
5 "properties" :
4
4
7 |&|types}",
4 "extensions" : {
9 },
5
5
5
4 &|types}"
5 }, {
5
6 "data" : {
5 "name" : "him"
7 },
5 "traverse" :
|&|types}",
6
0 "property" :
402
6
6 "properties" :
|&|types}",
6
6
5 "extensions" : {
6 },
6
6
&|types}"
7
1 }, "him" ] ]
7
2
7
3
18.18.9. Send an arbitrary Groovy script - Lucene sorting
This example demonstrates that you via the Groovy runtime embedded with the server have full
access to all of the servers Java APIs. The below example creates Nodes in the database both via the
Blueprints and the Neo4j API indexes the nodes via the native Neo4j Indexing API constructs a
custom Lucene sorting and searching returns a Neo4j IndexHits result iterator.
Raw script source
'******** Additional imports *********'

import org.neo4j.graphdb.index.*
import org.neo4j.graphdb.*
import org.neo4j.index.lucene.*
import org.apache.lucene.search.*
'**** Blueprints API methods on the injected Neo4jGraph at variable

g ****'
meVertex = g.addVertex([name:'me'])
meNode = meVertex.getRawVertex()
403
'*** get the Neo4j raw instance ***'
neo4j = g.getRawGraph()
'******** Neo4j API methods: *********'

tx = neo4j.beginTx()
youNode = neo4j.createNode()
youNode.setProperty('name','you')
youNode.createRelationshipTo(meNode,DynamicRelationshipType.w
ithName('knows'))
'*** index using Neo4j APIs ***'

idxManager = neo4j.index()
personIndex = idxManager.forNodes('persons')
personIndex.add(meNode,'name',meNode.getProperty('name'))
personIndex.add(youNode,'name',youNode.getProperty('name'))
tx.success()
tx.finish()
'*** Prepare a custom Lucene query context with Neo4j API ***'
query = new QueryContext( 'name:*' ).sort( new Sort(new
SortField( 'name',SortField.STRING, true ) ) )
results = personIndex.query( query )
Example request
• POST
ipt
1{
404
2 "script" : "'******** Additional imports *********';import
org.neo4j.graphdb.index.*;import
3 org.neo4j.graphdb.*;import
org.neo4j.index.lucene.*;import
4
org.apache.lucene.search.*;;'****
5 Blueprints API methods on the
injected Neo4jGraph at variable g ****';meVertex =
g.addVertex([name:'me']);meNode = meVertex.getRawVertex();;'***
get the Neo4j raw instance ***';neo4j = g.getRawGraph();;;'********
Neo4j API methods: *********';tx = neo4j.beginTx(); youNode =
neo4j.createNode(); youNode.setProperty('name','you');
youNode.createRelationshipTo(meNode,DynamicRelationshipType.wit
hName('knows'));;'*** index using Neo4j APIs ***'; idxManager =
neo4j.index(); personIndex = idxManager.forNodes('persons');
personIndex.add(meNode,'name',meNode.getProperty('name'));
personIndex.add(youNode,'name',youNode.getProperty('name'));tx.
success();tx.finish();;;'*** Prepare a custom Lucene query context
with Neo4j API ***';query = new QueryContext( 'name:*' ).sort( new
Sort(new SortField( 'name',SortField.STRING, true ) ) );results =
personIndex.query( query );",
"params" : {
}
}
Example response
• 200: OK
1 [ {
3
4 "data" : {
5 "name" : "you"
6 },
7 "traverse" :
8
1
0 |&|types}",
1 "property" :
2 "properties" :
1
1
4 |&|types}",
405
1 "extensions" : {
6 },
1
2
2
1 &|types}"
2 }, {
2
3 "data" : {
2 "name" : "me"
4 },
2 "traverse" :
|&|types}",
2
7 "property" :
2
2 "properties" :
|&|types}",
3
3
2 "extensions" : {
3 },
3
3
&|types}"
} ]
406
18.18.10. Emit a sample graph
Exporting a graph can be done by simple emitting the appropriate String.
Raw script source
writer = new GraphMLWriter(g)

1
out = new
2
java.io.ByteArrayOutputStream()
3
writer.outputGraph(out)
4
result = out.toString()
Example request
• POST
ipt
{
1 "script" : "writer = new GraphMLWriter(g);out = new
java.io.ByteArrayOutputStream();writer.outputGraph(out);result
2 =
out.toString();",
3
4 "params" : {
5 }
}
Example response
• 200: OK
"<?xml version=\"1.0\" ?><graphml

1
xmlns=\"http://graphml.graphdrawing.org/xmlns\"><key id=\"name\"
407
for=\"node\" attr.name=\"name\"
attr.type=\"string\"></key><graph id=\"G\"
edgedefault=\"directed\"><node id=\"12\"><data
key=\"name\">you</data></node><node id=\"13\"><data
key=\"name\">him</data></node><node id=\"14\"><data
key=\"name\">I</data></node><edge id=\"6\" source=\"14\"
target=\"12\" label=\"know\"></edge><edge id=\"7\" source=\"14\"
target=\"13\" label=\"know\"></edge></graph></graphml>"
18.18.11. HyperEdges - find user roles in groups
Imagine a user being part of different groups. A group can have different roles, and a user can be
part of different groups. He also can have different roles in different groups apart from the
membership. The association of a User, a Group and a Role can be referred to as a HyperEdge.
However, it can be easily modeled in a property graph as a node that captures this n-ary relationship,
as depicted below in the U1G2R1 node.
To find out in what roles a user is for a particular groups (here Group2), the following script can
traverse this HyperEdge node and provide answers.
Raw script source
g.v(37).out('hasRoleInGroup').as('hyperedge').out('hasGroup'
).filter{it.name=='Group2'}.back('hyperedge').out('hasRole').nam
1
e
408
Example request
• POST
ipt
{
1 "script" :
"g.v(37).out('hasRoleInGroup').as('hyperedge').out('hasGroup').f
2
ilter{it.name=='Group2'}.back('hyperedge').out('hasRole').name",
3
4 "params" : {
5 }
}
Example response
• 200: OK
409
[ "Role
1
1" ]
18.18.12. Group count
This example is showing a group count in Gremlin, for instance the counting of the different
relationship types connected to some the start node. The result is collected into a variable that then is
returned.
Raw script source
m = [:]
1
g.v(41).bothE().label.groupCount(m).itera
2
te()
3
m
Example request
• POST
ipt
1 {
2 "script" : "m =
410
[:];g.v(41).bothE().label.groupCount(m).iterate();m",
3
4 "params" : {
5 }
}
Example response
• 200: OK
{
1 "knows"
: 22,
3 "likes"
: 41
}
18.18.13. Collect multiple traversal results
Multiple traversals can be combined into a single result, using splitting and merging pipes in a
lazy fashion.
Raw script source
g.idx('node_auto_index')[[name:'Peter']].copySplit(_().out('
1
knows'), _().in('likes')).fairMerge.name
Example request
411
• POST
ipt
{
1 "script" :
"g.idx('node_auto_index')[[name:'Peter']].copySplit(_().out('kn
2
ows'),
3 _().in('likes')).fairMerge.name",
4 "params" : {
5 }
}
Example response
• 200: OK
[ "Ian",
1
"Marie" ]
18.18.14. Collaborative filtering
This example demonstrates basic collaborative filtering - ordering a traversal after occurence
counts and substracting objects that are not interesting in the final result.
Here, we are finding Friends-of-Friends that are not Joes friends already. The same can be
applied to graphs of users that LIKE things and others.
Raw script source
x=[]
1
fof=[:]
2
g.v(63).out('knows').aggregate(x).out('knows').except(x).gro
3
upCount(fof).iterate()
4
fof.sort{a,b -> b.value <=> a.value}
412
Example request
• POST
ipt
{
"script" :
1
"x=[];fof=[:];g.v(63).out('knows').aggregate(x).out('knows').exc
2
ept(x).groupCount(fof).iterate();fof.sort{a,b -> b.value <=>
3
a.value}",
4
"params" : {
5
}
}
Example response
• 200: OK
1 {
2 "v[61]"
: 32,
413
4 "v[60]"
: 51,
"v[62]"
: 1
}
18.18.15. Chunking and offsetting in Gremlin
Raw script source
g.v(51).out('knows').filter{it.name ==
1
'Sara'}[0..100]
Example request
• POST
ipt
{
1
"script" : "g.v(51).out('knows').filter{it.name ==
2
'Sara'}[0..100]",
3
"params" : {
4
}
5
}
Example response
• 200: OK
1 [ {
414
3
4 "data" : {
5 "name" : "Sara"
6 },
7 "traverse" :
8
1
0 |&|types}",
1 "property" :
2 "properties" :
1
1
4 |&|types}",
1 "extensions" : {
6 },
1
&|types}"
} ]
18.18.16. Modify the graph while traversing
This example is showing a group count in Gremlin, for instance the counting of the different
relationship types connected to some the start node. The result is collected into a variable that then is
returned.
415
Raw script source
g.v(44).bothE.each{g.removeEdge(
1
it)}
Example request
• POST
ipt
{
1
"script" :
2
"g.v(44).bothE.each{g.removeEdge(it)}",
3
"params" : {
4
}
5
}
Example response
• 200: OK
416
[
1
]
18.18.17. Flow algorithms with Gremlin
This is a basic stub example for implementing flow algorithms in for instance Flow Networks
with a pipes-based approach and scripting, here between source and sink using the capacity
property on relationships as the base for the flow function and modifying the graph during
calculation.
Raw script source
source=g.v(72)
1
sink=g.v(73)
2
maxFlow = 0
3
source.outE.inV.loop(2){!it.object.equals(sink)}.paths.each{
4
flow = it.capacity.min()
5
maxFlow += flow
6
it.findAll{it.capacity}.each{it.capacity -= flow}}
7
maxFlow
417
Example request
• POST
ipt
{
"script" : "source=g.v(72);sink=g.v(73);maxFlow =
0;source.outE.inV.loop(2){!it.object.equals(sink)}.paths.each{f
1
low
2 = it.capacity.min(); maxFlow +=
flow;it.findAll{it.capacity}.each{it.capacity
3 -=
flow}};maxFlow",
4
5 "params" : {
}
}
Example response
• 200: OK
1 4
18.18.18. Script execution errors
Script errors will result in an HTTP error response code.
418
Example request
• POST
ipt
{
1
"script" :
2
"g.addVertex([name:{}])"
3
}
Example response
{
"message" : "javax.script.ScriptException:
java.lang.IllegalArgumentException: Unknown property type on:
Script25$_run_closure1@6160722b, class Script25$_run_closure1",
"stacktrace" :
[ 1"org.neo4j.server.plugin.gremlin.GremlinPlugin.executeScript(
GremlinPlugin.java:88)",
2
"java.lang.reflect.Method.invoke(Method.java:597)",
3
"org.neo4j.server.plugins.PluginMethod.invoke(PluginMethod.java
4
:57)",
5
"org.neo4j.server.plugins.PluginManager.invoke(PluginManager.ja
va:168)",
"org.neo4j.server.rest.web.ExtensionService.invokeGraphDatabase
Extension(ExtensionService.java:300)",
"org.neo4j.server.rest.web.ExtensionService.invokeGraphDatabase
Extension(ExtensionService.java:122)",
419
}
第 19 章在 Python 中使用 Neo4j 嵌入模式

这描述了 _neo4j-embedded_，让你在 Python 应用中嵌入 Neo4j 数据库的一个 Python 库。
第 9 章在 Python 应用中使用 Neo4j。

从参考文档和这个章节的安装介绍分开，你可以参考：
这个工程在 GitHub 上面的源代码地址： https://github.com/neo4j/python-embedded
19.1. 安装
注意： Neo4j 数据库（来自社区版）本身就被包括在 Neo4j 嵌入模式发行版中。 The Neo4j
database itself (from the Community Edition) is included in the neo4j-embedded distribution.
19.1.1. 安装到 OSX/Linux
先决条件
特别注意：确保使用的整个堆栈要么是 64 位的，要么是 32 位（默认就是 32 位）。这都
是为了正常使用 JVM，Python 和 JPype。
首先，安装 JPype：
1. 从 http://sourceforge.net/projects/jpype/files/JPype/下载 JPype 的最新版本。

2. 解压下载压缩包。
3. 打开控制台，并进入压缩包目录。
4. 运行命令：`sudo python setup.py install`。
JPype 在 Debian 的源里面也有：
sudo apt-get install

1
python-jpype
然后，确保 +JAVA_HOME+环境参数设置到了你的 jre 和 'jdk'目录，保证 JPype 呢给你
找到 JVM。
注意: 在 OSX 上面安装是有问题的。看下面在 Stack Overflow 的讨论来获取更多的帮助：

http://stackoverflow.com/questions/8525193/cannot-install-jpype-on-os-x-lion-to-use-with-neo4j 。
安装 neo4j-embedded
你可以用你的 Python 包管理工具安装 neo4j-embedded：
sudo pip install

1
neo4j-embedded
420
sudo easy_install
1
neo4j-embedded
或者手工安装：
1. 从 32 位, 64 位下载 JPype 最新版本。

2. 解压下载的压缩文件。
3. 打开控制台并进入到解压目录。
4. 运行命令： sudo python setup.py install
19.1.2. 安装到 Windows
先决条件
警告：确保使用的整个堆栈要么是 64 位的，要么是 32 位（默认就是 32 位）。这都是为
了正常使用 JVM，Python，JPype 和所有额外的 DLL。
首先，安装 JPype：
注意
注意 JPype 只工作在 Python 2.6 和 2.7。也要注意你下载的地址由你使用的版本

决定。
1. 从 32 位, 64 位下载 JPype 最新版本。

2. 运行安装程序。
然后，确保 +JAVA_HOME+环境参数设置到了你的 jre 和 'jdk'目录。要了解详细的环境

参数设置情况，请参考：“解决缺失 DLL 文件的问题”一节。
注意：如果有 JPype 需要的 DLL 文件缺失，请参考： “解决缺失 DLL 文件的问题”一节的

介绍来修复它。
安装 neo4j-embedded
1. 从 http://pypi.python.org/pypi/neo4j-embedded/下载最新版本。
2. 运行安装程序。
解决缺失 DLL 文件的问题

Windows 的某些版本缺失了需要编程启动一个 JVM 的 DLL 文件。你需要保证
+IEShims.dll+和某些调试用的 DLL 文件在 Windows 上面。
IEShims.dll 一般是在 Internet Explorer 的安装包里面。要让这个文件在你的系统路径

下面，你需要增加 IE 的安装目录到你的 +PATH+。
1. 右键点击 "我的电脑" 或者 "电脑"。

421
2. 选择 "属性"。
3. 点击 "高级" 或者 "高级系统设置"。
4. 点击 "环境变量" 按钮。
5. 找到 path 变量，并增加路径 C:\Program Files\Internet Explorer 到他的后面（如果
你的 IE 在其他目录，请用其他目录代替）。
调试涉及到的 DLL 文件都在 Microsoft Visual C++ Redistributable 库里面。
• 32bit Windows:
http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=5555
• 64bit Windows:
http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=14632
如果你依然还有问题，你可以使用 http://www.dependencywalker.com/ 打开你的 jvm.dll

（在 JAVA_HOME/bin/client/ 或者 JAVA_HOME/bin/server/ 下面），然后他会告诉你是否缺失
文件。
19.2. Core API

这个章节描述了如何建立环境到运行和如何做一些基本的操作。
19.2.1. 开始
创建一个数据库
from neo4j import GraphDatabase
1
2
# Create db
3
db =
4
GraphDatabase(folder_to_put_db_in)
5
6
# Always shut down your database
7
db.shutdown()
用配置创建一个数据库
要了解你能使用的配置选项，请参考：第 21 章配置和调优。
from neo4j import GraphDatabase

1
2
# Example configuration parameters
3
db = GraphDatabase(folder_to_put_db_in, string_block_size=200,
4
array_block_size=240)
5
6
db.shutdown()
JPype JVM 配置
422
你能使用 NEO4J_PYTHON_JVMARGS 环境变量来设置扩展参数，以便传递给 JVM。这是
可以使用的，比如增加数据库的最大内存。
注意你必须在引入 neo4j 包之前设置这个，要么在你启动 python 之前设置，要么在你的应

用中通过程序来设置。
import os
1
os.environ['NEO4J_PYTHON_JVMARGS'] = '-Xms128M
2
-Xmx512M'
3
import neo4j
你可以通过使用环境变量 NEO4J_PYTHON_CLASSPATH 来重载 neo4j-embedded 要使用的
类路径。
19.2.2. 事务
所以的写到数据库的操作都必须在一个事务中执行。这确保了你的数据库不会处于一个数
据不一致的状态。
了解关于如何在 Neo4j 中控制事务的细节，请参考：第 12 章事务管理。
我们用 python 的 with 声明来定义一个事务。如果你使用一个 Python 的老版本，你可呢

给你必须引入 with 声明。
from __future__ import

1
with_statement
无论哪种方式，这就是进入事务的方法：
1
2 # Start a transaction
4 # This is inside the transactional
5 # context. All work done here
6 # will either entirely succeed,
7 # or no changes will be applied at all.
8
9 # Create a node
1 node = db.node()
0
1 # Give it a name
1 node['name'] = 'Cat Stevens'
1
2 # The transaction is automatically
1 # commited when you exit the with
3 # block.
1
423
4
1
5
1
6
19.2.3. 节点
这个地方描述了节点对象的一些特殊操作。要了解关于如何控制节点和关系的属性的文
档，请参考：第 19.2.5 节 “属性” 。
创建一个节点
1
# Create a node
2
thomas = db.node(name='Thomas Anderson',
3
age=42)
通过 Id 找到一个节点
1 # You don't have to be in a transaction
2 # to do read operations.
3 a_node = db.node[some_node_id]
4
5 # Ids on nodes and relationships are available via the "id"
6 # property, eg.:
7 node_id = a_node.id
找到参考节点
reference =
1
db.reference_node
移除一个节点
2 node = db.node()
3 node.delete()
提示
也可以参考：第 12.5 节 “Delete

semantics”。
通过 id 移除一个节点
1
del
2
db.node[some_node_id]
从一个节点上访问他的关系
要获取关于你在关系对象上面你能做什么操作的细节，请参考：第 19.2.4 节 “关系”。
424
1
2
3
4
5
6
7
8 # All relationships on a node
9 for rel in a_node.relationships:
1 pass
0
1 # Incoming relationships
1 for rel in a_node.relationships.incoming:
1 pass
2
1 # Outgoing relationships
3 for rel in a_node.relationships.outgoing:
1 pass
4
1 # Relationships of a specific type
5 for rel in a_node.mayor_of:
1 pass
6
1 # Incoming relationships of a specific type
7 for rel in a_node.mayor_of.incoming:
1 pass
8
1 # Outgoing relationships of a specific type
9 for rel in a_node.mayor_of.outgoing:
2 pass
0
2
1
2
2
2
3
获取并计算节点个数
使用这个必须小心，在大型数据集中它会变得非常慢。
1 for node in db.nodes:

2 pass
3
4 # Shorthand for iterating
through
5
6 # and counting all nodes
425
number_of_nodes =
len(db.nodes)
19.2.4. 关系
这节描述了关系对象的一些操作。要获取关于如何控制节点和关系上面的属性，请参考：
第 19.2.5 节 “属性”。
创建一个关系
1
2
# Nodes to create a relationship between
3
steven = self.graphdb.node(name='Steve Brook')
4
poplar_bluff = self.graphdb.node(name='Poplar Bluff')
5
6
# Create a relationship of type "mayor_of"
7
relationship = steven.mayor_of(poplar_bluff, since="12th
8
of July 2012")
9
1
# Or, to create relationship types with names
0
# that would not be possible with the above
1
# method.
1
steven.relationships.create('mayor_of', poplar_bluff,
1
since="12th of July 2012")
2
通过 Id 找到一个关系
the_relationship =
1
db.relationship[a_relationship_id]
移除一个关系
1
# Create a relationship
2
source = db.node()
3
target = db.node()
4
rel =
5
source.Knows(target)
6
7
# Delete it
8
rel.delete()
提示
也可以参考：第 12.5 节 “Delete

semantics”。
通过 Id 移除一个关系
2 del
426
db.relationship[some_relationship_id]
关系的起点，终点和关系类型
relationship_type =
1
relationship.type
2
3
start_node = relationship.start
4
end_node = relationship.end
获取所有的关系以及数量
Use this with care, it will become extremely slow in large datasets.
for rel in db.relationships:

1
pass
2
3
# Shorthand for iterating through
4
# and counting all relationships
5
number_of_rels =
6
len(db.relationships)
19.2.5. 属性
节点和关系都可以有属性，所以这个部分介绍的内容同时适合节点和关系。属性允许的值
包括：字符串，数字，布尔型以及数组。在咩一个数组内部，所有的值的类型都是相同的。
设置属性值
1
node_or_rel['name'] = 'Thomas Anderson'
2
node_or_rel['age'] = 42
3
node_or_rel['favourite_numbers'] = [1,2,3]
4
node_or_rel['favourite_words'] =
5
['banana','blue']
获取属性值
numbers =
1
node_or_rel['favourite_numbers']
移除属性
1
del
2
node_or_rel['favourite_numbers']
通过属性轮询
1 # Loop key and value at the same time
2 for key, value in
node_or_rel.items():
3
4 pass
5
427
6 # Loop property keys
7 for key in node_or_rel.keys():
8 pass
9
1 # Loop property values
0 for value in node_or_rel.values():
1 pass
1
19.2.6. 路径
一个路径对象表示在一个图中的两个节点之间的一个路径。路径因此至少包括两个节点和
一个关系，但不限长度。这个对象在 API 的不同部分使用，最多的地方请参考：遍历查询。
访问开始节点和结束节点
start_node =
1
path.start
2
end_node = path.end
访问关系
last_relationship =
1
path.last_relationship
通过完整的路径轮询
你能直接通过一个路径的所有元素进行轮询，或者你能通过节点或者关系来选择一个进行
轮询。当你通过所有元素进行轮询时，第一个元素将成为开始节点，第二个会成为第一个关系，
第三个节点是关系要到达的地方。
1
2 for item in path:
3 # Item is either a
Relationship,
4
5 # or a Node
6 pass
7
8 for nodes in path.nodes:
9 # All nodes in a path
1 pass
0
1 for nodes in path.relationships:
1 # All relationships in a path
1 pass
2
19.3. 索引
428
为了快速通过属性找到节点或者关系，Neo4j 支持索引。这个常被用来找到 traversals 用
的起始节点。
默认情况下，相关的索引是由 Apache Lucene 提供的。但也能使用其他索引实现来提供。
您可以创建任意数量的命名索引。每个索引控制节点或者关系，而每个索引都通过
key/value/object 三个参数来工作。其中 object 要么是一个节点，要么是一个关系，取决于索引
类型。
19.3.1. 索引管理
就像 REST API 一样，所有到一个索引的写操作都必须在一个事务中完成。
创建一个索引
创建一个带配置的索引
1
# Create a relationship index
2
rel_idx = db.relationship.indexes.create('my_rels')
3
4
# Create a node index, passing optional
5
# arguments to the index provider.
6
# In this case, enable full-text indexing.
7
node_idx = db.node.indexes.create('my_nodes',
8
type='fulltext')
接收一个预先存在的索引
1
node_idx = db.node.indexes.get('my_nodes')
2
3
rel_idx =
4
db.relationship.indexes.get('my_rels')
移除索引
1
node_idx = db.node.indexes.get('my_nodes')
2
node_idx.delete()
3
4
rel_idx =
5
db.relationship.indexes.get('my_rels')
6
rel_idx.delete()
检查一个索引是否存在
1 exists = db.node.indexes.exists('my_nodes')
19.3.2. 被索引的东西
429
增加节点或者关系到索引
1
2
4 # Indexing nodes
5 a_node = db.node()
6 node_idx = db.node.indexes.create('my_nodes')
7
8 # Add the node to the index
9 node_idx['akey']['avalue'] = a_node
1
0 # Indexing relationships
1 a_relationship = a_node.knows(db.node())
1 rel_idx =
db.relationship.indexes.create('my_rels')
1
2
1 # Add the relationship to the index
3 rel_idx['akey']['avalue'] = a_relationship
1
4
移除索引的条目
在不同的层面移除索引的条目。看下面的范例了解信息。
# Remove specific key/value/item

1
triplet
2
del idx['akey']['avalue'][item]
3
4
# Remove all instances under a certain
5
# key
6
del idx['akey'][item]
7
8
# Remove all instances all together
9
del idx[item]
19.3.3. 查询一个索引
你可以通过两种方式接收索引的条目。要么你做一个直接的查找，要么你执行一个查询。
直接的查找在不同的索引服务是一样的，索引的语法取决于你使用的索引服务。像之前提过的
一样，Lucene 是默认索引服务并且它也是使用最多的索引服务。了解你想使用的 Lucene，请参
考： Lucene query language。
一个编程生成 Lucene 查询的 python 库在这里：GitHub。
重要
430
除非遍历整个索引结果，否则当你用完它后你必须关闭结果。如果你没有关闭，
数据库不知道他什么时候能释放结果资源。
直接轮循
1 hits = idx['akey']['avalue']
2 for item in hits:
3 pass
4
5 # Always close index results when you are
6 # done, to free up resources.
7 hits.close()
查询
1 hits = idx.query('akey:avalue')
2 for item in hits:
3 pass
4
5 # Always close index results when you are
6 # done, to free up resources.
7 hits.close()
19.4. Cypher 查询
19.4.1. 查询并读取结果
19.4.2. 参数化，并准备查询
你能在 neo4j-embedded 中使用 Cypher 查询语言。要阅读更多关于 cypher 的语法以及你能
使用的非常方便的工具，请参考：第 15 章 Cypher 查询语言。
19.4.1. 查询并读取结果
基本查询
执行一个文本查询，如下：
result = db.query("START n=node(0) RETURN

1
n")
接收查询的结果
Cypher 返回一个表格式的结果。你要么通过表格的一行一行的轮循，要么你给定列取里面
的值。
这是如何进行一行一行的轮循的：
1 root_node = "START n=node(0) RETURN

n"2
3
4 # Iterate through all result rows
431
5 for row in db.query(root_node):
6 node = row['n']
7
8 # We know it's a single result,
9 # so we could have done this as well
node =
db.query(root_node).single['n']
这儿是给定列取值：
1
root_node = "START n=node(0) RETURN
2
n"
3
4
# Fetch an iterator for the "n" column
5
column = db.query(root_node)['n']
6
7
for cell in column:
8
node = cell
9
1
# Coumns support "single":
0
column = db.query(root_node)['n']
1
node = column.single
1
列出结果中的列
你能得到列名的一个列表，如下：
result = db.query("START n=node(0) RETURN

1
n,count(n)")
2
3
# Get a list of the column names
4
columns = result.keys()
19.4.2. 参数化，并准备查询
参数化查询
Cypher 支持参数化查询，请参考：cypher-parameters。这里是你在 neo4j-embedded 如何使
用他们。
result = db.query("START n=node({id}) RETURN

1
n",id=0)
2
3
node = result.single['n']
准备查询
准备查询，就是你可以对一个 cypher 查询进行预解析，这个功能已经被废弃了。Cypher
能识别之前已经解析过的查询而不会解析相同的字符串两次。
432
因此，如果你们使用两次以上，实际上所有的 cypher 查询都是预解析查询的。使用参数化
的查询得到完全的功能 - 然后一个通用的查询将被解析，以后每次执行的时候都会用参数修改
它。
19.5. 遍历查询
警告
在 neo4j-embedded for python 中支持的遍历查询在 Neo4j 1.7 GA 已经废弃了。

请参考第 19.4 节 “Cypher 查询” 或者用核心 API 代替。因为遍历查询框架要求
在 JVM 和 python 直接有一个紧密的耦合，所以为了提升性能，我们需要打破这
个耦合。
下面的文档在 neo4j-embedded 1.8 中被移除了，而对遍历的支持在 neo4j-embedded 1.9 已经
被删除了。
在这使用的遍历查询 API 本质上和 Java API 中的是一样的，略有一点修改。
遍历查询开始于一个给定的节点而使用大量的规则在图中移动以便找到我们想要的部分。
19.5.1. 基本的遍历查询
跟随一个关系
最基本的遍历查询简单的跟随某一个关系类型，而返回他们遇到的一切。默认情况下，莫
个节点都只会被访问一次所以他们没有死循环的风险。
traverser = db.traversal()\
1
.relationships('related_t
2
o')\
3
.traverse(start_node)
4
5
# The graph is traversed as
6
# you loop through the result.
7
for node in traverser.nodes:
8
pass
在一个特定的方向跟随一个关系
你可以告诉遍历查询你只跟随某一方向的关系
from neo4j import OUTGOING, INCOMING, ANY

1
2
3
.relationships('related_to',
4
OUTGOING)\
5
433
跟随多个关系类型
你能指定无数的关系类型以及方向来跟随。
from neo4j import OUTGOING, INCOMING, ANY

1
2
3
.relationships('related_to',
4
INCOMING)\
5
.relationships('likes')\
6
19.5.2. 遍历查询结果
一个遍历查询能给你三个不同的结果之一：nodes， relationships 或者 paths。
遍历查询是懒加载执行的，意味着只有当你轮循结果集时才会真正的去遍历。
1
2
3
4
.relationships('related_to')\
5
6
7
# Get each possible path
8
for path in traverser:
9
pass
1
0
# Get each node
1
for node in traverser.nodes:
1
pass
1
2
# Get each relationship
1
for relationship in
3
traverser.relationships:
1
pass
4
1
5
19.5.3. 唯一性
为了避免无限死循环，定义在遍历中可以重新访问的部分是非常重要的。默认情况下，唯
一性参数设置为： +NODE_GLOBAL+，这意味着每个节点只能被访问一次。
这儿有一些其他设置可以使用。
1 from neo4j import Uniqueness

434
2
3 # Available options are:
4
5 Uniqueness.NONE
6 # Any position in the graph may be revisited.
7
8 Uniqueness.NODE_GLOBAL
9 # Default option
1 # No node in the entire graph may be visited
0 # more than once. This could potentially
1 # consume a lot of memory since it requires
1 # keeping an in-memory data structure
1 # remembering all the visited nodes.
2
1 Uniqueness.RELATIONSHIP_GLOBAL
3 # No relationship in the entire graph may be
1 # visited more than once. For the same
4 # reasons as NODE_GLOBAL uniqueness, this
1 # could use up a lot of memory. But since
5 # graphs typically have a larger number of
1 # relationships than nodes, the memory
6 # overhead of this uniqueness level could
1 # grow even quicker.
7
1 Uniqueness.NODE_PATH
8 # A node may not occur previously in the
1 # path reaching up to it.
9
2 Uniqueness.RELATIONSHIP_PATH
0 # A relationship may not occur previously in
2 # the path reaching up to it.
1
2 Uniqueness.NODE_RECENT
2 # Similar to NODE_GLOBAL uniqueness in that
2 # there is a global collection of visited
3 # nodes each position is checked against.
2 # This uniqueness level does however have a
4 # cap on how much memory it may consume in
2 # the form of a collection that only
5 # contains the most recently visited nodes.
2 # The size of this collection can be
6 # specified by providing a number as the
2 # second argument to the
7 # uniqueness()-method along with the
2 # uniqueness level.
8
2 Uniqueness.RELATIONSHIP_RECENT
435
9 # works like NODE_RECENT uniqueness, but
3 # with relationships instead of nodes.
0
3
1 traverser = db.traversal()\
3 .uniqueness(Uniqueness.NODE_PATH)\
2 .traverse(start_node)
3
3
3
4
3
5
3
6
3
7
3
8
3
9
4
0
4
1
4
2
4
3
4
4
4
5
4
6
4
7
4
8
4
9
5
0
5
1
5
2
436
5
3
5
4
5
5
19.5.4. 顺序
你能通过宽度优先或者深度优先遍历。深度优先是默认使用的，因为他消耗更少的内存。
# Depth first traversal,

this
1
# is the default.
2
traverser =
3
db.traversal()\
4
.depthFirst()\
5
.traverse(self.source
6
)
7
8
# Breadth first traversal
9
traverser =
1
db.traversal()\
0
.breadthFirst()\
19.5.5. 评估器 - 高级过滤器

为了能在其他条件下遍历，比如节点属性，或者更复杂的比如邻居节点或者模式，我们都
需要使用评估器。一个评估器是一个将一个路径作为参数的普通 Python 方法，返回下一个将要
做的事情的描述。
路径参数是遍历器当前的位置，而我们下一步能做的事情是下面四种之一，就像下面的范
例一样。
1 from neo4j import Evaluation

2
3 # Evaluation contains the four
4 # options that an evaluator can
5 # return. They are:
6
7 Evaluation.INCLUDE_AND_CONTINUE
8 # Include this node in the result and
9 # continue the traversal
1
0 Evaluation.INCLUDE_AND_PRUNE
437
1 # Include this node in the result, but don't
1
2 Evaluation.EXCLUDE_AND_CONTINUE
1 # Exclude this node from the result, but
1
4 Evaluation.EXCLUDE_AND_PRUNE
1 # Exclude this node from the result and
5 # don't continue the traversal
1
6 # An evaluator
1 def my_evaluator(path):
7 # Filter on end node property
1 if path.end['message'] == 'world':
8 return Evaluation.INCLUDE_AND_CONTINUE
1
9 # Filter on last relationship type
2 if path.last_relationship.type.name() == 'related_to':
0 return Evaluation.INCLUDE_AND_PRUNE
2
1 # You can do even more complex things here, like
subtraversals.
2
2
2 return Evaluation.EXCLUDE_AND_CONTINUE
3
2 # Use the evaluator
4 traverser = db.traversal()\
2 .evaluator(my_evaluator)\
5 .traverse(start_node)
2
6
2
7
2
8
2
9
3
0
3
1
3
2
3
3
3
438
4
3
5
3
6
3
7
3
8
3
9
4
0
部分 IV. 操作
这个部分将介绍如何安装和维护一个 Neo4j 安装程序。这也包括类似备份数据库和监控服
务器健康状况等话题的介绍。
第 20 章安装和部署
20.1. 部署方案
Neo4j 可以嵌入到你的应用中，作为一个独立服务器运行或者部署成 HA 集群模式提供好
性能服务。
表 20.1. Neo4j 部署选项

Single Instance Multiple Instances
Embed EmbeddedGraphDat HighlyAvailableGraphDatabas

ded abase e
Standa Neo4j Server Neo4j Server high availability

lone mode
20.1.1. 服务器模式
Neo4j 一半作为一个独立服务器访问，要么直接通过一个 REST 接口或者一个基于指定语
言的驱动。关于 Neo4j 服务器的信息，请参考：第 17 章 Neo4j 服务器。要以 HA 模式运行服
务器，请参考：第 22 章高可用性模式。
20.1.2. 嵌入模式
439
通过引入一个正确的 Java 库就可以将 Neo4j
20.2. 系统要求
内存限制了图数据库的大小，磁盘 I/O 限制了读写的性能。
20.2.1. 中央处理器
性能通常是消耗在创建大图数据库的内存或者 I/O 上，以及计算适合内存的图数据库。
最小要求
Intel 486
推荐配置
Intel Core i7
20.2.2. 内存
更多内存允许更大的图数据库，但也会增加更多的垃圾回收操作。
最小要求
1GB
推荐配置
4-8GB
20.2.3. 磁盘
在选择存储设备时，除了容量的考虑，磁盘性能是最重要的。
最小要求
SCSI, EIDE
推荐配置
SSD w/ SATA
20.2.4. 文件系统
为了 ACID 行为的正确完成，文件系统必须支持 flush（fsync，fdatasync）。
最小要求
ext3 (或者更简单的)
推荐配置
ext4, ZFS
20.2.5. 软件
Neo4j 是基于 Java 开发的。
Java
440
1.6+
操作系统
Linux, Windows XP, Mac OS X
20.2.6. JDK 版本
Neo4j 运行时一致都是在使用下面的运行时库测试完成的：
• Oracle Java Runtime Environment JRE 1.6
20.3. 安装
Neo4j 可以被安装作为一个独立的服务器，以一个无头程序或者系统服务运行。对于 Java
开发者，也能将 Neo4j 作为一个库使用，嵌入到你的应用中。
关于安装 Neo4j 作为一个独立服务器的更多信息，请参考：第 17.1 节 “服务器安装”.
下面的表格呈现了可以使用的版本以及它们使用的依赖管理工具名称。
提示
参考下面表格中的连接查看依赖的 Apache Maven, Apache Buildr, Apache Ivy 和

Groovy Grape 的配置细节。
表 20.2. Neo4j 各版本

Editi Dependency Description Lic
on ense
Comm org.neo4j:neo4j a high performance, fully ACID GPL

unity transactional graph database v3
Advan org.neo4j:neo4j-adva adding advanced monitoring AG

ced nced PLv3
Enterp org.neo4j:neo4j-ente adding online backup and High Availability AG

rise rprise clustering PLv3
注意
列出的依赖并不包括实现，但可以从源中拉
取。
关于一般许可者的信息，请参考： Licensing Guide.
20.3.1. 嵌入模式安装
441
最新的发行版总是可以从 http://neo4j.org/download 下载，包括 Neo4j 下载包的各个部分。
在选择了适合你的平台的版本后，通过引入 jar 包到你的应用中来完成嵌入 Neo4j。 jar 文件可
以直接从下载包的目录 lib/ 获取，也可以从 Maven Central Repository [1] 获取，包括发行版和
里程碑版本。
要获取关于如何使用 Neo4j 作为 Maven 或者其他依赖管理工具的一个依赖，请参考下面的

表格：
注意
列出的依赖并不包括实现，但可以从源中拉
取。
Maven 依赖.
1
2 <project>
3 ...
4 <dependencies>
5 <dependency>
7 <artifactId>neo4j</artifactId>
ion>
9
1 </dependency>
0 ...
1 </dependencies>
1 ...
1 </project>
2
参数 ${neo4j-version} 是期望的版本并且 artifactId 是 neo4j,
neo4j-advanced, neo4j-enterprise 中之一。
20.3.2. 服务器安装
更多细节，请参考：第 17 章 Neo4j 服务器和第 17.1 节 “服务器安装”。
[1]
http://repo1.maven.org/maven2/org/neo4j/
20.4. 升级
20.4.1. 自动升级
执行一次普通升级（对于数据库存储有少量变更）：
1. 下载更新版本的 Neo4j。
442
2. 如果要升级的数据库在运行，先关闭。
3. 使用新版本的 Neo4j 启动数据库。
20.4.2. 显性升级
执行一次指定升级（对于数据库存储有显著变更）：
1. 确保你要升级的数据库已经被明确关闭。
2. 在你的配置文件 neo4j.properties 或者嵌入配置参数中设置参数：
"allow_store_upgrade=true" 。
3. 启动数据库。
4. 当数据库被成功启动后，升级会自动发生。
5. "allow_store_upgrade=true" 配置参数应该被移除或者设置为 "false"，或
者被注释掉。
20.5. 使用数据收集器
Neo4j 使用数据收集器是一个收集使用数据的子系统，报告提交到在 ud.neo4j.org 的
UDC-server。它很容易关闭，而不再收集机密数据。关于发送数据的详细细节，请参考下面的
章节。
Neo4j 团队使用这些信息作为一个来自 Neo4j 社区的自动，高效反馈的一种形式。我们想

通过匹配使用统计和下载统计来验证我们做了正确的事情。在每一次发行后，我们能看到服务
器软件释放有更大的保留空间。
收集的数据清楚地统计在这。如果这个系统将来的任何版本收集额外的数据，我们将清楚
地宣布哪些变更。
Neo4j 团队非常尊重你的隐私。我们不会公布您的任何隐私信息。
20.5.1. 技术信息
为了收集关于 Neo4j 使用的更好的统计数据，UDC 收集下面这些信息：
• 内核版本：构建编号。
• 存储编号：在一个数据库在被创建是随机生成的一个全局唯一标识符。
• PING 的次数：UDC 内部维持一个 PING 的计数器，在内核重启后重置。
• 来源：有 "neo4j" or "maven"。如果你是从网站下载的 Neo4j，那么就是 "neo4j" ; 如

果你是使用 Maven 获取的 Neo4j,那么就是 "maven" 。
• Java 版本：使用的 Java 的版本号
• MAC 地址
• 注册编号：注册服务器实例的编号
443
• 关于执行环境上下文的一些标签（比如：test, language, web-container, app-container,
spring, ejb）。
• Neo4j 发行版本(community, advanced, enterprise)。
• 当前集群名称的一个 hash（如果存在集群名称的话）。
• Linux 的发行信息（rpm, dpkg, unknown）。
• 追踪 REST 客户端驱动的 User-Agent 头信息。
在启动后，UDC 在发送第一个 PING 之前等待 10 分钟。这基于两方面原因：首先我们不

希望因为 UDC 降低了启动速度; 其次，我们希望保持一个 PING 来进行最小程序的自动测试。
到 UDC 服务器的 PING 是通过一个 HTTP GET 方式完成的。
20.5.2. 如何关闭 UDC 功能

我们试图让大家更容易关闭 UDC。实际上，UDC 的代码集成在内核里面而不是一个完全
独立的模块。
你可以通过下面两种方式关闭 UDC：
1. 最容易的方式是移除 neo4j-udc-*.jar 文件，这样内核就加载不了 UDC,从而不会自

动发送 PING 请求。
2. 如果你使用 Maven,并且希望 UDC 不再安装到你的系统，你可以如下配置：
1 <dependency>
3 <artifactId>neo4j</artifactId>
4 <version>${neo4j-version}</version
>5
6 <exclusions>
7 <exclusion>
9 <artifactId>neo4j-udc</artifac
tId>
1
0 </exclusion>
1 </exclusions>
1 </dependency>
3. 参数 ${neo4j-version} 是 Neo4j 的版本。
4. 最后，如果你正在使用 Neo4j 的一个封装版本，而又不想改变任何 jar 文件，你可
以通过一个系统配置参数： -Dneo4j.ext.udc.disable=true 来关闭 UDC。
第 21 章配置和调优
为了获得 Neo4j 最佳性能，Neo4j 有一些可以调整的参数。可以配置的两个主要组件是 Neo4j
缓存和 Neo4j 运行的 JVM。下面的章节描述了如何调整它们。
444
21.1. 介绍
21.1.1. 如何增加配置设置
为了获得更好的性能，下面的事情是我们首先需要做的：
• 确保 JVM 没有使用太多时间用于执行垃圾收集。监视使用 Neo4j 的一个应用的

堆使用可能有点混乱，因为当内存充裕时 Neo4j 会增加缓存，而相反会减少缓存。目标
就是有一个足够大的堆来确保一个重型加载不会导致调用 GC 收集（如果导致 GC 收集，
那么性能将降低高达两个数量级）。
• 用一个-server 标记启动 JVM 和一个适当的堆尺寸(参考：第 21.6 节 “JVM 设置”)。
太大的堆尺寸也会伤害性能，因此你可以反复尝试下不同的堆尺寸。
• 使用 parallel/concurrent 垃圾收集器 (我们发现使用
-XX:+UseConcMarkSweepGC 在许多情况下使用良好)
21.1.1. 如何增加配置设置
当创建一个嵌入 Neo4j 实例时，可以传递包含 Key-Value 的 Map 作为参数。
Map<String, String> config = new HashMap<String, String>();

1
config.put( "neostore.nodestore.db.mapped_memory", "10M" );
2
config.put( "string_block_size", "60" );
3
config.put( "array_block_size", "300" );
4
5
ImpermanentGraphDatabase( config );
如果没有配置提供，数据库核心将试图通过 JVM 配置和操作系统探测适合的配置信息。
JVM 的配置是通过在启动 JVM 时传递命令行参数。对于 Neo4j 最重要的配置参数是控制

内存和垃圾收集，但对于一些实时编译使用到的参数也是非常有趣的。
比如我们要在一个 64 位系统，堆空间为 1G 的服务器启动你的应用的主类：
java -d64 -server -Xmx1024m -cp

/path/to/neo4j-kernel.jar:/path/to/jta.jar:/path/to/your-applic
1
ation.jar com.example.yourapp.MainClass
看上面的范例，你也可以留意到最基本命令行参数之一：指定类路径。类路径是 JVM 搜
寻你的类的路径。它经常是一个 jar 文件列表。指定类路径通过标志 -cp (或者 -classpath)
完成后面跟类路径的值。对于 Neo4j 应用来说，至少应该包括 Neo4j neo4j-kernel.jar 和
Java 事务 API (jta.jar) 以及你自己的应用需要加载的类的路径。
提示
在 Linux，Unix 和 Mac OS X 上，在路径列表上面的每一个元素都被一个冒号符

号 (:)分隔，在 Windows 上面，则使用分号 (;)分隔。
445
当使用 Neo4j REST 服务器时，参考 server-configuration 了解如何增加数据的配置到服务
器。
21.2. 性能向导
21.2.1. 首先尝试
21.2.2. Neo4j 基础元素的生命周期
21.2.3. 配置 Neo4j
这是 Neo4j 性能优化向导。它将引导你如何使用 Neo4j 来达到最佳性能。
21.2.1. 首先尝试
首先需要做的事情就是确保 JVM 运行良好而没有浪费大量的时间来进行垃圾收集。监视
使用 Neo4j 的一个应用的堆使用可能有点混乱，因为当内存充裕时 Neo4j 会增加缓存，而相反
会减少缓存。目标就是有一个足够大的堆来确保一个重型加载不会导致调用 GC 收集（如果导
致 GC 收集，那么性能将降低高达两个数量级）。
使用标记 -server 和 -Xmx<good sized heap> (f.ex. -Xmx512M for 512Mb memory or
-Xmx3G for 3Gb memory)来启动 JVM。太大的堆尺寸也会伤害性能，因此你可以反复尝试下不
同的堆尺寸。使用 parallel/concurrent 垃圾收集器 (我们发现使用
-XX:+UseConcMarkSweepGC 在许多情况下使用良好)
最后，确保操作系统有一些内存来管理属性文件系统缓存, 这意味着如果你的系统有 8G
内存就不要使用全部的内存给堆使用（除非你关闭内存映射缓冲区）而要留一个适合大小的内
存给系统使用。要了解更多详情，请参考：第 21 章配置和调优。
对于 Linux 特有的调优，请参考：第 21.10 节 “Linux 性能向导”。
21.2.2. Neo4j 基础元素的生命周期

Neo4j 根据你使用 Neo4j 的情况来管理它的基础元素（节点，关系和属性）。比如如果你
从来都不会从某一个节点或者关系那儿获取一个属性，那么节点和关系将不会加载属性到内
存。第一次，在加载一个节点或者关系后，任何属性都可以被访问，所有的属性都加载了。如
果某一个属性包含一个数组大于一些常规元素或者包含一个长字符串，在请求是需要进行切
分。简单讲，一个节点的关系只有在访问这个节点的第一次被加载。
节点和关系使用 LRU 缓存。如果你（因为一些奇怪的原因）只需要使用节点工作，那关

系缓存会变得越来越小，而节点缓存会根据需要自动增长。使用大量关系和少量节点的应用会
导致关系数据占用缓存猛增而节点占用缓存会越来越小。
Neo4j API 规范并没有描述任何关于关系的顺序，所以调用
1 Node.getRelationshi
446
ps()
会与之前的调用相比以不同顺序返回关系。这允许我们做更多的优化来返回最需要遍历的
关系。
在 Neo4j 的所有元素都设计来根据实际使用来自动适配。 The (unachievable) overall goal is

to be able to handle any incoming operation without having to go down and work with the file/disk
I/O layer.
21.2.3. 配置 Neo4j
在第 21 章配置和调优章节有很多关于对 Neo4j 和 JVM 配置的信息。这些设置有很多对
性能的影响。
磁盘, 内存和其他要点
一如往常，和任何持久持久化持久方案持久一样，性能非常依赖持久化存储设备的。更好
的磁盘就会有更好的性能。
如果你有多个磁盘或者其他持久化介质可以使用，切分存储文件和事务日志在这些磁盘上
是个不错的主意。让存储文件运行在低寻址时间的磁盘上对于非缓存的读操作会有非常优秀的
表现。在今天一个常规的机械磁盘平均查询时间是 5ms，如果可以使用的内存非常少额或者缓
存内存映射设置不当的话，这会导致查询或者遍历操作变得非常慢。一个新的更好的打开了
SSD 功能的 SATA 磁盘平均查询时间少于 100 微妙，这意味着比其他类型的速度快 50 倍以上。
为了避免命中磁盘你需要更多的内存。在一个标准机械磁盘上你能用 1-2GB 的内存管理差

不多几千万的 Neo4j 基础元素。 4-8GB 的内存可以管理上亿的基础元素，而如果你要管理数十
亿的话，你需要 16-32GB 的样子。然而，如果你投资一块好的 SSD，你将可以处理更大的图数
据而需要更少的内存。
Neo4j 喜欢 Java 1.6 JVMs,如果你曾经没有或者至少没有使用-server 标记，以服务器模式运

行的可以考虑升级到那个版本。当你的应用运行时，使用 vmstat 等工具收集信息。如果你
有很高的 I/O 等待，而当运行读写事务时没有很多块数据进出磁盘，这是一个信号，表明你需
要调整你的 Java 堆参数，Neo4j 缓存以及内存映射设置（也许需要配置更多的内存或者更好的
磁盘）。
写操作性能
如果你在写入一些数据（刚开始很快，然后越来越慢）后经历过慢速的写性能，这可能是
操作系统从存储文件的内存映射区域写出来脏页造成的。这些区域不需要被写入来维护一致性
因此要实现最高性能的写操作，这类行为要避免。
另外写操作越来越慢的原因还可能是事务的大小决定的。许多小事务导致大量的 I/O 写到
磁盘的操作，这些应该避免。太多大事务会导致内存溢出错误发生，因为没有提交的事务数据
一致保持在内存的 Java 堆里面。关于 Neo4j 事务管理的细节，请参考：第 12 章事务管理。
447
Neo4j 内核使用一些存储文件和一个逻辑日志文件来存储图数据到磁盘。存储文件包括实
际的图数据而日志文件包括写操作。所有的写操作都会被追加到日志文件中而当一个事务提交
时，会强迫(fdatasync)逻辑日志同步到磁盘。然而存储文件不会强制写入到磁盘而也不仅仅
是追加操作。它们将被写入一个更大或者更小的随机模型中（依赖于图数据库的布局）而写操
作不会被强迫同步到磁盘。除非日志发生翻转或者 Neo4j 内核关闭。为逻辑日志目标增加翻转
的大小是个不错的主意，如果你在使用翻转日志功能时遇到写操作问题，你可以考虑关闭日志
翻转功能。下面是一个范例演示如何正运行时改变日志翻转设置：
GraphDatabaseService graphDb; // ...

1
2
// get the XaDataSource for the native store
3
TxModule txModule = ((EmbeddedGraphDatabase)
4
graphDb).getConfig().getTxModule();
5
XaDataSourceManager xaDsMgr =
6
txModule.getXaDataSourceManager();
7
XaDataSource xaDs = xaDsMgr.getXaDataSource( "nioneodb" );
8
9
1 // 关闭日志翻转
0
xaDs.setAutoRotate( false );
1
1
1 // 或者增加日志翻转目标尺寸为 100MB (默认：10MB)
2
xaDs.setLogicalLogTargetSize( 100 * 1024 * 1024L );
自从随机写到存储文件的内存映射区域会发生，如果不需要，不要让数据写道磁盘是非常
重要的。一些操作系统在把脏页面数据写出到磁盘时有非常积极的设置规则。如果操作系统决
定开始写出这些内存映射区域的脏页面时，写到磁盘的操作会停止连续的写，而变成随机。这
会大大降低性能，因此当用 Neo4j 时要确保最大的写性能，必须确保，操作系统不会因为写到
存储文件的内存映射区域而导致写出任何脏页面数据。举个例子，如果机器有 8G 的内存而存
储文件一共有 4G(完全可以内存映射)，操作系统必须被配置来接受至少 50%的脏页面在虚拟内
存里面以确保我们不会出现随机的磁盘写操作。
Note: 关于更多的规则信息，请参考：第 21.10 节 “Linux 性能向导” 。
二级缓存
当一般构建应用和“总是幻想图数据总是在内存里面”时，有时很有必要优化某些关键区域
的性能。 Neo4j 增加了一个很小的过载甚至当你属于内存数据结构做比较时，节点，关系或者
有问题的属性都在缓存中。如果这变成一个问题，请使用性能测试器找出它们的热点而然后增
加你自己的二级缓存。我们相信二级缓存应该能规避最大的扩展因为它将强迫你小心处理有时
非常难的无效数据。但当其他事情都失败时，你必须使用它，因此这是一个范例演示如何使用
它。
448
我们有一些 POJO，封装了一个节点和它的状态。在这个特殊的 POJO 中，我们重载了相
同的实现。
public boolean equals( Object obj )

1 {
2 return
underlyingNode.getProperty(
3 "some_property" ).equals( obj );
4 }
5
7 {
8 return
underlyingNode.getProperty(
9 "some_property" ).hashCode();
}
这会在许多场景都运行得非常好，但在这个特殊的场景下，那个 POJO 的许多实例都会递
归调用 adding/removing/getting/finding 来收集类。性能优化器探测这个应用后会发行相同的实
现会被反复调用而这个可以别看作一个热点。为这个相同的重载增加二级缓存将在这个特殊的
场景增加性能。
1
2
3
4 private Object cachedProperty = null;
5
6 public boolean equals( Object obj )
7 {
8 if ( cachedProperty == null )
9 {
1 cachedProperty =
0 underlyingNode.getProperty( "some_property" );
1 }
1 return cachedProperty.equals( obj );
1 }
2
3 {
1 if ( cachedPropety == null )
4 {
1 cachedProperty =
5 underlyingNode.getProperty( "some_property" );
1 }
6 return cachedProperty.hashCode();
1 }
7
1
8
449
1
9
现在的问题是，我们需要废除缓存属性而不管 some_property 在什么时候发生改变。
（可能在这个场景是一个问题因为状态是相同的而 hash code 计算经常都不会改变）。
提示
总结，尽可能的回避使用二级缓存除非你真的需要
它。
21.3. 内核配置
这些是你可能传递给 Neo4j 内核的配置选项。如果你使用嵌入数据库，你可以以一个 map
类型传递，又或者在 Neo4j 服务器中在 neo4j.properties 文件中配置。
表 21.1. Allow store upgrade
Default value: false
allow_store_upgrade
Whether to allow a store upgrade in case the current version of the database starts against an
older store version. Setting this to true does not guarantee successful upgrade, justthat it allows an
attempt at it.
表 21.2. Array block size
array_block_size
Specifies the block size for storing arrays. This parameter is only honored when the store is
created, otherwise it is ignored. The default block size is 120 bytes, and the overhead of each
block is the same as for string blocks, i.e., 8 bytes.
Limit Value
Default value: 120
min 1
表 21.3. Backup slave
backup_slave
Mark this database as a backup

slave.
450
表 21.4. Cache type
cache_type
The type of cache to use for nodes and relationships.
V Description
alue
Default value: soft
sof
Provides optimal utilization of the available memory. Suitable for high performance
traversal. May run into GC issues under high load if the frequently accessed parts of the
t
graph does not fit in the cache.
wea
Use weak reference cache.
str
Use strong references.
ong
non
Don’t use caching.
表 21.5. Cypher parser version
cypher_parser_version
Enable this to specify a parser other than the
default one.
Value Description
1.5
Cypher v1.5 syntax.
1.6
Cypher v1.6 syntax.
1.7
Cypher v1.7 syntax.
表 21.6. Dump configuration
dump_configuration
Print out the effective Neo4j configuration after

startup.
表 21.7. Forced kernel id
Default value:
451
forced_kernel_id
An identifier that uniquely identifies this graph database instance within this JVM. Defaults
to an auto-generated number depending on how many instance are started in this JVM.
表 21.8. Gc monitor threshold
Default value: 200ms
gc_monitor_threshold
The amount of time in ms the monitor thread has to be blocked before logging a message it
was blocked.
表 21.9. Gc monitor wait time
Default value: 100ms
gc_monitor_wait_time
Amount of time in ms the GC monitor thread will wait before taking another
measurement.
表 21.10. Gcr cache min log interval
Default value: 60s
gcr_cache_min_log_interval
The minimal time that must pass in between logging statistics from the cache (when using
the 'gcr' cache).
表 21.11. Grab file lock
Default value: true
grab_file_lock
Whether to grab locks on files

or not.
表 21.12. Intercept committing transactions
intercept_committing_transactions
Determines whether any TransactionInterceptors loaded will intercept prepared transactions

before they reach the logical log.
452
表 21.13. Intercept deserialized transactions
intercept_deserialized_transactions
Determines whether any TransactionInterceptors loaded will intercept externally received

transactions (e.g. in HA) before they reach the logical log and are applied to the store.
表 21.14. Keep logical logs
Default value: true
keep_logical_logs
Make Neo4j keep the logical transaction logs for being able to backup the database.Can be
used for specifying the threshold to prune logical logs after. For example "10 days" will prune
logical logs that only contains transactions older than 10 days from the current time, or "100k
txs" will keep the 100k latest transactions and prune any older transactions.
表 21.15. Logging.threshold for rotation
logging.threshold_for_rotation
Threshold in bytes for when database logs (text logs, for debugging, that is) are
rotated.
Limit Value
Default value: 104857600
min 1
表 21.16. Logical log
Default value: nioneo_logical.log
logical_log
The base name for the logical log files, either an absolute path or relative to the store_dir
setting. This should generally not be changed.
表 21.17. Lucene searcher cache size
lucene_searcher_cache_size
Integer value that sets the maximum number of open lucene index
searchers.
453
Limit Value
Default value: 2147483647
min 1
表 21.18. Neo store
Default value: neostore
neo_store
The base name for the Neo4j Store files, either an absolute path or relative to the store_dir
setting. This should generally not be changed.
表 21.19. Neostore.nodestore.db.mapped memory
Default value: 20M
neostore.nodestore.db.mapped_memory
The size to allocate for memory mapping the node

store.
表 21.20. Neostore.propertystore.db.arrays.mapped memory
Default value: 130M
neostore.propertystore.db.arrays.mapped_memory
The size to allocate for memory mapping the array property

store.
表 21.21. Neostore.propertystore.db.index.keys.mapped memory
Default value: 1M
neostore.propertystore.db.index.keys.mapped_memory
The size to allocate for memory mapping the store for property key
strings.
表 21.22. Neostore.propertystore.db.index.mapped memory
Default value: 1M
neostore.propertystore.db.index.mapped_memory
The size to allocate for memory mapping the store for property key
indexes.
454
表 21.23. Neostore.propertystore.db.mapped memory
Default value: 90M
neostore.propertystore.db.mapped_memory
The size to allocate for memory mapping the property value

store.
表 21.24. Neostore.propertystore.db.strings.mapped memory
Default value: 130M
neostore.propertystore.db.strings.mapped_memory
The size to allocate for memory mapping the string property

store.
表 21.25. Neostore.relationshipstore.db.mapped memory
Default value: 100M
neostore.relationshipstore.db.mapped_memory
The size to allocate for memory mapping the relationship

store.
表 21.26. Node auto indexing
node_auto_indexing
Controls the auto indexing feature for nodes. Setting to false shuts it down unconditionally,
while true enables it for every property, subject to restrictions in the configuration.
表 21.27. Node cache array fraction
node_cache_array_fraction
The fraction of the heap (1%-10%) to use for the base array in the node cache (when using
the 'gcr' cache).
Limit Value
Default value: 1.0
min 1.0
max 10.0
455
表 21.28. Node cache size
node_cache_size
The amount of memory to use for the node cache (when using the 'gcr'
cache).
表 21.29. Node keys indexable
node_keys_indexable
A list of property names (comma separated) that will be indexed by default. This applies to
Nodes only.
表 21.30. Read only database
read_only
Only allow read operations from this Neo4j

instance.
表 21.31. Rebuild idgenerators fast
Default value: true
rebuild_idgenerators_fast
Use a quick approach for rebuilding the ID generators. This give quicker recovery time, but
will limit the ability to reuse the space of deleted entities.
表 21.32. Relationship auto indexing
relationship_auto_indexing
Controls the auto indexing feature for relationships. Setting to false shuts it down
unconditionally, while true enables it for every property, subject to restrictions in the
configuration.
表 21.33. Relationship cache array fraction
456
relationship_cache_array_fraction
The fraction of the heap (1%-10%) to use for the base array in the relationship cache (when
using the 'gcr' cache).
Limit Value
Default value: 1.0
min 1.0
max 10.0
表 21.34. Relationship cache size
relationship_cache_size
The amount of memory to use for the relationship cache (when using the 'gcr'
cache).
表 21.35. Relationship keys indexable
relationship_keys_indexable
A list of property names (comma separated) that will be indexed by default. This applies to
Relationships only.
表 21.36. Remote logging enabled
remote_logging_enabled
Whether to enable logging to a remote server

or not.
表 21.37. Remote logging host
Default value: 127.0.0.1
remote_logging_host
Host for remote logging using LogBack

SocketAppender.
表 21.38. Remote logging port
457
remote_logging_port
Port for remote logging using LogBack
SocketAppender.
Limit Value
Default value: 4560
min 1
max 65535
表 21.39. Store dir
store_dir
The directory where the database files are

located.
表 21.40. String block size
string_block_size
Specifies the block size for storing strings. This parameter is only honored when the store is
created, otherwise it is ignored. Note that each character in a string occupies two bytes, meaning
that a block size of 120 (the default size) will hold a 60 character long string before overflowing
into a second block. Also note that each block carries an overhead of 8 bytes. This means that if
the block size is 120, the size of the stored records will be 128 bytes.
Limit Value
Default value: 120
min 1
表 21.41. Tx manager impl
tx_manager_impl
The name of the Transaction Manager service to use as defined in the TM service provider
constructor, defaults to native.
表 21.42. Use memory mapped buffers
use_memory_mapped_buffers
Tell Neo4j to use memory mapped buffers for accessing the native storage
layer.
458
21.4. Neo4j 的缓存设置
21.4.1. 文件缓冲区
21.4.2. 对象缓冲区
关于如何进行关于 Neo4j 的自定义配置，请参考：第 21.1 节 “介绍”。
Neo4j 使用两个不同类型的缓存：一个文件缓冲区和一个对象缓冲区。文件缓冲区以相同
格式存储文件因为它们存储在持久存储介质上。对象缓存区缓存节点，关系和属性，它们的存
储格式是以高速读写为目标做优化后的结果。
21.4.1. 文件缓冲区
要点. * * 文件缓冲区有时被称为低级缓存或者文件系统缓存。 * 它缓存 Neo4j 数据
在持久化介质上面。 * 如果可以的话，图使用操作系统内存映射特性。 * Neo4j 会自动配置缓
存只要 JVM 的堆尺寸配置适当。 *
文件缓冲区以相同格式存储文件因为它们存储在持久存储介质上。这个缓冲层的目的是提
示读写性能。文件缓冲区通过写到缓存来提示写的性能，直到逻辑日志发生翻转才真正写入。
这种行为是安全的因为所有的事务总是在写入操作时已经写入了逻辑日志了，这个日志可以用
来恢复存储的文件。
因为缓存操作与涉及到的数据紧紧关联在一起，Neo4j 数据表示形式的简短描述有必要在
后台处理。 Neo4j 存储数据在多个文件中而依赖文件系统来管控处理效率。每一个 Neo4j 存储
文件都包括某一类型的相同尺寸的记录：
Stor Reco Contents

e file rd size
nodes 9 B Nodes
tore
relsto 33 B Relationships
re
props 41 B Properties for

tore nodes and relationships
string 128 B Values of string

store properties
arrays 128 B Values of array

tore properties
459
为了存储数据是变长长度的数据，比如字符串和数组，数据以一个或者多个 120B 大小的
块方式存储，并且使用者 8B 大小数据来管理。这些块的大小实际上可以在存储的时候通过参
数 string_block_size 和 array_block_size 配置来创建。每一个记录类型的尺寸也可
以用来计算一个 Neo4j 图数据库的存储需求或者每一个文件缓冲区的最接近的缓存大小。主意
一些字符串和数组也可以不以字符串或者数组的方式存储，请参考：第 21.7 节 “短字符串的
压缩存储” and 第 21.8 节 “短数组的压缩存储”。
Neo4j 使用多个文件缓冲区，每一个服务都对应一个不同的存储文件。每一个文件缓冲区
都切分它们的存储文件成一系列相同大小的窗口。没一个缓冲窗口包含一系列存储记录。缓冲
区控制在内存中的最大的活动缓冲窗口并且追踪它们的命中率和丢失率。当一个未缓存窗口的
命中率大于了一个缓存窗口的丢失率的时候，这个缓存窗口会被驱逐而会被之前未缓存的窗口
取代。
重要
注意快尺寸只能在存储创建时间时配
置。
配置
Parameter Possi Effect
ble values
use_memory_mapped_buffers true or If set to true Neo4j

false will use the operating
systems memory mapping
functionality for the file
buffer cache windows. If
set to false Neo4j will
use its own buffer
implementation. In this
case the buffers will
reside in the JVM heap
which needs to be
increased accordingly.
The default value for this
parameter is true, except
on Windows.
neostore.nodestore.db.mapped_memory The The maximum

maximum amount of memory to use
amount of for the file buffer cache of
460
ble values
memory to the node storage file.

use for
neostore.relationshipstore.db.mapped_memory The maximum
memory
amount of memory to use
mapped
for the file buffer cache of
buffers for
the relationship store file.
this file
neostore.propertystore.db.index.keys.mapped_me
buffer cache. The maximum
mory
The default amount of memory to use
unit is MiB, for the file buffer cache of
for other the something-something
units use any file.

neostore.propertystore.db.index.mapped_memory of the The maximum
following amount of memory to use
suffixes: B, k, for the file buffer cache of
M or G. the something-something
file.
neostore.propertystore.db.mapped_memory The maximum

the property storage file.
neostore.propertystore.db.strings.mapped_memor The maximum

y amount of memory to use
the string property storage
file.
neostore.propertystore.db.arrays.mapped_memory The maximum

the array property storage
file.
string_block_size The Specifies the block

number of size for storing strings.
bytes per This parameter is only
461
ble values
block. honored when the store is

created, otherwise it is
ignored. Note that each
character in a string
occupies two bytes,
meaning that a block size
of 120 (the default size)
will hold a 60 character
long string before
overflowing into a second
block. Also note that each
block carries an overhead
of 8 bytes. This means
that if the block size is
120, the size of the stored
records will be 128 bytes.
array_block_size Specifies the block

size for storing arrays.
This parameter is only
honored when the store is
created, otherwise it is
ignored. The default block
size is 120 bytes, and the
overhead of each block is
the same as for string
blocks, i.e., 8 bytes.
dump_configuration true or If set to true the

false current configuration
settings will be written to
the default system output,
mostly the console or the
logfiles.
当内存映射缓冲区配置 (use_memory_mapped_buffers = true) 来在使用时，JVM 的
堆大小必须小于计算机整个可以使用的内存，要减去用于文件缓冲区的内存大小。当堆缓冲区
462
配置 (use_memory_mapped_buffers = false) 来在使用时，JVM 的堆大小必须足够大以
包括所有的缓冲区，加上应用和对象缓冲的实时堆内存需求。
Neo4j 在启动时读取配置参数，并且自动配置哪些没有指定的参数。缓冲大小会基于计算
机上可以使用的内存大小来配置，以决定 JVM 堆该用多大，存储文件该用多大的内存等。
21.4.2. 对象缓冲区
要点. * * 对象缓冲有时被成为高级缓存。 * 它缓存 Neo4j 的数据以一种更加优化便于
高速遍历的格式存储。 *
对象缓冲区以一种便于高速遍历的格式缓存节点和关系以及它们的属性。在 Neo4j 中有两

个不同类别的对象。
其中之一是参考缓存。这儿，Neo4j 将利用它能从分配的 JVM 的内存中获取尽可能多的用

于缓存对象，依赖于在一个 LRU 方式驱逐缓存的垃圾收集方式。然而要主意 Neo4j 是在和 JVM
上面其他对象在 "竞争" 堆空间的，比如你有一个应用以嵌入模式部署，在应用需要更多内存
时，Neo4j 会让需要更少内存的应用获得 "胜利" 。
注意
在下面描述的 GC 耐高速缓存只在 Neo4j 企业版中可以使

用。
另外一种是 GC 耐高速缓存它会从 JVM 的堆空间获取一固定大小的内存，当对象存储超
过了这个空间时，它会自动清理。分配最大的内存给它以便所有缓存对象都在里面而不会超出。
当最大内存被耗尽时，对象将被清理，而不依赖 GC 的决定。这个在堆上与其他对象的竞争让
GC-pauses 能被更好的控制，因为缓存分配一个最大空间的堆空间使用。与参考缓存相比，GC
耐高速缓存的开销更小，插入/查询的速度更快。
提示
对于 java 垃圾收集器来说，堆内存的使用是一个方面 — 依赖于缓存类型而释放

需要更大的堆空间。因此，分配一个大尺寸堆给 Neo4j 并不总是一个最好的策略因为
它可能导致长时间的 GC-pauses。相反应该留一些空间给 Neo4j 的文件系统缓存。这些
都是超出堆和内核的直接控制下，因此更有效率。
这个缓存中的内容发展成面向支持 Neo4j 的 API 和图形遍历的对象。从这个缓存读数据比
从文件系统缓存快 5~10 倍。这个缓存被包括在 JVM 的堆上，而大小跟当前可以使用的堆内存
的总量适应。
节点和关系只要它们被访问就会被加入对象缓存中。然而缓存对象是被懒填充的。一个节
点或者关系的属性不会被加载直到属性被访问。字符串（和数组）属性不会被加载直到该指定
属性被访问。一个指定节点的关系也不会被加载直到该关系被访问。
463
配置
对象缓冲区主要的配置参数就是 cache_type 。这指定了对象缓存由哪一个缓存来实现。
对象缓冲区主要存在两个缓存实例，一个是用于节点而一个是用于关系。可以采用的缓存类型
有：
cache_ Description
type
none Do not use a high level cache. No objects will be cached.
soft Provides optimal utilization of the available memory. Suitable for high
performance traversal. May run into GC issues under high load if the frequently
accessed parts of the graph does not fit in the cache.
This is the default cache implementation.
weak Provides short life span for cached objects. Suitable for high throughput
applications where a larger portion of the graph than what can fit into memory is
frequently accessed.
strong This cache will hold on to all data that gets loaded to never release it again.
Provides good performance if your graph is small enough to fit in memory.
gcr Provides means of assigning a specific amount of memory to dedicate to

caching loaded nodes and relationships. Small footprint and fast insert/lookup.
Should be the best option for most scenarios. See below on how to configure it. Note
that this option is only available in the Neo4j Enterprise Edition.
GC 耐高速缓存配置
因为 GC 高速缓存操作了 JVM 中一个最大的区域，每次使用的时候都会被配置用于优化性
能。有两个方面的缓存大小。
一个是对象被放到缓存的数组索引的大小。它被指定为一小部分堆，比如指定 5 表示让
数组占用整个堆的 5%的空间。增加这个指数（直到最大的 10）会减少消耗更多的堆用于哈希
碰撞的机会。更多的碰撞意味着来自低级缓存的更多的冗余对象加载。
configuration option Description (what it controls) Examp

le value
node_cache_array_fraction Fraction of the heap to dedicate to the 7
array holding the nodes in the cache (max

10).
relationship_cache_array_frac Fraction of the heap to dedicate to the 5
464
configuration option Description (what it controls) Examp
le value
tion array holding the relationships in the cache

(max 10).
另外一个方面是在缓存中的所有对象的尺寸。它以字节为单位指定。比如 500M 或者 2G
等。在要接近最大尺寸时一个清场操作将被执行，随机对象将被逐出内存直到最大尺寸降低
到 90%以下。最大尺寸的最优化设置依赖于你的图数据库的大小。配置最大尺寸应该留足够的
空间给在 JVM 共存的其他对象使用，但在同一时间，应该大到足以保持在最低限度的从低级
缓存加载的存储需求。在 JVM 上的预测负载以及域级别的对象的布局也应该考虑到。
configuration Description (what it controls) Exampl

option e value
node_cache_size Maximum size of the heap memory to dedicate 2G
to the cached nodes.
relationship_cache_si Maximum size of the heap memory to dedicate 800M
ze to the cached relationships.

你可以从下面的地址阅读 Sun HotSpot 的 JVM 配置和参考：
• Understanding soft/weak references
• How Hotspot Decides to Clear SoftReferences
• HotSpot FAQ
堆内存使用
下面的表格可以用来计算在一个 64 位的 JVM 上面，对象缓存会占据多少内存：
Objec S Comment
t ize
Node 3 Size for each node (not counting its relationships or properties).
44 B
4 Object overhead.
8 B
1 Property storage (ArrayMap 48B, HashMap 88B).

36 B
1 Relationship storage (ArrayMap 48B, HashMap 88B).

36 B
465
Objec S Comment
t ize
2 Location of first / next set of relationships.

4 B
Relation 2 Size for each relationship (not counting its properties).

ship 08 B
4 Object overhead.
8 B
1 Property storage (ArrayMap 48B, HashMap 88B).

36 B
Property 1 Size for each property of a node or relationship.

16 B
3 Data element — allows for transactional modification and keeps track of

2 B on disk location.
4 Entry in the hash table where it is stored.

8 B
1 Space used in hash table, accounts for normal fill ratio.

2 B
2 Property key index.

4 B
Relation 1 Size for each relationship type for a node that has a relationship of that
ships 08 B type.
4 Collection of the relationships of this type.

8 B
4 Entry in the hash table where it is stored.

8 B
1 Space used in hash table, accounts for normal fill ratio.

2 B
Relation 8 Space used by each relationship related to a particular node (both

ships B incoming and outgoing).
Primitiv 2 Size of a primitive property value.

e 4 B
466
Objec S Comment
t ize
String 6 Size of a string property value. 64 + 2*len(string) B (64 bytes, plus two
4+B bytes for each character in the string).
21.5. 逻辑日志
Logical logs in Neo4j are the journal of which operations happens and are the source of truth in
scenarios where the database needs to be recovered after a crash or similar. Logs are rotated every
now and then (defaults to when they surpass 25 Mb in size) and the amount of legacy logs to keep can
be configured. Purpose of keeping a history of logical logs include being able to serve incremental
backups as well as keeping an HA cluster running. Regardless of configuration at least the latest
non-empty logical log be kept.
For any given configuration at least the latest non-empty logical log will be kept, but
configuration can be supplied to control how much more to keep. There are several different means of
controlling it and the format in which configuration is supplied is:
keep_logical_logs=<true/false
>1
2 keep_logical_logs=<amount>
<type>
For example:
1 # Will keep logical logs indefinitely

2 keep_logical_logs=true
3
4 # Will keep only the most recent non-empty log
5 keep_logical_logs=false
6
7 # Will keep logical logs which contains any transaction
committed
8 within 30 days
9 keep_logical_logs=30 days
1
0 # Will keep logical logs which contains any of the most recent
500
1 000 transactions
1 keep_logical_logs=500k txs
Full list:
T Description Example
ype
467
T Description Example
ype
fil Number of most recent logical log "10 files"

es files to keep
si Max disk size to allow log files to "300M size"

ze occupy or "1G size"
tx Number of latest transactions to keep "250k txs" or

s Keep "5M txs"
ho Keep logs which contains any "10 hours"

urs transaction committed within N hours from
current time
da Keep logs which contains any "50 days"

ys transaction committed within N days from
current time
21.6. JVM 设置
21.6.1. Configuring heap size and GC
There are two main memory parameters for the JVM, one controls the heap space and the other
controls the stack space. The heap space parameter is the most important one for Neo4j, since this
governs how many objects you can allocate. The stack space parameter governs the how deep the call
stack of your application is allowed to get.
When it comes to heap space the general rule is: the larger heap space you have the better, but
make sure the heap fits in the RAM memory of the computer. If the heap is paged out to disk
performance will degrade rapidly. Having a heap that is much larger than what your application needs
is not good either, since this means that the JVM will accumulate a lot of dead objects before the
garbage collector is executed, this leads to long garbage collection pauses and undesired performance
behavior.
Having a larger heap space will mean that Neo4j can handle larger transactions and more
concurrent transactions. A large heap space will also make Neo4j run faster since it means Neo4j can
fit a larger portion of the graph in its caches, meaning that the nodes and relationships your
application uses frequently are always available quickly. The default heap size for a 32bit JVM is
64MB (and 30% larger for 64bit), which is too small for most real applications.
Neo4j works fine with the default stack space configuration, but if your application implements
some recursive behavior it is a good idea to increment the stack size. Note that the stack size is shared
468
for all threads, so if you application is running a lot of concurrent threads it is a good idea to increase
the stack size.
• The heap size is set by specifying the -Xmx???m parameter to hotspot, where ??? is
the heap size in megabytes. Default heap size is 64MB for 32bit JVMs, 30% larger (appr. 83MB)
for 64bit JVMs.
• The stack size is set by specifying the -Xss???m parameter to hotspot, where ??? is
the stack size in megabytes. Default stack size is 512kB for 32bit JVMs on Solaris, 320kB for
32bit JVMs on Linux (and Windows), and 1024kB for 64bit JVMs.
Most modern CPUs implement a Non-Uniform Memory Access (NUMA) architecture, where
different parts of the memory have different access speeds. Suns Hotspot JVM is able to allocate
objects with awareness of the NUMA structure as of version 1.6.0 update 18. When enabled this can
give up to 40% performance improvements. To enabled the NUMA awareness, specify the
-XX:+UseNUMA parameter (works only when using the Parallel Scavenger garbage collector (default
or -XX:+UseParallelGC not the concurrent mark and sweep one).
Properly configuring memory utilization of the JVM is crucial for optimal performance. As an
example, a poorly configured JVM could spend all CPU time performing garbage collection
(blocking all threads from performing any work). Requirements such as latency, total throughput and
available hardware have to be considered to find the right setup. In production, Neo4j should run on a
multi core/CPU platform with the JVM in server mode.
21.6.1. Configuring heap size and GC

A large heap allows for larger node and relationship caches — which is a good thing — but
large heaps can also lead to latency problems caused by full garbage collection. The different high
level cache implementations available in Neo4j together with a suitable JVM configuration of heap
size and garbage collection (GC) should be able to handle most workloads.
The default cache (soft reference based LRU cache) works best with a heap that never gets full: a
graph where the most used nodes and relationships can be cached. If the heap gets too full there is a
risk that a full GC will be triggered; the larger the heap, the longer it can take to determine what soft
references should be cleared.
Using the strong reference cache means that all the nodes and relationships being used must fit
in the available heap. Otherwise there is a risk of getting out-of-memory exceptions. The soft
reference and strong reference caches are well suited for applications were the overal throughput is
important.
469
The weak reference cache basically needs enough heap to handle the peak load of the
application — peak load multiplied by the average memory required per request. It is well suited for
low latency requirements were GC interuptions are not acceptable.
重要
When running Neo4j on Windows, keep in mind that the memory mapped buffers are
allocated on heap by default, so they need to be taken into account when determining heap
size.
表 21.43. Guidelines for heap size

Number RAM Heap Reserved
of primitives size configuration RAM for the
OS
10M 2GB 512MB the rest
100M 8GB+ 1-4GB 1-2GB
1B+ 16GB-3 4GB+ 1-2GB

2GB+
提示
The recommended garbage collector to use when running Neo4j in production is the
Concurrent Mark and Sweep Compactor turned on by supplying -XX:+UseConcMarkSweepGC
as a JVM parameter.
When having made sure that the heap size is well configured the second thing to tune in order to
tune the garbage collector for your application is to specify the sizes of the different generations of
the heap. The default settings are well tuned for "normal" applications, and work quite well for most
applications, but if you have an application with either really high allocation rate, or a lot of long
lived objects you might want to consider tuning the sizes of the heap generation. The ratio between
the young and tenured generation of the heap is specified by using the -XX:NewRatio=# command
line option (where # is replaced by a number). The default ratio is 1:12 for client mode JVM, and 1:8
for server mode JVM. You can also specify the size of the young generation explicitly using the
-Xmn command line option, which works just like the -Xmx option that specifies the total heap space.
GC Genera Command line Comment

shortname tion parameter
Copy Young -XX:+UseSerialGC The Copying collector
470
GC Genera Command line Comment
shortname tion parameter
MarkSweepComp Tenured -XX:+UseSerialGC The Mark and Sweep

act Compactor
ConcurrentMarkS Tenured -XX:+UseConcMarkSwee The Concurrent Mark and

weep pGC Sweep Compactor
ParNew Young -XX:+UseParNewGC The parallel Young

Generation Collector — can
only be used with the
Concurrent mark and sweep
compactor.
PS Scavenge Young -XX:+UseParallelGC The parallel object

scavenger
PS MarkSweep Tenured -XX:+UseParallelGC The parallel mark and

sweep collector
These are the default configurations on some platforms according to our non-exhaustive
research:
JVM -d32 -d32 -d64 -client -d6

-client -server 4
-server
Mac OS ParNew and PS Scavenge ParNew and PS
X Snow ConcurrentMark and PS MarkSweep ConcurrentMarkSweep Scavenge
Leopard, Sweep and PS

64-bit, MarkSweep
Hotspot
1.6.0_17
Ubuntu, Copy and Copy and N/A N/A

32-bit, MarkSweepCompa MarkSweepCompact
Hotspot ct
1.6.0_16
21.7. 短字符串的压缩存储
471
Neo4j 将尝试分类您短字符串类中的字符串，如果它管理，它将相应地对待它。在这种情
况下，它将存储，而不在属性存储，内联的间接寻址它反而在财产记录中，意味着不会参与动
态字符串存储存储的价值，从而减少的磁盘占用空间。此外，当需要时没有字符串记录，以存
储属性，它可以读取和写入单个查找，从而导致性能改进中。
The various classes for short strings are:
• Numerical, consisting of digits 0..9 and the punctuation space, period, dash, plus,
comma and apostrophe.
• Date, consisting of digits 0..9 and the punctuation space dash, colon, slash, plus and
comma.
• Uppercase, consisting of uppercase letters A..Z, and the punctuation space, underscore,
period, dash, colon and slash.
• Lowercase, like upper but with lowercase letters a..z instead of uppercase
• E-mail, consisting of lowercase letters a..z and the punctuation comma, underscore,
period, dash, plus and the at sign (@).
• URI, consisting of lowercase letters a..z, digits 0..9 and most punctuation available.
• Alphanumerical, consisting of both upper and lowercase letters a..zA..z, digits 0..9 and
punctuation space and underscore.
• Alphasymbolical, consisting of both upper and lowercase letters a..zA..Z and the
punctuation space, underscore, period, dash, colon, slash, plus, comma, apostrophe, at sign, pipe
and semicolon.
• European, consisting of most accented european characters and digits plus punctuation
space, dash, underscore and period — like latin1 but with less punctuation.
• Latin 1.
• UTF-8.
In addition to the string’s contents, the number of characters also determines if the string can be
inlined or not. Each class has its own character count limits, which are
表 21.44. Character count limits

String class Character
count limit
Numerical and Date 54
Uppercase, Lowercase 43
and E-mail
URI, Alphanumerical 36
and Alphasymbolical
472
String class Character
count limit
European 31
Latin1 27
UTF-8 14
That means that the largest inline-able string is 54 characters long and must be of the Numerical
class and also that all Strings of size 14 or less will always be inlined.
Also note that the above limits are for the default 41 byte PropertyRecord layout — if that
parameter is changed via editing the source and recompiling, the above have to be recalculated.
21.8. 短数组的压缩存储
Neo4j will try to store your primitive arrays in a compressed way, so as to save disk space and
possibly an I/O operation. To do that, it employs a "bit-shaving" algorithm that tries to reduce the
number of bits required for storing the members of the array. In particular:
1. For each member of the array, it determines the position of leftmost set bit.
2. Determines the largest such position among all members of the array
3. It reduces all members to that number of bits
4. Stores those values, prefixed by a small header.
That means that when even a single negative value is included in the array then the natural size
of the primitives will be used.
There is a possibility that the result can be inlined in the property record if:
• It is less than 24 bytes after compression
• It has less than 64 members
For example, an array long[] {0L, 1L, 2L, 4L} will be inlined, as the largest entry (4) will
require 3 bits to store so the whole array will be stored in 4*3=12 bits. The array long[] {-1L, 1L, 2L,
4L} however will require the whole 64 bits for the -1 entry so it needs 64*4 = 32 bytes and it will end
up in the dynamic store.
21.9. 内存 I 映射配置
21.9.1. Optimizing for traversal speed example
21.9.2. Batch insert example
473
Each file in the Neo4j store can use memory mapped I/O for reading/writing. Best performance
is achieved if the full file can be memory mapped but if there isn’t enough memory for that Neo4j
will try and make the best use of the memory it gets (regions of the file that get accessed often will
more likely be memory mapped).
重要
Neo4j makes heavy use of the java.nio package. Native I/O will result in memory
being allocated outside the normal Java heap so that memory usage needs to be taken into
consideration. Other processes running on the OS will impact the availability of such
memory. Neo4j will require all of the heap memory of the JVM plus the memory to be used
for memory mapping to be available as physical memory. Other processes may thus not use
more than what is available after the configured memory allocation is made for Neo4j.
A well configured OS with large disk caches will help a lot once we get cache misses in the node
and relationship caches. Therefore it is not a good idea to use all available memory as Java heap.
If you look into the directory of your Neo4j database, you will find its store files, all prefixed by
neostore:
• nodestore stores information about nodes
• relationshipstore holds all the relationships
• propertystore stores information of properties and all simple properties such as

primitive types (both for relationships and nodes)
• propertystore strings stores all string properties
• propertystore arrays stores all array properties
There are other files there as well, but they are normally not interesting in this context.
This is how the default memory mapping configuration looks:
neostore.nodestore.db.mapped_memory=25M
1 neostore.relationshipstore.db.mapped_memory=50M
2 neostore.propertystore.db.mapped_memory=90M
3 neostore.propertystore.db.strings.mapped_memory=
130M
4
5 neostore.propertystore.db.arrays.mapped_memory=1
30M
21.9.1. Optimizing for traversal speed example
To tune the memory mapping settings start by investigating the size of the different store files
found in the directory of your Neo4j database. Here is an example of some of the files and sizes in a
Neo4j database:
474
14M neostore.nodestore.db
1
510M neostore.propertystore.db
2
1.2G
3
neostore.propertystore.db.strings
4
304M neostore.relationshipstore.db
In this example the application is running on a machine with 4GB of RAM. We’ve reserved
about 2GB for the OS and other programs. The Java heap is set to 1.5GB, that leaves about 500MB of
RAM that can be used for memory mapping.
提示
If traversal speed is the highest priority it is good to memory map as much as possible
of the node- and relationship stores.
An example configuration on the example machine focusing on traversal speed would then look
something like:
1 neostore.relationshipstore.db.mapped_memory=285M
100M
4
M
21.9.2. Batch insert example
Read general information on batch insertion in batchinsert.
The configuration should suit the data set you are about to inject using BatchInsert. Lets say we
have a random-like graph with 10M nodes and 100M relationships. Each node (and maybe some
relationships) have different properties of string and Java primitive types (but no arrays). The
important thing with a random graph will be to give lots of memory to the relationship and node
store:
1 neostore.relationshipstore.db.mapped_memory=3G
100M
4
M
The configuration above will fit the entire graph (with exception to properties) in memory.
A rough formula to calculate the memory needed for the nodes:
475
number_of_nodes * 9
1
bytes
and for relationships:
number_of_relationships * 33
1
bytes
Properties will typically only be injected once and never read so a few megabytes for the
property store and string store is usually enough. If you have very large strings or arrays you may
want to increase the amount of memory assigned to the string and array store files.
An important thing to remember is that the above configuration will need a Java heap of 3.3G+
since in batch inserter mode normal Java buffers that gets allocated on the heap will be used instead
of memory mapped ones
21.10. Linux 性能向导

21.10.1. Setup
21.10.2. Running the benchmark
21.10.3. Fixing the problem
The key to achieve good performance on reads and writes is to have lots of RAM since disks are
so slow. This guide will focus on achieving good write performance on a Linux kernel based
operating system.
If you have not already read the information available in 第 21 章配置和调优 do that now to
get some basic knowledge on memory mapping and store files with Neo4j.
This section will guide you through how to set up a file system benchmark and use it to
configure your system in a better way.
21.10.1. Setup
Create a large file with random data. The file should fit in RAM so if your machine has 4GB of
RAM a 1-2GB file with random data will be enough. After the file has been created we will read the
file sequentially a few times to make sure it is cached.
1 $ dd if=/dev/urandom of=store bs=1M count=1000

2 1000+0 records in
3 1000+0 records out
4 1048576000 bytes (1.0 GB) copied, 263.53 s, 4.0 MB/s
5 $
6 $ dd if=store of=/dev/null bs=100M
7 10+0 records in
8 10+0 records out
9 1048576000 bytes (1.0 GB) copied, 38.6809 s, 27.1 MB/s
476
1 $
1 10+0 records in
1 10+0 records out
1 1048576000 bytes (1.0 GB) copied, 1.52365 s, 688 MB/s
1 10+0 records in
3 10+0 records out
1 1048576000 bytes (1.0 GB) copied, 0.776044 s, 1.4 GB/s
4
1
5
1
6
1
7
1
8
If you have a standard hard drive in the machine you may know that it is not capable of transfer
speeds as high as 1.4GB/s. What is measured is how fast we can read a file that is cached for us by
the operating system.
Next we will use a small utility that simulates the Neo4j kernel behavior to benchmark write
speed of the system.
$ git clone git@github.com:neo4j/tooling.git

1 ...
2 $ cd tooling/write-test/
3 $ mvn compile
4 [INFO] Scanning for projects...
5 ...
6 $ ./run
7 Usage: <large file> <log file> <[record size] [min tx size] [max
tx8 size] [tx count] <[--nosync | --nowritelog | --nowritestore |
--noread | --nomemorymap]>>
The utility will be given a store file (large file we just created) and a name of a log file. Then a
record size in bytes, min tx size, max tx size and transaction count must be set. When started the
utility will map the large store file entirely in memory and read (transaction size) records from it
randomly and then write them sequentially to the log file. The log file will then force changes to disk
and finally the records will be written back to the store file.
21.10.2. Running the benchmark
477
Lets try to benchmark 100 transactions of size 100-500 with a record size of 33 bytes (same
record size used by the relationship store).
$ ./run store logfile 33 100 500 100

1 tx_count[100] records[30759] fdatasyncs[100] read[0.96802425
MB]
2 wrote[1.9360485 MB]
3 Time was: 4.973
4 20.108585 tx/s, 6185.2 records/s, 20.108585 fdatasyncs/s,
199.32773 kB/s on reads, 398.65546 kB/s on writes
我们看到我们得到了有关 6185 记录更新/s 和 s 随当前事务大小 20 交易。我们可以改
变交易规模更大，例如写作的大小 1000 年-5000 记录 10 交易:

1 tx_count[10] records[24511] fdatasyncs[10] read[0.77139187 MB]
wrote[1.5427837
2 MB]
3 Time was: 0.792
With larger transaction we will do fewer of them per second but record throughput will increase.
Lets see if it scales, 10 transactions in under 1s then 100 of them should execute in about 10s:

1 tx_count[100] records[308814] fdatasyncs[100] read[9.718763
MB]
2 wrote[19.437527 MB]
3 Time was: 65.115
This is not very linear scaling. We modified a bit more than 10x records in total but the time
jumped up almost 100x. Running the benchmark watching vmstat output will reveal that something is
not as it should be:
$ vmstat 3
procs -----------memory---------- ---swap-- -----io----
-system-- ----cpu----
1 r b swpd free buff cache si so bi bo in c
s 2us sy id wa
3 0 1 47660 298884 136036 2650324 0 0 0 10239 1167
2268
4 5 7 46 42
5 0 1 47660 302728 136044 2646060 0 0 0 7389 1267
2627
6 6 7 47 40
7 0 1 47660 302408 136044 2646024 0 0 0 11707 1861
2016
8 8 5 48 39
0 2 47660 302472 136060 2646432 0 0 0 10011 1704
1878 4 7 49 40
0 1 47660 303420 136068 2645788 0 0 0 13807 1406
478
1601 4 5 44 47
There are a lot of blocks going out to IO, way more than expected for the write speed we are
seeing in the benchmark. Another observation that can be made is that the Linux kernel has spawned
a process called "flush-x:x" (run top) that seems to be consuming a lot of resources.
The problem here is that the Linux kernel is trying to be smart and write out dirty pages from the
virtual memory. As the benchmark will memory map a 1GB file and do random writes it is likely that
this will result in 1/4 of the memory pages available on the system to be marked as dirty. The Neo4j
kernel is not sending any system calls to the Linux kernel to write out these pages to disk however the
Linux kernel decided to start doing so and it is a very bad decision. The result is that instead of doing
sequential like writes down to disk (the logical log file) we are now doing random writes writing
regions of the memory mapped file to disk.
It is possible to observe this behavior in more detail by looking at /proc/vmstat "nr_dirty" and
"nr_writeback" values. By default the Linux kernel will start writing out pages at a very low ratio of
dirty pages (10%).
$ sync
1
$ watch grep -A 1 dirty
2
/proc/vmstat
3
...
4
nr_dirty 22
5
nr_writeback 0
The "sync" command will write out all data (that needs writing) from memory to disk. The
second command will watch the "nr_dirty" and "nr_writeback" count from vmstat. Now start the
benchmark again and observe the numbers:
nr_dirty
124947
1
2 nr_writeback
232
The "nr_dirty" pages will quickly start to rise and after a while the "nr_writeback" will also
increase meaning the Linux kernel is scheduling a lot of pages to write out to disk.
21.10.3. Fixing the problem
As we have 4GB RAM on the machine and memory map a 1GB file that does not need its
content written to disk (until we tell it to do so because of logical log rotation or Neo4j kernel
shutdown) it should be possible to do endless random writes to that memory with high throughput.
All we have to do is to tell the Linux kernel to stop trying to be smart. Edit the /etc/sysctl.conf (need
root access) and add the following lines:
479
vm.dirty_background_ratio =
1
50
2
vm.dirty_ratio = 80
Then (as root) execute:
# sysctl
1
-p
The "vm.dirty_background_ratio" tells at what ratio should the linux kernel start the background
task of writing out dirty pages. We increased this from the default 10% to 50% and that should cover
the 1GB memory mapped file. The "vm.dirty_ratio" tells at what ratio all IO writes become
synchronous, meaning that we can not do IO calls without waiting for the underlying device to
complete them (which is something you never want to happen).
Rerun the benchmark:

1 tx_count[100] records[265624] fdatasyncs[100] read[8.35952 MB]
wrote[16.71904
2 MB]
3 Time was: 6.781
Results are now more in line with what can be expected, 10x more records modified results in
10x longer execution time. The vmstat utility will not report any absurd amount of IO blocks going
out (it reports the ones caused by the fdatasync to the logical log) and Linux kernel will not spawn a
"flush-x:x" background process writing out dirty pages caused by writes to the memory mapped store
file.
21.11. Linux 特有的注意事项

21.11.1. File system tuning for high IO
21.11.2. Setting the number of open files
21.11.1. File system tuning for high IO
In order to support the high IO load of small transactions from a database, the underlying file
system should be tuned. Symptoms for this are low CPU load with high iowait. In this case, there are
a couple of tweaks possible on Linux systems:
• Disable access-time updates: noatime,nodiratime flags for disk mount command

or in the /etc/fstab for the database disk volume mount.
• Tune the IO scheduler for high disk IO on the database disk.
21.11.2. Setting the number of open files
480
Linux platforms impose an upper limit on the number of concurrent files a user may have open.
This number is reported for the current user and session with the command
user@localhost:~$ ulimi
1
t -n
2
1024
The usual default of 1024 is often not enough, especially when many indexes are used or a
server installation sees too many connections (network sockets count against that limit as well). Users
are therefore encouraged to increase that limit to a healthy value of 40000 or more, depending on
usage patterns. Setting this value via the ulimit command is possible only for the root user and that
for that session only. To set the value system wide you have to follow the instructions for your
platform.
What follows is the procedure to set the open file descriptor limit to 40k for user neo4j under
Ubuntu 10.04 and later. If you opted to run the neo4j service as a different user, change the first field
in step 2 accordingly.
1. Become root since all operations that follow require editing protected system files.
user@localhost:~$ sudo
1
su -
2
Password:
3
root@localhost:~$
2. Edit /etc/security/limits.conf and add these two lines:
neo4j soft nofile 4

0000
1
2 neo4j hard nofile 4
0000
3. Edit /etc/pam.d/su and uncomment or add the following line:
session required pam_limit

1
s.so
4. A restart is required for the settings to take effect.
After the above procedure, the neo4j user will have a limit of 40000 simultaneous open files. If
you continue experiencing exceptions on Too many open files or Could not stat()
directory then you may have to raise that limit further.
第 22 章高可用性模式
注意
The High Availability features are only available in the Neo4j Enterprise
Edition.
481
Neo4j High Availability or “Neo4j HA” provides the following two main features:
1. It enables a fault-tolerant database architecture, where several Neo4j slave databases

can be configured to be exact replicas of a single Neo4j master database. This allows the
end-user system to be fully functional and both read and write to the database in the event of
hardware failure.
2. It enables a horizontally scaling read-mostly architecture that enables the system to
handle more read load than a single Neo4j database instance can handle.
22.1. 架构
Neo4j HA has been designed to make the transition from single machine to multi machine
operation simple, by not having to change the already existing application.
Consider an existing application with Neo4j embedded and running on a single machine. To
deploy such an application in a multi machine setup the only required change is to switch the creation
of the GraphDatabaseService from EmbeddedGraphDatabase to
HighlyAvailableGraphDatabase. Since both implement the same interface, no additional
changes are required.
图 22.1. Typical setup when running multiple Neo4j instances in HA mode
482
When running Neo4j in HA mode there is always a single master and zero or more slaves.
Compared to other master-slave replication setups Neo4j HA can handle writes on a slave so there is
no need to redirect writes to the master.
A slave will handle writes by synchronizing with the master to preserve consistency. Writes to
master can be configured to be optimistically pushed to 0 or more slaves. By optimistically we mean
the master will try to push to slaves before the transaction completes but if it fails the transaction will
still be successful (different from normal replication factor). All updates will however propagate from
the master to other slaves eventually so a write from one slave may not be immediately visible on all
other slaves. This is the only difference between multiple machines running in HA mode compared to
single machine operation. All other ACID characteristics are the same.
22.2. 安装和配置
Neo4j HA can be set up to accommodate differing requirements for load, fault tolerance and
available hardware.
Within a cluster, Neo4j HA uses Apache ZooKeeper [2] for master election and propagation of
general cluster and machine status information. ZooKeeper can be seen as a distributed coordination
service. Neo4j HA requires a coordinator service for initial master election, new master election
(current master failing) and to publish general status information about the current Neo4j HA cluster
(for example when a server joined or left the cluster). Read operations through the
GraphDatabaseService API will always work and even writes can survive coordinator failures if
a master is present.
ZooKeeper requires a majority of the ZooKeeper instances to be available to operate properly.

This means that the number of ZooKeeper instances should always be an odd number since that will
make best use of available hardware.
To further clarify the fault tolerance characteristics of Neo4j HA here are a few example setups:
22.2.1. Small
• 3 physical (or virtual) machines
• 1 Coordinator running on each machine
• 1 Neo4j HA instance running on each machine
This setup is conservative in the use of hardware while being able to handle moderate read load.
It can fully operate when at least 2 of the coordinator instances are running. Since the coordinator and
483
Neo4j HA are running together on each machine this will in most scenarios mean that only one server
is allowed to go down.
22.2.2. Medium
• 5-7+ machines
• Coordinator running on 3, 5 or 7 machines
• Neo4j HA can run on 5+ machines
This setup may mean that two different machine setups have to be managed (some machines run
both coordinator and Neo4j HA). The fault tolerance will depend on how many machines there are
that are running coordinators. With 3 coordinators the cluster can survive one coordinator going down,
with 5 it can survive 2 and with 7 it can handle 3 coordinators failing. The number of Neo4j HA
instances that can fail for normal operations is theoretically all but 1 (but for each required master
election the coordinator must be available).
22.2.3. Large
• 8+ total machines
• 3+ Neo4j HA machines
• 5+ Coordinators, on separate dedicated machines
In this setup all coordinators are running on separate machines as a dedicated service. The
dedicated coordinator cluster can handle half of the instances, minus 1, going down. The Neo4j HA
cluster will be able to operate with at least a single live machine. Adding more Neo4j HA instances is
very easy in this setup since the coordinator cluster is operating as a separate service.
22.2.4. Installation Notes

For installation instructions of a High Availability cluster see 第 22.5 节 “HA 安装向导”.
Note that while the HighlyAvailableGraphDatabase supports the same API as the
EmbeddedGraphDatabase, it does have additional configuration parameters.
表 22.1. HighlyAvailableGraphDatabase configuration parameters

Parameter Description Example value R
Name equir
ed?
ha.server_id integer >= 0 and has to be 1 ye

unique s
484
Name equir
ed?
ha.server (auto-discovered) host & my-domain.com:6001 no

port to bind when acting as
master
ha.coordinators comma delimited localhost:2181,localhos ye

coordinator connections t:2182,localhost:2183 s
ha.cluster_name name of the cluster to neo4j.ha no

participate in
ha.pull_interval interval for polling master 30 no

from a slave, in seconds
ha.slave_coordinat creates a slave-only none no

or_update_mode instance that will never become
a master (sync,async,none)
ha.read_timeout how long a slave will wait 20 no

for response from master
before giving up (default 20)
ha.lock_read_timeo how long a slave lock 40 no

ut acquisition request will wait for
response from master before
giving up (defaults to what
ha.read_timeout is, or its
default if absent)
ha.max_concurrent_ max number of concurrent 100 no

channels_per_slave communication channels each
slave has to its master. Increase
if there’s high contention on
few nodes
ha.branched_data_p what to do with the db that keep_none no

olicy is considered branched and will
be replaced with a fresh copy
from the master
{keep_all(default),keep_last,ke
ep_none,shutdown}
485
Name equir
ed?
ha.zk_session_time how long (in 5000 no

out milliseconds) before a non
reachable instance has its
session expired from the
ZooKeeper cluster and its
ephemeral nodes removed,
probably leading to a master
election
ha.tx_push_factor amount of slaves a tx will 1 (default) no

be pushed to whenever the
master commits a transaction
ha.tx_push_strateg either "fixed" (default) or fixed no

y "round_robin", fixed will push
to the slaves with highest
server id
小心
Neo4j’s HA setup depends on ZooKeeper a.k.a. Coordinator which makes certain

assumptions about the state of the underlying operating system. In particular ZooKeeper
expects that the system time on each machine is set correctly, synchronized with respect to
each other. If this is not true, then Neo4j HA will appear to misbehave, caused by seemingly
random ZooKeeper hiccups.
小心
Neo4j uses the Coordinator cluster to store information representative of the

deployment’s state, including key fields of the database itself. Since that information is
permanently stored in the cluster, you cannot reuse it for multiple deployments of different
databases. In particular, removing the Neo4j servers, replacing the database and restarting
them using the same coordinator instances will result in errors mentioning the existing HA
deployment. To reset the Coordinator cluster to a clean state, you have to shutdown all
instances, remove the data/coordinator/version-2/* data files and restart the
Coordinators.
22.3. How Neo4j HA operates

486
A Neo4j HA cluster operates cooperatively, coordinating activity through Zookeeper.
On startup a Neo4j HA instance will connect to the coordinator service (ZooKeeper) to register
itself and ask, "who is master?" If some other machine is master, the new instance will start as slave
and connect to that master. If the machine starting up was the first to register — or should become
master according to the master election algorithm — it will start as master.
When performing a write transaction on a slave each write operation will be synchronized with
the master (locks will be acquired on both master and slave). When the transaction commits it will
first occur on the master. If the master commit is successful the transaction will be committed on the
slave as well. To ensure consistency, a slave has to be up to date with the master before performing a
write operation. This is built into the communication protocol between the slave and master, so that
updates will happen automatically if needed.
You can make a database instance permanently slave-only by including the

ha.slave_coordinator_update_mode=none configuration parameter in its configuration.
Such instances will never become a master during fail-over elections though otherwise they
behave identically to any other slaves, including the ability to write-through permanent slaves to the
master.
When performing a write on the master it will execute in the same way as running in normal
embedded mode. Currently the master will by default try to push the transaction to one slave. This is
done optimistically meaning if the push fails the transaction will still be successful. This push is not
like replication factor that would cause the transaction to fail. The push factor (amount of slaves to try
push a transaction to) can be configured to 0 (higher write performance) and up to amount of
machines available in the cluster minus one.
Slaves can also be configured to update asynchronously by setting a pull interval.
Whenever a server running a neo4j database becomes unavailable the coordinator service will
detect that and remove it from the cluster. If the master goes down a new master will automatically be
elected. Normally a new master is elected and started within just a few seconds and during this time
no writes can take place (the write will throw an exception). A machine that becomes available after
being unavailable will automatically reconnect to the cluster. The only time this is not true is when an
old master had changes that did not get replicated to any other machine. If the new master is elected
and performs changes before the old master recovers, there will be two different versions of the data.
The old master will move away the branched database and download a full copy from the new master.
All this can be summarized as:
• Slaves can handle write transactions.

487
• Updates to slaves are eventual consistent but can be configured to optimistically be
pushed from master during commit.
• Neo4j HA is fault tolerant and (depending on ZooKeeper setup) can continue to operate
from X machines down to a single machine.
• Slaves will be automatically synchronized with the master on a write operation.
• If the master fails a new master will be elected automatically.
• Machines will be reconnected automatically to the cluster whenever the issue that
caused the outage (network, maintenance) is resolved.
• Transactions are atomic, consistent and durable but eventually propagated out to other
slaves.
• If the master goes down any running write transaction will be rolled back and during
master election no write can take place.
• Reads are highly available.
22.4. Upgrading a Neo4j HA Cluster

22.4.1. Overview
22.4.2. Step 1: On each slave perform the upgrade
22.4.3. Step 2: Upgrade the master, complete the procedure
This document describes the steps required to upgrade a Neo4j cluster from a previous version to
1.8 without disrupting its operation, a process referred to as a rolling upgrade. The starting
assumptions are that there exists a cluster running Neo4j version 1.5.3 or newer with the
corresponding ZooKeeper instances and that the machine which is currently the master is known. It is
also assumed that on each machine the Neo4j service and the neo4j coordinator service is installed
under a directory which from here on is assumed to be /opt/old-neo4j
22.4.1. Overview
The process consists of upgrading each machine in turn by removing it from the cluster, moving
over the database and starting it back up again. Configuration settings also have to be transferred. It is
important to note that the last machine to be upgraded must be the master. In general, the "cluster
version" is defined by the version of the master, providing the master is of the older version the
cluster as a whole can operate (the 1.8 instances running in compatibility mode). When a 1.8 instance
is elected master however, the older instances are not capable of communicating with it, so we have
to make sure that the last machine upgraded is the old master. The upgrade process is detected
automatically from the joining 1.8 instances and they will not participate in a master election while
even a single old instance is part of the cluster.
22.4.2. Step 1: On each slave perform the upgrade
488
Download and unpack the new version. Copy over any configuration settings you run your
instances with, taking care for deprecated settings and API changes that can occur between versions.
Also, ensure that newly introduced settings have proper values (see 第 22.2 节 “安装和配置”).
Apart from the files under conf/ you should also set proper values in data/coordinator/myid (copying
over the file from the old instance is sufficient) The most important thing about the settings setup is
the allow_store_upgrade setting in neo4j.properties which must be set to true, otherwise the instance
will be unable to start. Finally, don’t forget to copy over any server plugins you may have. Shutdown
first the neo4j instance and then the coordinator with
service neo4j-service stop

1
service neo4j-coordinator
2
stop
Next, uninstall both services
service neo4j-service remove

1
service neo4j-coordinator
2
remove
Now you can copy over the database. Assuming the old instance is at /opt/old-neo4j and the
newly unpacked under /opt/neo4j-enterprise-1.8 the proper command would be
cp -R /opt/old-neo4j/data/graph.db
1
/opt/neo4j-enterprise-1.8/data/
Next install neo4j and the coordinator services, which also starts them
/opt/neo4j-enterprise-1.8/bin/neo4j-coordinator
1
install
2
/opt/neo4j-enterprise-1.8/bin/neo4j install
Done. Now check that the services are running and that webadmin reports the version 1.8.
Transactions should also be applied from the master as usual.
22.4.3. Step 2: Upgrade the master, complete the procedure
Go to the current master and execute step 1 The moment it will be stopped another instance will
take over, transitioning the cluster to 1.8. Finish Step 1 on this machine as well and you will have
completed the process.
22.5. HA 安装向导
22.5.1. Background
22.5.2. Setup and start the Coordinator cluster
22.5.3. Start the Neo4j Servers in HA mode
22.5.4. Start Neo4j Embedded in HA mode
489
This is a guide to set up a Neo4j HA cluster and run embedded Neo4j or Neo4j Server instances
participating as cluster nodes.
22.5.1. Background
The members of the HA cluster (see 第 22 章高可用性模式) use a Coordinator cluster to
manage themselves and coordinate lifecycle activity like electing a master. When running an Neo4j
HA cluster, a Coordinator cluster is used for cluster collaboration and must be installed and
configured before working with the Neo4j database HA instances.
提示
Neo4j Server (see 第 17 章 Neo4j 服务器) and Neo4j Embedded (see 第 21.1 节 “介
绍”) can both be used as nodes in the same HA cluster. This opens for scenarios where one
application can insert and update data via a Java or JVM language based application, and
other instances can run Neo4j Server and expose the data via the REST API (rest-api).
Below, there will be 3 coordinator instances set up on one local machine.
Download and unpack Neoj4 Enterprise

Download and unpack three installations of Neo4j Enterprise (called $NEO4J_HOME1,
$NEO4J_HOME2, $NEO4J_HOME3) from the Neo4j download site.
22.5.2. Setup and start the Coordinator cluster

Now, in the NEO4J_HOME1/conf/coord.cfg file, adjust the coordinator clientPort and let
the coordinator search for other coordinator cluster members at the localhost port ranges:
#$NEO4J_HOME1/conf/coord
.cfg
1
2 server.1=localhost:2888:
3888
3
3889
5
3890
7
clientPort=2181
The other two config files in $NEO4J_HOME2 and $NEO4J_HOME3 will have a different
clienttPort set but the other parameters identical to the first one:
1 #$NEO4J_HOME2/conf/coord
.cfg
2
3 ...
490
3888
5
3889
7
server.3=localhost:2890:
3890
...
clientPort=2182
#$NEO4J_HOME2/conf/coord
.cfg
1 ...
3888
3
3889
5
3890
7
...
clientPort=2183
Next we need to create a file in each data directory called "myid" that contains an id for each
server equal to the number in server.1, server.2 and server.3 from the configuration files.
neo4j_home1$ echo '1' >

data/coordinator/myid
1
2
3
We are now ready to start the Coordinator instances:
neo4j_home1$ ./bin/neo4j-coordinator
start
1
2
start
3
start
22.5.3. Start the Neo4j Servers in HA mode
In your conf/neo4j.properties file, enable HA by setting the necessary parameters for all 3
installations, adjusting the ha.server_id for all instances:
1 #$NEO4J_HOME1/conf/neo4j.properties
2 #unique server id for this graph database
3 #can not be negative id and must be unique
4 ha.server_id = 1
491
5
6 #ip and port for this instance to bind to
7 ha.server = localhost:6001
8
9 #connection information to the coordinator cluster client ports
1 ha.coordinators =
0 localhost:2181,localhost:2182,localhost:2183
4 ha.server_id = 2
5
8
1 ha.coordinators =
4 ha.server_id = 3
5
8
1 ha.coordinators =
To avoid port clashes when starting the servers, adjust the ports for the REST endpoints in all
instances under conf/neo4j-server.properties and enable HA mode:
1 #$NEO4J_HOME1/conf/neo4j-server.properties
2 ...
3 # http port (for all data, administrative, and UI access)
4 org.neo4j.server.webserver.port=7474
5 ...
6 # https port (for all data, administrative, and UI access)
7 org.neo4j.server.webserver.https.port=7473
8 ...
9 # Allowed values:
1 # HA - High Availability
0 # SINGLE - Single mode, default.
1 # To run in High Availability mode, configure the coord.cfg file,
1 and the
1 # neo4j.properties config file, then uncomment this line:
492
2 org.neo4j.server.database.mode=HA
1
3
1
4
1
2
4 ...
7 ...
1 ...
0 # Allowed values:
2 and the
1
4
1
2
4 ...
7 ...
1 ...
0 # Allowed values:
2 and the
1
4
To avoid JMX port clashes adjust the assigned ports for all instances under
conf/neo4j-wrapper.properties:
493
#$NEO4J_HOME1/conf/neo4j-wrapper.properties
...
# Remote JMX monitoring, adjust the following lines if needed.
1
# Also make sure to update the jmx.access and jmx.password files
2
with appropriate permission roles and passwords,
3
# the shipped configuration contains only a read only role called
4
'monitor' with password 'Neo4j'.
5
# For more details, see:
6
http://download.oracle.com/javase/6/docs/technotes/guides/manag
7
ement/agent.html
8
wrapper.java.additional.4=-Dcom.sun.management.jmxremote.por
t=3637
...
#$NEO4J_HOME2/conf/neo4j-wrapper.properties
...
1
2
3
4
5
6
7
ement/agent.html
8
t=3638
...
#$NEO4J_HOME3/conf/neo4j-server.properties
...
1
2
3
4
5
6
7
ement/agent.html
8
t=3639
...
Now, start all three server instances.
neo4j_home1$ ./bin/neo4j
start
1
2 neo4j_home2$ ./bin/neo4j
start
3
neo4j_home3$ ./bin/neo4j
494
start
Now, you should be able to access the 3 servers (the first one being elected as master since it
was started first) at http://localhost:7474/webadmin/#/info/org.neo4j/High%20Availability/,
http://localhost:7475/webadmin/#/info/org.neo4j/High%20Availability/ and
http://localhost:7476/webadmin/#/info/org.neo4j/High%20Availability/ and check the status of the
HA configuration. Alternatively, the REST API is exposing JMX, so you can check the HA JMX
bean with e.g.
curl -H "Content-Type:application/json" -d '["org.neo4j:*"]'

1
http://localhost:7474/db/manage/server/jmx/query
And find in the response
1
2
3
4
"description" : "Information about all instances in this
5
cluster",
6
"name" : "InstancesInCluster",
7
"value" : [ {
8
"description" : "org.neo4j.management.InstanceInfo",
9
"value" : [ {
1
"description" : "address",
0
"name" : "address"
1
}, {
1
"description" : "instanceId",
1
"name" : "instanceId"
2
}, {
1
"description" : "lastCommittedTransactionId",
3
"name" : "lastCommittedTransactionId",
1
"value" : 1
4
}, {
1
"description" : "machineId",
5
"name" : "machineId",
1
"value" : 1
6
}, {
1
"description" : "master",
7
"name" : "master",
1
"value" : true
8
} ],
1
"type" : "org.neo4j.management.InstanceInfo"
9
}
2
0
2
1
495
2
2
2
3
2
4
2
5
22.5.4. Start Neo4j Embedded in HA mode
If you are using Maven and Neo4j Embedded, simply add the following dependency to your
project:
<dependency>
2 <artifactId>neo4j-ha</artifactI
d>3
ion>
5
</dependency>
Where ${neo4j-version} is the Neo4j version used.
If you prefer to download the jar files manually, they are included in the Neo4j distribution.
The difference in code when using Neo4j-HA is the creation of the graph database service.
1
HighlyAvailableGraphDatabase( path, config );
The configuration can contain the standard configuration parameters (provided as part of the
config above or in neo4j.properties but will also have to contain:
1
#HA instance1
2
#unique machine id for this graph database
3
#can not be negative id and must be unique
4
ha.server_id = 1
5
6
#ip and port for this instance to bind to
7
ha.server = localhost:6001
8
9
#connection information to the coordinator cluster client ports
1
ha.coordinators =
0
localhost:2181,localhost:2182,localhost:2183
1
1
enable_remote_shell = port=1331
1
496
2
First we need to create a database that can be used for replication. This is easiest done by just
starting a normal embedded graph database, pointing out a path and shutdown.
Map<String,String> config =
HighlyAvailableGraphDatabase.loadConfigurations(
1 configFile );
2 GraphDatabaseService db = new
HighlyAvailableGraphDatabase( path, config );
We created a config file with machine id=1 and enabled remote shell. The main method will
expect the path to the db as first parameter and the configuration file as the second parameter.
It should now be possible to connect to the instance using 第 27 章 Neo4j 命令行:
1 neo4j_home1$ ./bin/neo4j-shell -port 1331

2 NOTE: Remote Neo4j graph database service 'shell' at port 1331
3 Welcome to the Neo4j Shell! Enter 'help' for a list of commands
4
5 neo4j-sh (0)$ hainfo
6 I'm currently master
7 Connected slaves:
Since it is the first instance to join the cluster it is elected master. Starting another instance
would require a second configuration and another path to the db.
1
2 #HA instance2
3 #unique machine id for this graph database
5 ha.server_id = 2
6
9
0 ha.coordinators =
localhost:2181,localhost:2182,localhost:2183
1
1
1 enable_remote_shell = port=1332
2
Now start the shell connecting to port 1332:
1 neo4j_home1$ ./bin/neo4j-shell -port 1332

2 NOTE: Remote Neo4j graph database service 'shell' at port 1332
3 Welcome to the Neo4j Shell! Enter 'help' for a list of commands
4
5 neo4j-sh (0)$ hainfo
497
6 I'm currently slave
22.6. 安装 HAProxy 作为一个负载均衡器

22.6.1. Installing HAProxy
22.6.2. Configuring HAProxy
22.6.3. Configuring separate sets for master and slaves
22.6.4. Cache-based sharding with HAProxy
In the Neo4j HA architecture, the cluster is typically fronted by a load balancer. In this section
we will explore how to set up HAProxy to perform load balancing across the HA cluster.
22.6.1. Installing HAProxy
For this tutorial we will assume a Linux environment. We will also be installing HAProxy from
source, and we’ll be using version 1.4.18. You need to ensure that your Linux server has a
development environment set up. On Ubuntu/apt systems, simply do:
aptitude install
1
build-essential
And on CentOS/yum systems do:
yum -y groupinstall 'Development

1
Tools'
Then download the tarball from the HAProxy website. Once you’ve downloaded it, simply build
and install HAProxy:
tar -zvxf
1
haproxy-1.4.18.tar.gz
2
cd haproxy-1.4.18
3
make
4
cp haproxy /usr/sbin/haproxy
Or specify a target for make (TARGET=linux26 for linux kernel 2.6 or above or linux24 for 2.4
kernel)
tar -zvxf
1
haproxy-1.4.18.tar.gz
2
cd haproxy-1.4.18
3
make TARGET=linux26
4
cp haproxy /usr/sbin/haproxy
22.6.2. Configuring HAProxy
HAProxy can be configured in many ways. The full documentation is available at their website.
498
For this example, we will configure HAProxy to load balance requests to three HA servers.
Simply write the follow configuration to /etc/haproxy.cfg:
1
2
3
4
5
6 global
7 daemon
8 maxconn 256
9
1 defaults
0 mode http
1 timeout connect 5000ms
1 timeout client 50000ms
1 timeout server 50000ms
2
1 frontend http-in
3 bind *:80
1 default_backend neo4j
4
1 backend neo4j
5 server s1 10.0.1.10:7474 maxconn
32
1
32
1
32
1
8
1 listen admin
9 bind *:8080
2 stats enable
0
2
1
2
2
HAProxy can now be started by running:
/usr/sbin/haproxy -f
1
/etc/haproxy.cfg
You can connect to http://<ha-proxy-ip>:8080/haproxy?stats to view the status dashboard. This
dashboard can be moved to run on port 80, and authentication can also be added. See the HAProxy
documentation for details on this.
499
22.6.3. Configuring separate sets for master and slaves
It is possible to set HAProxy backends up to only include slaves or the master. For example, it
may be desired to only write to slaves. To accomplish this, you need to have a small extension on the
server than can report whether or not the machine is master via HTTP response codes. In this example,
the extension exposes two URLs:
• /hastatus/master, which returns 200 if the machine is the master, and 404 if the
machine is a slave
• /hastatus/slave, which returns 200 if the machine is a slave, and 404 if the
machine is the master
The following example excludes the master from the set of machines. Request will only be sent
to the slaves.
1
2
3
global
4
daemon
5
maxconn 256
6
7
defaults
8
mode http
9
timeout connect 5000ms
1
timeout client 50000ms
0
timeout server 50000ms
1
1
frontend http-in
1
bind *:80
2
default_backend neo4j-slaves
1
3
backend neo4j-slaves
1
option httpchk GET /hastatus/slave
4
server s1 10.0.1.10:7474 maxconn 32
1
check
5
1
check
6
1
check
7
1
listen admin
8
bind *:8080
1
stats enable
9
2
0
500
2
1
2
2
2
3
22.6.4. Cache-based sharding with HAProxy
Neo4j HA enables what is called cache-based sharding. If the dataset is too big to fit into the
cache of any single machine, then by applying a consistent routing algorithm to requests, the caches
on each machine will actually cache different parts of the graph. A typical routing key could be user
ID.
In this example, the user ID is a query parameter in the URL being requested. This will route the
same user to the same machine for each request.
1
2
global
3
daemon
4
maxconn 256
5
6
defaults
7
mode http
8
timeout connect 5000ms
9
timeout client 50000ms
1
timeout server 50000ms
0
1
frontend http-in
1
bind *:80
1
default_backend neo4j-slaves
2
1
backend neo4j-slaves
3
balance url_param user_id
1
server s1 10.0.1.10:7474 maxconn
4
32
1
5
32
1
6
32
1
7
listen admin
1
bind *:8080
8
stats enable
1
9
501
2
0
2
1
2
2
2
3
Naturally the health check and query parameter-based routing can be combined to only route
requests to slaves by user ID. Other load balancing algorithms are also available, such as routing by
source IP (source), the URI (uri) or HTTP headers(hdr()).
第 23 章备份
注意
备份服务只在 Neo4j 企业版中可以使

用。
备份是通过网络将一个运行的图数据库备份到本地的操作。有两种类型的备份：完整备份
和增量备份。
一个完整备份在不进行任何锁定的情况下拷贝数据库文件，而不会影响到目标数据库的
任何操作。这当然也意味着，事务将继续而且存储内容也会被改变。基于这个原因，当备份操
作开始后，那些正在运行的事务是值得关注的。当备份操作完成后，所有在备份开始那一刻以
后的事务都会在备份文件上面重新执行一遍。这确保了备份数据的数据一致性以及数据存储的
快照是最新的。
相比之下，一个增量备份不会拷贝存储文件 — 而是拷贝在完整备份或者增量备份之后

发送的事务的日志，然后在已经存在的备份上面执行一次。这使得增量备份比完整备份更有效
率，但它要求在进行增量备份之前必须有一个已经存在的完整备份。
不管采用什么模式，一个备份一旦被创建，结果文件都表示一个持久化的数据库快照，都
可以用来启动一个 Neo4j 数据库实例。
备份的数据库可以通过一个 URI 地址来指定：
<running mode>://<host>[:port]{,<host>[:port]*}
运行模式必须定义并且只能是 single 或者 ha 。 <host>[:port] 部分指定了一个运行数据

库的主机，如果端口不是默认的，请使用 port 指定。额外的 host:port 参数是用来传递更多的
协调器实例的。
重要
502
只有配置了参数 online_backup_enabled=true 的数据库才可以进行备份。这将使
备份服务在默认端口（6362）可以使用。为了能在自定义端口进行备份操作,你可以使
用参数 online_backup_port=9999 进行配置。
23.1. 嵌入数据库和独立数据库服务器
为了执行一次从一个运行的嵌入数据库或者远程服务器的备份，如下：
# 执行一次完整备份
./neo4j-backup -full -from single://192.168.1.34 -to
/mnt/backup/neo4j-backup
1
2
3
# 执行一次增量备份
4
5 ./neo4j-backup -incremental -from single://192.168.1.34 -to
6
7
8
# 当服务注册在一个自定义端口时执行一次增量备份
./neo4j-backup -incremental -from single://192.168.1.34:9999
-to /mnt/backup/neo4j-backup
23.2. 在 Java 中进行在线备份

为了能编程从 JVM 进行完整或者增量备份你的数据，你需要进行类似如下编码：
OnlineBackup backup =
1
OnlineBackup.from( "localhost" );
2
backup.full( backupPath );
3
backup.incremental( backupPath );
获取更新信息，请参考： the Javadocs for OnlineBackup
23.3. 高可用性模式
要在一个 HA 模式的集群上面进行备份，你需要制定一个或者多个管理这个集群的协调器。
注意
如果你为你的 HA 集群指定了一个集群名称，那么你在备份的时候你需要指定这
个名称。以致于备份系统知道哪些集群需要备份。增加一个配置参数： -cluster
my_custom_cluster_name
1
# 执行一次从 HA 集群的完整备份，指定两个可能的协调器
2
3 ./neo4j-backup -full -from
4 ha://192.168.1.15:2181,192.168.1.16:2181 -to
503
5 /mnt/backup/neo4j-backup
6
7
# 执行一次从 HA 集群的增量备份，只指定一个协调器
8
./neo4j-backup -incremental -from
ha://192.168.1.15:2181 -to
# 用一个指定的名称在 HA 集群上执行一次增量备份（通过
Neo4j 配置的 'ha.cluster_name' 指定）

./neo4j-backup -incremental -from
ha://192.168.1.15:2181 -to
/mnt/backup/neo4j-backup -cluster my-cluster
23.4. 恢复你的数据
Neo4j 备份的内容是完整功能的数据库。为了使用备份数据源，你需要做的仅仅是用备份
的目录取代你正在使用的数据库目录即可。
第 24 章安全
Neo4j 自身并不会正数据层级加强安全性。然而，在使用 Neo4j 到不同场景时应该考虑不
同的情况。
24.1. 安全访问 Neo4j 服务器
24.1.1. 加强端口和远程客户端连接请求的安全
默认情况下，Neo4j 服务端会捆绑一个 Web 服务器，服务在 7474 端口，通过地址：
http://localhost:7474/ 访问，不过只能从本地访问。
在配置文件 conf/neo4j-server.properties 中：

3
4 #let the webserver only listen on the specified IP. Default
5 #is localhost (only accept local connections). Uncomment to allow
6 #any connection.
7 #org.neo4j.server.webserver.address=0.0.0.0
如果你需要能从外部主机访问，请在 conf/neo4j-server.properties 中设置
org.neo4j.server.webserver.address=0.0.0.0 来启用。
504
24.1.2. 任意的代码执行
默认情况下，Neo4j 服务端是可以执行任意代码的。比如 Gremlin Plugin 和 Traversals
REST endpoints。为了让这些更安全一些，要么从服务端的类路径完全移除这些插件，要么通
过代理或者授权的角色来访问哲学地址。当然， Java Security Manager 也可以用于让代码更加
安全。
24.1.3. HTTPS 支持
Neo4j 服务端内建支持 HTTPS 进行 SSL 加密通讯。服务端第一次启动时，他会自动生成
一个自签名 SSL 证书和一个私钥。因为这个证书是自签名的，在生产环境使用是不安全的，相
反，你应该生成为生产服务器单独产生。
为了提供你自己的 KEY 和证书，取代生成的 KEY 和证书，或者通过改变

neo4j-server.properties 的配置来充值你的证书和 KEY 的位置：
# Certificate location (auto generated if the file does not exist)

1 org.neo4j.server.webserver.https.cert.location=ssl/snakeoil.
cert
2
3
4 # Private key location (auto generated if the file does not exist)
5 org.neo4j.server.webserver.https.key.location=ssl/snakeoil.k
ey
注意这个 KEY 应该是不加密的。确保你给私钥设置正确的权限，以至于只有 neo4j 服务端
才有权限读取/写入它。
你可以设置 https 连接端口和配置文件中的一样，以及关闭/打开 https 支持：
1 # Turn https-support on/off

2 org.neo4j.server.webserver.https.enabled=true
3
24.1.4. 服务端授权规则
除了 IP 层面的限制，在 Web 服务器上管理员应该要求更高级的安全策略。 Neo4j 服务端
支持通过根据用户提供的证书来允许或者禁止管理员访问某些特点的功能。
为了实现在 Neo4j 服务端在特定领域的授权策略，可以实现安全规则并注册到服务端上。

这将排除轮训服务的可能的场景出现，比如给予安全和授权的用户和角色。
加强服务端授权规则
505
这这个范例中，一个失败安全规则被注册来拒绝访问所有到服务器的 URI，通过在
neo4j-server.properties 列举规则类实现：
org.neo4j.server.rest.security_rules=my.rules.PermanentlyFai
1
lingSecurityRule
规则源代码：
1
2
3
4
5
6 public class PermanentlyFailingSecurityRule implements
SecurityRule
7
8 {
9
1 public static final String REALM = "WallyWorld"; // as per
0 RFC2617 :-)
1
1 @Override
1 public boolean isAuthorized( HttpServletRequest request )
2 {
1 return false; // always fails - a production
3 implementation performs
1 // deployment-specific authorization
4 logic here
1 }
5
1 @Override
6 public String forUriPath()
1 {
7 return "/*";
1 }
8
1 @Override
9 public String wwwAuthenticateHeader()
2 {
0 return
SecurityFilter.basicAuthenticationResponse(REALM);
2
1 }
2 }
2
2
3
2
4
506
当这个规则被注册后，任何访问服务器的请求都会被拒绝。在一个生产级别的实现中，这
个规则可能会在第三方目录服务(比如 LDAP) 或者一个授权用户的本地数据库中查找证书/要
求。
Example request
Example response
• 401: Unauthorized
• WWW-Authenticate: Basic realm="WallyWorld"
使用通配符设置安全规则
在这个范例中，一个失败的安全规则被注册来拒绝任何对服务端的访问，通过在
neo4j-server.properties 列举规则类。在这个范例中，规则通过一个通配符 URI 路径定
义（* 用来表示路径中的任何部分）。比如 /users* 表示在 /users 为根路径下面的所有
路径， /users*type* 则可以用来匹配类似 /users/fred/type/premium 这样的地址。
org.neo4j.server.rest.security_rules=my.rules.PermanentlyFai
1
lingSecurityRuleWithWildcardPath
规则的源代码是：
public String
forUriPath()
1
2 {
3 return
"/protected/*";
4
}
这个规则注册后，任何访问服务端的请求都会被拒绝。使用通配符来弹性的设置安全规则
的目标，这将影响到服务端的 API，任何非托管的扩展以及已经注册的管理插件。
Example request
• GET
http://localhost:7474/protected/wildcard_replacement/x/y/z/somethin
g/else/more_wildcard_replacement/a/b/c/final/bit/more/stuff
• Accept: text/plain
Example response
507
• 401: Unauthorized
• WWW-Authenticate: Basic realm="WallyWorld"
24.1.5. 基于主机的脚本支持
重要
Neo4j 服务端默认运行完全访问相关系统来暴露远程脚本功能。不用实现一个安
全层来暴露你的服务端构成了重大的安全漏洞。
24.1.6. 深度安全
虽然 Neo4j 服务端有大陆的内建安全特性（看上面的章节），对于安全更高的部署，使用
像 Apache 的 mod_proxy [3] 来代理外部请求会更加安全的。
这样有很大的优势：
• 控制访问 Neo4j 服务端到特定的 IP 地址，URL 模式以及 IP 范围。这可以用来保证只

有 /db/data 命名空间可以被非本地客户端访问，而 /db/admin 空间只开发给特定的 IP
地址。
<Proxy *>
1
Order Deny,Allow
2
Deny from all
3
Allow from
4
192.168.0
5
</Proxy>
当用 Neo4j 的 SecurityRule 插件实现了功能需求后，比起开发插件，操作像 Apache
这样专业的配置服务器显得更好一些。然而，值得注意的是两种方法都要使用，他们会一起合
作提供访问代理服务器和 SecurityRule 插件的一致的行为。
• 在 Linux/Unix 系统中，在端口 < 1000 (比如端口 80)以非 ROOT 用户运行 Neo4j 服务

端，使用：
ProxyPass /neo4jdb/data http://localhost:7474/db/data

1
ProxyPassReverse /neo4jdb/data
2
http://localhost:7474/db/data
• 在一个集群环境中，使用 Apache mod_proxy_balancer [4] 实现简单的负载均衡。
1 <Proxy balancer://mycluster>
2 BalancerMember
http://192.168.1.50:80
3
4 BalancerMember
http://192.168.1.51:80
5
508
</Proxy>
ProxyPass /test
balancer://mycluster
24.1.7. 用一个代理重写 URL 规则

当安装 Neo4j 服务端在代理服务器后面后，你需要启动 JSON 调用的重写规则，否则他们
将指向基于(一般是 http://localhost:7474 )的服务端。
为了实现这个，你可以使用 Apache mod_substitute。
ProxyPass / http://localhost:7474/
1 ProxyPassReverse / http://localhost:7474/
2 <Location />
3 AddOutputFilterByType SUBSTITUTE
application/json
4
5 AddOutputFilterByType SUBSTITUTE text/html
6 Substitute
s/localhost:7474/myproxy.example.com/n
7
8 Substitute s/http/https/n
</Location>
[3]
http://httpd.apache.org/docs/2.2/mod/mod_proxy.html
[4]
http://httpd.apache.org/docs/2.2/mod/mod_proxy_balancer.html
第 25 章监视服务器
注意
许多监视器特性只在 Neo4j 服务器高级版和企业版才可以使

用。
为了能获取 Neo4j 数据库的健康状况，可以采用不同级别的监控等级。这些功能一般都是

通过 JMX 呈现出来。
25.1. 调整远程 JMX 访问 Neo4j 的服务器

默认情况下，Neo4j 高级版和企业版都不允许远程的 JMX 连接，因为在
conf/neo4j-wrapper.conf 配置文件中的相关配置是被注释掉了的。为了启用该功能，你必须去
掉配置 com.sun.management.jmxremote 前面的 # 。
当被注释掉时，默认值允许以某种角色远程 JMX 连接的，查看 conf/jmx.password ,

conf/jmx.access 和 conf/wrapper.conf 文件了解细节。
509
确保 conf/jmx.password 文件有正确的文件权限。文件的所有者必须是运行服务的用户，
而且只能对该用户只读。在 Unix 系统中，权限码应该是： 0600 。
在 Windows 中，按照
http://docs.oracle.com/javase/6/docs/technotes/guides/management/security-windows.html 设置正确
的权限。如果你以本地系统账户允许服务，文件的所有者和访问者必须是 SYSTEM。
使用这个配置，使用 <IP-OF-SERVER>:3637 ，用户名： monitor ，密码： Neo4j ，

你应该能连接到 Neo4j 服务端的 JMX 监控上面。
特别注意你可能必须升级 conf/jmx.password 和 conf/jmx.access 文件的权限或者拥有者，

查看 conf/wrapper.conf 文件了解细节。
警告
为了最大的安全性，请至少调整在文件 conf/jmx.password 中的密码设

置。
要获取更多细节，请查看：
http://download.oracle.com/javase/6/docs/technotes/guides/management/agent.html
25.2. 如何使用 JMX 和 JConsole 连接到一个 Neo4j 实

例
首先，启动你的嵌入数据库或者 Neo4j 服务端，使用：
$NEO4j_HOME/bin/neo4j
1
start
现在启动 JConsole：
$JAVA_HOME/bin/jcon
1
sole
连接到你的 Neo4j 数据库实例的进程：
图 25.1. 连接 JConsole 到 Neo4j 的 Java 进程
510
现在，与通过 JVM 暴露的 MBeans 无关，你将在 MBeans 选项卡处看到一个 org.neo4j 部
分。使用它，你可以访问所有 Neo4j 的监控信息。
为了打开 JMX 的远程监控访问，请查看第 25.1 节 “调整远程 JMX 访问 Neo4j 的服务器”

和 the JMX documention。当使用 Neo4j 嵌入模式时，确保传递
com.sun.management.jmxremote.port=portNum 或者其他配置作为 JVM 参数给你的允
许的 JAVA 进程。
图 25.2. Neo4j MBeans 浏览
511
25.3. 如何编程连接到 JMX 监视器
为了可编程连接到 Neo4j JMX 服务端，在 Neo4j 管理组件中有一些方法可以帮助你找到
最常用的监控属性。范例请查看：第 4.8 节 “读取一个管理配置属性”。
一旦你获取了这些信息，你可以使用它，把参数值暴露给 SNMP 或者其他监控系统的实

例使用。
部分 V. 工具集
工具集部分描述了可以使用的 Neo4j 工具以及如何使用它们。
第 26 章基于 Web 的 Neo4j 图数据库管理工具

基于 Web 的 Neo4j 图数据库管理工具是我们与 Neo4j 数据库主要的用户交互接口。使用它，
你可以：
• 监控 Neo4j 服务器
• 维护和浏览数据
• 通过控制台与数据库直接进行交互操作
• 浏览管理对象(JMX MBeans)
在你安装了 Neo4j 服务器后，这个工具可以通过地址： http://127.0.0.1:7474/ 访问。要使

用它与嵌入模式的 Neo4j 图数据库交互，请参考：server-embedded。
512
26.1. 仪表盘选项
仪表盘提供了一个对正在运行的 Neo4j 实例的总览。
图 26.1. Web 管理工具之仪表盘
26.1.1. 实体图
图表展示了随着时间的推移的实体数量：节点，关系和属性。
图 26.2. 实体图
26.1.2. 监视器状态
正实体图下面是一个状态面板收集区，显示了当前的资源使用情况。
图 26.3. 状态指示面板
513
2
6.2. 数据选项
使用数据选项浏览，新增和修改节点，关系以及它们的属性。
图 26.4. 浏览和维护数据
图 26.5. 编辑属性
2
6.3. 控制台选项
控制面板可以让你：
• 通过 Gremlin 脚本引擎访问数据库。
• 通过 cypher-query-lang,Cypher 查询数据。
• 通过 HTTP 控制台进行 HTTP 请求。
图 26.6. 使用 Gremlin 遍历数据
514
图 26.7. 使用 Cypher 进行查询
图 26.8. 通过 HTTP 进行交互
515
2
6.4. 服务器信息面板
服务器信息面板提供访问所有的管理对象（详细的细节请参考：operations-monitoring）。
图 26.9. JMX 属性
516
第 27 章 Neo4j 命令行
Neo4j shell 是一个用来浏览图数据库的命令行 shell，就像在 Unix Shell 中你可以使用像 cd,
ls 和 +pwd+来浏览你的本地文件系统一样。
它包括两个部分：
• 通过 RMI 发送命令的轻量级客户端
• 处理这些命令并返回结果的客户端
它是用于开发和调试非常好的工具。这个向导将给你展示如何使用它。
27.1. 启动 shell
当 Neo4j 作为一个服务器已经启动时，你只需要简单的输入下面的命令：
./bin/neo4j-s
1
hell
要了解完整的选项，请参考：Shell 手册。
为了连接到一个正在运行的 Neo4j 图数据库，使用只读模式连接到本地数据库而远程数据

库请参考：第 27.1.1 节 “使能 Shell 服务器”。
517
当你启动你的 Neo4j 实例时，你需要确保 shell jar 文件正类路径中。
27.1.1. 使能 Shell 服务器

Shell 从 Neo4j 内核配置中设置是否启动，请参考：server-configuration,服务器配置。这儿
是一个范例配置：
# 使用默认配置
1
enable_remote_shell = true
2
3 # ...或者指定自定义端口，而使用默认参数给其他
4
地方
enable_remote_shell = port=1234
当使用 Neo4j 服务器时，看 server-configuration,服务器配置了解如何在这种情况下新增配
置设置。
有两个方法启动 Shell，要么是连接到一个远端 Shell 服务器又或者直接指向一个 Neo4j 的

存储路径。
27.1.2. 连接到 Shell 服务器

为了启动 Shell 并连接到一个运行的服务器，请运行：
neo4j-s
1
hell
另外还支持 -port 和 -name 选项，当然它们取决于远程 Shell 服务器如何启动的。然后
你将得到一个 shell 提示符像下面这样：
neo4j-sh
1
(0)$
27.1.3. 把 shell 指向一个路径

通过指定到一个你运行 shell jar 文件的 Neo4j 存储路径来启动 shell。
neo4j-kernel-<version>.jar 和 jta jar 文件与你的 neo4j-shell-<version>.jar 在同一个目录，你可以
用下面的命令启动它：
$ neo4j-shell -path
1
path/to/neo4j-db
27.1.4. 只读模式
正启动一个带存储路径的 shell 时指定 -readonly 开关，那么在会话期间就能进行任何
写操作。
518
$ neo4j-shell -readonly -path
1
path/to/neo4j-db
27.1.5. 运行一个命令然后退出
可以告诉 shell 只是启动，执行一个命令然后就退出。
这个功能用于后台任务和管理大量输出的命令，比如 ''ls'，你可以通过管道将它输出到
''less''' 或者其他你选择的阅读工具上面，或者甚至是一个文件。
下面是用法范例:
1 $ neo4j-shell -c "cd -a 24 && set name Mattias"

2 $ neo4j-shell -c "trav -r KNOWS" | less
上一页下一页
27.2. 传递选项和参数
传递选项和参数给你的命令行相比在 *nix 环境下需要 CLI 命令行来说是非常简单的。选
项带有 - 前缀而且可以包括多个选项。一些选项起到有一个值。参数是没有 - 前缀的字符串。
让我们把 ls 作为一个范例看看：
ls -r -f KNOWS:out -v 12345 将生成一个带有输出关系 KNOWS 的节点 12345' 关

联的节点详细列表。节点， 12345+，是 +ls 的一个参数，告诉它是要显示这个指定的节点
而不是当前节点（默认是当前节点，使用命令： pwd 查看当前节点）。然而这个也有一个简
短的版本：
ls -rfv KNOWS:out 12345 。这儿三个选项都写在一个单独的前缀 - 后面合并在一起

的。虽然 f 在中间，它也知道是 KNOWS:out 需要的值。这是因为 ls+命令的选项 +r 或者 v
并不期望任何值。因此，它能推断出正确的选项和对应的值。
27.3. 枚举选项
一些选项期望一个值是一个枚举中的一个，比如，关系类型的方向就是： INCOMING+，
+OUTGOING 和 BOTH 。这些值都在之前已经提供了。它足够你使用并且解释器也能找出你真
正的意思。比如 out+， +in+， +i 设置是 +INCOMING
27.4. 过滤器
一些命令使用过滤器来达到各种目的。比如，在 ls 和在 trav 中的 +-f+。一个过滤器
以一个 http://www.json.org/[json]对象的形式提供，键和键值都能包括正则表达式来表示更复杂
的匹配。
519
一个过滤器的范例是： .*url.*:http.*neo4j.*,name:Neo4j+。过滤器选项也可以伴
随 +-i 和 -l 等选项，表示忽略大小写和宽松的匹配（即时过滤器的值只匹配一部分也
认为完全匹配而不要求完整匹配）。所以如果不考虑大小写并且模糊查询的话，你可以使用 -f
-i -l 或者它们的简短版本： -fil 。
27.5. 节点标题标签
为了能更好的浏览你的图数据库，shell 可以为每一个节点显示标题。比如： +ls -r+。它将
显示关系以及关系相连的其他节点。标题和每一个节点显示在一起而这个标题可以由一组属性
值组合而成。
如果你站在一个有两个 KNOWS+关系指向其他节点的节点上时，你是很难知道谁是谁的朋友
的。标题特性通过读取一系列属性值并且抓取这些属性存在的第一个属性值组合在一起并作为节点
的标题显示。因此你可以从列表中选择需要的属性。比如：+name,title.*,caption 而每个
节点的标题将是在那个列表中第一个存在的值。列表通过你的使用 TITLE_KEYS 环境变量来
定义，默认是： +.*name.*,.*title.*+。
27.6. 如何使用私有命令
shell 是在 Unix Shell 之后建立模型的，比如你用来浏览你本地文件系统的 bash。它们有一
些相同的命令，比如 cd 和 ls+。当你第一次启动 shell 时，你将获取到所有变量命令的一个
列表。使用 +man <命令> 获取更多关于一些特殊命令的用法。
27.6.1. 当前节点/关系以及路径
你有一个当前节点/关系和一个 “当前路径”（像在 bash 中的当前工作目录一样）到此，你
可以遍历查询了。你从参考节点开始，然后 cd 到你想去的路径（任何时候都可以通过命令 pwd
查询当前路径）。 cd 可以以不同的方式使用：
• cd <node-id> 将遍历一个关系到指定 ID 的节点。节点必须有一个直接关系连接

到当前节点。
• cd -a <node-id> 将做一次绝对路径切换，意味着给定的节点必须有一个直接
的关系连接到当前节点。
• cd -r <relationship-id> 将遍历到一个关系而不是一个节点。关系必须有当
前节点作为起点或者终点。在节点上面使用命令 ls -vr 查询关系的 id。
• cd -ar <relationship-id> 将做一次绝对路径切换，意味着给定的关系可以
是图中的任意一个关系。
• cd 将把你重新带回参考节点处，就是你开始的地方。
• cd .. 将带你返回上一次所在的地方，如果去掉 .. 将会带你到参考节点处。
• cd start _(只适合你的当前路径是一个关系)_。遍历到关系的开始节点处。
• cd end _(只适合你的当前路径是一个关系)_。遍历到关系的结束节点处。
520
27.6.2. 列出一个节点/关系的内容
使用命令 ls 来列举当前节点/关系的内容（当然也可以是其他节点）。请注意如果当前
节点/属性每一任何属性或者关系图将返回一个空值。（比如在一个新数据库上面）。 ls 使
用一个节点作为参数，也可以使用过滤器，细节请参考：第 27.4 节 “过滤器” 而关于如何指定
方向的信息，请参考：第 27.3 节 “枚举选项”。使用 man ls 获取更多帮助信息。
27.6.3. 创建节点和关系
你通过关系连接它们到当前节点来创建一个新的节点。比如 mkrel -t
A_RELATIONSHIP_TYPE -d OUTGOING -c 将创建一个新的节点（-c+）并通过带有
+OUTGOING 方向的 A_RELATIONSHIP_TYPE 连接到当前节点。如果你已经有两个节点，你
可以这样：+mkrel -t A_RELATIONSHIP_TYPE -d OUTGOING -n <other-node-id>+。它将只会
创建关系而不会创建节点。
27.6.4. 设置，重命名和移除属性
属性操作通过 set+， +mv 和 rm 来完成。这些命令操作当前的节点/关系。 * 带 -t 选
项的 set <key> <value> 用来设置一个属性值。属性值支持 Neo4j 支持的任何类型的值。
比如 int 类型：
$ set -t int age

1
29
设置一个 double[] 属性的范例：
$ set -t double[] my_values

1
[1.4,12.2,13]
设置一个包括一个 JSON 字符串的 String 的范例：
1 mkrel -c -d i -t DOMAIN_OF --np "{'app':'foobar'}"

• rm <key> 移除一个属性。
• mv <key> <new-key> 重命名一个已经存在的属性。
27.6.5. 移除节点和关系
删除节点和关系通过命令 rmnode 和 rmrel 来完成。
rmnode 能删除节点，如果被删除的节点还有关系，你也可以通过附加选项 -f 来强制删

除。rmrel 能删除关系，它始终保证图中的关联性，但页可以通过 -f 选项来强制删除。rmrel
也能删除正被删除关系相连的节点，当然这个关联的节点么可以任何关系，查看选项 -d 了解
细节。
521
27.6.6. 环境变量
shell 使用环境变量 a-la bash 来保存会话信息，比如当前的路径等等。这个命令模仿了 bash
命令 export 和 env+。比如，你能正任何时候执行一个 +export STACKTRACES=true+命令
来设置 +STACKTRACES+环境变量为 +true+。这让一个异常或者错误发生后，堆栈可以被打印
出来。使用 +env 列举环境变量。
27.6.7. 执行 groovy/python 脚本
shell 是支持执行脚本的，比如 Groovy 和 Python (通过 Jython)。脚本(*.groovy, *.py)必须
保存在服务器上，我们通过客户端调用，比如： gsh --renamePerson 1234 "Mathias"
"Mattias" --doSomethingElse ，脚本 renamePerson.groovy 和 doSomethingElse.groovy 必
须在服务端存在于通过环境变量 GSH_PATH 指定的目录（默认是在 .:src:src/script）。变量跟
java 类路径一样，通过 : 分割，然而脚本有 .py 扩展名而路径的环境名是 +JSH_PATH+。
当书写脚本时，变量 args （一个 String[]）是可以使用的，并包括里面支持的参数。在

上面的 renamePerson 的范例的情况中，数组会包括 ["1234", "Mathias",
"Mattias"]+。也请你将输出结果指定给变量 +out+，比如 +out.println( "My tracing
text" ) 会在客户端打印结果而不是在服务端。
27.6.8. 遍历查询
允许你通过使用命令 trav 从当前的节点进行遍历查询。你能提供关系类型（可以使用正
则匹配），方向以及属性过滤器来匹配节点。另外，你能提供一个命令行来执行每一次匹配。
比如：trav -o depth -r KNOWS:both,HAS_.*:incoming -c "ls $n"+。表示查询条件
是：深度优先，关系类型是 +KNOWS+但不考虑方向，有输入方向的关系 +HAS_.\*+并且为每一
个匹配的节点执行命令： +ls <matching node>+。节点过滤器页支持 +-f 选项，请参考：
第 27.4 节 “过滤器”。了解查询顺序选项，请参考：第 27.3 节 “枚举选项” 。甚至是关系类型/
方向都可以通过相同的格式作为过滤器存在。
27.6.9. 通过 Cypher 查询
你能使用 Cypher 查询数据库，请使用命令： +start+。
提示
Cypher 查询需要一个 ; 作为结尾符

号。
• start n = (0) return n; 将给你一个 ID=0 的节点列表
27.6.10. 索引
522
通过索引命令是能查询和维护索引的。
比如：
index -i persons name（将在索引"persons"
中索引当前节点或者关系的名称）。
• -g 将在索引中进行精确查询并且显示结果。你能使用 -c 带一个命令为每一个结
果执行一次命令。
• -q 将请求一个索引，查询并且显示结果。你能使用 -c 带一个命令为每一个结果
执行一次命令。
• --cd 将改变当前位置到查询到的位置。它只是适用于使用 -c 选项。
• --ls 会为每一个结果查询内容列表。它只适用于使用 -c 选项。
• -i 将为当前节点/关系在一个索引中创建一个 key-value 对。如果没有给予属性值，

当前节点将作为一个值。
• -r 将为当前节点/关系在一个索引中移除一个 key-value 对。键和值是可选的。
• -t 将设置工作的索引类型，比如：`index -t Relationship --delete friends`将删除

friends 关系的索引。
27.6.11. 事务
能尝试修改，然后能提交并且在修改失败后能回滚是非常有用的。
事务是可以被嵌套的。在一个嵌套的事务里面，除了顶层事务以外，一个提交是不能写任
何东西在磁盘上的。然而，一个回滚是不管事务的层级的。它将回滚所有打开的事务。
• begin transaction 启动一个事务。
• commit 提交一个事务。
• rollback 回滚所有打开的事务。
27.7. 扩展 Shell: 增加你自己的命令

shell 当然是可以扩展的并且有一个没有功能的通用的核心… 只有命令中的一些可以使
用。
所以，你说你想开始一个 Neo4j 图数据库，打开远程 shell 开关并且增加你自己的应用到

里面以至于你的应用和标准 Neo4j 应用共同存在，这儿是一个范例：
1 import org.neo4j.helpers.Service;
2 import org.neo4j.shell.kernel.apps.GraphDatabaseApp;
3
4 @Service.Implementation( App.class )
5 public class LsRelTypes extends GraphDatabaseApp
6 {
7 @Override
8 protected String exec( AppCommandParser parser, Session
session,
9 Output out )
523
1 throws ShellException, RemoteException
0 {
1 GraphDatabaseService graphDb = getServer().getDb();
1 out.println( "Types:" );
1 for ( RelationshipType type :
2 graphDb.getRelationshipTypes() )
1 {
3 out.println( type.name() );
1 }
4 return null;
1 }
5 }
1
6
1
7
1
8
1
9
而你现在能在 shell 里面输入 lsreltypes （它的名字是基于类名的）来使用它们，当然
要求 getName 没有被重载。
如果你想正使用命令 help 的时候显示一些更友好的信息，重载 getDescription 方法

并使用 +addValueType+方法增加你在你的应用中支持的选项的描述。
要知道应用是运行在服务端所以你你有一个正在允许的服务器而你从另外一个 JVM 启动
的一个远程客户端的话，你是不能增加你的应用到客户端的。
27.8. 一个 Shell 会话范例

1
# 我们在那儿？
2
3 neo4j-sh (0)$ pwd
4
当前在：(0)
5
6 (0)
7
8
9
# 在当前节点，设置属性“name"的值为"Jon"
1
0 neo4j-sh (0)$ set name "Jon"
1
1
# 发送一个 cypher 查询
1
2 neo4j-sh (Jon,0)$ start n=node(0) return n;
524
1 +---------------------+
3 | n |
1 +---------------------+
4 | Node[0]{name:"Jon"} |
1 +---------------------+
5 1 row
1 0 ms
6
1
7
1
# 创建一个类型为 LIKES 的输入关系，并且创建关系的终点节点
8
1 neo4j-sh (Jon,0)$ mkrel -c -d i -t LIKES --np "{'app':'foobar'}"
9
2
# 我们在那儿？
0
2 neo4j-sh (Jon,0)$ ls
1 *name =[Jon]
2 (me)<-[:LIKES]-(1)
2
2
3
# 切换到新创建的节点
2
4 neo4j-sh (Jon,0)$ cd 1
2
5
# 列举关系包括关系的 Id
2
6 neo4j-sh (1)$ ls -avr
2 (me)-[:LIKES,0]->(Jon,0)
7
2
8
# 创建不止一个关系 KNOWS 以及它的终点节点
2
9 neo4j-sh (1)$ mkrel -c -d i -t KNOWS --np "{'name':'Bob'}"
3
0
# 输出当前位置
3
1 neo4j-sh (1)$ pwd
3 Current is (1)
2 (Jon,0)-->(1)
3
3
3
# 列举关系
4
3 neo4j-sh (1)$ ls -avr
5 (me)-[:LIKES,0]->(Jon,0)
3 (me)<-[:KNOWS,1]-(Bob,2)
525
6
3
7
3
8
3
9
4
0
4
1
4
2
4
3
4
4
4
5
4
6
4
7
4
8
4
9
5
0
5
1
27.9. 黑客帝国范例
这个范例通过 shel 创建了在黑客帝国中的人物的图数据库，然后执行 Cypher 来查询数据：
图 27.1. 基于 Shell 的黑客帝国范例
526
Neo4j 是配置成自动索引的，在 Neo4j 配置文件中如下：
node_auto_indexing=true
1
node_keys_indexable=name,age
2
3
relationship_auto_indexing=true
4
relationship_keys_indexable=ROOT,KNOWS,CODE
5
D_BY
下面的范例展示了如何通过 Shell 会话创建黑客帝国的图数据库并从查询数据。
1
# 创建 Thomas Andersson 节点
2
527
3 neo4j-sh (0)$ mkrel -t ROOT -c -v
4 Node (1) created
5 Relationship [:ROOT,0] created
6
7
8
# 切换到新节点
9
1 neo4j-sh (0)$ cd 1
0
1
# 设置节点属性：name
1
1 neo4j-sh (1)$ set name "Thomas Andersson"
2
1
# 创建 Thomas 直接的朋友
3
1 neo4j-sh (Thomas Andersson,1)$ mkrel -t KNOWS -cv
4 Node (2) created
1 Relationship [:KNOWS,1] created
5
1
6
1
7 neo4j-sh (Thomas Andersson,1)$ cd 2
1
8
1
9 neo4j-sh (2)$ set name "Trinity"
2
0
# 返回上级
2
1 neo4j-sh (Trinity,2)$ cd ..
2
2
# 创建 Thomas 直接的朋友
2
3 neo4j-sh (Thomas Andersson,1)$ mkrel -t KNOWS -cv
2 Node (3) created
4 Relationship [:KNOWS,2] created
2
5
2
6
2 neo4j-sh (Thomas Andersson,1)$ cd 3
7
2
8
2 neo4j-sh (3)$ set name "Morpheus"
9
528
3
# 创建到 Trinity 的关系
0
3 neo4j-sh (Morpheus,3)$ mkrel -t KNOWS 2
1
3
# 列出节点 3 的关系
2
3 neo4j-sh (Morpheus,3)$ ls -rv
3 (me)-[:KNOWS,3]->(Trinity,2)
3 (me)<-[:KNOWS,2]-(Thomas Andersson,1)
4
# 切换当前位置到关系#2
3
5 neo4j-sh (Morpheus,3)$ cd -r 2
3
# 设置关系的属性：age
6
3 neo4j-sh [:KNOWS,2]$ set -t int age 3
7
# 回到 Morpheus
3
8 neo4j-sh [:KNOWS,2]$ cd ..
3
# 回到下一个关系
9
4 neo4j-sh (Morpheus,3)$ cd -r 3
0
4
# 设置关系的属性：age
1
4 neo4j-sh [:KNOWS,3]$ set -t int age 90
2
4
# 切换当前位置到当前关系的开始节点处
3
4 neo4j-sh [:KNOWS,3]$ cd start
4
4
# 创建新节点
5
4 neo4j-sh (Morpheus,3)$ mkrel -t KNOWS -c
6
4
# 列出当前节点的所有关系
7
4 neo4j-sh (Morpheus,3)$ ls -r
8 (me)-[:KNOWS]->(Trinity,2)
4 (me)-[:KNOWS]->(4)
9 (me)<-[:KNOWS]-(Thomas Andersson,1)
5
# 切换到 Cypher
0
5 neo4j-sh (Morpheus,3)$ cd 4
1
# 设置属性：name
5
2 neo4j-sh (4)$ set name Cypher
5
529
3
# 从 Cypher 处创建新节点
5
4 neo4j-sh (Cypher,4)$ mkrel -ct KNOWS
5
5
# 列出关系
5
6 neo4j-sh (Cypher,4)$ ls -r
5 (me)-[:KNOWS]->(5)
7 (me)<-[:KNOWS]-(Morpheus,3)
5
8
# 切换到 Agent Smith 节点
5
9 neo4j-sh (Cypher,4)$ cd 5
6
0
6
1 neo4j-sh (5)$ set name "Agent Smith"
6
2
# 创建输出关系和新节点
6
3 neo4j-sh (Agent Smith,5)$ mkrel -cvt CODED_BY
6 Node (6) created
4 Relationship [:CODED_BY,6] created
6
5
# 跳到这
6
6 neo4j-sh (Agent Smith,5)$ cd 6
6
7
6
8 neo4j-sh (6)$ set name "The Architect"
6
9
# 跳回第一个节点处
7
0 neo4j-sh (The Architect,6)$ cd
7
Morpheus 的朋友, 通过 Neo4j 自动生成的索引 name 查询 Morpheus
1
7 neo4j-sh (0)$ start morpheus =
2 node:node_auto_index(name='Morpheus') match
morpheus-[:KNOWS]-zionist
7 return zionist.name;
3 +--------------------+
7 | zionist.name |
4 +--------------------+
7 | "Trinity" |
5 | "Cypher" |
7 | "Thomas Andersson" |
6 +--------------------+
530
7 3 rows
7 19 ms
7
8
7
9
# Morpheus 的朋友, 通过 Neo4j 自动生成的索引 name 查询 Morpheus
8
0 neo4j-sh (0)$ cypher 1.6 start morpheus =
node:node_auto_index(name='Morpheus')
8 match
1 morpheus-[:KNOWS]-zionist return zionist.name;
8 +--------------------+
2 | zionist.name |
8 +--------------------+
3 | "Trinity" |
8 | "Cypher" |
4 | "Thomas Andersson" |
8 +--------------------+
5 3 rows
8 1 ms
6
8
7
8
8
8
9
9
0
9
1
9
2
9
3
9
4
9
5
9
6
9
7
9
8
9
9
1
531
00
1
01
1
02
1
03
1
04
1
05
1
06
1
07
1
08
1
09
1
10
1
11
1
12
1
13
1
14
1
15
1
16
1
17
1
18
1
19
1
20
1
21
1
22
1
23
532
1
24
1
25
1
26
1
27
1
28
1
29
1
30
1
31
1
32
1
33
1
34
1
35
1
36
1
37
部分 VI. 社区
Neo4j 项目有一个非常强大的社区。阅读下面的信息了解如何从社区获取帮助和如何为社
区做贡献。
第 28 章社区支持
你能在不同的事件上学习很多关于 Neo4j 的信息。为了获取关于 Neo4j 事件更新的信息，
你可以看看下面的网站：
• http://neo4j.org/ 。
• http://neo4j.meetup.com/ 。
从 Neo4j 开源社区获取帮助; 这里是一些切入点。
• Neo4j 社区讨论组： https://groups.google.com/forum/#!forum/neo4j 。
533
• Twitter: http://twitter.com/neo4j 。
• IRC 频道: irc://irc.freenode.net/neo4j web chat。
报告一个 bug 或者新增一个 _特性请求_：
• 通用： https://github.com/neo4j/community/issues
• 监视器： https://github.com/neo4j/advanced/issues
• 备份和高可用性： https://github.com/neo4j/enterprise/issues
关于文档的问题： Neo4j 手册发布在网上，允许大家评论，请使用他提交任何问题或者

评论，请浏览： http://docs.neo4j.org/ 。
第 29 章促进 Neo4j 发展
Neo4j 工程是一个致力于带给我们快速复杂数据存储的开源工程。任何形式的帮助都是社区高度赞
赏的 - 你不是一个人在战斗，请参考：贡献者列表!
贡献给 Neo4j 工程的一个关键方面，请参考：CLA 协议。
简而言之：确保签署 CLA 协议并通过邮件发送给我们否则 Neo4j 工程不能接收你的贡献信息。
注意你也能贡献文档或者在当前页面提出你的反馈来给社区做贡献。基本上，在你能得到帮助的任
何地方，都有你贡献的机会。
如果你想为社区作贡献，可以从一些好的领域开始，特别要与社区联系，请参考：社区支持。
如果你想在文档方面作贡献，我们非常推荐你先阅读：社区文档。
29.1. 贡献者许可协议
我们要求所有托管在 Neo4j 上面的源代码都必须通过 Neo4j Contributor License Agreement (CLA)贡
献出来。这个协议的目的是为了保护完整的代码库，反过来也保护代码库周围的社区：创始公司 Neo
Technology，Neo4j 开发社区以及 Neo4j 用户。贡献者协议类型跟很多开源软件和开源工程类似（实际
上，他非常类似于广泛签署的 Oracle Contributor Agreement）
。
如果你没有 CLA 协议的其他任何问题，请继续浏览下面的内容或者发送邮件到地址：
admins@neofourjay.org 。如果你有一个法律问题，请咨询律师。
29.1.2. 一般问题我会失去我自己代码的权利吗？
不会，Neo4j CLA 只要求你共享你的权利，而不是要你放弃你的权利。不像一些贡献协议，要求您
转移到另一个组织的版权。CLA 并不会带走你的知识产权。当你同意 CLA 协议时，您授予我们共同拥
有版权以及你的贡献的一个专利授权。你保留所有权利，所有权和利益，你可以把它们用正你任何你希
望的地方。除了撤销我们的权利，你还可以对你的代码做任何事情。
你们会如何处理我贡献的东西？
我们可以行使版权持有人的所有权利，以及拥有您在 Neo4j CLA 中授权给我们的你拥有的权利。在
CLA 提供的联合版权的所有权中，在你的贡献上，你可行使和我们相同的权利。
为此社区会提供什么福利？
嗯，它可以让我们赞助 Neo4j 重点项目和为社区提供基础设施，以便缺博啊我们能引入这些到软件
中，我们发送给我们的客户没有任何讨厌的惊喜。没有这个能力，我们作为一个小公司将很难发布我们
所有的代码作为免费软件。
而且，CLA 协议让阿功能我们保护社区成员（包括开发者和用户）在有需要时面对有敌意的知识产
品诉讼。这是与其他自由软件管家维护项目（除了与自由软件基金会（FSF），没有共享的版权而是你完
全签署的 FSF）是一样的，比如 Free Software Foundation - FSF。贡献者协议页包括一份"自由软件公约"
534
或者说一个承诺：贡献的东西将保持作为免费软件。
直到最后，你仍然保留你的贡献的所有权利而我们非常有信息的说：我们能保护 Neo4j 社区以及

Neo4j 公司的客户。
我们可以讨论一些 CLA 里面的条款吗？
绝对可以的！请告诉我们你的反馈！But let’s keep the legalese off the mailing lists。请把你的反馈信
息直接发送邮件到 cla (@t) neotechnology.com，我们会尽快回复你的。
我仍然不喜欢 CLA 协议。
好的，当然你也可以把你的代码或者稳定托管到其他任何地方。请不要！我们只是在这讨论下我
们提供的基础设施的规则而已。
29.1.3. 如何签署 CLA 协议
当你已经阅读完了 CLA，请你发送一份邮件到：cla (@t) neotechnology.com。包括下面一些信息：
你的完整姓名（包括中文名和英文名）。
你的常用电子邮件。
Neo4j CLA 协议的一份拷贝作为附件发送。
你同意这个协议的表述。
举个例子：
Hi. My name is John Doe (john@doe.com).

29.2. 可以贡献的领域
29.2.1. Neo4j 发布
29.2.2. 维护 Neo4j 文档
29.2.3. Neo4j 的客户端驱动以及绑定
Neo4j 是一个高速成长并有大量贡献空间的项目。在这里你可以和全情投入，当然取决于你的时间，
技能和兴趣。下面是一些你可能感兴趣的领域：
29.2.1. Neo4j 发布
Neo4j 社区版问题讨论，适合刚开始为社区作贡献的成员。
查看 GitHub Neo4j 项目获取所有的 Neo4j 工程。
29.2.2. 维护 Neo4j 文档
文档的某些部分需要额外维护来保持与社区同步更新。他们通常是指不同的社区贡献。下面有一个
清单，正那儿你可以随时研究和检查过时的或缺少的内容！为社区做贡献的最简单方式是直接在在线
HTML 版本发表评论。
第 5 章 Neo4j 远程客户端库
29.2.3. Neo4j 的客户端驱动以及绑定
REST: 查看第 5 章 Neo4j 远程客户端库获取当前活跃的社区项目。
29.3. 撰写 Neo4j 开发文档
注意：比起撰写文档，你也可以通过在在线 HTML 版本页面提交评论来帮助我们。
要获取如何编译手册，请参考：编译说明。
文档采用了 asciidoc 格式，请参考：
Aciidoc 参考手册
AsciiDoc 速记卡
535
这个速记卡真的非常棒!
29.3.1. 总体流程
每一个项目或者子项目都有它自己的文档，会生成到一个'docs.jar'文件中。默认情况下，这个文件
从'src/docs/'目录的内容组装而成。 Asciidoc 文档的扩展名是 .txt 。
文档可以使用来自项目的扩展代码的代码片段。相应的代码必须部署到 sources.jar 或者
'test-sources.jar'文件中。
通过建立相应的单元测试，文档可以正 JavaDoc 备注中直接书写。
上面的文件都用来构建手册（通过增加它们作为依赖）使用。为了得到包括在手册中的内容，它必
须被明确列入是手册中的文件。
注意为不同场景增加文档工作的不同方法：
为了生成详细的文档，必须把文档写来作为单元测试的一部分（包括在 JavaDoc 中）。在这种情

况下，你一般不想在文档中链接到源代码。
为了教学级的文档，通过书写一个 .txt 的文件来包括在文本中是最好的方式。源码片段和输出范
例都能从那个地方引入。在这种情况下，你一般想链接到源代码，而用户应该能不用任何额外的设置
就可以运行它。
29.3.2. 在'docs.jar'中的文件结构
目录内容
dev/
面向开发人员的内容
dev/images/
面向开发人员的内容需要的图片
ops/
面向管理员的内容
ops/images/
面向管理员的内容需要的图片
man/
联机帮助页
额外子目录被用来作为文档所必须的结构，比如'dev/tutorial/'，'ops/tutorial/'等等。
29.3.3. 标题和文档结构
每一个文档都带有标题并从等级 0 开始。每一个文档应该有一个 ID。在一些情况下，在文档中的片
段也需要它们的 ID，这依赖与它们是否会填充到整体结构中。为了能链接到这些内容，必须有一个 ID。
在强制性地方如果缺少 ID 在构建时将会失败。
这是一个文档的开始：
536
1
2
3 [[unique-id-verbose-is-ok]]
文档标题
====
为了把标题放到正确的等级下面，当在一个文件中引入另外文件时，属性 leveloffset 将会被使用。
Subsequent headings in a document should use the following syntax:
... 这是内容 ...
=== 三级标题 ===
这是内容 ...
Asciidoc 对头部分的语法不止一个，但在这个项目中它不会被使用。
29.3.4. 撰写
一行写一句。这让很容易移动内容，并且很容易分拆长句。
29.3.5. 陷阱
一个章节不能是空的。 (构建时做 docbook xml 合法性校验时会失败)。
文档标题应该有下划线，由与标题字符个数相同的 = 组成。
在文档末尾总是留一个空行（或者说下一个文档的标题会在文档的最后一段后结束）。
因为 {} 用来作为 Asciidoc 的属性，每一个内嵌正里面的都被认为是一个属性。当你必须用它的
时候，请使用：+\{+。如果你不这样做，在没有任何提示或者警告的情况下，括号里面的文字会被移
除。
29.3.6. 链接
为了链接到手册给定 Id 的其他部分，下面展示如何链接到一个参考页面：
1 <<community-docs-overall-flow>>
输出结果：第 29.3.1 节 “总体流程”
注意
只需要简单写 "看 <<target-id>>" ，在大多数情况下，这应该足够。
如果你修养链接到其他你自定义链接文本的文档，如下操作：
1 <<target-id, link text that fits in the context>>
注意
有大量的链接文本在 web 环境工作得很好但普通文本用于打印，而我们的目标是两个都可以。
外部链接如下：
537
1 http://neo4j.org/[Link text here]
输出像这样： Link text here
对于一个短地址可能不需要增加链接文本，像这样：
1 http://neo4j.org/
输出像这样： http://neo4j.org/
注意
在链接后面有一个标点符号是不受影响的，它不会被认为是链接的一部分。
29.3.7. 文本格式
_Italics_ 显示为 Italics 表示重点。
*Bold* 显示为 Bold 用来加强，仅用于加强重点。
+methodName()+ 显示为 methodName() 也被用于文字（注意：在 + 符号之间的文字会被解析）。
`command` 显示为 command (一般用于命令行)。（注意：在 ` 符号之间的文字会被解析）。
'my/path/' 显示为 my/path/ (用来表示文件名或者路径)。
``Double quoted'' (that is two grave accents to the left and two acute accents to the right) 表示为
‘`Double quoted’'。
`Single quoted' (that is a single grave accent to the left and a single acute accent to the right) 表示为
`Single quoted'。
29.3.9. 图片
重要
正整个手册中的图片都有相同的命名空间。你指定如何管理控制它们。
图片文件
为了引入一个图片文件，确保图包括在文档目录的子目录'images/'中，然后，你可以：
1 image::neo4j-logo.png[]
输出结果：
静态的 Graphviz/ DOT

我们使用 Graphviz/DOT 语言来描述图数据库。要获取更多的细节，请参考：http://graphviz.org/。
这是如何引入一个简单范例数据库：
1
2
3
4 ["dot", "community-docs-graphdb-rels.svg"]
----
538
"开始节点" -> "结束节点" [label="关系"]
----
输出结果：
这儿是一个在构建中使用一些预置参数的范例：
10 ["dot", "community-docs-graphdb-rels-overview.svg", "meta"]

----
"A Relationship" [fillcolor="NODEHIGHLIGHT"]
"Start node" [fillcolor="NODE2HIGHLIGHT"]
"A Relationship" -> "Start node" [label="has a"]
"A Relationship" -> "End node" [label="has a"]
"A Relationship" -> "Relationship type" [label="has a"]
Name [TEXTNODE]
"Relationship type" -> Name [label="uniquely identified by" color="EDGEHIGHLIGHT"
fontcolor="EDGEHIGHLIGHT"]
----
输出结果：
传递给点过滤器的可选第二个参数定义了使用的风格：
当没定义时: 节点空间的默认样式。
neoviz: 通过 Neoviz 生成节点空间样式。
meta: 图数据库不会显示内容，更像是展示概念。
小心
DOT 语言的关键字用于其他用途时，必须用双引号括起来。关键字包括： node， edge， graph，
digraph， subgraph，和 strict.
29.3.10. 属性
你能在文档中使用的普通属性：
{neo4j-version} - 表示为 "1.8"

{neo4j-git-tag} - 表示为 "1.8"
{lucene-version} - 表示为 "3_5_0"
这些可以用来替代指向范例 APIDocs 或者源代码的 URL 的部分。注意 neo4j-git-tag 也能控制
snapshot/master。
能被使用的 Asciidoc 范例属性：
{docdir} - 指向文档的根目录
{nbsp} - 不间断空格
29.3.11. 备注
539
有一个独立的构件来引入备注。备注用一个黄色背景表示。默认它是不会被构建的，但在一个普通
构建之后，你可以使用 make annotated 来构建它。你也能使用结果页来搜索内容，以及在一个单页面上
的完整手册。
下面是范例：
1 // this is a comment
备注在一般构建结果文档中是不可见的。备注块根本不会包括在任何构建输出里面。这是一个范例：
4 ////
Note that includes in here will still be processed, but not make it into the output.
That is, missing includes here will still break the build!
////
29.3.12. 代码片段在文件中明确定义
警告
尽可能少的使用代码片段。众所周知的，它们以及很长时间没有更新了。
下面是范例：
4 [source,cypher]
----
start n=(2, 1) where (n.age < 30 and n.name = "Tobias") or not(n.name = "Tobias") return n
----
输出结果：
1 start n=(2, 1) where (n.age < 30 and n.name = "Tobias") or not(n.name = "Tobias") return n
如果没有合适的高亮语法，就省略语言设置： +[source]+。
目前下面的语言的高亮是支持的：
Bash
Cypher
Groovy
Java
JavaScript
Python
XML
更多我们能使用的高亮语言，请参考： http://alexgorbatchev.com/SyntaxHighlighter/manual/brushes/ 。
源代码中抽取
代码可以自动的从源文件中抽取。你需要定义：
component: Maven 坐标的 artifactId 。

source: 指向部署的 jar 文件的文件路径。
540
classifier: sources 或者 test-sources 或者任何指向构件的分类。
tag: 用于搜索文件的 tag 标签。
代码语言，如果需要相应的语法高亮显示。
注意构件必须作为一个手册项目的 Maven 依赖被引入以便文件可以被找到。
。有一个非常简单的 on/off 开关，让多个出现的装配进一个单

注意标签“ABC”将匹配“ABCD”
独的代码片段输出。这个行为能被用户用来当来自代码范例测试的时候隐藏断言。
这是一个定义如何包含代码片段的范例：
7 [snippet,java]
----
component=neo4j-examples
source=org/neo4j/examples/JmxTest.java
classifier=test-sources
tag=getStartTime
----
输出结果：
1
8 private static Date getStartTimeFromManagementBean(
GraphDatabaseService graphDbService )
{
GraphDatabaseAPI graphDb = (GraphDatabaseAPI) graphDbService;
Kernel kernel = graphDb.getSingleManagementBean( Kernel.class );
Date startTime = kernel.getKernelStartTime();
return startTime;
}
查询结果
对于 Cypher 查询结果有一个特殊的过滤器。这是如何标记一个查询结果：
12 .结果
[queryresult]
----
+----------------------------------+
| friend_of_friend.name | count(*) |
+----------------------------------+
| Ian |2 |
| Derrick |1 |
| Jill |1 |
+----------------------------------+
3 rows, 12 ms
----
输出结果：
541
表 . --docbook
friend_of_friend.name count(*)
3 行, 12 毫秒
Ian
2
Derrick
1
Jill
1
29.3.13. 基于文档测试的一个简单 Java 范例

对于 Java 来说，有几个预置工具用来保持代码和文档都在 Javadocs 和代码片段中，这些代码片段
生成了其它工具的的 Asciidoc 文档。
为了说明这个，看看下面的文档，用文件 hello-world-title.asciidoc 生成的内容是：

59 [[examples-hello-world-sample-chapter]]
Hello world 范例章节
================
这是一个范例文档测试，演示了将代码和其他构件变成 Asciidoc 表单的不同方法。生成文档的标题

由方法名称决定，并用 " " 取代 "+_+" 。
Below you see a number of different ways to generate text from source,
inserting it into the JavaDoc documentation (really being Asciidoc markup)
via the +@@+ snippet markers and programmatic adding with runtime data
in the Java code.
- The annotated graph as http://www.graphviz.org/[GraphViz]-generated visualization:
.你好世界图
["dot", "Hello-World-Graph-hello-world-Sample-Chapter.svg", "neoviz", ""]
----
N1 [
label = "{Node\[1\]|name = \'you\'\l}"
]
N2 [
label = "{Node\[2\]|name = \'I\'\l}"
]
N2 -> N1 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "know\n"
]
542
----
- 一个 Cypher 范例查询：
[source,cypher]
----
START n = node(2)
RETURN n
----
- 一个范例文本输出片段：
[source]
----
Hello graphy world!
----
- a generated source link to the original GIThub source for this test:
https://github.com/neo4j/community/blob/{neo4j-git-tag}/embedded-examples/src/test/java/org/neo4j/exa
mples/DocumentationTest.java[DocumentationTest.java]
- The full source for this example as a source snippet, highlighted as Java code:
[snippet,java]
----
component=neo4j-examples
source=org/neo4j/examples/DocumentationTest.java
classifier=test-sources
tag=sampleDocumentation
----
以上是这个章节的全部。
这个文件通过下面的代码被引入到这个文档中：
1
2 :leveloffset: 3
include::{docdir}/examples/dev/examples/hello-world-sample-chapter.asciidoc[]
输出结果：
29.3.14. Hello world 范例章节
这是一个范例文档测试，演示了将代码和其他构件变成 Asciidoc 表单的不同方法。生成文档的标题
由方法名称决定，并用 " " 取代 "_" 。
Below you see a number of different ways to generate text from source, inserting it into the JavaDoc
documentation (really being Asciidoc markup) via the @@ snippet markers and programmatic adding with
543
runtime data in the Java code.
The annotated graph as GraphViz-generated visualization:

图 29.1. 你好世界图
一个 Cypher 范例查询：
1
2 START n = node(2)
RETURN n
一个范例文本输出片段：
1 Hello graphy world!
a generated source link to the original GIThub source for this test:
DocumentationTest.java
The full source for this example as a source snippet, highlighted as Java code:
73 // START SNIPPET: _sampleDocumentation
package org.neo4j.examples;
import org.junit.Test;
import org.neo4j.kernel.impl.annotations.Documented;
import org.neo4j.test.GraphDescription.Graph;
import static org.neo4j.visualization.asciidoc.AsciidocHelper.*;
public class DocumentationTest extends AbstractJavaDocTestbase

{
/**
* This is a sample documentation test, demonstrating different ways of
* bringing code and other artifacts into Asciidoc form. The title of the
* generated document is determined from the method name, replacing "+_+" with
* " ".
*
* Below you see a number of different ways to generate text from source,
* inserting it into the JavaDoc documentation (really being Asciidoc markup)
* via the +@@+ snippet markers and programmatic adding with runtime data
* in the Java code.
*
* - The annotated graph as http://www.graphviz.org/[GraphViz]-generated visualization:
*
* @@graph
*
* - A sample Cypher query:
*
* @@cypher
*
544
* - A sample text output snippet:
*
* @@output
*
* - a generated source link to the original GIThub source for this test:
*
* @@github
*
* - The full source for this example as a source snippet, highlighted as Java code:
*
* @@sampleDocumentation
*
* This is the end of this chapter.
*/
@Test
// signaling this to be a documentation test
@Documented
// the graph data setup as simple statements
@Graph( "I know you" )
// title is determined from the method name
public void hello_world_Sample_Chapter()
{
// initialize the graph with the annotation data
data.get();
gen.get().addTestSourceSnippets( this.getClass(), "sampleDocumentation" );
gen.get().addGithubTestSourceLink( "github", this.getClass(), "neo4j/community",
"embedded-examples" );
gen.get().addSnippet( "output",
createOutputSnippet( "Hello graphy world!" ) );
gen.get().addSnippet(
"graph",
createGraphVizWithNodeId( "Hello World Graph", graphdb(),
gen.get().getTitle() ) );
// A cypher snippet referring to the generated graph in the start clause
gen.get().addSnippet(
"cypher",
createCypherSnippet( "start n = node(" + data.get().get( "I" ).getId()
+ ") return n" ) );
}
}
// END SNIPPET: _sampleDocumentation
以上是这个章节的全部。
29.3.15. 集成远程控制台
一个交互控制台可以被新增并出现正在线 HTML 版本中。一个可选的标题被增加，用于按钮的文
545
本。
这里是做法，使用 Geoff 来定义数据，使用空行来从查询中分割它们：
10 .与黑客帝国交互的范例
[console]
----
(A) {"name" : "Neo"};
(B) {"name" : "Trinity"};
(A)-[:LOVES]->(B)
start n = node(*)
return n
----
输出结果:
29.3.16. 工具链
当配置 docbook 工具链时一些有用的链接：
http://www.docbook.org/tdg/en/html/docbook.html
http://www.sagehill.net/docbookxsl/index.html
http://docbook.sourceforge.net/release/xsl/1.76.1/doc/html/index.html
http://docbook.sourceforge.net/release/xsl/1.76.1/doc/fo/index.html
A.1. neo4j
A.1.1. 名称
neo4j - Neo4j 服务器控制和管理
A.1.2. 语法
neo4j <command>
A.1.3. 描述
Neo4j 是一个图数据库，在高度关联的数据中拥有很好的性能。
A.1.4. 命令
console
将服务器作为一个前台进程启动运行，停止服务器请使用 `CTRL-C`。
start
以后台服务形式启动服务器。
stop
546
停止一个后台运行的服务器。
restart
重启服务器
status
返回当前运行的服务器状态。
install
安装服务器作为一个平台相关的系统服务。
remove
卸载系统服务。
info
显示配置信息，比如当前的 NEO4J_HOME 和 CLASSPATH。
A.1.5. 用法 - Windows
Neo4j.bat 双击 Neo4j.bat 脚本在命令行启动服务器。在命令行窗口按下 control-C 来退
出服务。
Neo4j.bat install/remove
Neo4j 可以被安装作为 Windows 系统服务而不用显示控制台窗口。你需要使用管理员权限

来运行这个脚本。 Neo4j.bat 脚本包括如下命令选项：
• Neo4j.bat install - 安装作为一个 windows 系统服务
o 将安装作为一个 windows 系统服务

• Neo4j.bat remove - 移除 Neo4j 服务
o 将停止和移除 Neo4j 服务
• Neo4j.bat start - 启动 Neo4j 系统服务
o 如果安装了 Neo4j 系统服务，则会启动，
o 否则只会启动一次基于命令行的 Neo4j 服务。
• Neo4j.bat stop - 如果 Neo4j 系统服务在运行则停止它
• Neo4j.bat restart - 如果 Neo4j 系统服务已经安装则重启它

• Neo4j.bat status - 报告 Neo4j 服务的运行状态
o 返回 RUNNING, STOPPED 或者 NOT INSTALLED
A.1.6. 涉及到的配置文件
conf/neo4j-server.properties
服务器配置。
conf/neo4j-wrapper.conf
服务封装配置
547
conf/neo4j.properties
数据库调优配置
A.2. neo4j-shell
A.2.1. 名称
neo4j-shell - 浏览和维护一个图数据库的命令行工具
A.2.2. 语法
neo4j-shell [远程选项]
neo4j-shell [本地选项]
A.2.3. 描述
Neo4j 命令行是一个用于浏览图数据库的命令行工具，非常像 Unix 命令行，可以使用像 cd,
ls 和 pwd 等命令浏览本地文件系统一样的浏览图数据库。命令行可以直接连接到图数据库。
为了其他进程能访问这个本地数据库，你可以使用只读模式。
A.2.4. 远程选项
-port PORT
连接到主机的端口（默认：1337）。
-host HOST
连接到的主机的域名或者 IP 地址（默认：localhost）。
-name NAME
RMI 名称, 比如：rmi://<host>:<port>/<name> (默认: shell)。
-readonly
以只读模式访问数据库
A.2.5. 本地选项
-path PATH
数据库的目录。如果在这个目录没有数据库存在，则会自动创建一个新的数据
库。
-pid PID
连接到的进程编号。
-readonly
以只读模式访问数据库。
-c COMMAND
在命令行执行命令，执行完成后命令行会退出。
548
-config CONFIG
Neo4j 配置文件路径。
A.2.6. 范例
远程使用范例：
neo4j-shell
neo4j-shell -port 1337
neo4j-shell -host 192.168.1.234 -port 1337 -name
shell
neo4j-shell -host localhost -readonly
本地使用范例：
neo4j-shell -path /path/to/db

neo4j-shell -path /path/to/db -config
/path/to/neo4j.config
neo4j-shell -path /path/to/db -readonly
A.3. neo4j-backup
A.3.1. 名称
neo4j-backup - Neo4j 备份工具
A.3.2. 语法
neo4j-backup {-full|-incremental} -from 来源数据库地址 -to 目标目录名称 [-cluster 集群
名称]
A.3.3. 描述
它是一个通过网络将运行在线上的数据库备份到本地文件系统的工具。备份可以是完整或
者增量备份。第一次备份必须是完整备份，以后可以采用增量备份。
像 URI 一样的源地址是按指定格式构成的，目标地址是本地文件系统地址。
A.3.4. 备份类型
-full
拷贝完整的数据库到一个目录。
-incremental
差量备份，只拷贝自动上次完整备份以后的变化部分到一个存在的备份存储。
549
A.3.5. 数据源地址
备份源地址格式如下：
<运行模式>://<主机>[:<端口>][,<主机>[:<端口>]]…
注意可以同时定义多个主机。
运行模式
'single' 或者 'ha'. 'ha' 是用于高性能模式，而 'single' 则适合独立数据库。
主机
在单例模式，主机就是数据库服务器地址; 在高性能模式，主机是协调器的地
址。注意当使用高性能模式时可以指定多个主机。
端口
在单例模式，端口是源数据库备份服务的端口; 在高性能模式，端口是一个协
调器实例备份服务的端口。如果没有给定端口，默认端口是 6362 。
集群名称
*-cluster*::
如果你给你的高性能集群指定了一个集群名称，你需要在备份的时候指
定它。
增加这个配置参数：-cluster _my_custom_cluster_name_
[[neo4j-backup-manpage-usage-important]]
重点
当配置参数 enable_online_backup=true 设置了以后才能在数据库上面执行备份。这

将在默认端口 (6362) 启用备份服务。如果要在其他端口启用备份服务，请通过参数
enable_online_backup=port=9999 配置。
A.3.6. 用法 - Windows
*`Neo4jBackup.bat`* 脚本的用法跟 Linux 版本是一样的。
范例
[source,shell]
# 执行一次完整备份 neo4j-backup -full -from single://192.168.1.34 -to

550
# 执行一次差量备份 neo4j-backup -incremental -from single://freja -to
# 在一个自定义端口执行一次差量备份 neo4j-backup -incremental -from single://freja:9999

-to /mnt/backup/neo4j-backup
# 指定两个可能的协调器从高性能集群服务器中执行一次完整备份 ./neo4j-backup -full

-from ha://oden:2181,loke:2181 -to /mnt/backup/neo4j-backup
# 指定一个可能的协调器从高性能集群服务器中执行一次增量备份 ./neo4j-backup
-incremental -from ha://oden:2181 -to /mnt/backup/neo4j-backup
# 用一个指定的名称从高性能集群服务器中执行一次增量备份 # (名称通过
ha.cluster_name 配置) ./neo4j-backup -incremental -from ha://balder:2181 -to
/mnt/backup/neo4j-backup -cluster my-cluster
[[neo4j-backup-manpage-restore]]
从备份中恢复
Neo4j 备份的是完整功能的数据库。所以要使用你的备份，只需要用你备份的数据库目录
代替正式数据库目录即可。
A.4. neo4j-coordinator
A.4.1. 名称
neo4j-coordinator - 为 Neo4j 的高可用性集群提供的协调员
A.4.2. 语法
neo4j-coordinator <command>
A.4.3. 描述
Neo4j 协调器是一个提供对 Neo4j 高性能模式数据集群协调管理的服务器。一个"Neo4j 协
调器集群"必须在"数据库集群"之前启动。这个服务器是集群服务器之一。
A.4.4. 命令
console
启动服务器作为一个应用在前端运行，使用命令 CTRL-C 来结束运行。
start
启动服务器作为一个应用作为后台服务运行。
551
stop
停止一个在后台运行的服务器。
restart
重启一个正在运行的服务器。
status
返回当前服务器状态。
install
安装服务器作为系统服务运行。
remove
卸载系统服务。
A.4.5. 涉及到的相关配置文件
conf/coord.cfg
协调服务器配置。
conf/coord-wrapper.cfg
协调服务器系统服务配置。
协调服务器实例的标识符。
A.5. neo4j-coordinator-shell
A.5.1. 名称
neo4j-coordinator-shell - Neo4j 协调器命令行交互接口
A.5.2. 语法
neo4j-coordinator-shell -server <host:port> [<cmd> <args>]
A.5.3. 描述
Neo4j 协调器命令行交互接口提供了一个与一个运行的 Neo4j 协调服务器交互的接口。
A.5.4. 选项
-server HOST:PORT
通过提供服务器和端口连接到指定的 Neo4j 协调服务器。
附录 B. 常见问题
neo4j 数据库支持最大多少个节点？最大支持多少条边？
目前累积统计它有 34.4 亿个节点，344 亿的关系，和 6870 亿条属性。
552
neo4j 数据库支持的最复杂的连接是什么？（比如每个节点都与其他任何一个节点
相连）
可以从上面的数字得出理论的极限：它基本上就产生了 262144 节点和 34359607296
的关系图。我们从来没有见过这种使用情况。
在数据库中，读/写性能跟节点/边的数量有关吗？
这个问题意味着两个不同的问题。单次读/写操作不依赖数据库的大小。不管数据库
是有 10 个节点还是有 1 千万个都一样。 — 然而，有一个事实是如果数据库太大，你的
内存可能无法完全缓存住它，因此，你需要频繁的读写磁盘。虽然很多用户没有这样大
尺寸的数据库，但有的人却有。如果不巧你的数据库达到了这个尺寸，你可以扩展到多
台机器上以减轻缓存压力。
neo4j 数据库支持的读/写并发请求最大数量是多少呢？
在并发请求上面没有任何限制。服务器的并发量更多的是依赖于操作本身的性能（高
压写操作，简单读，复杂的遍历等等），以及使用的硬件性能。据粗略估计，在遍历最
简单路径时每毫秒可以达到 1000 次请求。在讨论了指定的用户案例后，我们能得到更好
的性能优化方案。
在数据库集群环境中数据一致性如何保证的呢？
主从复制。从服务器从主服务器拉取数据变化。拉取间隔可以在每个从服务器上进
行配置，从毫秒到分钟，根据你自己的需要来定。HA 也可以通过从服务器来进行写操
作。当发生时，从服务器通过追上主服务器来被写入，然后写入在主从之间完成。其他
从服务器做一般处理。
当在一个数据库中发生更新操作时如何快速更新其他所有服务器呢？
拉取间隔在每个从服务器上面进行配置，从几秒到几分钟不等，根据需求而定。当
通过一个从服务器写操作时，从服务器立即在写之前与主服务器进行同步。一般情况下，
读写加载不并影响从服务器的同步工作。一个复杂的写操作会给从服务器的文件系统巨
大压力，与此同时，从服务器也要求拉取同步数据。实际上，我们不系统这成为一个关
注的问题。
在集群环境中，在不同服务器会出现按比例延迟新增吗？
在集群中从服务器超过 10 台的规模时，我们能预料到来自从服务器的大量的拉取请
求会降低从服务器的性能。在集群中的写操作才会受影响，而读操作依然保持线性缩放。
支持在线扩展吗？换句话说，如果我们想新加入一台服务器到集群中需要关闭所有
服务器吗？
新的从服务器在不用停止或者启动整个集群的情况下可以被加入到一个已经存在的
集群中。我们的 HA 协议会新增入加入的服务器。从服务器也可以简单的通过关闭他们
553
自己来从集群中移除。
新加入一台服务器到全部同步需要多长时间？
我们推荐在将从服务器加入之前先做一个最近的数据库的快照。一般通过备份来完
成。从服务器之需要同步最近的更新，一般情况下只会一点点时间的数据。
重启需要多久呢？
如果重启，你的意思是关闭集群然后再重启它，这完全依赖与你打字的速度。一般
是 10 秒的样子。Neo4j 的缓存不会自动预加载，而操作系统的文件系统缓存不会重置。
是否有备份恢复机制？
Neo4j 企业版提供了一个在线备份（完整备份和增量备份）功能。
是否支持跨区集群？跨区集群是否比同区集群性能更低呢？
我们有用户在 AWS 上面测试了多区域部署的情况。跨地区部署在集群管理的效率
和协议同步上有一定影响。集群管理大量的延迟会触发主服务器的频繁重选，拖慢整个
集群的速度。在跨区部署支持上面以后还需大量提升。
是否有任何指定测控策略用于环境建立之类的需求？
关于这个话题我们有更深入的探讨。
写数据库是线程安全的吗？
不管在单服务模式还是 HA 模式，数据库在更新之前都通过锁定节点和关系来保证
线程安全。从 HA 读数据最好的策略是什么？
1. 保持会话。
2. 在 response 中发送返回数据，而在独立的请求中移除需要读回的数据。
3. 当操作需要时，强制请求从主服务器做一个拉取数据更新操作。
对于获取（如果不存在则创建）这中需求最好的策略是什么？
1. 单线程模式。
2. 如果不存在，悲观锁在一个普通节点上。
3. 如果不存在，乐观创建他，然后再检查。
如何锁定服务？
悲观锁。在读数据时并不要求锁。写操作并不会阻塞读操作。不用任何明确的锁定
操作就可以完成读取数据操作是非常重要的。当一个节点或者属性修改或者新增时，写
锁定会自动完成，或者也可以通过明确的锁设置。它常被用来提供读取语义和保证必须
的数据一致性。
554
数据存储占用空间如何？
Neo4j 当前并不适合存储 BLOBs/CLOBs。节点，关系和属性并不是保存在磁盘的同
一个地方。这个特性将来会进一步介绍。数据库索引怎么样？ Neo4j 支持复杂的属性索
引。额外的索引功能超过了图本身的索引。Lucene 引擎管理独立分页的索引并要求一些
空间来存储一个自动索引以及管理私有索引（通过 API 搜索）。
我如何进行数据库查询？
核心 API, Traversal API, REST API, Cypher, Gremlin Neo4j 使用日志（在数据丢失时
可以修复丢失的数据）功能吗？在 HA 集群环境中基于主从服务器之间的写增量来完
成。
我如何提示 Neo4j 的性能？

采用内存映射存储 Neo4j 文件，Neo4j 缓存策略解释如下：
• 软索引缓存: 软索引在 GC 认为需要时会被随时清理。如果应用加载并不高
时使用。
• 弱索引缓存: 不管 GC 是否找到，都会清理弱索引。如果在读取大量数据或
者遍历操作时使用。
• 强索引缓存: 所有的节点和关系都会保存在内存中，JVM 会阻止高加载的操
作。比如半分钟的暂停间隔。更大的堆大小是好的，然而 12G 或者更大的内存对
于 GC 是不切实际的。如果用从磁盘获取数据做比较，用内存映射文件缓存会提供
100 倍性能，而用 Java 堆则会是 1000 倍。
在主从服务器直接的 ACID 事务。

在初始从服务器到主服务器的事务同步中，最终从主服务器到其他从服务器。用死
锁探测来完成多个从服务器事务并发支持。从一个数据完整性的角度看是完全一致的，
但是必须得重多个点考虑。
独立服务器怎么样？
REST API 是完全无状态的，但他也可以通过批量提交来实现大量事务支持。线程池
和每个 socket 的线程：对于独立服务器和 HA 模式来说，Neo4j 采用 Jetty 来连接线程池。
（比如在 HA 集群中 25/每节点）。
在 HA 环境中如何使用负载均衡？
通常一个小型服务器扩展被写入后会返回 200 或 404，取决于机器是否是主或从。扩
展被负载均衡服务器用来探测主从服务器设置。只写到从服务器来确保至少在两个地方
存在提交事务。
Neo4j 支持那些监控器？
Neo4j 目前没有内建的追踪和解释计划。JMX 是用于统计和监控的主要接口。线程
555
内容可以用于调试。
我如何导入数据到 Neo4j 中？
Neo4j 批量插入用于初始化一个数据库。在批量插入后，存储的内容可以用与嵌入
模式或者 HA 环境。直接跟传统 SQL 服务器直接的数据交换目前没有官方支持。
556

Neo4j 1.8中文手册

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Neo4j 1.8中文手册

Uploaded by

Copyright:

Available Formats

目录

2.1.3. 用 Traversal 进行数据库查询

2.2.1. 从图数据库转换成 RDBMS

图 2.2. 用图实现 RDBMS 模型

2.2.2. 从图数据库转换成 Key-Value 数据库

图 2.3. Key-Value 存储模型

D=文档, S=子文档, V=值, D2/S2 = 关联到（其他）文档的索引。

下面让我们认识一个最简单的节点，他只有一个属性，属性名是 name,属性值是 Marko：

为了将来增强遍历图中所有的关系，我们需要为关系设置类型。注意 关键字 type 在

get who a person follows outgoing follows relationships,

get the followers of a incoming follows relationships,

get who a person blocks outgoing blocks relationships, depth

get who a person is incoming blocks relationships, depth

get the full path of a file incoming file relationships

属性是由 Key-Value 键值对组成，键名是字符串。属性值是要么是原始值，要么是原

null 不是一个合法的属性值。 Nulls 能代替模仿一个不存在的

byt 8-bit integer -128 to 127, inclusive

sho 16-bit integer -32768 to 32767, inclusive

int 32-bit integer -2147483648 to 2147483647, inclusive

lon 64-bit integer -9223372036854775808 to

flo 32-bit IEEE 754 floating-point

dou 64-bit IEEE 754 floating-point

cha 16-bit unsigned integers representing u0000 to uffff (0 to 65535)

Str sequence of Unicode characters

如果要了解 float/double 类型的更多细节，请参考：Java Language Specification。

Neo4j 提供了遍历的 API，可以让你指定遍历规则。最简单的设置就是设置遍历是宽

第 4 章 在 Java 应用中使用 Neo4j

在 Java 应用中使用 Neo4j 是非常容易的。正这个章节中你将找到你需要的一切 — 从开

4.1. 将 Neo4j 引入到你的项目工程中

4.1.1. 增加 Neo4j 的库文件到构造路径中

• 直接使用 Maven 中心仓库的 jars 文件。

4.1.2. 将 Neo4j 作为一个依赖添加

Eclipse and Maven

这样的话，你既可以通过 Maven 命令行来编译你的工程，也可以通过 Maven 命令自

参数 name 可以在 editions 找到。

1 def neo4jVersion = "1.8"

GraphDatabaseService graphDb = new GraphDatabaseFactory().

从第 2.1 节 “什么是图数据库?”中，我们还记得，一个 Neo4j 图数据库由以下几部分组成：

如果一个类型叫 KNOWS 的关系连接了两个节点，那么这可能表示这两个人呼吸认识。一

1 private static enum RelTypes implements

图 4.1. Hello World 图数据库

System.out.print( firstNode.getProperty( "message" ) );

// let's remove the data

private static enum RelTypes implements

1 private static String idToUserName( final int id )

1 int idToFind = 45;

要访问 Neo4j 测试功能，你应该把 neo4j-kernel 'tests.jar'新增到你的类路径中。你可以

使用 Maven 作为一个依赖管理，你通常会正 pom.xml 中增加依赖配置：

Map<String, String> config = new HashMap<String, String>();

1 private static Traverser getFriends(

1 private static Traverser findHackers( final Node startNode )

1 String output = "Hackers:\n";

1 final ArrayList<RelationshipType> orderedPathContext = new

Traverser traverser = td.traverse( A );

1 static class PathPrinter implements

下面的范例使用了一个已经废弃的遍历 API。它与新的遍历查询 API 共享底层实

4.5.2. 老的遍历查询 API

private static Traverser getFriends( final Node person )

Thomas Anderson 的朋友

让我们从一个旧的中创建一个新的 TraversalDescription ，并且设置 uniqueness 为

o friend: 连接两个不同 Person 实例的关系 (不能连接自己)

o status: 连接到最近的 StatusUpdate

o next: 指向在主线上的下一个 StatusUpdate ，是在当前这个状态更新之前发

为了将来增强遍历图中所有的关系，我们需要为关系设置类型。注意关键字 type 在

第 4 章在 Java 应用中使用 Neo4j

如果你对 Path 没有兴趣，但对 Node 有兴趣，你可以转换遍历器成一个节点的迭代器，