ArangoDB Cluster Administration

ArangoDB Cluster Administration
What you will learn

‣ Anatomy 
What is an ArangoDB cluster? How does it work?
‣ Behind the scenes 

What are the components? How to bootstrap?
‣ Concepts  
Storage engine, databases, collections, shards, indexes, authentication
‣ Cluster maintenance 
Startup, shutdown, maintenance, backup and restore, rolling upgrade
‣ Resilience 
Monitoring, failover, emptying a server for maintenance
‣ Troubleshooting 
Log files, disaster recovery
Copyright © ArangoDB GmbH, 2018 2

Table of Content
‣ Anatomy (p.6) ‣ Backup and Restore (p.58)
‣ What is an ArangoDB cluster? (p.7) ‣ Upgrade (p.61)
‣ How does it work? (p.12)
‣ Resilience (p.65)
‣ Behind the scenes (p.19)
‣ Monitoring and failover (p.66)
‣ What are the components? (p.21)
‣ Emptying a server for maintenance (p.72)
‣ How to bootstrap? (p.27)
‣ Concepts (p.33) ‣ Troubleshooting (p.73)

‣ Logging (p.74)
‣ Cluster maintenance (p.40)
‣ Disaster recovery (p.75)
‣ Databases (p.41)
‣ Collection (p.47)
‣ Final tasks (p.82)
‣ Indexes (p.49) ‣ Support ArangoDB (p.83)

Cluster nomenclature
Some nomenclature up front
‣ Shard 
A partition of collection data. Every cluster wide collection is sharded with respect to
the collection’s _key per default. Any access to collection discriminated its sharding
key.
‣ Agency 
Arangospeak for the central configuration store and supervision process of a cluster.
The agency consists of a typically small number of ArangoDB instances, which
establish consensus by means of the RAFT* consensus protocol. The agency is the
animal brain of the cluster.  
 
* Proceeding USENIX ATC'14 Proceedings of the 2014 USENIX conference on USENIX, Pages 305-320
Cluster nomenclature
‣ Supervision 
A service, that resides in the agency, which monitors the cluster's over all health. It
enacts both own jobs and user jobs on configuration changes like hardware
failover and shard movement.

Anatomy
What is an ArangoDB cluster? How does it work?

Anatomy - Cluster scheme

Anatomy - Instances
‣ An Arangodb cluster consists of multiple Arangodb database instances, which
have been specialised to their respective purposes.
‣ Database servers 
Home to the databases' collection shards. Here is where the cluster’s data resides. DB servers are
never accessed directly from outside of the cluster. The number of DB server can be adjusted
dynamically and there is no hard limit on their total number during the lifetime of a cluster.
‣ Coordinators 
Accessible front-end of a cluster. Home to cluster-wide deployment of Foxx services. The number
of deployed coordinators can be adjusted dynamically and is not limited.
‣ Agents 
The agents form the agency (see Nomenclature). Agents are never accessed from outside of the
cluster. Large agencies do not add significantly to the fault-tolerance and resilience aspect. Even in
very large deployments the number of agents should target single digits.

Anatomy - Database servers
‣ Database servers ...
‣ ... are regular ArangoDB instances.
‣ ... have a local database to every cluster-wide database
‣ ... have a local collection for every cluster-wide shard, which has been assigned to them.
‣ ... constantly monitor the agency for changes affecting them identified by their UUID.
‣ ... either as master or follower serve a shard, which is assigned to them.
‣ ... do synchronous replication as master or follower depending on their role for a shard.

Anatomy - Coordinators
‣ Coordinators ...
‣ ... serve the front-end of the cluster towards services / users.
‣ ... bootstrap a cluster.
‣ ... create / remove / alter cluster-wide entities such as databases, collections, indexes etc.
‣ ... analyse incoming requests to break them down to individual requests for db servers.
‣ ... compile / aggregate result sets from db servers to return to clients.
‣ ... serve and replicate cluster-wide Foxx services

Anatomy - Agents
‣ Agents ...
‣ ... gossip initially to find each other on the network.
‣ ... once complete, enact the RAFT consensus protocol
‣ ... hold the state of the cluster
‣ ... serve internal API to coordinators and DB servers to the state machine
‣ ... supervise cluster nodes other than agents

Anatomy - Deployment scenarios
‣ Budget configuration (3 servers) 
3 DB servers, 3 agents, 1 coordinator. Classical setup with no particular emphasis on data size or performance.
‣ Coordinator emphasis 
#coordinators > #db servers, 3 agents. This approach works best in cases where you need more CPU power for Foxx
services. Foxx services run on coordinators. By adding machines that only host coordinators, you allow for greater CPU
usage for these services.
‣ DB server emphasis 
#coordinators < #db servers, 3 agents. This approach is useful in cases where you need to increase data capacity more
than you need to improve query performance. It introduces a lesser bottleneck as there are fewer coordinators to
receive client connections.
‣ Application coordinators 
#coordinators == #application servers coordinators are on the same machines, which server the application separating
coordinators from the DB servers and agents. Avoid the network hop between the application and the coordinators,
decreasing latency.
Anatomy - Bootstrapping an cluster
The goto method of creating an ArangoDB cluster is  
that of the ArangoDB starter.
‣ Initial configuration
‣ Service discovery
‣ Cluster startup
‣ Cluster shutdown
‣ Desaster recovery

Anatomy - Single machine test environment
Start a local cluster on one machine for testing purposes
hosta > arangodb --starter.data-dir=db1 > db1.log & 

hosta > arangodb --starter.data-dir=db2 --starter.join ::1 > db2.log &  
hosta > arangodb --starter.data-dir=db3 --starter.join ::1 > db3.log &
The above will create an arangodb cluster with 3 instances of agent, db server and
coordinator. The 3 coordinators are reachable under  
http://[::]:8529, http://[::]:8534 and http://[::]:8539

All three coordinators would now display the following dashboard 
The dashboard shows some rough overview of the current cluster and its operation

Navigating to the Node view, all but the agent nodes are accounted for. 
Note that all instances are running on localhost with different ports.

Anatomy - Multi machine setup
3 Linux machines: Start a starter on each one: 
 
ssh c11 "arangodb --starter.data-dir=db > db.log &" 
  ssh c12 "arangodb --starter.data-dir=db --starter.join c11:8528 > db.log &"  
ssh c13 "arangodb --starter.data-dir=db --starter.join c11:8528 > db.log &"
 
 
The above will create an arangodb cluster with 3 instances of agent, db server and
coordinator distributed every over the machines c11, c12 and c13. The three
coordinators are reachable under  
http://c11:8529, http://c12:8529 and http://c13:8529 
 
Please note the differences to slide #8

Anatomy - Multi machine setup
Navigating to the Node view, compare the endpoints with slide #11. 
The endpoints are no longer on the same machine instead on the same ports

Behind the scenes
What are the components? How to bootstrap?

Behind the scenes
‣ Starter process starts an agent on each server
‣ Initially the agents, once started, try to gossip a complete list of peers, 3 here.
‣ Once the list is complete and locally persisted, the agents will start to enact the
RAFT consensus protocol in following order
‣ Leader election.
‣ Assembly of state machine with persisted logs if any.
‣ Establishment of the replication process to followers
‣ Service of the agency's REST handlers (at this point the agency is ready to serve requests)
‣ Start of supervision after --agency.supervision-grace-period monitoring the cluster nodes
‣ The process above is repeated every time a leader loses communication with a
majority of its peers.
Behind the scenes - Agency
‣ The agency ...
‣ ... is the animal brain of the cluster
‣ ... will serve as long as a majority of the agent instances can communicate and
consequently is able to uphold the RAFT process.
‣ ... must be up and serving at all times for the coordinators and DB servers to operate.
Every serving unit, that cannot communicate with the agency stops serving until the
communication is reestablished.
‣ ... implements the RAFT consensus protocol to ensure fault-tolerance and resilience
towards network partitions and hardware failures.
‣ ... speaks with one voice, i.e. only the RAFT leader answers to requests. The remaining
agents always redirect to the current leader.
‣ ... is a bottleneck by design as every call is replicated from the leader to followers.

Behind the scenes - Coordinators
‣ Coordinators …
‣ … start up with initialisation of local database files.
‣ … attempt to access the agency in short intervals until the RAFT process is underway.
‣ … to first to access the agency, bootstraps the cluster’s skeleton in the agency.
‣ … waits until synchronous replication of collections is completed.
‣ … start serving the cluster’s frontend

Behind the scenes - DB Servers
‣ The starter started 3 database servers specifying endpoints to the agents. Each … 
‣ … starts up with initialisation of local database files if first start.

‣ … attempts to access the agency in short intervals until the RAFT process is underway.
‣ … bootstraps information to the agency of its readiness.
‣ … starts implementing what is planned for it in the agency.
‣ Database creation/dropping
‣ Shard creation/dropping/updating
‣ Indexes
‣ Synchronous replication as leader or follower
‣ … reports readiness

Behind the scenes - Agency ( agent#1)
Behind the scenes, the starter (arangodb) created 9 processes (arangod) 3 on each machine one as agent 
 
/usr/sbin/arangod  
-c /opt/kaveh/db/agent8531/arangod.conf \  
--database.directory /opt/kaveh/db/agent8531/data \ 
--javascript.startup-directory /usr/share/arangodb3/js \ 
--javascript.app-path /opt/kaveh/db/agent8531/apps \ 
--log.file /opt/kaveh/db/agent8531/arangod.log \ 
--log.force-direct false \ 
--foxx.queues false \ 
--server.statistics false \ 
--agency.activate true \ 
--agency.my-address tcp://c11:8531 \  
--agency.size 3 \ 
--agency.supervision true \ 
--agency.endpoint tcp://192.168.10.12:8531 \ 
--agency.endpoint tcp://192.168.10.13:8531
 
The key information for agents is bold above. Also note that the parameters --agency.endpoint and  
--agency.my-address are changing for the next two machines on the following slides.
Behind the scenes - Agency (agent#2)
Note subtle differences in endpoints on c12 
 
--agency.endpoint tcp://192.168.10.13:8531  

Behind the scenes - Agency (agent#3)
Note subtle differences in endpoints on c13 
 
--agency.endpoint tcp://192.168.10.12:8531  

Behind the scenes - Agency startup parameters
The key parameters of agents to look for
--agency.activate Mandatory parameter on every agent to activate the according feature.
--agency.size Mandatory initial parameter to define the agencies size.
--agency.my-address The ip address of this agent on the cluster's network. This parameter must be
specified as often infrastructures come with multiple networks and automated
choices would often be wrong.
--agency.endpoint This parameter can be specified multiple times. It is a means to inform a starting
agent to one or more of its peers. The same peer, a circular pattern or all peers
may be specified at start time. The key here is that all agents are capable to
complete their list of peers through gossiping.

Behind the scenes - Agency before first election
{
‣ Agency’s configuration  "term": 0,
"leaderId": "",
"commitIndex": 0,
curl [::]:8531/_api/agency/config "lastAcked": {},
"configuration": {
"pool": {
"AGNT-90479f91-d21f-4c4f-be4d-b3b4ff27be0c": "tcp://localhost:8536",
"AGNT-ad02b53f-5616-4bcd-851f-c803f8a0a82b": "tcp://localhost:8531",
‣ configuration: "AGNT-1864e184-a5c7-424e-8d54-85f5b78010ea": "tcp://localhost:8541"
},
‣ agency size: The RAFT group’s size  "active": [
"AGNT-90479f91-d21f-4c4f-be4d-b3b4ff27be0c",
When this many agents have initially found   "AGNT-ad02b53f-5616-4bcd-851f-c803f8a0a82b",
"AGNT-1864e184-a5c7-424e-8d54-85f5b78010ea"
each other, the RAFT process will start ],
"id": "AGNT-ad02b53f-5616-4bcd-851f-c803f8a0a82b",
‣ id: This agent’s id "agency size": 3,
"pool size": 3,
‣ endpoint: This agent’s endpoint "endpoint": "tcp://localhost:8531",
"min ping": 1,
‣ pool: List of all agents and their respective "max ping": 5,
"timeoutMult": 1,
endpoints "supervision": true,
"supervision frequency": 1,
‣ term: RAFT election term "compaction step size": 20000,
"compaction keep size": 10000,
"supervision grace period": 10,
‣ leaderId: The leader’s id.   "version": 5,
"startup": "origin"
Therefore, the agent at localhost:8536   }
}
has won the election in the above first term.  
 
Behind the scenes - Agency after first election
{
‣ Agency’s configuration  "term": 1,
"leaderId": "AGNT-90479f91-d21f-4c4f-be4d-b3b4ff27be0c",
"commitIndex": 507,
curl [::]:8531/_api/agency/config "lastAcked": {},
"configuration": {
"pool": {
"AGNT-ad02b53f-5616-4bcd-851f-c803f8a0a82b": "tcp://localhost:8531",
‣ configuration: "AGNT-1864e184-a5c7-424e-8d54-85f5b78010ea": "tcp://localhost:8541"
},
‣ agency size: The RAFT group’s size  "active": [
When this many agents have initially found   "AGNT-ad02b53f-5616-4bcd-851f-c803f8a0a82b",
"AGNT-1864e184-a5c7-424e-8d54-85f5b78010ea"
each other, the RAFT process will start ],
"id": "AGNT-ad02b53f-5616-4bcd-851f-c803f8a0a82b",
‣ id: This agent’s id "agency size": 3,
"pool size": 3,
‣ endpoint: This agent’s endpoint "endpoint": "tcp://localhost:8531",
"min ping": 1,
‣ pool: List of all agents and their respective "max ping": 5,
"timeoutMult": 1,
endpoints "supervision": true,
"supervision frequency": 1,
‣ term: RAFT election term "compaction step size": 20000,
‣ leaderId: The leader’s id.   "version": 5,
"startup": "origin"
Therefore, the agent at localhost:8536   }
}
has won the election in the above first term.  
 
Behind the scenes - Agency’s leader
{
‣ Agency’s configuration (leader)  "term": 1,
"leaderId": "AGNT-90479f91-d21f-4c4f-be4d-b3b4ff27be0c",
"commitIndex": 539,
curl [::]:8536/_api/agency/config "lastAcked": {
"AGNT-1864e184-a5c7-424e-8d54-85f5b78010ea": 0.098,
"AGNT-ad02b53f-5616-4bcd-851f-c803f8a0a82b": 0.098,
"AGNT-90479f91-d21f-4c4f-be4d-b3b4ff27be0c": 0
},
‣ lastAcked: Last acknowledged contact with   "configuration": {
"pool": {
the particular agent listed in seconds. Note that   "AGNT-ad02b53f-5616-4bcd-851f-c803f8a0a82b": "tcp://localhost:8531",
all agents are listed, including the leader itself   "AGNT-1864e184-a5c7-424e-8d54-85f5b78010ea": "tcp://localhost:8541"
},
with its last acknowledged delay of 0. "active": [
"AGNT-ad02b53f-5616-4bcd-851f-c803f8a0a82b",
"AGNT-1864e184-a5c7-424e-8d54-85f5b78010ea"
],
‣ term: 1 indicates that this is a fresh cluster, which "id": "AGNT-90479f91-d21f-4c4f-be4d-b3b4ff27be0c",
"agency size": 3,
has never seen agency leadership changes or "pool size": 3,
"endpoint": "tcp://localhost:8536",
restarts. This value grows over time. A high term "min ping": 1,
"max ping": 5,
indicates that many rounds of elections had to be "timeoutMult": 1,
"supervision": true,
performed in the past. Often an indicator of "supervision frequency": 1,
"compaction step size": 20000,
networking / management p "compaction keep size": 10000,
"version": 5,
"startup": "origin"
}
}
Behind the scenes - Agency dump
‣ Agency dumps are very helpful in troubleshooting of the cluster 
curl [::]:8531/_api/agency/read -Ld '[["/"]]'
‣ The output is long. In large clusters, very long and roughly divided in 4 sections:
‣ arango/Plan: reflects the planned state in which the cluster should be. It lists
databases, collections, db servers and coordinators as configured to operate within the
cluster.
‣ arango/Current: depicts the actual larger state in which the cluster is. Again, it lists
databases, collections and participating servers. Deviations from the planned state are
either only temporary or mean trouble.
‣ arango/Health: keeps track of the current state of individual nodes.
‣ arango/Target: holds meta information about the cluster and is home to the
supervision's past, current and future jobs.
Behind the scenes - Agency
‣ The entire API of the agency is documented here:  
https://docs.arangodb.com/latest/HTTP/Agency
‣ A very serious word of warning: It is a matter of triviality to break an ArangoDB

cluster with a single command to the Agency's API as it is trivial to delete the
database directory and cause complete loss of data.
‣ The API is only to be handled with expert knowledge of ArangoDB clusters and
with utmost care.
‣ Needless to mention, nevertheless it shall be done, do not manipulate the agency
without complete backup of your cluster.

Concepts
Storage engine, databases, collections, shards, indexes, authentication

Concepts - Storage engine
‣ Each ArangoDB cluster node can be deployed to persist its data using different
storage engines
‣ MMFiles (memory mapped files) are the overall faster mechanism for write and read
access to the data as long as it fits in RAM
‣ RocksDB compromises by keeping only a hot set in memory but in return offers superior
performance when the data exceeds the RAM size
‣ ArangoDB cluster admins are encouraged to chose the latter option, as clustered
instances are subject to random sharding by the supervision making it sometimes
impossible to have control over data distribution over the nodes.

Concepts - Databases
‣ Every database in the cluster is represented by a local database with the same
name on every DB server in the cluster
‣ The databases hold local collections, which correspond to global (cluster-wide)
shards.

Concepts - Collections
‣ Cluster collections comprise of local shards
‣ Data is sharded to the different shards per default by the _key attribute of every
document. With the randomised generation of _key, the data is distributed evenly
over all the shards of a collection. This is not be the ideal distribution pattern for
every business logic.
‣ Large document store for technical documentation of millions of parts of an aircraft
might be sharded more naturally by the model it is used in.
‣ It might however be also sharded by the manufacturer's id
‣ Database admins may shard collections based on any attribute

Concepts - Shards
‣ Every shard is a local collection to a DB server
‣ Every shard may be replicated over multiple DB servers
‣ The replication has ideally only a minor effect on write latency but no effect on
write throughput.
‣ The replication should have no negative, measurable effect on read performance.
‣ A DB server can only hold a single replica of a shard.

Concepts - Indexes
‣ Indexes on fields of cluster collections correspond to indexes on the same fields
on cluster shards.
‣ Read performance can thus profit scaling with number of shards on different
machines

Concepts - Authentication
‣ The experience of authenticated access to ArangoDB clusters is transparent to
users. All access to the coordinators behaves the in same way as a single instance
would.
‣ DB servers and agents cannot be accessed by authenticating cluster users. Cluster
nodes communicate in the authenticated cluster via jwt tokens.

Cluster Maintenance
Startup, shutdown, maintenance, backup and restore, rolling upgrade

Cluster Maintenance - Create Database
‣ Every cluster database is created either through the UI, arangosh or the database
API on any coordinator.
‣ The coordinator commissions the creation of the database in the agency's plan section
‣ All database servers see the changes and handle to create the according local database
‣ After local creation, they report their changes in the current section of the agency
‣ The coordinators wait for completion of all db server tasks in agency's current section
‣ The database becomes available on coordinators and can be used
‣ A cluster database is represented by a local database with the same name on
every db server.

‣ Shell 
arangosh> db._createDatabase("test_db")
‣ Rest API 
curl <coordinator>/_api/database -d{"name":"test_db"} 
 
Above actions will need a couple of seconds to finish as in the backend multiple db servers are enacting the
creation of system collections and initially setting up their replication. 
 

‣ UI

Cluster Maintenance - Drop Database
‣ Every cluster database is dropped either through the UI, the arangosh shell or the
database API on any coordinator. Cannot be undone!
‣ The coordinator commissions the deletion of the database in the agency's plan
‣ All database servers see the changes and handle to drop the according local database
with all its collections
‣ After local drops complete, db servers report in the agency's current section
‣ The coordinators wait for the all db server tasks to complete
‣ Database dropping is finished

‣ Shell: 
arangosh> db._dropDatabase("test_db")
‣ Rest API: 
curl <coordinator>/_api/database/test_db -XDELETE

‣ UI

Cluster Maintenance - Create Collection
‣ Every cluster collection is created either through the UI, arangosh or the
collection API on any coordinator.
‣ The coordinator commissions the creation of a collection in the agency's plan section
‣ All database servers see the changes to the agency and those which are to hold a shard
as leader or follower handle to create the according local collection
‣ After local creation, they report their changes in the current section of the agency
‣ The coordinators wait for the complete db server tasks in current to show the collection
as created
‣ A collection is represented by a number of shards distributed over database
servers as local collections in the according local database.

‣ Shell 
curl <coordinator>/_api/collections \ 
-d'{"name":"foo", "numberOfShards":3, "replicationFactor":2}'
‣ Rest API 
arangosh> db._create("foo", {numberOfShards:3, replationFactor:2})

‣ UI

‣ curl <coordinator>/_api/collections \ 
-d'{"name":"foo", "numberOfShards":3, "replicationFactor":2}' 
Created the following entry in the agency's plan distributing the 3 double replicated shards over
all database servers:
"5010292": { ... 
"shards": { "waitForSync": false,
"s5010294": [ "id": "5010292",
"PRMR-0dc7087e-0c51-45d3-a8ef-f7f8e8f943f1", "isSmart": false,
"PRMR-5924b597-84f9-4a72-9b66-f099b4d17ea6" "name": "foo",
], "indexes": [
"s5010295": [ {
"PRMR-5924b597-84f9-4a72-9b66-f099b4d17ea6", "fields": [
"PRMR-c31fbda6-d941-4a67-9cfd-360f66033944" "_key"
], ],
"s5010293": [ "id": "0",
"PRMR-c31fbda6-d941-4a67-9cfd-360f66033944", "sparse": false,
"PRMR-0dc7087e-0c51-45d3-a8ef-f7f8e8f943f1" "type": "primary",
] "unique": true
}, }
"shardKeys": [ ],
"_key" "doCompact": true,
], "isSystem": false,
"replicationFactor": 2, "isVolatile": false,
"path": "", "journalSize": 33554432,
"statusString": "loaded", "keyOptions": {
"deleted": false, "lastValue": 0,
"indexBuckets": 8, "type": "traditional",
"type": 2, "allowUserKeys": true
"status": 3, },
... "numberOfShards": 3
},

Cluster Maintenance - Drop Collection
‣ Every cluster database is dropped either through the UI, the arangosh shell or
the collection API on any coordinator. Cannot be undone!
‣ The coordinator commissions the deletion of the collection in the agency's plan section
‣ All database servers see the changes and handle to drop the according local collections,
representing the global collection's shards with all its indexes.
‣ After local drops complete, the servers report their changes in agency's current section
‣ The coordinators wait for the all db server tasks to complete and report in current to
show that the local collections no longer exist.

‣ Shell 
curl <coordinator>/_api/collections/foo -XDELETE
‣ Rest API:  
arangosh> db._drop("foo")

‣ UI

Cluster Maintenance - Index
‣ Every collection index is created either through the UI, arangosh or the collection
API on any coordinator.
‣ The coordinator commissions the creation of an index in the agency's plan section
‣ All database servers see the changes to the agency and those which are to hold a shard
as leader or follower handle to create the according local index on the local collection
‣ After local creation, the master reports the changes in the current section of the agency
‣ The coordinators wait for the complete db server tasks in current to show the collection
as created
‣ A cluster-wide collection index is thus represented by local indexes in contributing
shards everywhere.

Cluster Maintenance - Create Index
‣ Shell 
db.test.ensureIndex({type: "hash", fields: ["a"], sparse: true})
‣ Rest API 
https://docs.arangodb.com/latest/HTTP/Indexes/WorkingWith.html

Cluster Maintenance - Create Index
‣ UI

Cluster Maintenance - Drop Index
‣ Every collection index is dropped either through the UI, arangosh or the
collection API on any coordinator.
‣ The changes are made to agent's plan section
‣ The local indexes are removed on the affected shards
‣ The master shards report the changes to the agency
‣ Coordinators show the changes

Backup & Restore

Backup
‣ arangodump command operates transparently in the same way as with single
server arangodb. The documentation for arangodump is found here: 
https://docs.arangodb.com/<major-version>/Manual/Administration/Arangodump.html
c11 ~> arangodump --server.database test_db

Please specify a password:
Server version: 3.3.2
Connected to ArangoDB 'tcp://127.0.0.1:8529', database: 'test_db', username: 'root'
Writing dump to output directory '/mnt/c11/kaveh/dump'
# Dumping collection 'foo'...
# Dumping shard 's5010765' from DBserver 'PRMR-c31fbda6-d941-4a67-9cfd-360f66033944' ...
# Dumping shard 's5010766' from DBserver 'PRMR-0dc7087e-0c51-45d3-a8ef-f7f8e8f943f1' ...
# Dumping shard 's5010767' from DBserver 'PRMR-5924b597-84f9-4a72-9b66-f099b4d17ea6' ...
Processed 1 collection(s), wrote 28791523 byte(s) into datafiles, sent 6 batch(es)

Restore
‣ arangorestore command operates transparently in the same way as with
single server arangodb. The documentation for arangorestore is found here: 
https://docs.arangodb.com/<major-version>/Manual/Administration/Arangorestore.html
c11 ~> arangorestore --server.database test_db

Please specify a password:
Server version: 3.3.2
# Connected to ArangoDB 'http+tcp://127.0.0.1:8529'
# Re-creating document collection 'foo'...
# Loading data into document collection 'foo'...
# Creating indexes for collection 'foo'...
Processed 1 collection(s), read 28791523 byte(s) from datafiles, sent 4 batch(es)

Upgrade

Rolling upgrade - Starter only
‣ Upgrading ArangoDB clusters can be performed while the cluster is running.
‣ Upgrade binaries on all machines. (f.e. apt update && apt dist-upgrade)
‣ kill -15 *
‣ ... agents one after the other always allowing for the majority to be serving.
‣ ... database servers after the other allowing for full restart in between.
‣ ... coordinators one after the other allowing for full restart in between.
✴ Note that the kill commands orderly shutdown every single process. Subsequently, the starter
process takes immediate action to respawn the process. This is done with the updated arangodb
binary.
✴ Note that it is of utmost importance that backups of the data directories is performed ahead of
the upgrade process

‣ Upgrading ArangoDB clusters can be performed while the cluster is running.
‣ On all machines stop agents and make data backups

kill -15 $(ps -ef|grep -v grep|grep agent8531|awk {'print $2'}) 
rsync -a agent8531/data agent8531/data-$(date -I)
‣ On all machines stop db servers and make data backups

kill -15 $(ps -ef|grep -v grep|grep dbserver8530|awk {'print $2'}) 
rsync -a dbserver8530/data dbserver8530/data-$(date -I)
‣ On all machines stop coordinators and make data backups

kill -15 $(ps -ef|grep -v grep|grep coordinator8529|awk {'print $2'}) 
rsync -a coordinator8529/data coordinator8529/data-$(date -I)

‣ The upgrade is reflected consequently in the UI as well as _api/version
Upgrade 3.3.2 to 3.3.3

Resilience
Monitoring, failover, emptying a server for maintenance

Resilience - Supervision
‣ As a actors to operate a centralised job repository with fault-tolerant and resilient
state, the agents perform a crucial role in the maintenance of a cluster. The
leader...
‣ ... monitors heartbeats from coordinators and db servers and keeps track of their health
‣ ... takes action, when database servers have gone missing for longer than a grace period
‣ Shard leaders are replaced by an in-sync follower. And a new follower is added, if available.
‣ Shard followers are replaced by any available db server, which does not already replicate this
shard as follower or leader.
‣ Once any of the above has finished, the failed server is removed from the shards server list.
‣ ... handles cleaning out of db servers on user request
‣ ... handles moving of shards per user request

Resilience - Supervision, moving shards
‣ One might move a shard from any database server to a new database server,
which does not already hold the shard as leader or follower.

Resilience - Node down


Note that all shards have been moved  

away from the dead node DBServer0001

Resilience - Node back up shard rebalancing
The node has been restarted.  

Clicking "Rebalance shards" redistributes the shards 
which had been concentrated on the healthy nodes  
more evenly
Resilience - Rebalancing shards
An intermediate snapshot shows, how DBServer0001 

is again trusted with some responsibility.

Trouble shooting
Log files, disaster recovery

Troubleshooting - Log files
‣ Every ArangoDB instance in the cluster comes with a log file
‣ The log files generally try to report errors and warnings, which are helpful in finding the
case of an issue
‣ Oftentimes the problems are caused by less common and less anticipated problems. It is
then very helpful to change the log level of an instance while it is still experiencing the
problem
‣ DB servers & coordinators  
curl <ip:port>/_admin/log/level -sd '{"cluster":"debug","agencycomm":"debug"}'
‣ Agents 
curl <ip:port>/_admin/log/level -sd '{"agency":"trace"}'
‣ The log levels are back to normal after restart or by issuing same command with log level
"info"

Disaster recovery - Agents
‣ Full loss of data on a minority of agents (majority must be up and serving)
‣ Acquire agency's configuration from its current leader 
curl [agency-leader]/_api/agency/config
‣ Determine, which agent[s] have failed 
Search through the configuration document to find lastAcked that is missing out and
the according UUID
‣ Start an agent process in a new data directory by specifying the failed agent's UUID
‣ Keep requiring the leader's configuration until the lackAcked entry of the failed agent
is back to normal

‣ Leader's configuration shows that
> curl -s [::]:5001/_api/agency/config|jq
{
"term": 1,
"leaderId": "AGNT-455a054f-8da1-4406-bebd-8a650420a8cb",
operation is normal all "lastAcked" "commitIndex": 1,
"lastAcked": {
"AGNT-455a054f-8da1-4406-bebd-8a650420a8cb": 0,
values are smaller than   "AGNT-5a679651-99c4-4b95-9ff2-d0bf0e7a1bd4": 0.002,

"AGNT-c5b82440-d61d-469f-af03-366d87b4fd19": 0.002
},
"configuration/min ping" "configuration": {
"pool": {
"AGNT-455a054f-8da1-4406-bebd-8a650420a8cb": "tcp://[::1]:5001",
"AGNT-5a679651-99c4-4b95-9ff2-d0bf0e7a1bd4": "tcp://[::1]:5000",
"AGNT-c5b82440-d61d-469f-af03-366d87b4fd19": "tcp://[::1]:5002"
},
"active": [
"AGNT-455a054f-8da1-4406-bebd-8a650420a8cb",
"AGNT-5a679651-99c4-4b95-9ff2-d0bf0e7a1bd4",
"AGNT-c5b82440-d61d-469f-af03-366d87b4fd19"
],
"id": "AGNT-455a054f-8da1-4406-bebd-8a650420a8cb",
"agency size": 3,
"pool size": 3,
"endpoint": "tcp://[::1]:5001",
"min ping": 1,
"max ping": 5,
"timeoutMult": 1,
"supervision frequency": 2.5,
"version": 4,
"startup": "origin"
}
}

‣ Kill agency's leader process
> curl -s [::]:5002/_api/agency/config|jq
{
"term": 2,
"leaderId": "AGNT-c5b82440-d61d-469f-af03-366d87b4fd19",
‣ In the meantime the agent on port 5002  

"commitIndex": 2,
"lastAcked": {
"AGNT-c5b82440-d61d-469f-af03-366d87b4fd19": 0,
"AGNT-5a679651-99c4-4b95-9ff2-d0bf0e7a1bd4": 0.251,
has taken over the leadership in term 2 },
"AGNT-455a054f-8da1-4406-bebd-8a650420a8cb": 11.756
"configuration": {
‣ Purge data directory of the old leader

"pool": {
"AGNT-455a054f-8da1-4406-bebd-8a650420a8cb": "tcp://[::1]:5001",
"AGNT-c5b82440-d61d-469f-af03-366d87b4fd19": "tcp://[::1]:5002",
"AGNT-5a679651-99c4-4b95-9ff2-d0bf0e7a1bd4": "tcp://[::1]:5000"
‣ Find old agent's id by determining the  

},
"active": [
"AGNT-455a054f-8da1-4406-bebd-8a650420a8cb",
long dead time in lastAcked

"AGNT-c5b82440-d61d-469f-af03-366d87b4fd19",
"AGNT-5a679651-99c4-4b95-9ff2-d0bf0e7a1bd4"
],
"id": "AGNT-c5b82440-d61d-469f-af03-366d87b4fd19",
‣ Restart a new agent process in a clean  

"agency size": 3,
"pool size": 3,
"endpoint": "tcp://[::1]:5002",
"min ping": 1,
directory specifying the old agent's id   "max ping": 5,
"timeoutMult": 2,
among the old parameters  "supervision frequency": 2.5,

--agency.disaster-recovery-id=  "supervision grace period": 10,
"version": 5,
AGNT-455a054f-8da1-4406-bebd-8a650420a8cb "startup": "origin"
}
}

‣ Add the recovery id to the agents command line / configuration
build/bin/arangod -c none \
--agency.activate true \
--agency.endpoint tcp://[::1]:5000 \
--agency.my-address tcp://[::1]:5001 \
--agency.compaction-step-size 20000 \
--agency.compaction-keep-size 10000 \
--agency.pool-size 3 \
--agency.size 3 \
--agency.supervision true \
--agency.supervision-frequency 2.5 \
--agency.wait-for-sync true \ 
--agency.disaster-recovery-id AGNT-455a054f-8da1-4406-bebd-8a650420a8cb \
--database.directory agency/data5001 \
--javascript.app-path ./js/apps \
--javascript.startup-directory ./js \
--javascript.v8-contexts 1 \
--log.file agency/5001.log \
--log.force-direct false \
--log.level agency=info \
--log.use-microtime false \
--server.authentication false \
--server.endpoint tcp://[::]:5001 \
--server.statistics false

‣ The log output of the newly started agent in the clean directory:
2018-03-02T09:14:04Z [6073] INFO {agency} Entering gossip phase ...

2018-03-02T09:14:04Z [6073] INFO {agency} Adding AGNT-c5b82440-d61d-469f-af03-366d87b4fd19 
(tcp://[::1]:5002) to agent pool
2018-03-02T09:14:04Z [6073] INFO {agency} Adding AGNT-5a679651-99c4-4b95-9ff2-d0bf0e7a1bd4 
(tcp://[::1]:5000) to agent pool
2018-03-02T09:14:04Z [6073] INFO {agency} Agent pool completed. Stopping active gossipping.  
Starting RAFT process.
2018-03-02T09:14:04Z [6073] INFO {agency} Activating agent.
2018-03-02T09:14:04Z [6073] INFO {agency} Setting role to follower in term 0
2018-03-02T09:14:05Z [6073] INFO {agency} AGNT-455a054f-8da1-4406-bebd-8a650420a8cb:  
changing term or votedFor, current role:  
Follower term 2 votedFor:
2018-03-02T09:14:05Z [6073] INFO {agency} Set _role to FOLLOWER in term 2
2018-03-02T09:14:05Z [6073] INFO {agency} AGNT-455a054f-8da1-4406-bebd-8a650420a8cb:  
following 'AGNT-c5b82440-d61d-469f-af03-366d87b4fd19'  
in term 2

Disaster recovery - Coordinators
‣ Coordinators are the least sensitive instances in a cluster
‣ They can be replaced by starting a new coordinator instance to join the cluster
and removing the old one by clicking on the trash can icon in the node overview

Disaster recovery - DB servers
‣ When a DB server goes missing, the agency's supervision will reorganise
responsibilities for shards.
‣ If enough DB servers are left to satisfy the replication factors of all collections in
the cluster one can remove the DB server from the cluster after the
responsibilities have been taken over by the other DB servers. (Trash can icon)
‣ If not enough servers are left for the replication factors to be met, one needs to
add a new DB server instance to the cluster, before the missing node can be
removed.
‣ When a new DB server is added to the cluster and replication factors are currently
not met, shard followerships are automatically assigned to the new DB server

Add-ons
We hope you enjoyed the course and it helped you get started with the ArangoDB Cluster
What would you like to learn next?

Tell us with 3 clicks:

Support ArangoDB
You can support ArangoDB in multiple ways:
Feedback to the course Add “ArangoDB” to your skills Tweet about ArangoDB

Your ArangoDB team

ArangoDB Cluster Administration

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ArangoDB Cluster Administration

Uploaded by

Copyright:

Available Formats

ArangoDB Cluster Administration

What you will learn

‣ Behind the scenes

Copyright © ArangoDB GmbH, 2018 2

‣ Concepts (p.33) ‣ Troubleshooting (p.73)

Copyright © ArangoDB GmbH, 2018 3

Copyright © ArangoDB GmbH, 2018 5

What is an ArangoDB cluster? How does it work?

Copyright © ArangoDB GmbH, 2018 6

Copyright © ArangoDB GmbH, 2018 7

Copyright © ArangoDB GmbH, 2018 8

Copyright © ArangoDB GmbH, 2018 9

Copyright © ArangoDB GmbH, 2018 10

Copyright © ArangoDB GmbH, 2018 11

Copyright © ArangoDB GmbH, 2018 13

hosta > arangodb --starter.data-dir=db1 > db1.log &

Copyright © ArangoDB GmbH, 2018 14

Copyright © ArangoDB GmbH, 2018 15

Copyright © ArangoDB GmbH, 2018 16

Copyright © ArangoDB GmbH, 2018 17

Copyright © ArangoDB GmbH, 2018 18

What are the components? How to bootstrap?

Copyright © ArangoDB GmbH, 2018 19

Copyright © ArangoDB GmbH, 2018 21

Copyright © ArangoDB GmbH, 2018 22

‣ … starts up with initialisation of local database ﬁles if ﬁrst start.

Copyright © ArangoDB GmbH, 2018 23

Copyright © ArangoDB GmbH, 2018 25

Copyright © ArangoDB GmbH, 2018 26

--agency.activate Mandatory parameter on every agent to activate the according feature.

--agency.size Mandatory initial parameter to deﬁne the agencies size.

Copyright © ArangoDB GmbH, 2018 27

‣ A very serious word of warning: It is a matter of triviality to break an ArangoDB

Copyright © ArangoDB GmbH, 2018 32

Storage engine, databases, collections, shards, indexes, authentication

Copyright © ArangoDB GmbH, 2018 33

Copyright © ArangoDB GmbH, 2018 34

Copyright © ArangoDB GmbH, 2018 35

Copyright © ArangoDB GmbH, 2018 36

Copyright © ArangoDB GmbH, 2018 37

Copyright © ArangoDB GmbH, 2018 38

Copyright © ArangoDB GmbH, 2018 39

Startup, shutdown, maintenance, backup and restore, rolling upgrade

Copyright © ArangoDB GmbH, 2018 40

Copyright © ArangoDB GmbH, 2018 41

Copyright © ArangoDB GmbH, 2018 42

Copyright © ArangoDB GmbH, 2018 43

Copyright © ArangoDB GmbH, 2018 44

Copyright © ArangoDB GmbH, 2018 45

Copyright © ArangoDB GmbH, 2018 46

Copyright © ArangoDB GmbH, 2018 47

Copyright © ArangoDB GmbH, 2018 48

Copyright © ArangoDB GmbH, 2018 49

Copyright © ArangoDB GmbH, 2018 50

Copyright © ArangoDB GmbH, 2018 51

Copyright © ArangoDB GmbH, 2018 52

Copyright © ArangoDB GmbH, 2018 53

Copyright © ArangoDB GmbH, 2018 54

Copyright © ArangoDB GmbH, 2018 55

Copyright © ArangoDB GmbH, 2018 56

Copyright © ArangoDB GmbH, 2018 57

Copyright © ArangoDB GmbH, 2018 58

c11 ~> arangodump --server.database test_db

Copyright © ArangoDB GmbH, 2018 59

‣ Behind the scenes 

hosta > arangodb --starter.data-dir=db1 > db1.log & 

Note that all shards have been moved  

The node has been restarted.  

An intermediate snapshot shows, how DBServer0001 

values are smaller than   "AGNT-5a679651-99c4-4b95-9ff2-d0bf0e7a1bd4": 0.002,

‣ In the meantime the agent on port 5002  

‣ Find old agent's id by determining the  

‣ Restart a new agent process in a clean  

among the old parameters  "supervision frequency": 2.5,