Professional Documents
Culture Documents
07 - Basic Cluster Administration
07 - Basic Cluster Administration
07 - Basic Cluster Administration
1
References
• MongoDB The Definitive Guide: Powerful and Scalable Data Storage 3rd
Edition
• https://docs.mongodb.com/
• https://www.mongodb.com/docs/manual/
2
Contents
• Mongod
• Security
• Replication
• Sharding
3
What is mongod?
• mongod is the main daemon process for MongoDB.
• It handles data requests, manages data access, and performs background
management operations
• We run a separate mongod process for each server.
4
What is mongod?
• When we launch mongod, we're essentially starting up a new database.
But we don't interact with the mongod process directly. Instead, we use a
database client to communicate with mongod (mongo, mongosh).
• Default configuration:
o IP Binding: localhost (127.0.0.1).
o Port: 27017.
o Data directory: typically in C:\data\db on Windows systems.
o Log file: typically in C:\Program
Files\MongoDB\Server\version_number\log\mongod.log on Windows systems
o Authentication is turned off, so clients are not required to authenticate before
accessing the database.
5
Mongod options
Example: mongod --bind_ip localhost --port 27017 --dbpath /data --logpath /log/mongod.log
• --bind_ip: specify which IP addresses mongod should bind to. When
mongod binds to an IP address, clients from that address are able to
connect to mongod.
• --port <port number>: specify the port on which mongod will listen for client
connections (command "netstat -a –n" to see used ‘host:ports’)
• --dbpath <path>: specify where all data files of database are stored.
• --logpath <path>: specify where and name log file is stored
• --auth: enables authentication to control which users can access the
database. When auth is specified, all database clients who want to connect
to mongod first need to authenticate.
• --help: output the various options for mongod with a description of their
functionality → mongod --help or mongod -h 6
Mongod options
• Example:
Run command line as administrator:
mongod --bind_ip localhost --port 27017 --dbpath C:/"Program Files"/MongoDB/Server/5.0/data
--logpath C:/"Program Files"/MongoDB/Server/5.0/log/mongod.log
7
Mongod – Configuration File
• Configuration file is a way to organize the options you need to run the
mongod process into an easy to parse YAML (Yet Another Markup
Language) file
mongod --dbpath data/db --logpath data/log/mongod.log --replSet 'M103' --keyFile /data/keyfile --bind_ip localhost
8
Mongod – Configuration File
YAML file
Configuration File Option
storage.dbPath
systemLog.path
systemLog.destination
net.bind_ip
net.port
security.keyFile
replication.replSetName
9
Mongod – An example
• Run a mongod instance with custom configuration:
o Create configuration file with port 27000, any dbpath and logpath;
o Start a mongod with the configuration file;
o Start a mongo and connect to that mongod;
o View databases…
10
Mongod – Basic Commands
• Cover a few of the basic commands necessary to interact with the
MongoDB cluster.
• These methods are available in the mongodb shell that wrap underlying
database commands.
o db.<method>(): DB shell helpers, interact with the database.
✓db.<collection>.<method>(): shell helpers for collection level operations.
o rs.<method>(): rs helper methods, control replica set deployment and
management.
o sh.<method>(): sh helper methods, control sharded cluster deployment and
management.
o (Read More)
11
Mongod – Basic Commands
• User management, example:
o db.createUser()
o db.dropUser()
• Collection management, example :
o db.<collection>.renameCollection( <target>, <dropTarget> ) [dropTarget:
optional]
o db.<collection>.createIndex( <keys>, <options>, <commitQuorum> )
o db.<collection>.drop( <options> )
• Database management, example:
o db.dropDatabase( <writeConcern> ) [removes current database]
o db.createCollection( <name>, <options> )
• Database status, example:
12
o db.serverStatus()
Contents
• Mongod
• Security
• Replication
• Sharding
13
Basic Security
• How do we secure the data?
Authentication Authorization
• Verifies the identity of a user • Verifies the priviliges of a user
• Answers the question : Who are you? • Answers the question: What do you have
access to?
• SCRAM: default and most basic • Each user has one or more Roles.
form of client authentication • Each Role has one or more Privileges.
(password security)
• A Privilege represent a group of actions
• X.509: certificate for authentication, and the resources that those actions
more secure and more complex apply to.
• LDAP
Only for MongoDB Enterprise
• Kerberos
14
Basic Security
• Authorization: Role-based access control
o Roles support a high level of responsibility isolation for operational task:
17
Basic Security
• Roles in MongoDB:
o Build-In roles: pre-packaged MongoDB roles.
o Custom roles: tailored roles to attend specific needs of specific users.
18
Basic Security
• Build-In roles
Role levels Roles
Database Users read, readWrite
Database Administration userAdmin, dbAdmin
clusterAdmin, clusterManager, clusterMonitor,
Cluster Administration
hostManager
Backup and Restore backup, restore
Super User root (root is also a role at the all database level)
readAnyDatabase, readWriteAnyDatabase
AllDatabase
dbAdminAnyDatabase, userAdminAnyDatabase
(read more Built-In Roles)
19
Basic Security
Example:
Run MongoDB server that enforces authentication (no user created):
mongod --config 'D:\HeQTCSDL_NoSQL\config_files\mongod.conf'
23
Contents
• Mongod
• Security
• Replication
• Sharding
24
Replication
• Replication: Maintain multiple copies of your data – Really important
• Why?
o Can never assume all servers will always be available
o To make sure, if server goes down, you can still access your data
Redundancy and Data Availability
o Replication can provide increased read capacity as clients can send read
operations to different servers
o If the database is hosted on a single server standalone node
Client
Application Data
25
Replication
• A replica set is a group of mongod instances that maintain the same data set.
• A replica set contains several data bearing nodes and optionally one arbiter node.
Of the data bearing nodes, one and only one member is deemed the primary
node, while the other nodes are deemed secondary nodes.
• A secondary can become a primary: If the current primary becomes unavailable,
the replica set holds an election to choose which of the secondaries becomes the
new primary.
• All writes are directed to the primary member whereas the secondary members
replicate from the primary asynchronously using oplog.
o The oplog (operations log) is a special capped collection that keeps a record of all
operations that modify the data stored in your databases.
o MongoDB applies database operations on the primary and then records the
operations on the primary's oplog. The secondary members then copy and apply
these operations in an asynchronous process.
(read more Replica Set)
26
Replication
• Replica set members:
o Primary: receives all write operations
o Secondary: replicates operations from the primary to maintain an identical
data set, it can be configure for a specific purpose:
✓Priority 0 members - are secondary members that maintain the primary’s data
copy but can never become a primary in case of a failover. They can participate in
voting and can accept read requests (see Priority 0 Replica Set Members).
✓Hidden members - are 0-priority members that are hidden from the client
applications (can’t serve any read requests). (see Hidden Replica Set Members).
✓Delayed members - are secondary members that replicate data with a delay
from the primary’s oplog (see Delayed Replica Set Members).
o Arbiter: participates in elections but does not hold data
27
Replication
Client
Application
Replica Set
Primary
Heartbeat
Secondary Secondary
Replica Set
Primary
Heartbeat/2s Arbiter
Secondary (vote only)
Replica Set
Primary
Replication
Heartbeat
Primary
Secondary Secondary
30
Replication
• Setting up a Replica Set
Example:
Replica Set
Primary
(27011)
Secondary Secondary
(27012) (27013)
31
Replication
• Setting up a Replica Set:
1. Use configuration file for standalone mongod;
2. Start a mongod with configuration file;
3. Start a mongo and connect to one of mongo instance;
4. Initialize replica set;
5. Create root user;
6. Exit out of this mongo and then log back in as root user;
7. Add nodes to replica set
32
Replication
Setting up a Replica Set:
1- Use configuration file for standalone mongod
net:
encrypt data exchanged net:
net:
bindIp: localhost between
bindIp: client
localhost bindIp: localhost
port: 27011 application
port: 27012and mongodb port: 27013
33
Replication
Setting up a Replica Set:
2- Start a mongod with configuration file:
mongod --config 'c:\mongoDB\configs\node1.conf’
mongod --config 'c:\mongoDB\configs\node2.conf’
mongod --config 'c:\mongoDB\configs\node3.conf’
3- Start a mongo and connect to one of mongo instance:
mongo --host localhost:27011
4- Initialize replica set:
rs.initiate()
5- Create root user:
use admin
db.createUser({ user: 'm-admin', pwd: 'm-pass', roles: [ { role : 'root', db : 'admin' } ] })
34
Replication
Setting up a Replica Set
6- Exit out of this mongo and then log back in as m-admin user
mongosh --host rep-example/localhost:27011 -u m-admin -p m-pass --authenticationDatabase admin
7- Add nodes to Replica set Replica set name
rs.add( 'localhost:27012' )
rs.add( 'localhost:27013' )
35
Replication
• Replication Configuration Document:
o The replica set configuration document is a simple BSON document that we
manage using a JSON representation
o Can be configured from the shell
o There are set of mongo shell replication helper methods that make it easier to
manage
✓rs.initiate() : Initializes a new replica set.
✓rs.add() : Adds a member to a replica set.
✓rs.addArb() : Adds an arbiter to a replica set.
✓rs.remove() : Remove a member from a replica set.
✓rs.reconfig() : Reconfigures a replica set by applying a new replica set
configuration object.
✓… (seft study)
36
Replication
• Reconfiguring a Running Replica Set:
Replica Set
Primary Secondary
(node 1) (node 4)
Arbiter
Secondary Secondary (node 5)
(node 2) (node 3)
37
Replication
Reconfiguring a Running Replica Set:
1- Create config files for the secondaries 3 and arbiter nodes
storage: storage:
dbPath: d:\mongodb\db\node4 dbPath: d:\mongodb\db\arbiter
net: net:
bindIp: localhost bindIp: localhost
port: 27014 port: 28000
security: security:
authorization: enabled authorization: enabled
keyFile: d:\mongodb\pki\m-keyfile keyFile: d:\mongodb\pki\m-keyfile
systemLog: systemLog:
destination: file destination: file
path: d:\mongodb\db\node4\mongod.log path: d:\mongodb\db\arbiter\mongod.log
logAppend: true logAppend: true
replication: replication:
replSetName: rep-example replSetName: rep-example
38
Replication
Replica Set
Reconfiguring a Running Replica Set:
Primary
2- Starting up mongod processes for our fourth node and arbiter (node 1)
3- Run Mongo shell and connect to the replica set m-example (node 3) (node 4)
39
Replication
Reconfiguring a Running Replica Set:
- Removing the arbiter from our replica set:
rs.remove( 'localhost:28000’ )
- Assigning the current configuration to a shell variable we can edit, in order to reconfigure the
replica set:
cfg = rs.conf()
- Editing our new variable cfg to change topology - specifically, by modifying cfg.members:
cfg.members[3].votes = 0
cfg.members[3].hidden = true
cfg.members[3].priority = 0
- Updating our replica set to use the new configuration cfg:
rs.reconfig( cfg )
40
Replication
• Failover and Elections:
o Primary node is the first point where the client application accesses the
database.
o If secondaries go down, the client will continue communicating with the node
acting as primary until the primary is unavailable.
Replica Set
Client
Primary Application
Secondary Secondary
41
Replication
• Failover and Elections:
o What would cause a primary to become unavailable? → a common reason is
maintenance.
Replica Set
? Client
Primary Application
Secondary Secondary
42
Replication
• Failover and Elections:
o Let's say we want to roll upgrade on a three nodes replica set.
o A rolling upgrade just means we're upgrading one server at a time, starting
with the secondaries and eventually, we'll upgrade the primary.
upgrade to
Replica Set version 5.0
Primary
(node1 - v4.2)
Secondary Secondary
(node2 - v4.2) (node3 - v4.2)
43
Replication
• Failover and Elections:
Replica Set
Primary
(node1 –- v4.2)
(node2 v5.0)
Secondary Secondary
(node2–––-v5.0)
(node1
(node2
(node1 v4.2)
v4.3)
v5.0) (node3 –- v4.2)
(node3 v5.0)
44
Contents
• Mongod
• Security
• Replication
• Sharding
45
Sharding
• What is Sharding?
o In a replica set, we have more than one server in
our database and each server has to contain
the entire dataset
o What do we do when the data grows, and the
servers can't work properly?
48
Sharding
• Primary Shard
o In the sharded cluster, we have the primary shard.
o All the non-sharded collections on that database will remain on primary shard
(not all the collections in a sharded cluster need to be distributed).
Primary
Sharded Non-Sharded
49
Sharding
• Sharded Cluster Components:
o shard: Each shard contains a subset of the sharded data. Each shard must be
deployed as a replica set.
o mongos: The mongos acts as a query router, providing an interface between
client applications and the sharded cluster.
o config servers: Config servers store metadata and configuration settings for
the cluster. As of MongoDB 3.4, config servers must be deployed as a replica
set (CSRS - Config Server Replica Set).
50
Sharding
Sharding Architecture Mongos routes client queries
to the correct shards
Router
client (mongos) config servers stores the
Config
collection of metadata
Server
(replica
set)
...
53
Sharding
• Setting up a Sharded Cluster:
server-04
server-01
server-02
server-05 server-03
server-06
server-07 54
Sharding
• Setting up a Sharded Cluster:
o Build config servers:
1. Create configuration file for config servers;
2. Starting the config servers;
3. Run mongo shell and connect to one of the config servers;
4. Initiating the CSRS (Config Server Replica Set);
5. Creating super user on CSRS;
6. Authenticating as the super user;
7. Add the second and third node to the CSRS replica set;
o Config and run mongos:
o Config shard.
o Adding shards to cluster from mongos.
55
Sharding
Setting Up a Sharded Cluster:
✓ Build config servers:
1. Create configuration file for config servers:
storage: storage: storage:
dbPath: d:\db\ShardCluster\csrs1 dbPath: d:\db\ShardCluster\csrs2 dbPath: d:\db\ShardCluster\csrs3
security:
authorization: enabled
keyFile: d:\db\pki\keyfile
systemLog:
destination: file
path: d:\db\mongos.log
logAppend: true
sharding:
configDB: configsvr-rs/localhost:26001, localhost:26002, localhost:26003
mongos.cfg 58
Sharding
Setting Up a Sharded Cluster:
✓ Config Shard
storage: storage: storage:
dbPath: d:\db\ShardCluster\node1 dbPath: d:\db\ShardCluster\node2 dbPath: d:\db\ShardCluster\node3
wiredTiger: wiredTiger: wiredTiger:
engineConfig: engineConfig: engineConfig:
cacheSizeGB: .25 cacheSizeGB: .25 cacheSizeGB: .25
net: net: net:
bindIp: localhost bindIp: localhost bindIp: localhost
port: 27011 port: 27012 port: 27013
security: security: security:
authorization: enabled authorization: enabled authorization: enabled
keyFile: d:\db\pki\keyfile keyFile: d:\db\pki\keyfile keyFile: d:\db\pki\keyfile
systemLog: systemLog: systemLog:
destination: file destination: file destination: file
path: d:\db\ShardCluster\node1\mongod.log path: d:\db\ShardCluster\node3\mongod.log
path: d:\db\ShardCluster\node2\mongod.log
logAppend: true logAppend: true logAppend: true
replication: replication: replication:
replSetName: shard1-rs replSetName: shard1-rs replSetName: shard1-rs
sharding: sharding: sharding:
clusterRole: shardsvr clusterRole: shardsvr clusterRole: shardsvr
node1.cfg node2.cfg node3.cfg 59
Read more WiredTiger Storage Engine
Sharding
Setting Up a Sharded Cluster:
✓ Config Shard (using of Replica set m-example2)
Run mongod with corresponding config files:
mongod --config node1.cfg
mongod --config node2.cfg
mongod --config node3.cfg
Run mongo shell and connect to one of servers:
mongo --port 27011
rs.initiate()
✓ Adding new shard to cluster from mongos
sh.addShard( ‘shard1-rs/localhost:27011’ ) //if port:27011 is primary node
Check sharding status:
sh.status()
60
Sharding
• Shard Keys:
o MongoDB uses the shard key to distribute the collection's documents across
shards. The shard key consists of a field or multiple fields in the documents.
o MongoDB divides the span of shard key values into non-overlapping ranges of
shard key values. Each range is associated with a chunk.
o MongoDB supports two sharding strategies for distributing data across
sharded clusters:
62
Question?
63