Download as pdf or txt
Download as pdf or txt
You are on page 1of 49

mongoDB

Bộ môn: Kỹ Thuật Phần Mềm

Giáo viên: Trần Thế Trung.


Email: tranthetrung@iuh.edu.vn

1
Transactions in
MongoDB

1.Transactions in MongoDB.
2.Read Preference.
3.Read Concern.
4.Write Concern

2
What is a transaction?

• A database transaction is a unit of work, designed to handle the changes of data in the
database. It makes sure that the output of the data is consistent and doesn’t generate
errors. It helps with concurrent changes to the database, and makes the database more
scalable.
• A transaction is a set of database operations (reads and writes) performed in a sequential
order, all individual operations must succeed.
• If any operation is not executed correctly, the transaction will be aborted. The database will
then be restored to its previous state

3
Transactions in MongoDB

• In MongoDB, an operation on a single document is atomic.


• For situations that require atomicity of reads and writes to multiple documents (in a
single or multiple collections), MongoDB supports multi-document transactions. With
distributed transactions, transactions can be used across multiple operations, collections,
databases, documents, and shards.

4
Transactions in MongoDB

• Multi-document transactions are Atomic:


• When a transaction commits, all data changes made in the transaction are saved
and visible outside the transaction.
• When a transaction writes to multiple shards, not all outside read operations need
to wait for the result of the committed transaction to be visible across the shards.
• When a transaction aborts, all data changes made in the transaction are discarded
without ever becoming visible.

5
• ACID in MogoDB is a group of database operations that
must happen together or not at all, ensuring database
Transactions and safety and consistency.
• ACID Transactions should be used in scenarios that
ACID in MogoDB involve the transfer of value from one record to another
(such as when exchanging currency, stock or even
adding an item to an online shopping cart)

6
ACID properties of transactions
• Atomicity: all operations will either succeed or fail together.
• Consistency: all changes made by operations are consistent with database constraints.
• For example, in an application that transfers funds from one account to another, the
consistency property ensures that the total value of funds in both the accounts is the
same at the start and end of each.
• Isolation: multiple transactions can happen at the same time without affecting the outcome
of the other transaction.
• For example, in an application that transfers funds from one account to another, the
isolation property ensures that another transaction sees the transferred funds in one
account or the other, but not in both, nor in neither.
• Durability: all of the changes that are made by operations in a transaction with persist, no
matter what.
• For example, in an application that transfers funds from one account to another, the
durability property ensures that the changes made to each account will not be reversed.

7
Transactions and Operations

• Distributed transactions can be used across multiple operations, collections, databases,


documents, and shards.
• For transactions:
• We can specify read/write (CRUD) operations on existing collections. (Read more)
• We can create collections and indexes in transactions. (Read more)

8
Transactions and
Read Preference
Read preference describes how
MongoDB clients route read
operations to the members of a
replica set.

9
Transactions and Read Preference
Set the transaction-level read preference at the transaction start:
• If the transaction-level read preference is unset, the transaction uses the session-level
read preference.
• If transaction-level and the session-level read preference are unset, the transaction
uses the client-level read preference. By default, the client-level read preference is
primary.
• Multi-document transactions that contain read operations must use read preference
primary. All operations in a given transaction must route to the same member.

Read Preference
Mode

primary primaryPreferred Secondary secondaryPreferred nearest

10
(Read more)
Transactions and Read Concern

• The readConcern option allows you to control the consistency and isolation properties of
the data read from replica sets and replica set shards.
• Through the effective use of write concerns and read concerns, you can adjust the level of
consistency and availability guarantees as appropriate, such as waiting for stronger
consistency guarantees, or loosening consistency requirements to provide higher
availability.

11
Transactions and Read Concern
Read Concern Levels
• Local: returns the most recent data available from the node but can be rolled back.
• For transactions on sharded cluster, "local" read concern cannot guarantee that the data is from the
same snapshot view across the shards. If snapshot isolation is required, use "snapshot" read
concern
• Majority: returns data that has been acknowledged by a majority of the replica set members (i.e. data
cannot be rolled back) if the transaction commits with write concern "majority".
• If the transaction does not use write concern "majority" for the commit, the "majority" read
concern provides no guarantees that read operations read majority-committed data.
• For transactions on sharded cluster, "majority" read concern cannot guarantee that the data is from
the same snapshot view across the shards. If snapshot isolation is required, use "snapshot" read
concern.
• Snapshot: Read concern "snapshot" returns data from a snapshot of majority committed data if the
transaction commits with write concern "majority".
• If the transaction does not use write concern "majority" for the commit, the "snapshot" read
concern provides no guarantee that read operations used a snapshot of majority-committed data.
• For transactions on sharded clusters, the "snapshot" view of the data is synchronized across shards.

12
Transactions and Read Concern

Read Concern Option: Command/Method "local" "available" "majority" "snapshot" [3] "linearizable"

• For operations not in multi-document count ✓ ✓ ✓ ✓


transactions, you can specify a readConcern distinct ✓ ✓ ✓ ✓ [2] ✓
level as an option to commands and
methods that support read concern: find ✓ ✓ ✓ ✓ ✓

readConcern: { level: <level> } db.collection.find() via cursor.readConcern() ✓ ✓ ✓ ✓ ✓

geoSearch ✓ ✓ ✓ ✓ ✓
• To specify the read concern level for getMore ✓ ✓
mongosh use the method:
aggregate db.collection.aggregate() ✓ ✓ ✓ ✓ ✓ [1]
db.collection.find().readConcern(<level>)
Session.startTransaction() ✓ ✓ ✓

13
• Write Concern describes the level of acknowledgment requested from
MongoDB for write operations to a standalone mongod or to replica
sets or to sharded clusters.
Transactions and • Transactions use the transaction-level write concern to commit the
write operations.
Write Concern • Write operations inside transactions must be issued without
explicit write concern specification and use the default write
concern.
• At commit time, the writes are then committed using the
transaction-level write concern.

14
• You can set the transaction-level write concern at the transaction
start:
Transactions and • If the transaction-level write concern is unset, the
transaction-level write concern defaults to the session-level
Write Concern write concern for the commit.
• If the transaction-level write concern and the session-level
write concern are unset, the transaction-level write concern
defaults to the client-level write concern

15
• Write Concern Specification: Write concern can include the following
fields:
{ w : <value>, j : <boolean>, wtimeout : <number> }
Transactions and • The w option to request acknowledgment that the write operation
Write Concern has propagated to a specified number of mongod instances or to
mongod instances with specified tags.
• The j option to request acknowledgment that the write operation
has been written to the on-disk journal, and
• The wtimeout option to specify a time limit to prevent write
operations from blocking indefinitely

16
Using a Transaction
• Here is a recap of the code that's used to complete a multi-document transaction:
const session = db.getMongo().startSession()

session.startTransaction()

const account = session.getDatabase('< add database name here>').getCollection('<add collection name here>’)

//Add database operations like .updateOne() here

session.commitTransaction()

17
Aborting a Transaction
• Here is a recap of the code that's used to cancel a transaction before it completes:

const session = db.getMongo().startSession()

session.startTransaction()

const account = session.getDatabase('< add database name here>').getCollection('<add collection name here>’)

//Add database operations like .updateOne()

here session.abortTransaction()

18
Transactions in MogoDB
• MongoDB write operations are Atomic at the document level (including documents within a
document)
• Transactions across multiple documents can be made atomic using two phase commits
• Two phase commit in database management systems is a technique used to ensure atomicity and
consistency

19
Two phase commits
▪ Set up a collection called transactions
{Target document, source document, value, state}
▪ Add a pendingtranTsactions=[ ] field to documents
▪ Create a new transaction with state=initial
▪ When transaction starts, set state=pending

▪ Store transaction id in pendingTransactions[ ]


▪ Apply transactions to both documents
▪ Set state =committed
▪ Use find() to see if documents are correct
▪ If so, set state=done

20
Examples
• All of the examples in this document use the mongo shell to interact with the database.
• Assume that you have two collections:
• First, a collection named accounts that will store data about accounts with one account per
document,
• A collection named transactions which will store the transactions themselves.

21
Examples
• Add pendingTransactions field to accounts documents:

db.accounts.insert(
[
{_id: "A", balance: 1000, pendingTransactions: [ ]},
{_id: "B", balance: 1000, pendingTransactions: [ ]}
]
)

22
Examples
• Add document to transactions collection

db.transactions.insert(
{_id: 1, source: “A”, destination: “B”, value: 100, state: “initial”},
)

23
Examples
• Get the transaction from the collection

t = db.transactions.findOne({state: "initial“})

24
Examples
• Update the balances – put transaction id in pendingTransactions array.
db.accounts.update(
{_id: t.source, pendingTransactions: {$ne:t._id}},
{$inc:{balance: t.value}, $push: {pendingTransactions: t._id}}
)
db.accounts.update(
{_id: t.destination, pendingTransactions: {$ne:t._id}},
{$inc: {balance: -t.value}, $push: {pendingTransactions: t._id}}
)

25
Examples
• Set the transaction state to done and remove the pending transactions array from the account
document

db.transaction.update(
{_id: t._id, state: “applied”},
{
$set:{state: “done”},
$currentDate: {lastModified: true}
}
)
db.accounts.update(
{_id: t.source, pendingTransactions: t._id},
{$pull: {pendingTransactions: t._id}}
)

26
CAP theorem
• States that you can have at most two of:
• Consistency
• Accessibility
• Partition Tolerance

27
Consistency – tính nhất quán
• In a distributed database, maintaining consistency means ensuring that every read gets the most
recent data and every write is durable
• Write inconsistency can occur if two version of the database (each on a different machine) are
updated at the same time.
• Read inconsistency occurs if a read is made from one machine after another is updated

28
Eventual Consistency
• Replication consistency means that every read, no matter which replication it is made from, gives
the same answer
• Requires writes to propagate fully to every node before a read can take place: not always
necessary
• Eventual consistency allows some nodes to be a little “Behind” others, but to catch up
eventually(really, quite quickly)

29
Eventual Consistency
• Example:
• Factbook: not problem if a friend in the UK can see a new photo of your cat while a friend in
America has to wait a few more seconds before it appears
• PayPal – needs to be sure the balance it reads is correct, and that another node hasn’t spent
the remaining money.

30
Read your Writes Consistency
• Imagine a blog database, distributed across several nodes
• If I write to one node and you read from another, you won’t see my post until it propagates to
your node – eventual consistency
• But, if I write to one node and then, due to load balancing, read from another – my post has
vanished

31
Sticky Sessions
• To ensure read your writes consistency, a session between the user and the node can be
maintained so that the entire interaction is consistent
• Can reduce the efficiency of load balancing

32
Availability
• One way to maintain consistency is to make sure updates are fully propagated or writes are
forced through a master node
• That means that a node might be reachable on the network, but still ’unavailable’ because it
either hasn’t been updated or can’t contact the master node
• So available really means able to respond

33
Read/Write Available
• In the case where writes need to go through a master node, but reads don’t, availability depends
on the request
• Read available
• Write unavailable

34
Example
• Hotel booking system
• Read from a slave (might be out of date)
• Write through master
• If no rooms available, report room was lost
• If master not available, either report error or write to slave and deal with conflict late.
• Keeps reads (most frequent query) fast using slaves
• Keep write consistent using master

35
Partition Tolerance
• A network becomes partitioned when one or more links fail causing some machines to become
isolated from some others.
• If master node is in partition, then the slaves in the other can’t reach it
• So those slaves become unavailable until the partition is repaired and they are updated

36
Without Partition Tolerance
• A database can be partition tolerant if it is happy to lose either consistency or availability as soon
as it is partitioned.
• It can keep consistent by making some node unavailable (CP)
• Or stay available but accept that it will become inconsistent (AP)
• While everything is working (no partitions) a database can be consistent and available

37
Consistency Latency
• It takes some time (however small) to update all nodes in a network after a write.
• That latency is like temporary partition
• So in a sense, you always have brief partitions
• So you can only really choose between consistency and availability

38
Really a Continuum
• In reality, the CAP qualities are not all or nothing options, but a continuum. You need to think
about:
• How much do I need consistency?
• How long are users prepared to wait for it?
• Can I get away with write consistency only?
• How can conflicts be solved later, and at what cost?

39
Read/Write Quora
• Replication is generally only an additional two nodes, so three copies in total.
• Latency not much of a problem as updates propagated fast
• Can speed things up more by using a read or write quorum
• Write is acknowledged once two of the three nodes have it, then a read access two of the three
and picks the most recent

40
Trade – off of Read/Write Quorum
• Write to 3, read from 1
• Write to 2, read from 2
• Write to 1, read from 3
• The “Write to” part means write that many and then acknowledge write as complete

41
Durability
• Memory is Much faster than disk, even SSD
• Running a DB in memory is desirable where speed is crucial
• Disk writes can be at intervals or, for temporary stores, never
• Node crashes cause permanent data loss
• Worth it for things like web session data.

42
Concurrency in MongoDB transactions
• Transactions and Concurrency
• MongoDB allows multiple clients to read and write the same data.
• To ensure consistency, MongoDB uses locking and concurrency control to prevent clients
from modifying the same data simultaneously.
• Writes to a single document occur either in full or not at all, and clients always see consistent
data.

43
Concurrency in MongoDB transactions
• Locking is a mechanism used to maintain concurrency in the databases. MongoDB uses multi-
granularity locking in different levels and modes of locking to achieve this
• Locking levels: There are four different levels in MongoDB.
• Global: (MongoD instance) – All databases in it are affected by the lock
• Database: This is a database level lock, Only the Database on which lock is applied is affected.
• Collection: Here the locking is handled in the collection level
• Document: It is a document level locking where only that particular document will be locked.

44
Concurrency in MongoDB transactions
• Locking modes
• S – Shared: The resource will be shared with the concurrent readers. This mode is used for
read operations.
• X – Exclusive: This lock mode does not allow concurrent readers to share the resource. This
mode is used for write operations.
• IS – Intent Shared: This lock mode indicates that the lock holder will read the resource at a
granular level.
• For example, if IS is applied to a database, then it means that lock holder is willing to
apply a Shared (S) lock on Collection or Document level.

45
Concurrency in MongoDB transactions
• Locking modes
• IX – Intent Exclusive: This lock mode indicates that the lock holder will modify the resource
at a granular level.
• For example, if IX is applied to a database, then it means that lock holder is willing to apply a
Exclusive (X) lock on Collection or Document level.

46
Concurrency in MongoDB transactions
• Concurrency Control: allows multiple applications to run concurrently without causing data
inconsistency or conflicts.
• A findAndModify operation on a document is atomic: if the find condition matches a document,
the update is performed on that document.
• Concurrent queries and additional updates on that document are not affected until the current
update is complete.

47
Concurrency in MongoDB transactions
• Concurrency Control:
• Example: A collection with two documents:
db.myCollection.insertMany( [
{ _id: 0, a: 1, b: 1 }, { _id: 1, a: 1, b: 1 } ] )
• Two of the following findAndModify operations run concurrently:
db.myCollection.findAndModify( {
query: { a: 1 },
update: { $inc: { b: 1 }, $set: { a: 2 } }
})

48
Concurrency in MongoDB transactions
• Concurrency Control:
• Example (cont.)
• After the findAndModify operations are complete, it is guaranteed that a and b in both
documents are set to 2.

49

You might also like