Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Distributed Database [E18CN]

Term project

Overview
The project will be to build a distributed database, implementing some of the key
components / protocols presented in class. Students are free to select any database
management system to implement their databases. This project will be done in
groups of two or three students. Any changes in the group composition, once the
groups have been formed will need my approval.
Completion of the project requires submission of different reports at various due
dates throughout the semester as well as the submission of a final report and system
demonstration.
The project
Phase Description Due date Points
1 Group formation 01/9/2021 0
“real-world scenario” description,
2 design a distributed database system, setup 10/11/2021 10%
local databases, and run applications.
Concurrency, Commit Protocols.
3 01/12/2021 20%
Final report
4 Demonstration of completed system Final exam date 60%

Phase 2:
Identify the “real-world scenario” that you are assigned to model in your database.
You need to describe how you want to build a decentralized database system for
your own scenario. Description must include all database constraints and
requirements. You also need to identify the applications that the database system
can support, and all information needs to be collected.
The applications must include, but not limited to, adding new data, modifying the
data, deleting data, and querying data from multiple sites.
In this phase you have to design a complete database system for your own scenario.
Then, you need to decentralize your database and to set up at least three
communicating database servers. Each server must be capable of responding to
requests from a client (a transaction program), and processing those transactions.
The transactions may involve tables at multiple sites.
The database in each site needs to be designed clearly and must consist of at least
three tables.
Transaction programs will be the collections of any arbitrary commands that support
the applications described above.
You can either have your database server read statements from the command line,
or from a connection from a separate program. However, you must be able to
read/process requests from the other sites while processing a command line
transaction. You will be asked to demonstrate that your system supports such
concurrency by demonstrating a non-serializable set of transactions (specifically, the
final state should not be equivalent to any serial execution, which you should
demonstrate.) You'll implement concurrency control to prevent this later.
In this phase, you will need to submit a report presenting all your works that have
been done. The report must be written in English.
Phase 3: Concurrency, Commit Protocols
For this phase of the project, you will need to try many roles in the system. One team
member will implement concurrency control, while the others will implement a
commit protocol. You can use either of your code bases as the starting point. You will
need to come to some agreement on how you will make use of the underlying
transaction mechanisms, as what makes it easy for one task could make it hard for
the other. You may choose to use multiple transactions running at each site, but you
may not use transaction naming to get the underlying DBMS to manage transactions
across sites.
Concurrency Control
You can implement whatever concurrency control mechanism you think best: 2-
phase locking, timestamp ordering, shadow pages, or whatever you choose. You
should support at least row-level concurrency. You can augment the database itself
(e.g., put timestamps on rows) if that makes it easier - as long as such additional
information is stripped before data is presented (this can easily be done as part of
queries.). You must support at least the following:
1. Demonstrate that two transactions entered in a non-serializable order will
somehow be delayed, aborted, or otherwise managed so the outcome is
equivalent to some serial ordering.
2. Demonstrate that two transactions operating on the same tables, but
different rows, can execute concurrently.
3. Demonstrate that you handle read/write conflicts as well as write/write
conflicts.
4. Prove that your scheme will not block indefinitely. I.e., if you have a deadlock,
you must detect and correct for it.
You may find that there are things that you don't handle well. As long as you
demonstrate the above, it is okay if there are some things you can't handle - as long
as you are aware of (and document) the conditions under which your concurrency
control fails.
Distributed Failure/Recovery
This part of the project requires that you implement a distributed commit protocol,
and demonstrate that you get correct results in spite of failure. You must support
failure of any single site (including the one where the transaction is being entered).
Any transactions running at the time must either abort or complete (including at the
failed site when it recovers.) You may use the recovery mechanisms of the underlying
databases, provided that this is independent for the separate tables (as with
concurrency control above.) You need not crash the underlying DBMS - this can be
simulated by having it abort any uncommitted transaction when a client (your server)
crashes. You must demonstrate:
1. Correct recovery after a commit, to all sites having update.
2. Correct recovery before a commit: transaction must be aborted at all sites.
3. At least some situations where failure is non-blocking (remaining sites can run
transactions that do not involve the failed sites.)
Your system does not need to be non-blocking in all cases, but you should document
failures that can result in blocking or conditions that can result in an inconsistent
database (hopefully you will have none of the above.)
Final report:
You need to complete a written report describing the details of this project. It will be
a compilation of the report you’ve submitted in the phase 2 and the details of your
work in Phase 3. The report must include descriptions of additional features and a
“how-to-use” your distributed database manual.
The report must be written in English.
Phase 4:
During this phase your group will demonstrate your system online.
Restrictions:
1. Every member of the group must be present at the demonstration.
2. Every member of the group must present some part of the project.
3. Depending on your workload, presentation time will be long or short.
General requirements
Each group will assign one member to send reports at corresponding due dates to
my email, dinhhoa@gmail.com. All email titles must include [E18CN1] or [E18CN2]
and your Group Numbers, for example: [E18CN1] group 4 – report phase 2.
All reports are prepared in English, and must be submitted in one single file each
time (MS Word, or PDF only). All other types of report will not be accepted.
Reports must be submitted by 23h59’ on due dates.
Each report will be graded in a 10 score scale. All members in each group are
expected to get the same grades. However, if the contribution to the project in the
group is different among its members, the grades will be adjusted accordingly.

You might also like