Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 9

Table of Contents

Database, DBMS and RDBMS........................................................................................................2


ACID................................................................................................................................................3
Normalization and Denormalization...............................................................................................4
Improving Query Performance.......................................................................................................6
SQL VS No SQL...............................................................................................................................6
Simple Query Comparison – MySQL vs NoSQL.............................................................................8
Database, DBMS and RDBMS

Q: What are database, DBMS and RDBMS?


A: Below are the definition of each

Database
A database is a structured collection of data that is stored and managed to allow easy access,
retrieval, and use of the data. Databases are used to store information in an organized manner,
typically involving the use of tables, schemas, and other structured formats. The data in a
database can be anything from personal information, product details, transaction records, and
more.

DBMS (Database Management System)


A Database Management System (DBMS) is software that interacts with end-users,
applications, and the database itself to capture and analyze data. A DBMS allows users to create,
read, update, and delete data in a database. It provides tools to manage data, enforce data
integrity, and ensure data security. DBMSs support data manipulation and retrieval through a
query language such as SQL (Structured Query Language).
Key functions of a DBMS:
1. Data Definition: Defines the structure of the stored data (schemas, tables, fields).
2. Data Manipulation: Supports data operations like insertion, updating, deletion, and
retrieval.
3. Data Security and Integrity: Ensures data accuracy and consistency, and restricts
unauthorized access.
4. Data Backup and Recovery: Provides mechanisms to backup data and recover it in case
of failure.

RDBMS (Relational Database Management System)


A Relational Database Management System (RDBMS) is a type of DBMS that is based on the
relational model introduced by E. F. Codd. In an RDBMS, data is organized into tables (also
known as relations) which can be linked—or related—based on data common to each. This
model uses primary keys and foreign keys to establish and manage the relationships between
tables.
Characteristics of an RDBMS:
1. Table-based Structure: Data is stored in tables with rows and columns.
2. Relational Integrity: Enforces relationships between tables through foreign keys.
3. SQL Usage: Uses Structured Query Language (SQL) for database interaction.
4. ACID Properties: Ensures transactions are processed reliably (Atomicity, Consistency,
Isolation, Durability).
5. Data Normalization: Organizes data to minimize redundancy and dependency.
Examples of RDBMSs include:
 MySQL
 PostgreSQL
 Oracle Database
 Microsoft SQL Server
 SQLite

Comparison Summary
1. Database:
 A collection of structured data.
2. DBMS:
 Software for managing databases.
 Provides CRUD (Create, Read, Update, Delete) operations.
 Ensures data security, integrity, and backup.
3. RDBMS:
 A type of DBMS based on the relational model.
 Organizes data in tables with relationships.
 Uses SQL for querying and managing data.
 Enforces ACID properties for reliable transaction management

ACID

Q: What is ACID in database ?


A: ACID is an acronym that stands for Atomicity, Consistency, Isolation, and Durability.
These are a set of properties that ensure reliable processing of database transactions. ACID
properties are crucial for maintaining the integrity and reliability of a database, especially
in environments where multiple transactions are occurring concurrently. Here’s a detailed
explanation of each property:

1. Atomicity
Atomicity ensures that each transaction is treated as a single, indivisible unit of work. This
means that all the operations within a transaction must either complete successfully or fail
entirely. If any part of the transaction fails, the entire transaction is rolled back, and the database
is left in its original state.

Example: If a transaction involves transferring money from one bank account to another, the
deduction from the first account and the addition to the second account must both occur. If one
operation fails, neither should be applied.

2. Consistency
Consistency ensures that a transaction brings the database from one valid state to another valid
state, maintaining database rules such as constraints, cascades, and triggers. A transaction must
transform the database from one consistent state to another consistent state.

Example: If a database has a rule that no account balance can be negative, a transaction should
ensure that after it completes, all account balances are still non-negative.

3. Isolation
Isolation ensures that the operations of one transaction are isolated from the operations of other
transactions. This means that concurrently executing transactions cannot interfere with each
other. The effects of an incomplete transaction should not be visible to other transactions.

Example: If two transactions are occurring simultaneously, one withdrawing money from an
account and the other checking the balance, the transaction checking the balance should either
see the balance before withdrawal or after, but not during the withdrawal.

4. Durability
Durability ensures that once a transaction has been committed, it will remain so, even in the
event of a system failure (e.g., power loss, crash). This means that committed data is saved in a
non-volatile memory.

Example: After a transaction confirming a purchase completes and commits, the purchase details
remain recorded even if the system crashes immediately after the commit.

Summary
The ACID properties guarantee that database transactions are processed reliably and ensure the
integrity of data, even in multi-user environments and systems with potential failures. Here’s a
summary of each property:

 Atomicity: All-or-nothing principle.


 Consistency: Valid state transformations.
 Isolation: Independent transaction execution.
 Durability: Permanent transaction effects.

Normalization and Denormalization

Q: What is normalization and denormalization in database?


A: Below is the definition of each

Normalization
Normalization is the process of organizing the data in a database to minimize redundancy and
improve data integrity. The primary goal of normalization is to divide large tables into smaller,
related tables and define relationships between them to eliminate redundancy and ensure data
dependencies make sense.
Normalization involves applying a series of rules called normal forms (NF), each building upon
the previous one to progressively reduce data redundancy and dependency. Here are the
commonly used normal forms:
1. First Normal Form (1NF):
 Ensures that the table has a primary key.
 All columns contain atomic (indivisible) values.
 Each column contains values of a single type.
2. Second Normal Form (2NF):
 Meets all requirements of 1NF.
 All non-key attributes are fully functional dependent on the primary key
(eliminates partial dependency).
3. Third Normal Form (3NF):
 Meets all requirements of 2NF.
 There are no transitive dependencies (non-key attributes are not dependent on
other non-key attributes).
4. Boyce-Codd Normal Form (BCNF):
 A stricter version of 3NF.
 Every determinant is a candidate key.

Advantages of Normalization:
 Reduces Data Redundancy: Eliminates duplicate data.
 Improves Data Integrity: Ensures accuracy and consistency of data.
 Optimizes Queries: Reduces the complexity of queries.
 Simplifies Maintenance: Easier to maintain and update data.

Denormalization
Denormalization is the process of combining normalized tables into larger tables to improve
database read performance. While normalization optimizes for data integrity and reduction of
redundancy, denormalization aims to optimize query performance by reducing the number of
joins required to retrieve data.
Denormalization involves intentionally introducing redundancy by merging tables and
sometimes duplicating data. It is typically used in read-heavy databases where query
performance is more critical than update performance.

Advantages of Denormalization:
 Improved Read Performance: Reduces the number of joins and can speed up complex
queries.
 Simplified Queries: Makes querying data more straightforward and faster.
 Better Reporting Performance: Often used in data warehouses where read operations
are frequent and performance is critical.

Disadvantages of Denormalization:
 Increased Redundancy: Can lead to duplicate data and increased storage requirements.
 Data Inconsistency: Higher risk of data anomalies and integrity issues due to
redundancy.
 Complexity in Updates: Updating data can become more complex and error-prone.

Summary
Normalization and denormalization are techniques used to optimize database design
depending on the specific requirements and use cases:
 Normalization: Focuses on reducing redundancy and improving data integrity by
organizing data into related tables.
 Denormalization: Focuses on improving read performance by combining tables and
introducing controlled redundancy.
Choosing between normalization and denormalization involves balancing the need for data
integrity and efficiency with the need for performance and speed in database operations

Improving Query Performance

Q: Mention some techniques to improve slow query in database?


A: Below are some techniques to handle with slow query

 Index
 Sharding
 Sub clause to join
 Add column
 Cut data
 De-normalize

SQL VS No SQL

Q: Explain the differences between SQL like MySQL and NO SQLlike MongoDB
A: SQL (Structured Query Language) and NoSQL (Not Only SQL) are two distinct types
of database management systems, each with its own set of features, advantages, and use
cases. Here's a detailed comparison between SQL databases like MySQL and NoSQL
databases like MongoDB:

Data Model
MySQL (SQL):
 Relational Model: Data is stored in tables with rows and columns.
 Schema: Requires a predefined schema. The structure of the data (tables, columns, data
types) must be defined before adding data.
 Relationships: Supports complex joins and relationships between tables (one-to-one,
one-to-many, many-to-many).
MongoDB (NoSQL):
 Document Model: Data is stored in JSON-like documents (BSON in MongoDB).
 Schema-less: Flexible schema design. Documents in a collection do not need to have the
same set of fields, and data types can vary for different fields.
 Embedded Documents: Supports nesting of documents within documents, which can
simplify the representation of hierarchical data.

Query Language
MySQL:
 SQL: Uses Structured Query Language (SQL) for defining and manipulating data. SQL
is standardized and widely used.
 ACID Compliance: Ensures transactions are Atomic, Consistent, Isolated, and Durable,
which is critical for applications requiring strong consistency and reliability.
MongoDB:
 Query Language: Uses a rich, expressive query language based on JSON. While not
standardized like SQL, it is powerful and flexible.
 BASE: Typically follows the BASE (Basically Available, Soft state, Eventually
consistent) model, which allows for more flexible consistency models and can provide
better performance and scalability for certain use cases.

Scalability
MySQL:
 Vertical Scaling: Traditionally scaled by increasing the resources (CPU, RAM, storage)
of a single server.
 Replication: Supports master-slave replication for read scaling and high availability.
 Sharding: Possible but more complex to implement compared to NoSQL solutions.
MongoDB:
 Horizontal Scaling: Designed for horizontal scaling through sharding, which distributes
data across multiple servers.
 Replication: Built-in support for replica sets, providing high availability and automatic
failover.

Use Cases
MySQL:
 Structured Data: Best suited for applications requiring structured data with
relationships, such as financial systems, ERP, and CRM systems.
 Transactional Applications: Ideal for applications requiring complex transactions and
strong consistency.
MongoDB:
 Unstructured Data: Suitable for applications with unstructured or semi-structured data,
such as content management systems, social networks, and IoT applications.
 Flexible Schema: Ideal for applications that require rapid development and frequent
changes to the data model.

Performance
MySQL:
 Optimized for Reads and Joins: Performs well with complex queries and joins.
 Indexing: Uses traditional indexing methods which can be very efficient for certain types
of queries.
MongoDB:
 Optimized for Write-heavy Operations: Can handle large volumes of writes and is
well-suited for real-time analytics and big data applications.
 Indexing: Supports various indexing techniques, including compound indexes,
geospatial indexes, and full-text search.
Conclusion
Choosing between MySQL and MongoDB depends on the specific requirements of the
application:
 MySQL is ideal for applications requiring structured data, complex queries, and strong
transactional support.
 MongoDB is preferred for applications needing flexible schemas, high scalability, and
the ability to handle large volumes of unstructured data.
Understanding the strengths and weaknesses of each database type will help you make an
informed decision based on your project's needs.

Simple Query Comparison – MySQL vs NoSQL

Q: How do MySQL and MongoDB compare when they execute a simple query?
A: To understand the performance differences between MySQL and MongoDB for the
given conditions, it's essential to explore how each database internally handles queries.

MySQL
1. Simple Select All
When you execute a SELECT * FROM table in MySQL:
1. Query Parsing: MySQL parses the SQL query to understand what data is being
requested.
2. Optimization: The query optimizer determines the best way to execute the query. For a
simple select all, this step is straightforward.
3. Execution: MySQL reads the rows from the table. The data is usually stored in a row-
oriented format, meaning each row is read in its entirety.
MySQL performs well in select-all operations for small to medium-sized datasets. However, for
very large datasets, reading all rows can be resource-intensive.
2. Select by ID
When you execute SELECT * FROM table WHERE id = ? in MySQL:
1. Query Parsing: As with any query, MySQL first parses the query.
2. Optimization: The query optimizer looks at available indexes. Since id is typically
indexed (often as a primary key), the optimizer chooses an index scan.
3. Execution: Using the index, MySQL quickly locates the specific row or rows matching
the ID. This operation is very efficient due to the indexed search.

MongoDB
1. Simple Select All
When you execute db.collection.find({}) in MongoDB:
1. Query Parsing: MongoDB parses the query, understanding that all documents are
requested.
2. Execution: MongoDB reads all documents in the collection. The data is stored in a
BSON (Binary JSON) format. MongoDB can quickly return all documents, but like
MySQL, this can be resource-intensive for very large collections.
MongoDB can handle large datasets efficiently, but the performance is similar to MySQL in
terms of reading all documents.

2. Select by ID
When you execute db.collection.find({ _id: ObjectId("...") }) in MongoDB:
1. Query Parsing: MongoDB parses the query.
2. Execution: MongoDB uses the _id index, which is created by default for the _id field.
The query directly accesses the specific document matching the _id.
MongoDB's ability to quickly retrieve documents by _id is highly efficient, similar to MySQL's
indexed queries.

Performance Comparison
1. Simple Select All
 MySQL: Performs well for small to medium datasets but can be resource-intensive for
large datasets due to its row-oriented storage.
 MongoDB: Handles large datasets efficiently but faces similar challenges as MySQL
when dealing with very large collections.
2. Select by ID
 MySQL: Very efficient due to indexing, particularly if the id is a primary key.
 MongoDB: Highly efficient due to the default index on _id.

Summary
 Select All: Performance is generally comparable between MySQL and MongoDB for
small to medium-sized datasets. For large datasets, both can face performance issues, but
MongoDB's document-oriented nature might provide a slight edge in flexibility and
scalability.
 Select by ID: Both MySQL and MongoDB perform exceptionally well due to indexing
mechanisms. The difference in performance is minimal, but MongoDB's design around
the _id field as an index by default might make it marginally faster for retrieval by ID in
some cases.

Ultimately, the choice between MySQL and MongoDB depends on other factors such as the
nature of your data, scalability requirements, and specific use cases rather than just query
performance for select operations.

You might also like