RDBMS Lab Manual

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

MADHYANCHAL PROFESSIONAL UNIVERSITY

ME DEPT. RDBMS LAB MANUAL


Experiment no 1
Normalization is a database design technique used to reduce data redundancy and improve data
integrity. It involves organizing the attributes of a database in such a way that it minimizes duplication
and dependency, ensuring that the data is stored efficiently. Normalization typically involves dividing
large tables into smaller, more manageable ones, and defining relationships between them.
Forms of Normalization
Normalization is performed through several stages known as normal forms, each of which addresses a
specific type of redundancy or anomaly. The most commonly used normal forms are the first normal
form (1NF), second normal form (2NF), and third normal form (3NF). Additionally, there are higher
normal forms like the Boyce-Codd normal form (BCNF) and beyond.
#### First Normal Form (1NF)
A table is in the first normal form if:
- It contains only atomic (indivisible) values.
- Each column contains values of a single type.
- Each column contains unique values, and there are no repeating groups or arrays.

**Example:**
Consider a table with student records that includes courses taken by students:
| StudentID | StudentName | Courses |
|-----------|-------------|----------------|
|1 | Alice | Math, English |
|2 | Bob | Science, Math |

To convert this table to 1NF, we need to ensure each cell contains only a single value:
| StudentID | StudentName | Course |
|-----------|-------------|----------|
|1 | Alice | Math |
|1 | Alice | English |
|2 | Bob | Science |
|2 | Bob | Math |

#### Second Normal Form (2NF)


A table is in the second normal form if: - It is already in 1NF. - All non-key attributes are fully
functionally dependent on the primary key. This means that each non-key attribute must depend on the
whole primary key and not just part of it (no partial dependency).
MADHYANCHAL PROFESSIONAL UNIVERSITY
ME DEPT. RDBMS LAB MANUAL
**Example:**
Consider a table with the following columns:
| StudentID | Course | Instructor |
|-----------|------------|--------------|
|1 | Math | Dr. Smith |
|1 | English | Dr. Brown |
|2 | Science | Dr. White |
|2 | Math | Dr. Smith |

To convert this table to 2NF, we identify that `Instructor` depends on `Course`, not on `StudentID`. We
should split the table into two tables:

**Students Table:**
| StudentID | Course |
|-----------|----------|
|1 | Math |
|1 | English |
|2 | Science |
|2 | Math |

**Courses Table:**
| Course | Instructor |
|----------|------------|
| Math | Dr. Smith |
| English | Dr. Brown |
| Science | Dr. White |
#### Third Normal Form (3NF)
A table is in the third normal form if:
- It is already in 2NF.
- There are no transitive dependencies. This means that non-key attributes should not depend on other
non-key attributes.

**Example:**
MADHYANCHAL PROFESSIONAL UNIVERSITY
ME DEPT. RDBMS LAB MANUAL

Consider the Students and Courses tables from 2NF. Suppose the instructor’s department is also
included:

| Course | Instructor | Department |


|----------|------------|------------|
| Math | Dr. Smith | Math Dept |
| English | Dr. Brown | Eng Dept |
| Science | Dr. White | Sci Dept |

Here, `Department` depends on `Instructor`, which depends on `Course`. To achieve 3NF, we should
split this further:

**Courses Table:**
| Course | Instructor |
|----------|------------|
| Math | Dr. Smith |
| English | Dr. Brown |
| Science | Dr. White |

**Instructors Table:**
| Instructor | Department |
|------------|------------|
| Dr. Smith | Math Dept |
| Dr. Brown | Eng Dept |
| Dr. White | Sci Dept |

### Summary
Normalization is a crucial step in designing a database that ensures data is stored efficiently and
consistently. By organizing data into progressively simpler structures, normalization eliminates
redundancy, reduces potential for anomalies, and enhances data integrity.
MADHYANCHAL PROFESSIONAL UNIVERSITY
ME DEPT. RDBMS LAB MANUAL

Experiment No. 2
Normalization of a University Database
**Objective:**
To design a normalized database for a university that stores information about students, courses,
instructors, and enrollments. The goal is to eliminate data redundancy and ensure data integrity through
the process of normalization up to the Third Normal Form (3NF).

**Initial Scenario:**
The university has a single table to store information about students, courses they are enrolled in, and
instructors teaching those courses. Below is the unnormalized table:

| StudentID | StudentName | CourseID | CourseName | InstructorID | InstructorName | InstructorDept |


|-----------|-------------|----------|------------|--------------|----------------|----------------|
|1 | Alice | CSE101 | Data Structures | 1001 | Dr. Smith | Computer Science |
|2 | Bob | CSE101 | Data Structures | 1001 | Dr. Smith | Computer Science |
|1 | Alice | MTH101 | Calculus | 1002 | Dr. Jones | Mathematics |
|3 | Carol | PHY101 | Physics | 1003 | Dr. Clark | Physics |

**Problems with the Unnormalized Table:**


- **Redundancy:** Instructor details and course names are repeated.
- **Update Anomalies:** Changing an instructor's department would require multiple updates.
- **Insertion Anomalies:** Adding a new course requires redundant instructor details.
- **Deletion Anomalies:** Deleting a student's record may remove essential instructor/course
information.

### Normalization Steps

#### Step 1: First Normal Form (1NF)


- **Objective:** Ensure each column contains only atomic values, and each row is unique.

**1NF Table:**
MADHYANCHAL PROFESSIONAL UNIVERSITY
ME DEPT. RDBMS LAB MANUAL
| StudentID | StudentName | CourseID | CourseName | InstructorID | InstructorName | InstructorDept
|
|-----------|-------------|----------|-----------------|--------------|----------------|----------------|
|1 | Alice | CSE101 | Data Structures | 1001 | Dr. Smith | Computer Science |
|2 | Bob | CSE101 | Data Structures | 1001 | Dr. Smith | Computer Science |
|1 | Alice | MTH101 | Calculus | 1002 | Dr. Jones | Mathematics |
|3 | Carol | PHY101 | Physics | 1003 | Dr. Clark | Physics |

#### Step 2: Second Normal Form (2NF)


- **Objective:** Remove partial dependencies; ensure all non-key attributes are fully dependent on the
primary key.

We identify composite keys and decompose the table:

**Students Table:**

| StudentID | StudentName |
|-----------|-------------|
|1 | Alice |
|2 | Bob |
|3 | Carol |

**Courses Table:**

| CourseID | CourseName |
|----------|-----------------|
| CSE101 | Data Structures |
| MTH101 | Calculus |
| PHY101 | Physics |

**Instructors Table:**

| InstructorID | InstructorName | InstructorDept |


MADHYANCHAL PROFESSIONAL UNIVERSITY
ME DEPT. RDBMS LAB MANUAL
|--------------|----------------|------------------|
| 1001 | Dr. Smith | Computer Science |
| 1002 | Dr. Jones | Mathematics |
| 1003 | Dr. Clark | Physics |

**Enrollments Table:**

| StudentID | CourseID | InstructorID |


|-----------|----------|--------------|
|1 | CSE101 | 1001 |
|2 | CSE101 | 1001 |
|1 | MTH101 | 1002 |
|3 | PHY101 | 1003 |

#### Step 3: Third Normal Form (3NF)


- **Objective:** Remove transitive dependencies; ensure non-key attributes are not dependent on other
non-key attributes.

From our current tables, we see no transitive dependencies. Therefore, the tables are already in 3NF.

### Final Normalized Database Schema

**Students Table:**

| StudentID | StudentName |
|-----------|-------------|
|1 | Alice |
|2 | Bob |
|3 | Carol |

**Courses Table:**
MADHYANCHAL PROFESSIONAL UNIVERSITY
ME DEPT. RDBMS LAB MANUAL
| CourseID | CourseName |
|----------|-----------------|
| CSE101 | Data Structures |
| MTH101 | Calculus |
| PHY101 | Physics |

**Instructors Table:**

| InstructorID | InstructorName | InstructorDept |


|--------------|----------------|------------------|
| 1001 | Dr. Smith | Computer Science |
| 1002 | Dr. Jones | Mathematics |
| 1003 | Dr. Clark | Physics |

**Enrollments Table:**

| StudentID | CourseID | InstructorID |


|-----------|----------|--------------|
|1 | CSE101 | 1001 |
|2 | CSE101 | 1001 |
|1 | MTH101 | 1002 |
|3 | PHY101 | 1003 |

### Conclusion
By applying normalization, we have successfully removed redundancy, minimized update, insertion,
and deletion anomalies, and structured the database in a way that ensures data integrity and efficient
management. The resulting tables are in 3NF, which is generally sufficient for most practical
applications.
MADHYANCHAL PROFESSIONAL UNIVERSITY
ME DEPT. RDBMS LAB MANUAL

Experiment no.3
Lab Manual: Introduction to Query Processing and Query Optimization
#### Introduction
This lab manual provides an introduction to query processing and query optimization in relational
database management systems (RDBMS). It covers the basic concepts, steps involved, and the
importance of optimizing queries for efficient data retrieval.

### Query Processing


Query processing involves translating a high-level query (such as SQL) into an efficient sequence of
operations to retrieve the requested data from a database. The main steps in query processing are:
1. **Parsing and Translation:**
- The SQL query is parsed to check for syntactic and semantic correctness.
- The parsed query is translated into an internal representation, typically a query tree or a query graph.

2. **Optimization:**
- The internal representation of the query is optimized to improve performance.
- The optimizer explores different execution plans and selects the most efficient one based on various
criteria.

3. **Execution:**
- The optimized query plan is executed by the database engine.
- The results are fetched and returned to the user.

### Query Optimization


Query optimization is a crucial phase in query processing where the goal is to find the most efficient
execution plan for a query. The optimization process can significantly impact the performance of
database systems, especially for complex queries on large datasets.

#### Types of Query Optimization


1. **Rule-Based Optimization:**
- Uses a set of predefined rules to transform the query into a more efficient form.
MADHYANCHAL PROFESSIONAL UNIVERSITY
ME DEPT. RDBMS LAB MANUAL
- Example rules include pushing selections down the query tree, combining consecutive projections,
and reordering joins.

2. **Cost-Based Optimization:**
- Evaluates the cost of different execution plans using statistical information about the data.
- Considers factors such as I/O operations, CPU usage, and memory usage to estimate the cost.
- The plan with the lowest estimated cost is selected for execution.

#### Techniques in Query Optimization


1. **Selection and Projection:**
- Push down selection and projection operations as close to the base tables as possible to reduce the
amount of data processed.

2. **Join Ordering:**
- Determine the most efficient order to join tables, often using heuristics or cost-based methods.
- The order of joins can greatly affect the performance of the query.

3. **Index Usage:**
- Utilize indexes to speed up data retrieval.
- Identify which indexes can be used to optimize specific query operations.

4. **Materialized Views:**
- Use precomputed views to speed up query processing.
- Useful for complex queries that involve aggregations and joins.

### Example
Consider the following SQL query:
```sql
SELECT S.Name, C.CourseName
FROM Students S, Enrollments E, Courses C
WHERE S.StudentID = E.StudentID
AND E.CourseID = C.CourseID
AND C.Department = 'Computer Science';
MADHYANCHAL PROFESSIONAL UNIVERSITY
ME DEPT. RDBMS LAB MANUAL
```
#### Query Processing Steps:
1. **Parsing:**
- Check for syntax errors.
- Validate that all referenced tables and columns exist.
2. **Translation:**
- Convert the SQL query into an internal query tree.
3. **Optimization:**
- **Rule-Based Optimization:**
- Apply selection pushdown to filter courses by department before joining.
- **Cost-Based Optimization:**
- Evaluate different join orders (e.g., `Students` with `Enrollments` first or `Enrollments` with
`Courses` first) and select the one with the lowest cost.
4. **Execution:**
- Execute the optimized query plan and return the results.

### Exercises

1. **Simple Query Optimization:**


- Write and optimize a simple SQL query to retrieve data from a single table using selection and
projection.

2. **Join Optimization:**
- Write a query involving multiple tables and experiment with different join orders to observe the
impact on performance.

3. **Index Usage:**
- Create indexes on a table and compare the performance of queries before and after index creation.

4. **Cost-Based Optimization:**
- Use a database's explain plan feature to analyze and understand the cost of different query execution
plans.
MADHYANCHAL PROFESSIONAL UNIVERSITY
ME DEPT. RDBMS LAB MANUAL
### Conclusion

Understanding query processing and optimization is essential for efficient database management. By
applying the principles and techniques discussed in this lab manual, you can write optimized queries
that perform well even on large datasets. Regular practice and experimentation with different
optimization strategies will help you develop a deeper understanding of these concepts.

Experiment no 4
Study and Usage of Query Optimization Techniques
#### Introduction
Query optimization is essential for improving the performance of SQL queries in relational databases.
This lab manual will guide you through the study and practical application of various query optimization
techniques. By the end of this manual, you will understand how to optimize queries and evaluate their
performance using different strategies.

### Objectives
1. Understand the importance of query optimization.
2. Learn and apply various query optimization techniques.
3. Use tools to analyze and compare query performance.

### Pre-requisites
- Basic understanding of SQL.
- Familiarity with database concepts and relational databases.
- Access to a relational database management system (e.g., MySQL, PostgreSQL, Oracle).

### Query Optimization Techniques

#### 1. Selection and Projection Pushdown

**Concept:**
- Push selection and projection operations as close to the base tables as possible to reduce the amount
of data processed in subsequent operations.

**Exercise:**
1. Create a table `Employees` with sample data:
MADHYANCHAL PROFESSIONAL UNIVERSITY
ME DEPT. RDBMS LAB MANUAL
```sql
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
Name VARCHAR(50),
Department VARCHAR(50),
Salary DECIMAL(10, 2)
);

INSERT INTO Employees VALUES


(1, 'Alice', 'HR', 60000),
(2, 'Bob', 'Engineering', 80000),
(3, 'Carol', 'HR', 65000),
(4, 'David', 'Engineering', 75000);
```

2. Write a query to select names of employees from the HR department:


```sql
SELECT Name
FROM Employees
WHERE Department = 'HR';
```

3. Use the database's `EXPLAIN` or `EXPLAIN PLAN` feature to analyze the query plan:
```sql
EXPLAIN SELECT Name FROM Employees WHERE Department = 'HR';
```

#### 2. Index Usage

**Concept:**
- Use indexes to speed up data retrieval by allowing the database to find rows more quickly than
scanning the entire table.
MADHYANCHAL PROFESSIONAL UNIVERSITY
ME DEPT. RDBMS LAB MANUAL

**Exercise:**
1. Create an index on the `Department` column:
```sql
CREATE INDEX idx_department ON Employees(Department);
```

2. Execute the same query as above and compare the performance:


```sql
SELECT Name FROM Employees WHERE Department = 'HR';
```

3. Analyze the query plan to see how the index is used:


```sql
EXPLAIN SELECT Name FROM Employees WHERE Department = 'HR';
```

#### 3. Join Optimization

**Concept:**
- Optimize the order of joins and use indexes to minimize the cost of join operations.

**Exercise:**
1. Create tables `Students`, `Enrollments`, and `Courses` with sample data:
```sql
CREATE TABLE Students (
StudentID INT PRIMARY KEY,
Name VARCHAR(50)
);

CREATE TABLE Courses (


CourseID INT PRIMARY KEY,
MADHYANCHAL PROFESSIONAL UNIVERSITY
ME DEPT. RDBMS LAB MANUAL
CourseName VARCHAR(50)
);

CREATE TABLE Enrollments (


StudentID INT,
CourseID INT,
PRIMARY KEY (StudentID, CourseID),
FOREIGN KEY (StudentID) REFERENCES Students(StudentID),
FOREIGN KEY (CourseID) REFERENCES Courses(CourseID)
);

INSERT INTO Students VALUES (1, 'Alice'), (2, 'Bob'), (3, 'Carol');
INSERT INTO Courses VALUES (101, 'Math'), (102, 'Science');
INSERT INTO Enrollments VALUES (1, 101), (2, 102), (1, 102), (3, 101);
```

2. Write a query to join these tables and fetch the names of students and their courses:
```sql
SELECT S.Name, C.CourseName
FROM Students S
JOIN Enrollments E ON S.StudentID = E.StudentID
JOIN Courses C ON E.CourseID = C.CourseID;
```

3. Use `EXPLAIN` to analyze the query plan:


```sql
EXPLAIN SELECT S.Name, C.CourseName
FROM Students S
JOIN Enrollments E ON S.StudentID = E.StudentID
JOIN Courses C ON E.CourseID = C.CourseID;
```
MADHYANCHAL PROFESSIONAL UNIVERSITY
ME DEPT. RDBMS LAB MANUAL
4. Create indexes on `Enrollments` table:
```sql
CREATE INDEX idx_studentid ON Enrollments(StudentID);
CREATE INDEX idx_courseid ON Enrollments(CourseID);
```

5. Re-run the query and compare the performance:


```sql
SELECT S.Name, C.CourseName
FROM Students S
JOIN Enrollments E ON S.StudentID = E.StudentID
JOIN Courses C ON E.CourseID = C.CourseID;
```
MADHYANCHAL PROFESSIONAL UNIVERSITY
ME DEPT. RDBMS LAB MANUAL

Experiment no 5
Study and Usage of Backup and Recovery Features of a Database

#### Introduction
Database backup and recovery are critical components of database management, ensuring data integrity
and availability in case of failures, such as hardware malfunctions, software errors, or human mistakes.
This lab manual will guide you through the study and practical application of backup and recovery
features in a relational database management system (RDBMS).

### Objectives
1. Understand the importance of database backup and recovery.
2. Learn and implement various types of backup methods.
3. Practice recovery techniques to restore database states.

### Pre-requisites
- Basic understanding of SQL and database management.
- Access to a relational database management system (e.g., MySQL, PostgreSQL, Oracle).

### Backup Types

#### 1. Full Backup

**Concept:**
- A full backup includes all the data in the database.
- It is the most comprehensive type of backup but can be time-consuming and resource-intensive.

**Exercise:**
MADHYANCHAL PROFESSIONAL UNIVERSITY
ME DEPT. RDBMS LAB MANUAL
1. **Creating a Full Backup (MySQL Example):**
```sql
-- Using MySQL command line
mysqldump -u username -p database_name > full_backup.sql
```

2. **Verify the Full Backup:**


- Check the `full_backup.sql` file to ensure it contains the complete database structure and data.

#### 2. Incremental Backup

**Concept:**
- An incremental backup includes only the data that has changed since the last backup (whether full or
incremental).
- It is faster and requires less storage space than a full backup.

**Exercise:**
1. **Creating an Incremental Backup (MySQL Example using binary logs):**
```sql
-- Enable binary logging in MySQL configuration (my.cnf or my.ini)
[mysqld]
log_bin = /var/log/mysql/mysql-bin.log

-- Take an initial full backup


mysqldump -u username -p database_name > full_backup.sql

-- After making some changes, record the current binary log position
SHOW MASTER STATUS;
```

2. **Apply Incremental Changes:**


- Use the binary logs to create incremental backups.
MADHYANCHAL PROFESSIONAL UNIVERSITY
ME DEPT. RDBMS LAB MANUAL
```sql
mysqlbinlog /var/log/mysql/mysql-bin.000001 > incremental_backup.sql
```

3. **Verify the Incremental Backup:**


- Check the `incremental_backup.sql` file to ensure it contains the changes made since the last
backup.

#### 3. Differential Backup

**Concept:**
- A differential backup includes all the data that has changed since the last full backup.
- It is faster than a full backup and simpler to restore than multiple incremental backups.

**Exercise:**
1. **Creating a Differential Backup (MySQL Example using binary logs):**
```sql
-- Take an initial full backup
mysqldump -u username -p database_name > full_backup.sql

-- After making some changes, create a differential backup


mysqldump -u username -p --single-transaction --quick --lock-tables=false database_name >
differential_backup.sql
```

2. **Verify the Differential Backup:**


- Check the `differential_backup.sql` file to ensure it contains the changes made since the last full
backup.

### Recovery Techniques

#### 1. Restoring from a Full Backup


MADHYANCHAL PROFESSIONAL UNIVERSITY
ME DEPT. RDBMS LAB MANUAL
**Exercise:**
1. **Restore the Full Backup (MySQL Example):**
```sql
mysql -u username -p database_name < full_backup.sql
```

2. **Verify the Restoration:**


- Check the database to ensure it has been restored to the state of the last full backup.

#### 2. Restoring from an Incremental Backup

**Exercise:**
1. **Restore the Full Backup First:**
```sql
mysql -u username -p database_name < full_backup.sql
```

2. **Apply Incremental Changes:**


```sql
mysql -u username -p database_name < incremental_backup.sql
```

3. **Verify the Restoration:**


- Check the database to ensure it includes the incremental changes.

#### 3. Restoring from a Differential Backup

**Exercise:**
1. **Restore the Full Backup First:**
```sql
mysql -u username -p database_name < full_backup.sql
```
MADHYANCHAL PROFESSIONAL UNIVERSITY
ME DEPT. RDBMS LAB MANUAL

2. **Apply the Differential Backup:**


```sql
mysql -u username -p database_name < differential_backup.sql
```

3. **Verify the Restoration:**


- Check the database to ensure it includes the differential changes.

### Practical Scenarios

#### Scenario 1: Scheduled Full Backups with Incremental Backups


- **Setup:**
- Schedule a full backup every Sunday.
- Schedule incremental backups every day of the week.

- **Steps:**
1. Create a full backup on Sunday.
2. Create incremental backups on Monday through Saturday.

- **Recovery:**
- Restore the full backup from Sunday.
- Apply each incremental backup in sequence.

#### Scenario 2: Full and Differential Backups


- **Setup:**
- Schedule a full backup every month.
- Schedule differential backups every week.

- **Steps:**
1. Create a full backup at the beginning of the month.
2. Create differential backups each week.
MADHYANCHAL PROFESSIONAL UNIVERSITY
ME DEPT. RDBMS LAB MANUAL

- **Recovery:**
- Restore the full backup from the beginning of the month.
- Apply the most recent differential backup.

### Conclusion

Understanding and implementing backup and recovery strategies is essential for ensuring data integrity
and availability in a database system. By practicing the exercises in this lab manual, you will gain hands-
on experience with different types of backups and recovery techniques, equipping you with the skills to
manage database backups and recoveries effectively.

6. Analyze the updated query plan:


```sql
EXPLAIN SELECT S.Name, C.CourseName
FROM Students S
JOIN Enrollments E ON S.StudentID = E.StudentID
JOIN Courses C ON E.CourseID = C.CourseID;
```

#### 4. Use of Materialized Views

**Concept:**
- Materialized views store the result of a query and can be refreshed periodically, improving
performance for complex queries.

**Exercise:**
1. Create a materialized view for frequently accessed query:
```sql
CREATE MATERIALIZED VIEW StudentCourses AS
SELECT S.Name, C.CourseName
FROM Students S
JOIN Enrollments E ON S.StudentID = E.StudentID
MADHYANCHAL PROFESSIONAL UNIVERSITY
ME DEPT. RDBMS LAB MANUAL
JOIN Courses C ON E.CourseID = C.CourseID;
```

2. Query the materialized view:


```sql
SELECT * FROM StudentCourses;
```

3. Compare the performance with the original query.

### Conclusion

Query optimization techniques are essential for improving the performance of SQL queries. By
understanding and applying these techniques, you can ensure efficient data retrieval and better overall
performance of your database systems. Regularly analyze and optimize your queries, especially as your
data grows and changes over time.

You might also like