Chapter One Database System Concepts and Architecture Outline Database Management Systems
Overview of different types of databases and data
models (ER, UML)
Practical Object Oriented Database design
methodology and use of UML Diagram
Database-System Architectures
Database languages and Query Languages
Database Management Systems A Database Management System (DBMS) is software designed to store, retrieve, define, and manage data in a database. The DBMS manages incoming data, organizes it, and provides ways for the data to be modified or extracted by users or other programs. DBMS contains information about a particular enterprise o Collection of interrelated data o Set of programs to access the data o An environment that is both convenient and efficient to use Cont.. A DBMS generally manipulates the data itself, the data format, field names, record structure and file structure. It also defines rules to validate and manipulate the data Databases can be very large. Databases touch all aspects of our lives CONT... Some DBMS examples include MySQL, Postgre SQL, Microsoft Access, SQL Server, FileMaker, Oracle, RDBMS and Foxpro Application program examples of a university DB includes: • Add new students, instructors, and courses • Register students for courses, and generate class rosters • Assign grades to students, compute grade point averages (GPA) and generate transcripts Database Applications: Banking: transactions Airlines: reservations, schedules Universities: registration, grades Sales: customers, products, purchases Online retailers: order tracking, customized recommendations Manufacturing: production, inventory, orders, supply chain Human resources: employee records, salaries, tax deductions Why Use a DBMS Data independence and efficient access. Reduced application development time. Data integrity and security. Uniform data administration. Concurrent access, recovery from crashes. Purpose of DBMS 1. Data redundancy and inconsistency Same information may be duplicated in several places. All copies may not be updated properly. 2. Difficulty in new program to carry out each new task 3. Data isolation — Data in different formats. Cont.. 4. Security problems Every user of the system should be able to access only the data they are permitted to see. E.g. payroll people only handle employee records, and cannot see customer accounts; tellers only access account data and cannot see payroll data. Difficult to enforce this with application programs. 5. Integrity problems Data may be required to satisfy constraints. E.g. no account balance below $25.00. Again, difficult to enforce or to change constraints with the file-processing approach. Advantage of DBMS Controlling Redundancy Sharing of Data Data Consistency Integration of Data Integration Constraints Data Security Control Over Concurrency Backup and Recovery Procedures Data Independence Dis-Advantage of DBMS Cost of Hardware and Software Cost of Data Conversion Cost of Staff Training Appointing Technical Staff Database Damage Database Types There are various types of databases used for storing different varieties of data: Database Types 1) Centralized Database It is the type of database that stores data at a centralized database system. It comforts the users to access the stored data from different locations through several applications. These applications contain the authentication process to let users access data securely. An example of a Centralized database can be Central Library that carries a central database of each library in a college/university. Cont.. Advantages of Centralized Database It has decreased the risk of data management, Data consistency is maintained as it manages data in a central repository. It provides better data quality. It is less costly. Disadvantages of Centralized Database The size of the centralized database is large, which increases the response time for fetching the data. It is not easy to update such an extensive database system. If any server failure occurs, entire data will be lost, which could be a huge loss. Cont.. 2) Distributed Database In distributed systems, data is distributed among different database systems of an organization. These database systems are connected via communication links. Such links help the end-users to access the data easily Examples of the Distributed database are Apache Cassandra, HBase, Ignite, etc. We can further divide a distributed database system into: Homogeneous DDB Heterogeneous DDB Cont.. Homogeneous DDB: Those database systems which execute on the same operating system and use the same application process and carry the same hardware devices. Heterogeneous DDB: Those database systems which execute on different operating systems under different application procedures, and carries different hardware devices. Cont.. 3) Relational Database This database is based on the relational data model, which stores data in the form of rows(tuple) and columns(attributes), and together forms a table(relation). A relational database uses SQL for storing, manipulating, as well as maintaining the data. Each table in the database carries a key that makes the data unique from others. Examples of Relational databases are MySQL, Microsoft SQL Server, Oracle, etc. Cont.. 4) NoSQL Database Non-SQL/Not Only SQL is a type of database that is used for storing a wide range of data sets. It is not a relational database as it stores data not only in tabular form but in several different ways. It came into existence when the demand for building modern applications increased. Thus, NoSQL presented a wide variety of database technologies in response to the demands. Cont.. 5) Cloud Database A type of database where data is stored in a virtual environment and executes over the cloud computing platform. It provides users with various cloud computing services (SaaS, PaaS, IaaS, etc.) for accessing the database. There are numerous cloud platforms, but the best options are: Amazon Web Services(AWS) Microsoft Azure Kamatera PhonixNAP ScienceSoft Google Cloud SQL, etc. Cont.. 6) Object-oriented Databases The type of database that uses the object-based data model approach for storing data in the database system. The data is represented and stored as objects which are similar to the objects used in the object-oriented programming language. 7) Network Databases It is the database that typically follows the network data model. Here, the representation of data is in the form of nodes connected via links between them. Unlike the hierarchical database, it allows each record to have multiple children and parent nodes to form a generalized graph structure. 8) Personal Database Collecting and storing data on the user's system defines a Personal Database. This database is basically designed for a single user. Cont.. 9) Hierarchical Databases It is the type of database that stores data in the form of parent- children relationship nodes. Here, it organizes data in a tree-like structure. Data get stored in the form of records that are connected via links. Each child record in the tree will contain only one parent. On the other hand, each parent record can have multiple child records. Overview of Data Models Data Models are used to show how data is stored, connected, accessed and updated in the database management system. Though there are many data models being used nowadays but the Relational model is the most widely used model. Some of the Data Models in DBMS are:
Hierarchical Model Object-Relational Data Model
Network Model Flat Data Model Entity-Relationship Model Semi-Structured Data Model Relational Model Associative Data Model Object-Oriented Data Model Context Data Model We will discuss about some types of Data models in next slides. Entity-Relationship Model ER Model is based on the notion of real-world entities and relationships among them. While formulating real-world scenario into the database model, the ER Model creates entity set, relationship set, general attributes and constraints. ER Model is best used for the conceptual design of a database. ER Model is based on − Entities and their attributes. Relationships among entities. These concepts are explained In next slides Cont.. Entity − An entity in an ER Model is a real-world entity having properties called attributes. Every attribute is defined by its set of values called domain. For example, in a school database, a student is considered as an entity. Student has various attributes like name, age, class, etc. Relationship − The logical association among entities is called relationship. Relationships are mapped with entities in various ways. Mapping cardinalities define the number of association between two entities. Features of ER Model Graphical Representation for Better Understanding: It is very easy and simple to understand so it can be used by the developers to communicate with the stakeholders. ER Diagram: ER diagram is used as a visual tool for representing the model. Database Design: This model helps the database designers to build the database and is widely used in database design. Hierarchical Model Hierarchical Model was the first DBMS model. This model organizes the data in the hierarchical tree structure. The hierarchy starts from the root which has root data and then it expands in the form of a tree adding child node to the parent node. This model easily represents some of the real-world r/ship like food recipes, sitemap of a website etc. Example: We can represent the r/ship b/n the shoes present on a shopping website in the following way: Relational Model Relational Model is the most widely used model. In this model, the data is maintained in the form of a two- dimensional table. All the information is stored in the form of row and columns. The basic structure of a relational model is tables. So, the tables are also called relations in the relational model. Example: In this example, we have an Employee table. Cont.. The main highlights of this model are − Data is stored in tables called relations. . Relations can be normalized. In normalized relations, values saved are atomic values. Each row in a relation contains a unique value (e.g. SSID). Each column in a relation contains values from a same domain. Object Oriented Data-model An object database is managed by an object-oriented database management system (OODBMS). The database combines object-oriented programming concepts with relational database principles. Objects are the basic building block and an instance of a class, where the type is either built-in or user-defined. Classes provide a schema or blueprint for objects, defining the behavior. Methods determine the behavior of a class. Pointers help access elements of an object database and establish relations between objects. DB system Architecture The DBMS design depends upon its architecture. The basic client/server architecture is used to deal with a large number of PCs, web servers, database servers and other components that are connected with networks. The client/server architecture consists of many PCs and a workstation which are connected via the network. DBMS architecture depends upon how users are connected to the database to get their request done. Cont.. Database architecture can be seen as a single tier or multi- tier. But logically, database architecture is of two types like: 2-tier architecture and 3-tier architecture. 1-Tier Architecture In this architecture, the database is directly available to the user. It means the user can directly sit on the DBMS and uses it. Any changes done here will directly be done on the database itself. It doesn't provide a handy tool for end users. The 1-Tier architecture is used for development of the local application, where programmers can directly communicate with the database for the quick response. 2-Tier Architecture In the two-tier architecture, applications on the client end can directly communicate with the database at the server side. For this interaction, API's like: ODBC, JDBC are used. The user interfaces and application programs are run on the client-side. The server side is responsible to provide the functionalities like: query processing and transaction management. To communicate with the DBMS, client-side application establishes a connection with the server side.
Fig: 2-tier Architecture
3-Tier Architecture The 3-Tier architecture contains another layer between the client and server. In this architecture, client can't directly communicate with the server. The application on the client-end interacts with an application server which further communicates with the database system. End user has no idea about the existence of the database beyond the application server. The database also has no idea about any other user beyond the application. The 3-Tier architecture is used in case of large web application.
Fig: 3-tier Architecture
Database languages and Query Languages A DBMS has appropriate languages and interfaces to express database queries and updates. Database languages can be used to read, store and update the data in the database. A query language is a specialized programming language for searching and changing the contents of a database. DBMS Language 1. Data Definition Language DDL stands for Data Definition Language. It is used to define database structure or pattern. It is used to create schema, tables, indexes, constraints, etc. in the database. Using the DDL statements, you can create the skeleton of the database. Data definition language is used to store the information of metadata like the number of tables and schemas, their names, indexes, columns in each table, constraints, etc. Cont.. Here are some tasks that come under DDL: Create: It is used to create objects in the database. Alter: It is used to alter the structure of the database. Drop: It is used to delete objects from the database. Truncate: It is used to remove all records from a table. Rename: It is used to rename an object. Comment: It is used to comment on the data dictionary. These commands are used to update the database schema that's why they come under Data definition language. DBMS Language 2. Data Manipulation Language DML stands for Data Manipulation Language. It is used for accessing and manipulating data in a database. It handles user requests. Here are some tasks that come under DML: Select: It is used to retrieve data from a database. Insert: It is used to insert data into a table. Update: It is used to update existing data within a table. Delete: It is used to delete all records from a table. Merge: It performs UPSERT operation, i.e., insert or update operations. Call: It is used to call a structured query language or a Java subprogram. DBMS Language 3. Data Control Language DCL stands for Data Control Language. It is used to retrieve the stored or saved data. The DCL execution is transactional. It also has rollback parameters. (But in Oracle database, the execution of data control language does not have the feature of rolling back.) Here are some tasks that come under DCL: Grant: It is used to give user access privileges to a database. Revoke: It is used to take back permissions from the user. There are the following operations which have the authorization of Revoke: CONNECT, INSERT, USAGE, EXECUTE, DELETE, UPDATE DBMS Language 4. Transaction Control Language TCL is used to run the changes made by the DML (Data Manipulation Language) statement. Here are some tasks that come under TCL: Commit: It is used to save the transaction on the database. Rollback: It is used to restore the database to original since the last Commit. Reading Assignment Read About the concepts of the following term in Object Oriented Programming. A. Polymorphism B. Inheritance C. Encapsulation D. Abstraction End of Chap 1