Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Module Title: Module Title: Advanced Database Systems

Advanced Database Systems

Module Leader: Room no: Email address: Week 1 (Part 1 of 2) Introduction to the Module, Revision of Key Database Concepts, Revision of Key Relational Concepts Module Tutor: Room no: Email address: Delivery:

Paul Massey IT0.08 p.c.massey@tees.ac.uk Annette Marshall G0.35 a.marshall@tees.ac.uk One hour lecture One hour tutorial
100% ICA

Assessment:

(in 2 elements 50% each)

Advanced Database Systems

Page 1 of 34

AM - September 2009

Advanced Database Systems

Page 2 of 34

AM - September 2009

Module Description
The module is aimed at developing the student's ability to design and implement database applications to meet business needs. The module will investigate the issues and technologies associated with implementing and supporting databases and the services that are needed to maintain and access a repository of data. Investigations will be undertaken in a number of areas including data warehouses, integrating legacy data, data management and approaches that support the modelling and visualisation of data for a range of user views. The module will be assessed by an ICA. For this students will be required to design and implement a database solution for a given scenario and write up the results of research undertaken into a particular database area related to that scenario. 1.

Module Aims
To undertake a detailed analysis and modelling of an enterprises information processing requirements. To design and build a database that meet an enterprises information processing requirements. To investigate the issues associated with database server administration. To explore practical development issues associated with integrating legacy data and data services. To explore new and emerging trends in information visualisation and representation.

2.

3.

4.

5.

Advanced Database Systems

Page 3 of 34

AM - September 2009

Advanced Database Systems

Page 4 of 34

AM - September 2009

Main Learning Outcomes


On successful completion of this module, the student will be able to: 1. Undertake some of the key roles and responsibilities of a database administrator (DBA). Design and build a database to meet an enterprises data processing requirements and business rules. Construct a program using a data definition and data manipulation language (eg SQL). Implement the processes and procedures needed to extract, transform and load legacy data from a range of data sources. Research a range of visualisation techniques used to present information (such as cubes, maps, graphs, icons). To revise:

Term 1ish (AM)

Key database concepts Key relational concepts SQL To enhance database design skills: Conceptual design (ER + EER modelling) Logical design (normalisation) Physical design To introduce: Query processing Query optimisation Indexes With particular reference to SQL Server. To complete the first, design, half of the ICA.

2.

3.

4.

5.

Advanced Database Systems

Page 5 of 34

AM - September 2009

Advanced Database Systems

Page 6 of 34

AM - September 2009

Outline Schedule for Term 1ish


Wk 1 Introduction to the module. Revision of key database concepts. Revision of key relational concepts. SQL: SELECT, creating tables, primary and foreign key constraints. No official tutorial in the lab. Database design: ER modelling using UML. SQL: other constraints, inserting data, joins, views. Database design: ER modelling using UML (contd). SQL: Stored procedures. Database design: ER modelling using UML, normalisation, physical database design. ER + EER modelling. Database transactions. ICA part 1 out. Working on ICA.

Outline Schedule for Term 1ish (contd)


Wks 8 Tut Wk 9 Tut Wk 10 Query processing and optimisation. Database transactions. Indexes. Working on ICA. Indexes (contd). Practical work on indexes. Indexes (contd). Working on ICA.

Tut

Wk 2 Tut

Wk 3 Tut Wk 4-6 Tuts Wk 7 Tut

Wk 11 Tut

Wk12-13 Tut Wk 14 Wk 15-24 Wk 28

Working on ICA. Working on ICA. Hand in part 1 of ICA. Paul Hand in part 2 of ICA.
Page 8 of 34 AM - September 2009

Advanced Database Systems

Page 7 of 34

AM - September 2009

Advanced Database Systems

Recommended Books
Database Systems, by Thomas Connolly and Carolyn Begg, published by AddisonWesley, ISBN 0-321-21025-5 (4th Edition) Database System Concepts by Avi Silberschatz, Henry F. Korth, S. Sudarshan, published by McGrawHill, ISBN 0-07-295886-3 (5th Edition) Fundamentals of Database Systems by R. Elmasri, Shamkant B. Navathe, published by Prentice-Hall, ISBN 0-321-36957-2 (5th Edition) Database Solutions, by Thomas Connolly and Carolyn Begg, published by Addison-Wesley, ISBN 0 321 17350 3 (2nd Edition)

Tutorial 1 (Self Study)


Revision of SQL: DML statement SELECT for simple queries. DDL statements for creating tables and primary and foreign key constraints.

Tutorial 2
Revision of SQL: DDL statements for creating other constraints. DML statements for inserting data into tables. DML statement SELECT o For joining tables and retrieving data from them. o For views.

Advanced Database Systems

Page 9 of 34

AM - September 2009

Advanced Database Systems

Page 10 of 34

AM - September 2009

Why Database Systems? Module Title: Advanced Database Systems


Data One of the most important resources in ALL organisations. Without information, hence data, how could we: Control manufacturing processes? Process sales of goods? Diagnose patients illnesses? Forecast sales? Run the University of Teesside?

Week 1 Revision of Key Database Concepts

Advanced Database Systems

Page 11 of 34

AM - September 2009

Advanced Database Systems

Page 12 of 34

AM - September 2009

Information To be useful, information must be: Accurate Timely Relevant

Traditional Approach
From the earliest days computers were used to store files of information. Separate systems, ie separate files and programs, were developed for each application, eg payroll files, personnel files, accounts files, etc.

Therefore, you need adequate facilities for:


Modules Module System Module Descriptions

Data File

Storing (and verifying) data Manipulating data Extracting data


Student Data File Student System Student Records

Room Data File

Resource System

Resource Summary

Advanced Database Systems

Page 13 of 34

AM - September 2009

Advanced Database Systems

Page 14 of 34

AM - September 2009

Problems
Inconsistency Redundancy Lack of integration and control

What is a Database?
A shared collection of logically related data, and a description of this data, designed to meet the information needs of an organisation. Database Systems by Connolly & Begg Addison-Wesley, ISBN 0-201-70857-4

Solution? The Database Approach


Instead of having separate files for separate applications, data are organised into a single set of underlying files from which the applications draw the data that are relevant to them.
Module System Module Data Student Data Room Data Database Management System Student System

A database system can be thought of as a computerised record-keeping system. Such a system involves the data itself (stored in the database), hardware, software and most important! users. Databases are integrated and (usually) shared; they are used to store persistent data. An Introduction to Database Systems by C J Date, Addison-Wesley ISBN 0-201-38590-2 (7th Edition)

Resource System

Advanced Database Systems

Page 15 of 34

AM - September 2009

Advanced Database Systems

Page 16 of 34

AM - September 2009

What is a Database Management System (DBMS)?


A software system that enables users to define, create, maintain, and control access to the database. Provides the interface between the user and the data in the database. Allocates storage to data and maintains indices so that any required data can be retrieved. Protects data against unauthorised access. Safeguards data against corruption. Provides recovery and restart facilities after a hardware or software failure.

Advantages of the Database Approach


No unnecessary duplication of data. Greater consistency of data. Wider availability of data. Greater flexibility of use of data. Improved data integrity. Improved security. Improved backup and recovery services. Can change the data structure without altering associated programs. A database is dynamic: it can grow and change. Data management can be more consistent and systematic.

Advanced Database Systems

Page 17 of 34

AM - September 2009

Advanced Database Systems

Page 18 of 34

AM - September 2009

The Three Level ANSI-SPARC Architecture


In 1975 the ANSI Standards Planning and Requirements Committee proposed a standard terminology and general architecture for database systems.

External Level
The users view of the database. Also known as the applications view.

Conceptual Level
The objective is to separate each users view of the database from the way it is physically represented. 3 levels or views of data within a database: External level Conceptual level Internal level The overall view of the database. Also known as the global view.

Internal Level
The physical representation of the database on the computer. Also known as the storage view.

Advanced Database Systems

Page 19 of 34

AM - September 2009

Advanced Database Systems

Page 20 of 34

AM - September 2009

Schemas
The overall description of the database is called the database schema. There are 3 different types of schema in the database.

Mapping
Provides the translation between the schemas at different levels. The DBMS is responsible for mapping between the 3 types of schema. The DBMS must ensure that each external schema is derivable from the conceptual schema. The DBMS must use the information in the conceptual schema to map between each external schema and the internal schema.

External Schema
There are multiple external schemas (or subschemas), each one corresponding to a different view of the data.

Conceptual Schema
There is one conceptual schema, which describes the data stored in the database, the relationships and the integrity constraints.

Internal Schema
There is one internal schema, which describes how the data are stored in the database and how they are accessed.

Advanced Database Systems

Page 21 of 34

AM - September 2009

Advanced Database Systems

Page 22 of 34

AM - September 2009

Data Independence
A major objective for the 3-level architecture is to provide data independence, ie upper levels must be unaffected by changes to lower levels. There are 2 kinds of data independence: Logical data independence refers to the immunity of the external schemas to changes in the conceptual schema. It should be possible to alter tables, columns or relationships without having to alter existing external schemas or rewrite application programs (other than those that are directly affected). Physical data independence refers to the immunity of the conceptual schema to changes in the internal schema. It should be possible to alter file organizations, storage devices, indexes, etc, without having to alter the conceptual or external schemas.

The System Catalogue


The database schema is defined using a special language called a Data Definition Language (DDL). The result of the compilation of the DDL statements is a set of tables stored in special files collectively called the system catalog. This is a repository of meta-data (data about data), ie information describing the data in the database, typically containing the name, description, source and usage information for each data item. The system catalog is also known as the data dictionary or the data directory. See page 40 + Section 2.4 in Connolly & Begg for further details.

Advanced Database Systems

Page 23 of 34

AM - September 2009

Advanced Database Systems

Page 24 of 34

AM - September 2009

Database Languages
A Data Definition Language (DDL) is used to specify the data in the database. A Data Manipulation Language (DML) is used to access the data. A Data Control Language (DCL) is used to control access to the data.

Types of Database
5 main logical structures (in terms of how data are organised, stored and manipulated): 1. Hierarchical 2. Network 3. Relational 4. Object-oriented

Some databases have a combined DDL, DML and DCL (often called a Query Language), eg SQL.

5. Object-relational See pp 45-47 of Connolly & Begg for brief descriptions of 1-3 and Chapters 25-28 for details of 4-5.

Advanced Database Systems

Page 25 of 34

AM - September 2009

Advanced Database Systems

Page 26 of 34

AM - September 2009

Why Relational Databases? Module Title: Advanced Database Systems


Relational model first proposed in 1970 by Dr E F (Ted) Codd in the paper A relational model of data for large shared data banks. Purpose Achieve program/data independence Treat data in a disciplined way - Apply rigour of mathematics - Use set theory Improve programmer productivity Implementation

Week 1 Revision of Key Relational Concepts

At first considered impractical, but in the late 1970s prototype System R developed by IBM and during 1980s commercial products began to appear, in particular Oracle from Oracle Corporation.

Advanced Database Systems

Page 27 of 34

AM - September 2009

Advanced Database Systems

Page 28 of 34

AM - September 2009

What is a Relational Database?


A relational database is made up of relations (tables) in which data are stored. A relation (table) is a 2-dimensional structure made up of attributes (columns) and tuples (rows).

Example of a Table (Relation)

Relation A relation is a table that obeys the following rules: There are no duplicate rows in the table. The order of the rows is immaterial. The order of the columns is immaterial. Each attribute value is atomic, ie each cell can contain one and only one data value. ANIMAL ANAME Candice Zona Sam Elmer Leonard AFAMILY Camel Zebra Snake Elephant Lion WEIGHT 1800 900 5 5000 1200

Advanced Database Systems

Page 29 of 34

AM - September 2009

Advanced Database Systems

Page 30 of 34

AM - September 2009

Relational Database Terminology


Relation Tuple Attribute Primary key Domain Degree Cardinality ANIMAL ANAME Candice Zona Sam Elmer Leonard AFAMILY Camel Zebra Snake Elephant Lion WEIGHT 1800 900 5 5000 1200 a table with rows and columns a row of a relation a named column of a relation a unique identifier for each row in a relation the set of allowable values for a column the number of columns in a relation the number of rows in a relation

Primary and Foreign Keys


A primary key is a unique identifier for each row in a table. It may consist of one or more columns. ANIMAL ANO CA1 ZE4 SN1 EL3 LI2 ANAME Candice Zona Sam Elmer Leonard AFAMILY WEIGHT Camel 1800 Zebra 900 Snake 5 Elephant 5000 Lion 1200

No part of the primary key may have a null value. This is known as the entity integrity rule.

Advanced Database Systems

Page 31 of 34

AM - September 2009

Advanced Database Systems

Page 32 of 34

AM - September 2009

Primary and Foreign Keys (contd)


Each table contains data about one entity. ANIMAL-FOOD ANO CA1 CA1 ZE4 SN1 SN1 EL3 LI2 LI2 FOOD Hay Buns Brush Mice People Leaves People Meat

Primary and Foreign Keys (contd)


A foreign key is a column or columns in one table which reference(s) a primary key column or columns in another table. Values in a foreign key must match an existing value in the primary key or be NULL. This is known as the referential integrity rule.
ANIMAL ANIMAL-FOOD AFAMILY WEIGHT Camel 1800 Zebra 900 Snake 5 Elephant 5000 Lion 1200 ANO FOOD CA1 Hay CA1 Buns ZE4 Brush SN1 Mice SN1 People EL3 Leaves LI2 People LI2 Meat

You may need to combine 2 or more tables to find out a particular piece of information, eg you may want a report on the animal name, family and food. You relate the data in one table to the data in another through foreign keys.

ANO CA1 ZE4 SN1 EL3 LI2

ANAME Candice Zona Sam Elmer Leonard

ANO in the ANIMAL-FOOD table is part of the primary key and also a foreign key.

Advanced Database Systems

Page 33 of 34

AM - September 2009

Advanced Database Systems

Page 34 of 34

AM - September 2009

You might also like