Data Sets Are Flexible Data Structures That

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Chapter 6: Data-Base

Modeling and Applications

Introduction

We use the term Data Base to mean the


collected data sets that are organized and
stored as an integral part of a firm’s computer-
based information system
Data Sets are flexible data structures that
include groupings of data that are logically
related

The Database Approach to


Data Storage
A Database is a set of computer files that
minimizes data redundancy and is accessed by
one or more application programs for data
processing
The database approach to data storage applies
whenever a database is established to serve two
or more applications, organizational units, or
types of users
A Database Management System (DBMS) is a
computer program that enables users to create,
modify, and utilize database information
efficiently

1
Characteristics of the
Database Approach
Data Independence - the separation of the
data from the various application programs and
other accesses by users
Data Standardization - data elements within a
database have standard definitions, thus stored
data are compatible with every application
program that accesses the data
One-Time Data Entry and Storage -
individual data values are entered into the
database only once; consequently, redundancy
is reduced and inconsistencies between data
elements are eliminated

Characteristics of the
Database Approach
Data Integration - data sets integrate the data,
which enables all affected data sets to be updated
simultaneously
Shared Data Ownership - all data within a
database are owned in common by the users. The
portion of the database that is of interest to each user
is known as the sub-schema
Centralized Data Management - the database
management system stands guard over the database
and presents the logical view to users and
application programs

Program-Data
Independence

Application
Program A
Database
Management
Database
System
Application
Program B

Figure 6-1

2
Questions for Database
Design and Construction

What data management perspective should be adopted?


What is the proposed system’s initial objective?
What systems and users will use the data?
Which existing or future systems will the proposed system
interface with?
How much data will be stored initially? In the future?
How many data accesses (reads and updates) will occur on an
hourly, daily, and monthly basis?
How can the data be organized, both logically and physically,
to best serve the users of the system?

Iterative Phases in Database


Development: Planning & Analysis

Planning
Cost-benefit Analysis
Effective usage Analysis
Analysis
Enterprise Diagram
User Requirements
Data requirements
⌧Firm’s operations and relationships
Development of logical design
⌧Expected output requirements
⌧Inputs
⌧Processes
⌧Appropriate Conceptual Model
⌧Data Modeling through Entity-Relationship Diagrams
Specification of logical view(s)
Designation of Primary and Secondary keys
Development of Data Dictionary

Iterative Phases in Database


Development: Detailed Design

Technical Specifications
Report Layouts
Data Flows
Screen Layouts
DBMS Selection
⌧Data Definition Language (DDL)
⌧Data Manipulation language (DML)
⌧Query language [Structured Query Language (SQL)
and/or Query by Example (QBE)]
⌧Data-base Control System (DBCS)

3
DBMS
Many DBMS packages allow users to:
Analyze Data
Prepare ad hoc or customized Reports
Create and Display Graphs
Create Customized Applications via
Programming Languages
Import and Export Data
Perform On-line Editing
Purge or Archive Obsolete Data
Backup data
Maintain Security Measures
Interface with Communication Networks

Iterative Phases in Database


Development: Post-Design Phases

Implementation
Testing
Unit Testing
System Testing
User Acceptance Test
Maintenance

DATA MODELING

So far have not considered the data


required to support the data flows and
processing

Data Modeling is a technique for


organising and documenting a system’s
Data.

4
Why is data modeling
considered crucial?

Data is a resource to be shared by as many


processes as possible, thus must be
organized in a way that’s flexible &
adaptable to unanticipated business
requirements

Data structures and properties are


reasonably permanent compared with the
processes that use the data.

Why is data modeling


considered crucial (cont)?

Data models are much smaller than


process and object models and can be
constructed more rapidly.

The process of constructing data models


helps analysts and users quickly reach
consensus on business terminology and
rules.

Conceptual Data Model

Model that captures overall structure of


organizational data

Independent of database management


systems

5
Conceptual Data Model
(cont)

Collect information from


Interviews
Questionnaires
JAD’s

Develop Conceptual data model (logic)

Data Modeling (cont)

There are several formats for data


modeling. (ERD, object oriented env.)

A common one is called an entity


relationship diagram (ERD) because it
depicts data in terms of the entities and
relationships described by the data.

INTRODUCTION TO E-R
MODELING

Entity Relationship Data Model


Detailed logical representation of
entities, associations, and data
elements for an organization or
business area

ERD
Graphical representation of ER model.

6
FUNDAMENTALS OF E-R
MODELING

Basic E-R Modeling notation uses three


main constructs
entities
associated attributes
relationships

ENTITIES

An ENTITY is something about which we


want to store data.
Synonyms include entity type and entity
class.
An entity is a class of persons, places,
objects, events, or concepts about which we
may need to capture and store data.

ENTITIES (cont)

Each entity is distinguishable from the


other entities.
Each entity represents a group of many
instances of that entity
An entity instance is a single occurrence
of an entity.

7
ATTRIBUTES

If an entity is something about which we


want to store data, then intuitively, we
need to identify what specific pieces of
data we want to store about each
instance of a given entity.

These pieces of data are ATTRIBUTES.

ATTRIBUTES (cont)

An attribute is a descriptive property or


characteristic of an entity.
Synonyms include element, property, and
field.

ATTRIBUTES (cont)

Some attributes can be logically grouped


into super-attributes called compound
attributes.
A compound attribute is one that actually
consists of more primitive attributes.

EG name = first name and family name

8
ATTRIBUTES (cont)

MULTIVALUED ATTRIBUTES
An attribute that may take on more than
one value for each entity instance
(sometimes termed a repeating group)

EG: a student may be enrolled in more


than one major

IDENTIFICATION

An entity typically has many instances,

Conceptually, there exists a need to uniquely


identify each instance based on the data value
of one or more attributes.

Thus, every entity must have an identifier or


key.

KEYS & IDENTIFIERS

A KEY is an attribute, or a group of


attributes, that assumes a unique value
for each entity instance.

An entity may have more than one key.

A candidate key is a “candidate to


become the primary identifier” of
instances of an entity.

9
KEYS & IDENTIFIERS (cont)

The candidate key selected as the


unique identifier is the primary key
Any candidate key that is not selected
to become the primary key is called an
alternate key.
A group of attributes that uniquely
identifies an instance of an entity is
called a concatenated key.

RELATIONSHIPS

Conceptually, entities and attributes do


not exist in isolation.

The things they represent interact with


and impact one another.

Thus we introduce the concept of a


RELATIONSHIP.

RELATIONSHIPS (cont)
A relationship is an association that exists
between one or more entities.

The relationship may represent an event that


links the entities or merely a logical affinity
that exits between the entities.
• A STUDENT IS ENROLLED in one or more degrees.
• A degree IS BEING STUDIED BY zero, one, or more
STUDENTS.

10
RELATIONSHIPS (cont)

The underlined verb phrases defines the


relationships that exist between the two
entities.

Notice that all relationships are implicitly


bidirectional, meaning they can be
interpreted in both directions

DEGREE OF A RELATIONSHIP

Another measure of the complexity of a


data relationships is its degree.

The degree of a relationship is the


number of entities that participate in the
relationships.

DEGREE OF A RELATIONSHIP
(Cont)

Relationships may also exist between


different instances the same entity.
We call this a recursive relationship or
UNARY RELATIONSHIP
the degree of the relationship = 1.

11
DEGREE OF A RELATIONSHIP
(Cont)

A relationships that exists between


instances of two different entities is a
BINARY RELATIONSHIP

the degree of the relationship = 2

DEGREE OF A RELATIONSHIP
(Cont)

Relationships can also exist between more


than two different entities.

These are sometimes called N-ary


relationships, EG ternary relationships

An N-ary relationships is illustrated with a


new entity construct called an associate
entity.

Relationships of Different Degrees

12
CARDINALITIES OF RELATIONSHIPS

Cardinality shows the complexity of


each relationship.

Cardinality defines the minimum and


maximum number of instances of one
entity that may be associated with
each instance of the related entity.

CARDINALITIES (Cont)

Since all relationships are bi-directional,


cardinality must be defined in both
directions for every relationships.

Thus must consider whether its


optional or mandatory ie min is 0 or 1

Sample ER Diagram

13
Associative Entities

Sometimes attributes may be associated


with a many-many relationship

An associative entity (gerund) is a


relationship that the data modeller
chooses to model as an entity type.

Associative Entities (cont)

You must turn the relationship into an


associative entity if the associative entity
in involved in relationships with additional
entities

Note the M:M is replaced by two


mandatory 1:M relationships

Relational Databases

In a relational database, data are perceived by


users to be structured in the form of simple flat
files or tables
Each table consists of records that are
comprised of a key and associated data
elements
In order to lay claim as a relational database, it
must do the following:
Present data to users as tables only
Support the relational algebra functions of
Restrict (Select), Project, and Join without
requiring any definitions of access paths to support
these operations

14
Relational Algebra Functions in
a Relational Database - Select

Select (Restrict): This function produces a


new table with only rows from a single
source table whose columns meet
prescribed conditions, e.g.,
Customer_Name=Adam Smith;
DOB=2/29/64; Legal Residence=California,
etc

Select

Cust Cust. Date Credit Legal


No. Name of Limit Res.
Birth
1000 Adam 3-12-62 1000 CA
Smith

1010 Lord 2-29-64 2000 TX


Keynes

Relational Algebra Functions in


a Relational Database - Project

This function produces a new table with only


some columns from a single source table. e.g.,
Project Student table on Student_Name and
Student_Major
Student_Name Student_Major
Estudiante Garcia French

Madeleine Notallbright International Relations

15
Relational Algebra Functions in a
Relational Database - Select & Project

The combination of Select and Project


produces a new table with both fewer
columns and rows than the original
table. e.g., Project on Student_Name
and Student_Major where
Student_Major = Latin

Select & Project

Student_Name Student_Major Student_Status


Penny Pasta Latin Senior
Connie Curry Greek Freshman
Tony Lama Tibetan Junior

Student_Name Student_Major
Penny Pasta Latin

Relational Algebra Functions


in a Relational Database - Join

The Join function produces a new table


from two or more source tables that
have at least one common column
The new table is wider than either of the
two source tables because it contains all
the columns from both source tables

16
Join
Customer_Name Customer_Code

John Doe 1001

Customer_Code Credit_Limit
+
1001 10,000

Customer_ Customer_ Credit_Limit


=
Name Code

John Doe 1001 10,000

Query Languages for a


Relational Database

Structured Query Language


(SQL)
SELECT CLIENT_NO, CLIENT_NAME,
PROJECT_NAME
FROM PROJ.TABL
WHERE CLIENT_NO = 531

Thanks.

17

You might also like