Professional Documents
Culture Documents
All Basic Principles and Concept of Databases
All Basic Principles and Concept of Databases
All Basic Principles and Concept of Databases
Principles &
Fundamentals of
Design,
DATA MODELS
OUTLINE
• What a database is, what it does, and why
database design is important
• How modern databases evolved from files and
file systems
• About flaws in file system data management
• What a DBMS is, what it does, and how it fits into
the database system
• About types of database systems and database
models
2
OUTLINE CONTINUE
4
Data-Information-Decision Cycle
Figure 16.1
5
Database Management
• A database is a shared, integrated computer
structure containing:
– Application (or end user) data
– Metadata (data about data, eg. datatype, length,
required/not required, validation, …)
• Database Management System (DBMS)
– Manages Database structure
– Controls access to data
– Provides query language
6
Advantages of DBMS
• Makes data management more efficient and
effective
• Query language allows quick answers to ad hoc
(one time) queries
• Provides easier access to better-managed data
• Promotes integrated view of organization’s
operations
• Reduces the probability of inconsistent data (same
data stored in different places with possibility of
different values)
7
DBMS Manages Interaction
Figure 1.2
8
Database Design
• Importance of Good Design
– Poor design results in unwanted data redundancy
(unnecessary duplication of data)
– Poor design generates errors leading to decisions
based on incorrect data
• Practical Approach
– Focus on principles and concepts of database
design
– Importance of logical design
9
Historical Roots of Database
• First computer applications focused on clerical
tasks (eg preparing bills)
• Requests for information (eg how many bills were
not paid this month) quickly followed
• File systems developed to address needs
– Data organized according to expected use
– Data Processing (DP) specialists computerized
manual file systems
10
File Terminology
• Data
– Raw Facts
• Field
– Group of characters with specific meaning
• Record
– Logically connected fields that describe a person,
place, or thing
• File
– Collection of related records
11
Simple File System
Figure 1.5
12
File System Disadvantages
• File System Data Management
– Requires extensive programming in third-
generation language (3GL)
– Time consuming
– Makes ad hoc queries impossible
– Leads to islands of information
13
File System Critique (con’t.)
• Data Dependence
– Change in file’s data characteristics requires
modification of data access programs
– Must tell program what to do and how
– Makes file systems cumbersome from
programming and data management views
• Structural Dependence
– Change in file structure requires modification of
related programs
14
File System Critique (con’t.)
• Field Definitions and Naming Conventions
– Flexible record definition anticipates reporting
requirements
– Selection of proper field names important
– Attention to length of field names
– Use of unique record identifiers
15
File System Critique (con’t.)
• Data Redundancy
– Different and conflicting versions of same data
– Results of uncontrolled data redundancy
• Data anomalies
– Modification
– Insertion
– Deletion
• Data inconsistency
– Lack of data integrity
16
Database Systems
• Database consists of logically related data stored
in a single repository
• Provides advantages over file system
management approach
– Eliminates inconsistency, data anomalies, data
dependency, and structural dependency problems
– Stores data structures, relationships, and access
paths in addition to application data
17
Database vs. File Systems
Figure 1.6
18
Database System Environment
Figure 1.7
19
Database System Types
22
Implementation Database Models
23
Hierarchical Database Model
• Logically represented by an upside down tree
– Each parent can have many children
– Each child has only one parent
Figure 1.8
24
Hierarchical Database Model
• Advantages
– Conceptual simplicity
– Database security and integrity
– Data independence
– Efficiency
• Disadvantages
– Complex implementation
– Difficult to manage and lack of standards
– Lacks structural independence
– Applications programming and use complexity
– Implementation limitations
25
Network Database Model
• Each record can have multiple parents
– Composed of sets
– Each set has owner record and member record
– Member may have several owners
Figure
1.10
26
Network Database Model
• Advantages
– Conceptual simplicity
– Handles more relationship types
– Data access flexibility
– Promotes database integrity
– Data independence
– Conformance to standards
• Disadvantages
– System complexity
– Lack of structural independence
27
Relational Database Model
28
Relational Database Model (con’t.)
Figure 1.11
29
Relational Database Model
• Advantages
– Structural independence
– Improved conceptual simplicity
– Easier database design, implementation,
management, and use
– Ad hoc query capability with SQL
– Powerful database management system
30
Relational Database Model
• Disadvantages
– Substantial hardware and system software
overhead
– Poor design and implementation is made easy
– May promote “islands of information” problems
31
Entity Relationship Database Model
• Complements the relational data model concepts
• Represented in an entity relationship diagram
(ERD)
• Based on entities, attributes, and relationships
Figure 1.13
32
Entity Relationship Database Model
• Advantages
– Exceptional conceptual simplicity
– Visual representation
– Effective communication tool
– Integrated with the relational database model
• Disadvantages
– Limited constraint representation
– Limited relationship representation
– No data manipulation language
– Loss of information content
33
Design Principle Introduction
34
Data Modeling and Data Models
• Data models
– Relatively simple representations of complex real-
world data structures
• Often graphical
• Model: an abstraction of a real-world object or
event
– Useful in understanding complexities of the real-
world environment
• Data modeling is iterative and progressive
35
The Importance of Data Models
36
Data Model Basic Building Blocks
37
Business Rules
• Descriptions of policies, procedures, or principles
within a specific organization
– Apply to any organization that stores and uses data to
generate information
• Description of operations to create/enforce actions
within an organization’s environment
– Must be in writing and kept up to date
– Must be easy to understand and widely disseminated
• Describe characteristics of data as viewed by the
company
38
Discovering Business Rules
• Sources of business rules:
– Company managers
– Policy makers
– Department managers
– Written documentation
• Procedures
• Standards
• Operations manuals
– Direct interviews with end users
39
Discovering Business Rules (cont’d.)
40
Translating Business Rules into Data
Model Components
• Nouns translate into entities
• Verbs translate into relationships among entities
• Relationships are bidirectional
• Two questions to identify the relationship type:
– How many instances of B are related to one
instance of A?
– How many instances of A are related to one
instance of B?
41
Naming Conventions
42
The Evolution of Data Models
43
Hierarchical and Network Models
44
Hierarchical and Network Models
(cont’d.)
• Network model
– Created to represent complex data relationships
more effectively than the hierarchical model
– Improves database performance
– Imposes a database standard
– Resembles hierarchical model
• Record may have more than one parent
45
Hierarchical and Network Models
(cont’d.)
– Collection of records in 1:M relationships
– Set composed of two record types:
• Owner
• Member
• Network model concepts still used today:
– Schema
• Conceptual organization of entire database as
viewed by the database administrator
– Subschema
• Database portion “seen” by the application
programs
46
Hierarchical and Network Models
(cont’d.)
– Data management language (DML)
• Defines the environment in which data can be
managed
– Data definition language (DDL)
• Enables the administrator to define the schema
components
47
The Relational Model
48
The Relational Model (cont’d.)
49
The Relational Model (cont’d.)
• SQL-based relational database application
involves three parts:
– End-user interface
• Allows end user to interact with the data
– Set of tables stored in the database
• Each table is independent from another
• Rows in different tables are related based on
common values in common attributes
– SQL “engine”
• Executes all queries
52
The Entity Relationship Model
53
The Entity Relationship Model (cont’d.)
54
The Object-Oriented (OO) Model
56
The Object-Oriented (OO) Model
(cont’d.)
• Attributes describe the properties of an object
• Objects that share similar characteristics are
grouped in classes
• Classes are organized in a class hierarchy
• Inheritance: object inherits methods and
attributes of parent class
• UML based on OO concepts that describe
diagrams and symbols
– Used to graphically model a system
57
Object/Relational and XML
59
Object/Relational and XML (cont’d.)
60
Emerging Data Models: Big Data and
NoSQL
• Big Data
– Find new and better ways to manage large
amounts of Web-generated data and derive
business insight from it
– Simultaneously provides high performance and
scalability at a reasonable cost
– Relational approach does not always match the
needs of organizations with Big Data challenges
61
Emerging Data Models: Big Data and
NoSQL (cont’d.)
• NoSQL databases
– Not based on the relational model, hence the name
NoSQL
– Supports distributed database architectures
– Provides high scalability, high availability, and fault
tolerance
– Supports very large amounts of sparse data
– Geared toward performance rather than transaction
consistency
62
Emerging Data Models: Big Data and
NoSQL (cont’d.)
• Key-value data model
– Two data elements: key and value
• Every key has a corresponding value or set of values
• Sparse data
– Number of attributes is very large
– Number of actual data instances is low
• Eventual consistency
– Updates will propagate through system; eventually
all data copies will be consistent
63
Data Models: A Summary
• Common characteristics:
– Conceptual simplicity with semantic completeness
– Represent the real world as closely as possible
– Real-world transformations must comply with
consistency and integrity characteristics
• Each new data model capitalized on the
shortcomings of previous models
• Some models better suited for some tasks
65
66
Degrees of Data Abstraction
67
The External Model
68
69
The External Model (cont’d.)
70
The Conceptual Model
71
The Conceptual Model (cont’d.)
73
The Internal Model
74
The Physical Model
76
Summary
78
Summary (cont’d.)
• Hierarchical model
– Set of one-to-many (1:M) relationships between a
parent and its children segments
• Network data model
– Uses sets to represent 1:M relationships between
record types
• Relational model
– Current database implementation standard
– ER model is a tool for data modeling
• Complements relational model
79
Summary
• Object-oriented (cont’d.)
data model: object is basic
modeling structure
• Relational model adopted object-oriented
extensions: extended relational data model
(ERDM)
• OO data models depicted using UML
• Data-modeling requirements are a function of
different data views and abstraction levels
– Three abstraction levels: external, conceptual, and
internal
80
Object-Oriented Database Model
• Objects or abstractions of real-world entities are
stored
– Attributes describe properties
– Collection of similar objects is a class
• Methods represent real world actions of classes
• Classes are organized in a class hierarchy
– Inheritance is ability of object to inherit attributes
and methods of classes above it
81
OO Data Model
• Advantages
– Adds semantic content
– Visual presentation includes semantic content
– Database integrity
– Both structural and data independence
• Disadvantages
– Lack of OODM
– Complex navigational data access
– Steep learning curve
– High system overhead slows transactions
82
Database Models and the Internet
• Characteristics of “Internet age” databases
– Flexible, efficient, and secure Internet access
– Easily used, developed, and supported
– Supports complex data types and relationships
– Seamless interfaces with multiple data sources
and structures
– Simplicity of conceptual database model
– Many database design, implementation, and
application development tools
– Powerful DBMS GUI make DBA job easier
83