Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 55

Chapter 2 – Data Models

Chapter 9 -Database Design


Learning Objectives
• After completing this chapter, you will be able to:
• Discuss data modeling and why data models are important
• Describe the basic data-modeling building blocks
• Define what business rules are and how they influence database design
• Understand how the major data models evolved
• List emerging alternative data models and the needs they fulfill
• Explain how data models can be classified by their level of abstraction
Data Modeling and Data Models

• Data modeling is “a process “


• Data collection & Business rules & Requirements collection
• Conceptual Data Model
• Logical Data Model
• Physical Data Model
• Implementation
• Maintenance

• Data model is schema or blueprint of solution ( outcome of data modeling)


• Graphical simple representation.
The Importance of Data Models

• Facilitates communication
• Gives various views of the database
• Organizes data for various users
• Provides an abstraction for the creation of good a database
Business Rules
• Brief, precise, and unambiguous description of a policy, procedure, or
principle
• Create and enforce actions within that organization’s environment
• Proper identification of, Relationships, Attributes ,Constraints and Entities
Discovering Business Rules (1 of 2)
• Sources of business rules
• Company managers
• Policy makers
• Department managers
• Written documentation
• Direct interviews with end users & costumers

• How about Government regulations ?


Discovering Business Rules (2 of 2)
• Reasons for identifying and documenting business rules
• Standardize company’s view of data
• Facilitate communications tool between users and designers
• Assist designers
• Understand the nature, role, scope of data, and business processes
• Develop appropriate relationship participation rules and constraints
• Create an accurate data model
Translating Business Rules into Data Model Components
• Business rules set the stage for the proper identification of entities, attributes,
relationships, and constraints
• Nouns translate into entities
• Verbs translate into relationships among entities
• Relationships are bidirectional !!
• Questions to identify the relationship type
• How many instances of B are related to one instance of A?
• How many instances of A are related to one instance of B?

• Each applicant can submit one or more applications (one application per year for multiple
years).
• Each application is submitted by only one applicant.
Hierarchical and Network Models
• Hierarchical models: developed to manage large amounts of data for
complex manufacturing projects
• Represented by an upside-down tree which contains segments
• Segments are the equivalent of a file system’s record type
• Depicts a set of one-to-many (1:M) relationships
• The network model allows more than one parent and support (M:M) relationship

Schema
Hierarchical Model
• Advantages
• Promotes data sharing
• Parent/child relationship promotes conceptual simplicity and data integrity
• Database security is provided and enforced by DBMS
• Efficient with 1:M relationships
• Disadvantages
• Requires knowledge of physical data storage characteristics
• Navigational system requires knowledge of hierarchical path
• Changes in structure require changes in all application programs
• Implementation limitations
• No data definition
• Lack of standards
Network Model
• Advantages
• Conceptual simplicity
• Handles more relationship types
• Data access is flexible
• Data owner/member relationship promotes data integrity
• Conformance to standards
• Includes data definition language (DDL) and data manipulation language (DML)
• Disadvantages
• System complexity limits efficiency
• Navigational system yields complex implementation, application development, and
management
• Structural changes require changes in all application programs
The Relational Model
• Relational database management system (RDBMS)
• Performs basic functions provided by the hierarchical and network DBMS
systems
• Makes the relational data model easier to understand and implement
• Hides the complexities of the relational model from the user
• Relation Types; One to one (1:1) , One to many (1:M) many to many (M:N)
Data Model Basic Building Blocks
• Entity: person, place, thing, or event about which data will be collected and stored
• Attribute: characteristic of an entity
• Relationship: association among entities
• One-to-many (1:M OR 1..*)
• Many-to-many (M:N or *..*)
• One-to-one (1:1 OR 1..1)
• Constraint: restriction placed on data
• Ensures data integrity
Naming Conventions
• Entity name requirements
• Be descriptive of the objects in the business environment
• Use terminology that is familiar to the users
• Attribute name
• Required to be descriptive of the data represented by the attribute
• Proper naming
• Facilitates communication between parties
• Promotes self-documentation

• Snake vs Camel vs Kebab vs Pascal  what are they ?


• my_first_variable=my_second_variable-my_third_variable
• myFirstVariable=mySecondVariable-myThirdVariable
• my-first-variable=my-second-variable-my-third-variable
Relational Model
• Advantages
• Structural independence is promoted using independent tables
• Tabular view improves conceptual simplicity
• Ad hoc query capability is based on SQL
• Isolates the end user from physical-level details
• Improves implementation and management simplicity
• Disadvantages
• Requires substantial hardware and system software overhead
• Conceptual simplicity gives untrained people the tools to use a good system poorly
• May promote information problems
The Entity Relationship Model
• Graphical representation of entities and their relationships in a database
structure
• ER Models represented in Entity relationship diagram (ERD) < uses graphic
representations to model database components>
• Entity instance or entity occurrence: rows in the relational table
• Can be anything
• Must have rectangle box
• Singular ,“A noun”
• Attributes: describe particular characteristics
• Connectivity: term used to label the relationship(s) types
• Active or passive verb
Entity Relationship Model
• Advantages
• Visual modeling yields conceptual simplicity
• Visual representation makes it an effective communication tool
• Is integrated with the dominant relational model
• Disadvantages
• Limited constraint representation
• Limited relationship representation
• No data manipulation language
• Loss of information content occurs when attributes are removed from entities to avoid
crowded displays
The Object-Oriented Data Model (1 of 3)

• Both data and its relationships are contained in a single structure


known as an object
• Object-oriented database management system(OODBMS): based on OODM
• Object: contains data and their relationships with operations that are
performed on it
• Basic building block for autonomous structures
• Abstraction of real-world entity
• Attribute: describes the properties of an object
The Object-Oriented Data Model (2 of 3)

• Class: collection of similar objects with shared structure and behavior


organized in a class hierarchy
• Class hierarchy: resembles an upside-down tree in which each class
has only one parent
• Inheritance: object inherits methods and attributes of classes above it
• Unified Modeling Language (UML): describes sets of diagrams and
symbols to graphically model a system
The Object-Oriented Data Model (3 of 3)
• Advantages
• Semantic content is added
• Visual representation includes semantic content
• Inheritance promotes data integrity
• Disadvantages
• Slow development of standards caused vendors to supply their own
enhancements
• Complex navigational system
• Learning curve is steep
• High system overhead slows transactions
Emerging Data Models: Big Data and NoSQL (1 of 3)

• Goals of Big Data


• Find new and better ways to manage large amounts of web and sensor-
generated data
• Provide high performance at a reasonable cost
• Characteristics of Big Data
• Volume
• Velocity
• Variety
Emerging Data Models: Big Data and NoSQL (2 of 3)

• Challenges of Big Data


• Volume doesn’t allow usage of conventional structures
• Expensive
• OLAP tools proved inconsistent dealing with unstructured data
• New technologies of Big Data
• Hadoop
• Hadoop Distributed File System (HDFS)
• MapReduce
• NoSQL
Emerging Data Models: Big Data and NoSQL (3 of 3)

• NoSQL databases
• Not based on the relational model
• Support distributed database architectures
• Provide high scalability, high availability, and fault tolerance
• Support large amounts of sparse data
• Geared toward performance rather than transaction consistency
• Provides a broad umbrella for data storage and manipulation
NoSQL
• Advantages
• High scalability, availability, and fault tolerance are provided
• Uses low-cost commodity hardware
• Supports Big Data
• Key-value model improves storage efficiency
• Disadvantages
• Complex programming is required
• There is no relationship support
• There is no transaction integrity support
• In terms of data consistency, it provides an eventually consistent model
Data Modeling & Database Modeling
• Data modeling ( a process )
• Conceptual
• Logical
• Physical

• Database modeling ( architectural decision of data modeling)


• Hierarchical
• Network
• Relational
• Object-Oriented
Degrees of Data Abstraction
The External Model
• End users’ view of the data environment
• People who use the application programs to manipulate the data and generate
information

• ER diagrams are used to


represent the external views
• External schema: specific
representation of an external
view
The Conceptual Model
• Represents a global view of the entire database by the
entire organization
• Conceptual schema: basis for the identification and high-level
description of the main data objects
• Logical design: task of creating a conceptual data model
• Conceptual model advantages
• Macro-level view of data environment
• Software and hardware independent
The Internal Model
• Representing database as seen by the DBMS
mapping conceptual model to the DBMS
• Internal schema: specific representation of an
internal model, using the database constructs
supported by the chosen database
• Logical independence: changing internal model
without affecting the conceptual model
• Hardware independent: unaffected by the type
of computer on which the software is installed
The Physical Model
• Operates at lowest level of abstraction
• Describes the way data are saved on storage media such as magnetic, solid
state, or optical media
• Requires the definition of physical storage and data access methods
• Software and hardware dependent
• Relational model aimed at logical level
• Does not require physical-level details
• Physical independence: changes in physical model do not affect
internal model
Models
Levels of Data Abstraction

Model Degree of Abstraction Focus Independent of

External High End-user views Hardware and software

Conceptual Medium-High Global view of data Hardware and software


(database model
independent)

Internal Medium-Low Specific database model Hardware

Physical Low Storage and access Neither hardware nor


methods software
Systems Development Life Cycle (SDLC)

• Traces history of an information system


• It is a process framework defining tasks performed each step
• Traditional SDLC is divided into five phases
• Planning: yields a general overview of the company and its objectives
• Analysis: problems defined during planning phase are examined in greater
detail
• Detailed systems design: designer completes the design of the system’s
processes
• Implementation: hardware, DBMS software, and application programs are
installed, and the database design is implemented
• Maintenance: corrective, adaptive, and perfective
• Iterative rather than sequential process
DBLC vs SDLC
The Database Life Cycle How the company runs its business?
Mission, vision, organizational structure
Do meeting with all end users ,
Getter information about problems
Find relationship between problems
Find constrains, limits and scope of tasks
ERM is communication tools between parties
Conceptual>Logical>Physical design
Business view –What ?
Technical/Designer view –How ?
Where we are going to install DBMS ? Cloud ?
DB creating with SysAdmins/DomainAdmins
ETL perform? Migrated?Aggregated ?Security ?
Data integrity and security >DBMS
Check limits and boundaries
Regulations & Compliance? PCI ,FERPA
Patches, Bugs, bottlenecks ?
Demands ?

Preventive, Corrective, Adaptive, Audits , Reporting


Database Design

Conceptual Design

Logical Design

Physical Design
(1 of 6)
Conceptual Design

• Goal: design a database independent of database software and


physical details
• Conceptual data model: describes main data entities, attributes,
relationships, and constrains
• Data Analysis and requirements
Step1

• Entity relationship modeling and normalization


Step2

• Data model verification


Step3

• Distributed database design


Step4
(2 of 6)
Conceptual Design

• Goal: design a database independent of database software and


physical details

• Easy to understand
• Simple boxes ,lines and text
• Easy to enhanced
• Highly abstract –no details
• Describes main data entities, attributes,
relationships, and constrains
• Translation from business requirements
• Most likely you will have multiple version.
(3 of 6)
Conceptual Design

• Data Analysis and requirements


Step1

• Goal: Collect all data and requirements from different sources and
make sure all business process and their needs understood correctly

Activities during to this first stage :

• Make meeting with users from all level


• Observe current system and their problems
• Meet with system /DBA design team
• Create scope of work and apply some standards
(4 of 6)
Conceptual Design

Entity relationship modeling and normalization


Step2

• Goal: Create first relationships between entities and details of


attributes

Activities during to this stage :

• Create formal communication tools


• Define relationships among to entities
• Decide PKs, FKs
(5 of 6)
Conceptual Design

Data model verification


Step3

• Goal: finalize all tests, compare with proposed system &needs and confirm with all stakeholders
most of the time this is last step for Conceptual design stage

Activities during to this stage :

• Make sure all modules are in place


• Check all boundaries, limits
• Make sure all access rights and security needs in place

• Modules create loose coupling !!


• Sometimes we may not remove all duplicated and
overlapping of data.
• Enterprise modules doesn't let you have loose coupling.
(6 of 6)
Conceptual Design

Distributed database design


Step3

• Goal: this not required but performance standing point you may need to do if your DB requires very
fast response specially geographically dispersed demands
Some question before to decide any distributed DB:

• Do we have aggregation or performance problem ?


• Do we Content Delivery Network (CDN)
• Do we have governmental mandatory regulations ?
• Do we have to fallow some compliances ?

We are still independent from any


HW and SW.
DBMS Software Selection
• Factors that affect the purchasing
• Cost
• DBMS features and tools
• Underlying model
• Portability
• DBMS hardware requirements

• Now we decided to use MS-SQL


• No longer we are independent from
Software however still we are
independent from hardware.
(1 of 9)
Logical Design

• Goal: design a database that is based on a specific data model


but independent of database physical details
• We need to consider DBMS requirements and boundaries
• Map the conceptual model to logical model components
Step1

• Validate the logical model using normalization


Step2

• Validate the logical model integrity constraints


Step3

• Validate the logical model against user requirements


Step4
(2 of 9)
logical Design
Map the conceptual model to logical model
Step1 components

• Goal: Create first relationships between entities and


details of attributes

Activities during to this stage :

• Requires that all objects in the conceptual


model be mapped to the specific constructs
used by the selected database model
(3 of 9)
logical Design
Map the conceptual model to logical model
Step1 components

Step Activity Strong entity is one resides in the “1”


side off all relationship
1 Map strong entities
2 Map supertype/subtype relationships
3 Map weak entities
fills
4 Map binary relationships
5 Map higher-degree relationships
has
has

teaches teaches
(4 of 9)
logical Design
Map the conceptual model to logical model
Step1 components

Ste Activity CREATE TABLE [Instructor] (


[InstructorID] Char(5),
p [InstructorLastName] VarChar(255),
1 Map strong entities [InstructorFirstName] VarChar(255),
[InstructorAddress] VarChar(255),
2 Map supertype/subtype relationships PRIMARY KEY ([InstructorID])
);
3 Map weak entities
4 Map binary relationships CREATE TABLE [Professor] (
[ProfessorID] Char(5),
5 Map higher-degree relationships [ProfessorLastName] VarChar(255),
[ProfessorFirstName] VarChar(255),
[ProfessorAddress] VarChar(255),
CREATE TABLE [Seat] ( PRIMARY KEY ([ProfessorID])
[ClassroomNumber] Char(10), );
[NumberOfSeat] Decimal(4,0),
PRIMARY KEY ([ClassroomNumber])
);
(5 of 9)
logical Design
Map the conceptual model to logical model
Step1 components
• For each weak entity, create a table that includes all of
it’s simple attributes. And include a foreign key points
Ste Activity to the primary key of the owner entity, where the
p foreign key and partial key will be the primary key of
the weak entity.
1 Map strong entities • A partial key uniquely identify a weak entity for a given
2 Map supertype/subtype relationships owner entity.
3 Map weak entities
fills
4 Map binary relationships
5 Map higher-degree relationships
has
has

teaches teaches
(6 of 9)
logical Design
Map the conceptual model to logical model
Step1 components
• Our example doesn’t have any super type or sub type
therefore we skipped that step we also mapped
Ste Activity almost all entitles but “Course “ which it has binary
p Relationship
1 Map strong entities
2 Map supertype/subtype relationships
3 Map weak entities
fills
4 Map binary relationships
5 Map higher-degree relationships
has
has

teaches teaches
(7 of 9)
logical Design

Validate the logical model using normalization


Step2

• Goal: Final data control step make sure all data redundant and normalized
during to mapping you may add/remove some of the attributes.

Activities uring to this stage :


• All tables should be at least 3NF
(8 of 9)
logical Design

Validate the logical model integrity constraints


Step3

• Goal: Make sure all defined constrains are in place and they are
working correctly

Activities during to this stage :

• Check all limits and boundaries.( up/down)


• Check Null is working as suppose to
• Check all codes are they ok
(9 of 9)
logical Design

Validate the logical model against user requirements


Step4

• Goal:Make sure before moving physical stage design what you


prepared meets with requirements and stakeholders need
Activities during to this stage :

• Validate all end-user data, transactions


• Validate security requirements

This is last step for logical design and we still are independent from Hardware.
Physical Design

• Volume of data to be used and data usage pattern are very important

Centralized vs distributed on
• Define data storage organization promises vs cloud ,indexes and
Step1 Views
Users , groups , security settings
• Define integrity and security measures
Step2 and assignments
Performance metrics ,read ,
• Determine performance measures write speed ,caching
Step3 Fine-tuning
Database Design Strategies
• Top-down design starts by identifying the data sets and then defines the data elements
for each of those sets
• Involves the identification of different entity types and the definition of each
entity’s attributes
• Bottom-up design first identifies the data elements (items) and then groups them
together in data sets
• First defines attributes, and then groups them to form entities
Centralized versus Decentralized Design
• Centralized design: process by which all
database design decisions are carried out
centrally by a small group of people
• Suitable in a top-down design approach
when the problem domain is relatively small,
as in a single unit or department in an
organization
Questions ?

You might also like