Download as pdf or txt
Download as pdf or txt
You are on page 1of 248

Database Concept

Data and Information


 Data - The values you
store in the database
are data.

 Information - data that


you process in a
manner that makes it
meaningful and useful
to you when you work
with it or view it.
What is a Database?

Organized collection of data.


Ex. Organizer, filing cabinets, telephone directory
The Purpose of a Database is…
…to keep track of things:
 Orders

 Customers

 Jobs

 Skills

 Employees
Generating Information for Decision Making
Types of Database Systems
 Number of Users  Location
 Single-user  Centralized
 Multiuser  Distributed
 Scope  Use
 Desktop  Transactional
 Workgroup (Production)
 Enterprise  Analytical
Types of Databases

 operational databases - used in the day-to-day life


of an organization, institution, or business.
 analytical databases - used to store and track
historical and time-dependent data.
PC Databases
E.g.:
Access
FoxPro
Dbase
Etc.
Centralized Databases

Central
Computer
Client Server Databases
Client

Client
Network

Database
Server
Client
Distributed Databases
Location B
Location C

computer
computer

Homogeneous
computer Databases
Location A
Distributed Databases
Heterogeneous Client
Or Federated Remote
Databases Comp.
Database
Server
Local Network

Comm
Server
Remote
Client Comp.
Evolution of Database
Technologies
Hierarchical Model
 The simplest database model arranges record types
as hierarchy
 In a hierarchical database, a record type is
referred to as a node or “segment.”
 The top node of the hierarchy is referred to as the
root node
 A parent node can have more than one child node
 A child node can have only one parent node
Hierarchical Model
Network Model
 The network database model allows many-to-many
relationships in addition to one-to-many
relationships
 Related record types are referred to as a network
set, or simply a “set.”
 A set contains an owner and members
 An owner is similar to a parent record in a
hierarchical database
 A member is roughly equivalent to the child records
in a hierarchical database
Network Model
Relational Database Model
 It produces databases that are difficult to
create, manipulate, and maintain
 A relational database stores data in a
collection of related tables
 Each table (also called a “relation”) is a
sequence, or list, of records.
 All of the records in a table are of the
same record type
Relational Database
Relational Database
Object Oriented Database
 An object-oriented database stores data as
objects, which can be grouped into object classes,
and defined by attributes and method
 A class specifies the attributes and methods that are
shared by all objects in a class
 A class attribute is equivalent to a field, and contains
the smallest unit of data.
 A method is any behavior that the object is capable
of performing
Object Oriented Database
Types of Database Models

 Hierarchical database model


Legacy systems
 Network database model
Focus of lesson
 Relational database model /discussion
 Object-oriented database model
Vendors of Relational
Databases

 Oracle
 IBM
 Microsoft
 Informix
 Sybase
Table
 is the chief structure in a relational database. It is
composed of fields and records, the order of
which is completely unimportant.
Field
 Field - (known as an attribute in relational
database theory) is the smallest structure in a
relational database.
Record
 (known as a tuple in relational database
theory) is a structure within a table that
represents a unique instance of the
subject of the table.
View
 A virtual table that is
composed of the fields of
one or more data or
validation tables. A View
is considered "virtual"
because it doesn't store
any data on its own;
instead it draws its data
from the tables on which
it is based.
Keys
 A special fields that serve specific
purposes within a table, and the type of
key determines its use within the table.
 Primary key is a field that uniquely identifies
a record within a table.
 Foreign key, which is a field that is used to
establish a relationship between a pair of
tables.
Keys
Index
 A structure within an RDBMS that is used
to improve data processing
 Keys are logical structures used to identify
records within a table, and indexes are
physical structures used to optimize data
processing.
Relationships
 A connection
established between
a pair of tables
Types of Relationships

 one-to-one - exists
between a pair of tables if
a single record in the first
table is related to only
one record in the second
table, and a single record
in the second table is
related to only one record
in the first table
Types of Relationships
 one-to-many - exists
between a pair of tables if
a single record in the first
table can be related to
one or more records in
the second table, but a
single record in the
second table can be
related to only one record
in the first table
Types of Relationships

 many-to-many - exists
between a pair of tables if
a single record in the first
table can be related to
one or more records in
the second table, and a
single record in the
second table can be
related to one or more
records in the first table.
Types of Participation

 Mandatory – if
relationship must
exist
 Optional – if
relationship is
optional
Degree of Participation
 Degree of Participation
 the minimum and maximum number of
records in one table that can be related to a
single record in the other table.
 Data Integrity
 refersto the validity, consistency, and
accuracy of the data in a database.
Types of Integrity
 Table-level integrity ensures that the field that identifies each record
within the table, is unique and is never missing its value.
 Field-level integrity ensures that the structure of every field is sound,
that the values in each field are valid, consistent, and accurate, and
that fields of the same type (such as City fields) are consistently
defined throughout the database.
 Relationship-level integrity (traditionally known as referential
integrity) ensures that the relationship between a pair of tables is
sound and that there is synchronization between the two tables
whenever data is entered, updated, or deleted.
 Business rules impose restrictions or limitations on certain aspects
of a database based on the ways an organization perceives and
uses its data.
Schema
 The term schema or database schema
simply means the structure or design of
the database—that is, the form of the
database without any data in it. If you like,
the schema is a blueprint for the data in
the database.
employee(employeeID, name, job, departmentID)
Database
Development and
Design
The Information System
 Provides for data collection, storage, and retrieval.
 Consists of
 People
 Hardware
 Software
 Database(s)
 Application programs
 Procedures
 Also facilitates the transformation of data into information
and the management of both data and information.
The Systems Development Life
Cycle (SDLC)
Five phases of SDLC:
 Planning
 Analysis
 DetailedSystems Design
 Implementation
 Maintenance
Database Development
 process of database design and implementation
 takes place within the confines of an IS

Database Design Database Implementation


Complete, normalized, Creating the database storage
nonredundant and fully integrated structure, loading the database,
conceptual, logical, and physical and providing for data
datamodels management
Database Development
Lifecycle
The Database Initial Study
 Analyze the company
situation.
 Define problems and
constraints.
 Define objectives.
 Define scope and
boundaries.
The Database Initial Study
 Analyze the Company Situation
 organization’s general operating environment, its mission
within that environment, the organization’s structure
 Define Problems and Constraints
 How does the existing system function?
 What input does the system require?
 What documents does the system generate?
 How is the system output used? By Whom?
 What are the operational relationships among business
units?
 What are the limits and constraints imposed on the system?
The Database Initial Study
 Define the Objective
 What is the proposed system’s initial objective?
 Will the system interface with other existing or future systems
in the company?
 Will the system share the data with other systems or users?
 Define Scope and Boundaries
 Scope -- What is the extent of the design based on
operational requirements?
 Boundaries -- What are the limits?
 Budget
 Hardware and software
Database Design
 Designing a database is a process that
involves developing and refining a
database structure based on the
requirements of your business.
 Database design includes the following
three stages:
1. Conceptual Database Design
2. Logical Database Design
3. Physical Database Design
Database Design: Why is it
Important?
 To ensure that the final product meets
user and system requirement
 To facilitate data management
 To minimize uncontrolled data
redundancies
 To minimize errors that lead to bad
decisions
Procedure Flow for Database
Design
Conceptual Design
Conceptual Design
 The first step in the database design cycle is to define
the data requirements for your business. Answering
these types of questions helps you define the conceptual
design:
 What types of information does my business currently use?
 . What types of information does my business need?
 . What kind of information do I want from this system?
 . What are the assumptions on which my business runs?
 . What are the restrictions of my business?
 . What kind of reports do I need to generate?
 . What will I do with this information?
Conceptual Design
 What kind of security does this system require?
 What kinds of information are likely to expand?

Identifying the goals of your business and


gathering information from the different
sources who will use the database is an
essential process.
Business Rules
The designer must identify the
company’s business rules and analyze
their impacts:
 narrative descriptions of a policy,
procedure, or principle within a
specific environment
 create and enforce actions within a
business environment
Conceptual Design
 Data modeling is used to create an abstract
database structure that represents real-world
objects.
 The design must be software- and hardware-
independent.
 Minimal data rule:
All that is needed is there, and all that is there is
needed.
Developing a Conceptual Model
 Overall view of the database that integrates all the
needed information discovered during the
requirements analysis
 Elements of the Conceptual Model are represented
by diagrams, Entity-Relationship or ER Diagrams,
that show the meanings and relationships of those
elements independent of any particular database
systems or implementation details
 Can also be represented using other modeling tools
(such as UML)
Good Habits
1. Employ the user’s language and vocabulary.
2. Be rigorous.
3. Don’t rely on the opinion of a single expert.
4. Ask first about data, not processing.
5. Master the shapes of data.
6. Choose a notation that helps you realize habits
1 - 5.
ER Modeling
What is Modeling?
 An abstract representation of a system,
constructed to understand the system
prior to building or modifying it.
 A simplified representation of reality.
 Capture of a subset of system
characteristics relevant to a level or
understanding of detail.
Why Modeling?
 Easier to express complex ideas,
contributes to simplicity an conceptual
understanding.
 Easier to detect errors and omissions.
 Enhances and reinforces learning and
training.
 Easier and less costly to analyze and
manipulate model than the real system.
 Can improve the maintainability of a
system.
Key Ideas Regarding Modeling
 A model is rarely
correct on the first try.
 Always seek the
advice and criticism of
others.
 Let simplicity and
elegance guide you
through the process.
The External Models
for Tiny College
Conceptual Level
 The community view of the database
 Describes what data is stored in the
database and the relationships among the
data
 Represents:
 Allentities, their attributes, and their
relationships
 The constraints on the data
 Semantic information about the data
 Security and integrity information
A Conceptual Schema for Tiny College
The Entity-Relationship (ER)
Model
 A high-level conceptual data model developed
by Chen (1976) to facilitate database design
 Commonly used to:
 To support a user’s perception of the data
 To conceal the more technical aspects associated
with database design
 Translate different views of data among managers,
users, and programmers to fit into a common
framework.
 Define data processing and constraint requirements
to help us meet the different views.
 Help implement the database.
The Concepts of the E-R Model
 Entity
 Entity type
 Entity instance

 Attributes
 Relationship types
 Attributes on relationships
Tiny College Entities
Weak versus Strong Entities
Weak Entity Type Strong Entity Type
 Cannot exist in the  Is not existence-
database unless dependent on some
another type of entity other entity type
also exists in the
database
 Identifying owner
 Identifying
relationship
A Weak Entity in an ERD
Attributes
 Properties that describe the entity’s
characteristics
 Represented by ovals and are connected
to the entity with a line.
 Have a domain -- the attribute’s set of
possible values.
 Attributes may share a domain.
 Primary keys are underlined.
Keys
 Candidate key
An attribute or set of attributes that uniquely
identifies individual occurrences of an entity
type.
 Primary key
The candidate key selected to be the primary
key.
 Composite key
A candidate key that consists of two or more
attributes.
The Attributes of the STUDENT
Entity
Basic E-R Model Entity Presentation
Problems with Multivalued
Attributes
The relational DBMS cannot implement
multivalued attributes.
Solutions:
 Within the original entity, create several
new attributes, one for each of the original
multivalued attribute’s components
 Create a new entity composed of the
original multivalued attribute’s components
Splitting the Multivalued Attributes into New
Attributes
A New Entity Set Composed of Multivalued
Attribute’s Components
Derived Attributes
A derived attribute is not physically stored within
the database; instead, it is derived by using an
algorithm.
Relationships
an association between entities
Degree of the Relationship

 Number of participating entities in a


relationship
Degree 1 – recursive/unary relationship
Degree 2 - binary relationship
Degree 3 - ternary relationship
> Degree 3 - n-nary relationship
Recursive Relationships
Binary and Ternary
Relationships
Cardinality Constraints
 In a binary relationship, the maximum or
minimum number of elements allowed on
each side of the relationship
 One-to-one(1:1)
 One-to-many (1:M)
 Many-to-many (M:N)
Cardinalities
Cardinality

Figure 4.16 Cardinality in an ERD


Participation Constraints
 Optional
 Minimum cardinality is zero
 Mandatory
 Minimum cardinality is one
 Mandatory-one
 Minimum and maximum cardinality are both
one
The Entity Relationship (E-R)
Model
Figure 4.20 CLASS is Optional to COURSE

Figure 4.21 COURSE and CLASS in a Mandatory Relationship


Associative Entities
 Also called gerunds or composite entities
 Used when all relationships involved are
“many” as a bridge between the related
entities.
 The composite entity may contain
additional attributes.
 Result has independent meaning
 Gerund has one or more non-key
attributes
The M:N Relationship Between
STUDENT and CLASS
Supertypes and Subtypes
Subtype Supertype
 A subgrouping of the
 An entity type whose
subtypes share common
entities in an entity attributes.
type which has  Attributes that are shared
attributes that are by all entities (including
distinct from those in the identifier) are
associated with the
other subgroupings. supertype. (Fig. 4-1, 2)
Basic rotation for supertype/subtype relationships
Employee supertype with three subtypes
Attribute Inheritance
 Subtype entities inherit values of all
attributes of the supertype.
 An occurrence of a subtype is also an
occurrence of the supertype.
Use of Supertype/Subtype
 There are attributes that apply to some
(but not all) of the instances of an entity
type.
 The instances of a subtype participate in a
relationship unique to that subtype.
Supertype/subtype relationships in a hospital
Generalization and
Specialization
 Generalization
 The process of defining a more general entity
type from a set of more specialized entity
types.
 Specialization
 The process of defining one or more subtypes
of the supertype, and forming
supertype/subtype relationships.
Example of generalization
(a) Three entity types: CAR, TRUCK, and MOTORCYCLE
(b) Generalization to VEHICLE supertype
Example of specialization
(a) Entity type PART
(b) Specialization to MANUFACTURED PART
and PURCHASED PART
ER-Diagram Exercises
Video Store Database:
Customers of the video store are assigned a unique customer number when
they make their first rental. In addition to the customer number, other
information such as name and address is also collected. Each video cassette
that the store owns is identified by a unique code. Thus, if the store owns
several copies of the same video, each copy has a unique identification code.
Other information about a video includes the date of purchase and the number
of times the video was rented. When a customer selects a video to rent, the
store needs to record this transaction, including the date and time the video
was rented. It is not unusual for a customer to rent several videos when they
visit the store. The store assigns a unique identifier to each movie title. For
example, the Jame Bond movie "Goldfinger" is assigned the identifier ADV234.
The store may have several cassettes for this movie title. Other information on
movies includes a title and the year made. Each movie title is associated with a
list of actors and one or more directors. The store has a unique internal code
they use to identify each actor. In addition, the store has a different set of
internal codes it uses to identify each director. In addition to the actor and
director identification codes, other biographic information on actors and
directors is stored. Using this information, the store can easily find out all the
movies a specific actor has appeared in.
ER-Diagram Exercises
Children’s Book Database:
Scope of the database: It will store information about children's books (infant,
toddler, young child, young adult) and the reviews that are written about them.
The user has been using an alphbetical three-ring binder system to keep track
of children's books and reviews. These reviews are gathered for two purposes:
1) to be used to help make decisions in a book award committee and 2) to
assist the user when writing about children's books for publication. The user
would like to be able to input the data into a database system for better retrieval
and easier searching capabilities. There will only be two primary users of this
database - the person who will be inputting the data and the user who will be
searching and manipulating the data to suit his/her particular needs. The user
would like to keep track of books, including title, author, illustrator, publisher,
date published, and type of illustration (i.e., watercolor, gouache, collage). Each
book can have more than one author as well as one or more illustrators. Only
one publisher can publish a book, but that publisher can publish many books
overall. Many reviews can be written about a particular book, but a certain
review will concern only one book. Awards will have been given to certain
books. Only a small percentage of the books will have received awards, fewer
will have more than one. The user would like to be able to retrieve records in a
number of ways including by title, author, and award winners. Additionally, the
reviews are extremely important. The user would like to know how many times
the book has been reviewed and by which journals.
ER-Diagram Exercises

Products, Supplier and Parts Database


A company purchases parts from suppliers and assembles these parts
into a variety of products. Information stored about products includes a
unique product number, a product description, and a price. Information
stored about parts includes a unique part number and a part
description. Each product is assembled from many different parts.
Some parts are used more than once in the same product. As a result,
there is a need to keep track of the quantity of each part found in each
product. Parts are supplied by suppliers. Information stored about
suppliers includes a unique supplier number, supplier name, address,
and phone number. The company carefully evaluates prospective
suppliers for a part. When it finds a supplier for a part, it purchases that
part exclusively from the selected supplier. Suppliers often supply
several different parts to the company.
Conceptual Design Exercises
 Student Grade Database
 Consider the following set of requirements for a university database that is used to
keep track of students' transcripts.
 The university keeps track of each student's name, student number, social security
number, current address and phone number, permanent address and phone number,
birthdate, sex, class (freshman, sophomore, ..., graduate), major department, minor
department (if any), and degree program (BA, BS,...PhD). Some user applications
need to refer to the city, state, and zip of the student's permanent address and to the
student's last name. Both social security number and student number have unique
values for each student.
 Each department is described by a name, department, department code, office
number, office phone number, and college. Both name and code have unique values
for each department.
 Each course has a course name, description, course number, number of semester
hours, level, and offering department. The value of the course number is unique for
each course.
 Each section has an instructor, semester, year, course, and section number. The
section number distinguishes different sections of the same course that are taught
during the same semester/year; its values are 1,2,3..., up to the number of sections
taught during each semester.
 A grade report has a student, section, letter grade, and numeric grade (0, 1,2, 3, or
4).
Developing the Conceptual
Model Using E-R Diagrams
1.Identify, analyze, and refine the business
rules.
2.Identify the main entities based on 1.
3.Define the relationships between the
entities based on 1. and 2.
4.Define the attributes, primary keys, and
foreign keys for each of the entities.
Developing the Conceptual
Model Using E-R Diagrams
5.Normalize the entities.
6.Complete the intial E-R diagram.
7.Have the main end users verify the model
in 6. against the data, information,and
processing requirements.
8.Modify the E-R diagram based on 7.
Normalization
Database Normalization
 Eliminates redundancy and inconsistent
dependencies in table designs
 Avoids scalability issues
 Results in the substantial query speed
 Decreases chances of compromising
database integrity
Anomalies
 Anomalies are problems that arise in the data due to a flaw in
the database design.
 Insertion Anomalies
 Insertion anomalies occur when we try to insert data into a flawed table.
 Deletion Anomalies
 Deletion anomalies occur when we delete data from a flawed schema.
 Update Anomalies
 Update anomalies occur when we change data in a flawed schema.
 Null Values
 A final rule for good database design is that we should avoid schema
designs that have large numbers of empty attributes.
Deletion Anomalies
Student# StName AdvName AdvRoom Class#
1022 Sue Brown Jones 412 101-07
1022 Sue Brown Jones 412 143-01
1022 Sue Brown Jones 412 159-02
4123 Jim White Smith 216 210-01
4123 Jim White Smith 216 211-02

 If we delete student Jim White, we lose the


only records for advisor Smith
Insertion Anomalies
Student# StName AdvName AdvRoom Class#
1022 Sue Brown Jones 412 101-07
1022 Sue Brown Jones 412 143-01
1022 Sue Brown Jones 412 159-02
4123 Jim White Smith 216 210-01
4123 Jim White Smith 216 211-02

 If we want to add a new student we have


to know what class they are in (it’s part of
the key)
Change Anomalies
Student# StName AdvName AdvRoom Class#
1022 Sue Brown Jones 412 101-07
1022 Sue Brown Jones 412 143-01
1022 Sue Brown Jones 412 159-02
4123 Jim White Smith 216 210-01
4123 Jim White Smith 216 211-02

 If student Sue Brown switches to advisor


Wilson, we have to change three records
instead of just one
Normalization
 Normalization is a process we can use to
remove design flaws from a database.
 The normalization process consists of
breaking tables into smaller tables that
form a better design.
Steps
in
Normalizat
ion
First Normal Form (1NF)
 A relation is in first normal form if it does
not contain repeating groups, i.e., in which
the intersection of each row and column
contain one and only one value
 No multi-valued attributes.
 Every attribute value is atomic.
Unnormalized Data
Contains repeating groups in the Author
column in the BOOKS table
1NF
Second Normal Form (2NF)

 1NF and every non-key attribute is


fully functionally dependent on the
primary key.
 Every non-key attribute must be
defined by the entire key, not by only
part of the key.
 No partial functional dependencies.
Partial Functional Dependency
A dependency in which one or more
nonkey attributes are functionally
dependent on part, but not all, of the
primary key.
Composite Primary Key
 A composite Primary Key is when more
than one column is required to uniquely
identify a row
 Can lead to partial dependencies - a
column that is only dependent on a portion
of the primary key
Dependency Diagram
 The primary key components are bold,
underlined, and shaded in a different color.
 The arrows indicate dependencies of attributes
on other attributes
Transitive Dependency
 A functional dependency between two or more
nonkey attributes in a relation.
 If A, B, and C are attributes of a relation such
that
A  B and B  C
then C is transitively dependent on A via B,
provided that A is not functionally dependent on
B or C.
Third Normal Form
 1NF and 2NF and no transitive
dependencies (functional dependency
between non-key attributes.)
Relation with Transitive
Dependency

(a) SALES relation with simple data


Relation with Transitive
Dependency
Removing a transitive
dependency
Exercises Normalization
 Normalize the following schema into third
normal form:
 Orders(customerID, customerName,
customerAddress, orderID, orderDate,
itemID, itemName, itemQuantity)
 Try to design a schema that is in 3NF
Normalization Exercise
The following data defines rental agreement data that might be
collected by a company that rents equipment to its customers. This
data records information about the customer and the items that the
customer has rented.
{Agreement#, Date + Time + Cust#, + Name + Addr + {Item# +
Description + Rental-Code + Hrly-Rate + No-of-Hrs}}
Assume the following:
Cust#, Item#, and Rental-Code are all unique.
Rental-Code are used to define hourly rates. For example, all items
with code "A" are $3.50/hr, all items with code "B" are $6.00/hr, etc.
No-of-Hrs is the number of hours the items was rented.
An item has the same rental code no matter who rents it.
Logical Design
Logical Design
 Logical database design helps you further define
and assess your business’s information
requirements
 Logical database design involves describing
each piece of information you need to track and
the relationships among those pieces of
information.
 Once you create a logical design, you can verify
with the users of the database that the design is
complete and accurate.
Steps in Logical Design
Build Initial Entity-Relationship Model
a. Identify entities
b. Identify primary keys
c. Identify relationships between entities
d. Create linking tables
e. Identify foreign keys
>> Graphical ER picture of general schema
Steps in Logical Design
Identify Attributes
>> Listing of table and field descriptions
4. Normalize Tables
a. Remove repeating groups
b. Remove partial-key dependencies
c. Remove non-key dependencies
>> Corrected, normalized tables
Steps in Logical Design
Complete Logical Schema
a. Build code-description lookup tables
Consider selective denormalization
Consider derived fields
d. Revise conceptual schema
documentation
>> Relief
Critical Success Factors in
Database Design
 Work interactively with the users as much as
possible.
 Follow a structured methodology throughout the
data modelling process.
 Incorporate structural and integrity considerations
into the data models.
 Combine conceptualisation, normalisation, and
transaction validation techniques into the data
modelling methodology.
Physical Design
Physical Database Design
Objectives:
 To translate the logical description of
data into the technical specifications for
storing and retrieving data
 To a design for storing data that will
provide adequate performance and
insure database integrity, security
and recoverability
Physical Database Design
Objectives:
 To translate the logical description of
data into the technical specifications for
storing and retrieving data
 To a design for storing data that will
provide adequate performance and
insure database integrity, security
and recoverability
Table Operations
 Retrieve data  Reorganize/pack
 Read entire table database
 Read next row/sequential  Remove deleted rows
 Read arbitrary/random row
 Recover unused space
 Store data
 Insert a row
 Delete a row
 Modify a row
Database Efficiency
Considerations
 Transaction throughput
# of transactions that can be processed in a given
time interval
 Response time
 The elapsed time for the completion of a single
transaction
 Disk storage
 Amount of disk space required to store the database
files
Understanding System
Resources
 Main memory
 CPU
 Disk I/O
 Network
Disk I/O Recommendations
Distribute storage evenly across available
drives to to reduce likelihood of performance
problems:
 OS files should be separated from DB files
 Main DB files should be separated from the index
files
 The recovery log file should be separated from the
rest of the database
Inputs to Physical Design
 Normalized relations
 Volume estimates
 Attribute definitions
 Descriptions of data usage
 Response time requirements
 Requirements for security, backup, recovery,
retention, integrity
 DBMS characteristics
Physical Design Decisions
 Specifying attribute storage format (data
types)
 Grouping attributes from the logical data
model into physical records
 Specifying the file organization
 Preparing strategies for queries to
optimize performance
Choosing Data Types
 Minimize storage space
 Represent all possible values
 Improve data integrity
 Support all data manipulations
Example Look-Up Table
Advantages of Look-Up Tables
 Reduction in the size of the child relation
 Changes can be easier made in the
lookup table than in the child relation
 Can be used to validate user input
Database Efficiency
Considerations
 Transaction throughput
# of transactions that can be processed in a given
time interval
 Response time
 The elapsed time for the completion of a single
transaction
 Disk storage
 Amount of disk space required to store the database
files
Controlling Data Integrity
 Default values
 Range control
 Null value control
 Referential integrity
Handling Missing Data
 Substitute an estimate of the missing
value.
 Trigger a report listing missing values.
 In programs, ignore missing data unless
the value is significant.
Design Security Mechanisms
 User views
CREATE VIEW staff3
AS SELECT sno, lname, address, tel_n
FROM staff
WHERE bno = “B3”;

 Access rules
GRANT ALL PRIVILEGES
ON staff
TO admin;
Creating Database
Structured Query
Language
SQL Basics: How does a query
language like SQL work?

 Query languages like SQL typically work


behind the scenes as an intermediary
between the database client software
provided to users, and the database itself
 The client software collects your input,
then converts it into an SQL query, which
can operate directly on the database to
carry out your instructions
SQL Basics: How does a query
language like SQL work?
What does a simple SQL query
look like?
 An SQL query is a sequence of words, much like a
sentence
 SELECT TrackTitle FROM Tracks WHERE
TrackTitle = ‘Fly Away’
 The SQL query language provides a collection of
special command words called SQL keywords, such
as SELECT, FROM, INSERT, and WHERE, which
issue instructions to the database
 Most SQL queries can be divided into three simple
elements that specify an action, the name of a
database table, and a set of parameters
How does SQL specify the action that I
want carried out in the database?

 An SQL query typically begins with an


action keyword, or command, which
specifies the operation that you want
carried out
 For example, the command word DELETE
removes a record from a table
Terminology
 Data Definition Language (DDL):
 Commands that define a database, including creating,
altering, and dropping tables and establishing
constraints.
 Data Manipulation Language (DML)
 Commands that maintain and query a database.
 Data Control Language (DCL)
 Commands that control a database, including
administering privileges and committing data.
Basic SQL Commands
 Creating tables with CREATE
 Adding data with INSERT
 Viewing data with SELECT
 Removing data with DELETE
 Modifying data with UPDATE
 Destroying tables with DROP
Data Definition
 Creating a Database CREATE SCHEMA
 Creating a Table CREATE TABLE
 Changing Table ALTER TABLE
 Removing a Table DROP TABLE
 Creating an Index CREATE INDEX
 Removing an Index DROP INDEX
 Creating a View CREATE VIEW
 Removing a View DROP VIEW
Creating Database
 CREATE DATABASE databasename;

 CREATE DATABASE contacts;


Creating tables with CREATE
 Generic form

CREATE TABLE tablename (


column_name data_type attributes…,
column_name data_type attributes…,

)

 Table and column names can’t have spaces or


be “reserved words” like TABLE, CREATE, etc.
Creating Tables
CREATE TABLE ORDER_LINE
(ORDER_ID CHAR(5) NOT NULL,
PRODUCT_ID CHAR(5) NOT NULL,
QUANTITY INT NOT NULL,
PRIMARY KEY (ORDER_ID,PRODUCT_ID),
FOREIGN KEY (ORDER_ID) REFERENCES
ORDER (ORDER_ID)
FOREIGN KEY (PRODUCT_ID) REFERENCES
PRODUCT (PRODUCT_ID));
Phone Book/Contact Record
Name Character
Address Character
Company Character
Phone Number Character
URL/Web Page Character
Age Integer
Height Real (float)
Birthday Date
When we added the entry Timestamp
Phone Book/Contact Table
CREATE TABLE contacts (
Name VARCHAR(40),
Address VARCHAR(60),
Company VARCHAR(60),
Phone VARCHAR(11),
URL VARCHAR(80),
Age INT,
Height FLOAT,
Birthday DATE,
WhenEntered TIMESTAMP
);
Phone Book/Contact Table
CREATE TABLE contacts (
ContactID INT PRIMARY KEY,
Name VARCHAR(40),
Address VARCHAR(60),
Company VARCHAR(60),
Phone VARCHAR(11),
URL VARCHAR(80),
Age INT,
Height FLOAT,
Birthday DATE,
WhenEntered TIMESTAMP
);
Data Types
 Binary
 Database specific binary objects (BLOB)
 Boolean
 True/False values (BOOLEAN)
 Character
 Fixed width (CHAR) or variable size (VARCHAR)
 Numeric
 Integer (INT), Real (FLOAT), Money (MONEY)
 Temporal
 Time (TIME), Date (DATE), Timestamp (TIMESTAMP)
Creating Indexes
 The DBMS refuses to accept any update
that would cause a duplicate value in the
column:

CREATE INDEX index1 ON


contacts(contactid);
Creating Indexes

 CREATE INDEX BALIND ON CUSTOMER


(BALANCE);

 CREATE INDEX CUSTNAME ON


CUSTOMER(LAST,FIRST);

 CREATE INDEX CREDNAME ON


CUSTOMER(CREDIT_LIMIT DESC, LAST,
FIRST);
Unique Indexes
 The DBMS refuses to accept any update
that would cause a duplicate value in the
column:

CREATE UNIQUE INDEX SSN ON


CUSTOMER(SOC_SEC_NUMBER);
Adding data with INSERT
 Generic Form

INSERT INTO tablename (column_name,…)


VALUES (value,…)
Inserting a record into ‘contacts’
INSERT INTO contacts
(contactid,name,address,company,phone,
url,age,height,birthday,whenentered)
VALUES
(1,‘Joe’,’123 Any St.’,’ABC’,
’800-555-1212’,‘http://abc.com’,30,1.9,
’1972-06-24’,now());
Inserting a partial record
INSERT INTO contacts
(contactid,name,phone)
VALUES (2,’Jane’,’212-555-1212’);
Modifying data with UPDATE
 Generic Form

UPDATE table SET column=expression


WHERE condition;

UPDATE contacts SET company=‘AOL’


WHERE company=‘Time Warner’;
Updating Data
 All rows
UPDATE contacts
SET whenentered=now();
 Specific rows
UPDATE contacts
SET birthday=‘1976-06-03’
WHERE contactid = 1;
Updating Data (with Calculation)
 All rows
UPDATE STAFF
SET Salary=Salary*1.03;
 Specific rows
UPDATE STAFF
SET Salary=Salary*1.03
WHERE Position=‘Manager’;
Updating Multiple Columns
 UPDATE STAFF
SET Address=‘Manhattan, New York’,
company=‘AOL’ WHERE age=30;
Removing data with DELETE
 Generic Form

DELETE FROM table WHERE condition;

DELETE FROM contacts WHERE age is


Null;
Destroying tables with DROP
 Generic Form

DROP TABLE tablename;

DROP TABLE contacts;


Changing a Column Value to
NULL
UPDATE contact
SET address = NULL
WHERE contactid = 124;
Changing the Table Structure
ALTER TABLE CUSTOMER
ADD CUSTOMER_TYPE CHAR(1);

ALTER TABLE CUSTOMER


MODIFY STREET CHAR(20);
Renaming a Field
 ALTER TABLE <tablename> CHANGE
<oldfieldname> <newfieldname> <field
type>
 ALTER TABLE contacts CHANGE phone
phonenumber char(12);
Renaming a Table
 ALTER TABLE <old table name>
RENAME <newtablename>;

 ALTER TABLE contacts RENAME


addressbook;
Describing a Table
 To display all the columns in a table and
their corresponding data types

DESCRIBE sales_rep;

DESCRIBE student;
Dropping a Primary Key
 ALTER TABLE tablename DROP Primary
Key

 ALTER TABLE contacts DROP primary


key
Data Manipulation
Simple Queries
1. Projections
SELECT <column(s)>
FROM <table>
2. Selections
SELECT <column(s)>
FROM <table>
WHERE <condition(s)>
Viewing data with SELECT
 Generic Form
SELECT column,… FROM table,…
WHERE condition
GROUP BY group_by_expression
HAVING condition
ORDER BY order_expression
 The most used command
 Probably the most complicated also
 If used improperly, can cause very long waits
because complex combinations
Retrieve Table
Retrieve the entire contacts table.
SELECT *
FROM contacts;

SELECT contactid, birthday


FROM contacts;
Projections
 Retrieve all columns, all rows
SELECT *
FROM STUDENT;
 Use of DISTINCT
SELECT DISTINCT Major
FROM STUDENT;
A few simple SELECTs
 SELECT * FROM contacts;
 Display all records in the ‘contacts’ table
 SELECT contactid,name FROM contacts;
 Display only the record number and names
 SELECT DISTINCT url FROM contacts;
 Display only one entry for every value of URL.
Other SELECT examples
 SELECT * FROM contacts
WHERE name is NULL;
 SELECT * FROM contacts
WHERE zip IN (‘14454’,’12345’);
 SELECT * FROM contacts
WHERE zip NOT IN (
SELECT zip FROM address
WHERE state=‘NY’
);
Refining selections with WHERE

 The WHERE “subclause” allows you to


select records based on a condition.
 SELECT * FROM contacts
WHERE age<10;
 Display records from contacts where age<10
 SELECT * FROM contacts
WHERE age BETWEEN 18 AND 35;
 Display records where age is 18-35
Additional selections
 The “LIKE” condition
 Allows you to look at strings that are alike
 SELECT * FROM contacts
WHERE name LIKE ‘J%’;
 Display records where the name starts with ‘J’
 SELECT * FROM contacts
WHERE url NOT LIKE ‘%.com’;
 Display where url does not end in “.com”
Row Selections (WHERE
Clause)
 Comparison
 Range
 Set membership
 Pattern match
 Null
Rules for Evaluating a
Conditional Expression
 Evaluation from left to right
 Subexpressions in brackets are evaluated
first
 NOTs are evaluated before ANDs and
ORs
 ANDs are evaluated before ORs
Simple Comparison Operators
 = equals
 < less than
 > greater than
 <= less than or equal to
 >= greater than or equal to
 <> not equal to (ISO standard)
 != not equal to (supported by some
dialects)
Comparison Search Condition
SELECT Sno, Fname, Lname, Position, Salary
FROM STAFF
WHERE Salary>10000;
String Comparison
 Which teams are based in Detroit?
SELECT TEAMNUM, TEAMNAME
FROM TEAMS
WHERE TEAMNAME=‘Detroit’;
24 Tigers
Compound Comparison Search
Condition
SELECT Name, Age
FROM STUDENT
WHERE Major =“Math” AND Age>21;
Another Combination
of Conditions
 Which players, over 27 years old, have
player numbers of at least 1000?
SELECT PLAYNUM, PLAYNAME
FROM PLAYERS
WHERE AGE>27
AND PLAYNUM>=1000;
1131 Johnson
5410 Smith
8366 Gomez
ANDs and ORs
 Which players are over 30 years old or are less
than 22 years old and have a player number
less than 2000?
SELECT *
FROM PLAYERS
WHERE AGE>30
OR (AGE<22 AND PLAYNUM<2000);
358 Stevens 21
523 Doe 32
8366 Gomez 33
Range Search Condition
(BETWEEN/NOT BETWEEN)
SELECT Name,Major
FROM STUDENT
WHERE Age BETWEEN 19 AND 30;

SELECT sid
FROM scores
WHERE points between 50 and 70;
Another Example
 Which players are between 25 and 27
years old?
SELECT PLAYNUM, PLAYNAME
FROM PLAYERS
WHERE AGE BETWEEN 25 AND 27;
1779 Jones
2007 Dobbs
4280 Cohen
5410 Smith
Set Membership Search
Condition (IN/NOT IN)
SELECT contactid, name, age
FROM contact
WHERE age IN (15, 20, 25, 30);
Set Membership Search
Condition (IN/NOT IN)
 Which teams are in New York or Detroit?

SELECT TEAMNUM
FROM TEAMS
WHERE TEAMCITY IN (‘New York’, ‘Detroit’);
Pattern Match Search Condition
(LIKE/NOT LIKE)
% represents any sequence of zero or more
characters (wildcard)
_ underscore represents any single character

SELECT PLAYNUM, PLAYNAME


FROM PLAYERS
WHERE PLAYNAME LIKE ‘S%’;
Distinct
 Used to eliminate duplicate answers
 Get pno values for parts for which orders
have been placed:
SELECT distinct pno
FROM odetails;
NULL Search Condition
(IS NULL/IS NOT NULL)
SELECT Rno,Date
FROM VIEWING
WHERE Pno = ‘PG4’ AND Comment IS NULL;
Sorting Results
SELECT contactID, name, height, address
FROM contacts
ORDER BY name, height DESC;
Calculated Fields

SELECT contactid, name, age/2


FROM contacts;
SQL Aggregate Functions
 COUNT(*)
Counts all rows regardless of whether any rows
contain null values
 COUNT(fieldname)
 Tallies only rows that contain an occurrence
 SUM
 AVG
 MIN
 MAX
Use of COUNT(*)
 How many properties cost more than $350
per month to rent?

SELECT COUNT(*) AS Count


FROM contacts
WHERE age>30;
COUNT
 Get the total number of customers.
SELECTcount (CNO) NUM_CUSTOMERS
FROM customers;
 Get the number of cities in which
customers are based.
SELECT count (distinct city)
FROM customers, zipcodes
WHERE customers.zip=zipcodes.zip;
Use of COUNT and SUM

 Find the total number of managers and the


sum of their salaries.

SELECT COUNT(Sno) AS Count, SUM(Salary) AS


Sum
FROM STAFF
WHERE Position=‘Manager’;
Use of MIN, MAX, AVG
 Find the minimum, maximum, and average
staff salary.
SELECT MIN(Salary) AS Min,
MAX(Salary) AS Max, AVG(Salary)
AS Avg
FROM STAFF;
GROUPING
 Creates groups of rows that share a
common characteristic
 GROUP BY
groups data in a particular order, then
allows for calculating statistics if desired
 HAVING
command used for groups
Use of GROUP BY
SELECT age, Count(*)
FROM contacts
GROUP BY age
ORDER BY age DESC;
Use of HAVING
SELECT age
FROM contacts
GROUP BY age
HAVING max(height) > 1;
WHERE vs HAVING
 The WHERE clause filters individual rows
going into the final results table.
 The HAVING filters groups into the final
results table; the search condition in the
HAVING clause always includes at least
one aggregate function.
GROUP BY/HAVING
 The “GROUP BY” clause allows
you to group results together with
“aggregate functions”
 AVG(),
COUNT(), MAX(), MIN(), SUM()
 COUNT DISTINCT

 HAVING allows you to search


the GROUP BY results
GROUP BY Examples
SELECT * FROM contacts
GROUP BY company;
SELECT company,count(company)
FROM contacts
GROUP BY company;
SELECT company,count(company)
FROM contacts
GROUP BY company
HAVING count(company) > 5;
ORDER BY
 The “ORDER BY” clause allows you to
sort the results returned by SELECT.

SELECT * FROM contacts


ORDER BY company;

SELECT * FROM contacts


ORDER BY company, name;
Distinct Keyword
 Return only unique values from a column
 To eliminate duplicates from the result,
 For example, the following query shows all
distinct salaries above 50,000 with no duplicates:
select distinct course
from tblstudent
Multi-Table Queries:
Simple Join
SELECT STUDENT.Name
FROM STUDENT, ENROLLMENT
WHERE STUDENT.SID=ENROLLMENT.StudentNumber
AND ENROLLMENT.ClassName=‘BD445’;

Alternatively:
SELECT STUDENT.Name
FROM STUDENT INNER JOIN ENROLLMENT ON
STUDENT.SID = ENROLLMENT.StudentNumber
WHERE ENROLLMENT.ClassName=‘BD445’;
Using an ALIAS
SELECT s.studentid, s.lname, s.fname,
c.subjectid
FROM tblstudent as s, tblstudent_subject
as c
WHERE s.studentid=c.studentid;
Exercises
 Create a query that will show:
 All the teachers of a student
First name|Last name | Teacher
 How many subjects that the teacher teaches
 Teacher Name | CountSubjects
 Student their subjects and their teachers
 First Name|Last name | subject name | teacher
name
Joining together tables
People
 SELECT name,phone,zip FROM Id Name Addressid
people, phonenumbers, address 1 Joe 1
WHERE
2 Jane 2
people.addressid=address.addressid
AND people.id=phonenumbers.id; 3 Chris 3
PhoneNumbers
PhoneID Id Phone
1 1 5532 Address
2 1 2234 AddressID Company Address Zip
1 ABC 123 12345
3 1 3211
2 XYZ 456 14454
4 2 3421
3 PDQ 789 14423
5 3 2341
6 3 3211
Different types of JOINs
 “Inner Join”
 Unmatched rows in either table aren’t printed
 “Left Join”
 All records from the “left” side are printed
 “Right Join”
 All records from the “right” side are printed
General form of SELECT/JOIN
SELECT columns,…
FROM left_table
join_type JOIN right_table ON condition
WHERE condition;
Using Join (INNER)
SELECT fname, course FROM tblStudents
INNER JOIN tblStudentCourse ON
studentid=studentid;
 Will display students with courses.
Original Form:
SELECT fname, course FROM tblStudents
studentid=studentid;
Using Join (LEFT)
SELECT a.fname, a.course, b.subjectid
FROM tblStudent as a
LEFT JOIN tblStudent_Subject as b ON
a.studentid=b.studentid;
 Will display students with courses. Null values
will be shown in the field where subject does
not exist
Using Join (RIGHT)
SELECT fname, course
FROM tblStudents as a
RIGHT JOIN tblStudent_Subject as b
ON a.studentid=b.studentid;
 Will display course with students. Null values
will be shown in the student field in subjects
with no student.
Using SQL in
MySQL RDBMS
Using mysql in the command
prompt
Go to:

C:\apachefriends\xampp\mysql\bin
Using MySQL Command Line
 Make sure you are in the /mysql/bin
directory.
 Type mysql or mysql –u root or mysql – u
(username) – p (password)
Show Databases
 To see what databases exist on this
server, type the following:
Use command
Show Tables Command
 To list the columns of a selected table
Describe tablename
 Shows the fields within a table
Create Database Command
Drop Database
Creating Tables
Inserting Data

You might also like