CH 01

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 58

Advanced Database Management

System (ADBMS)

01/31/24
Advanced Database System 1
CHAPTER ONE

Object-Oriented DBMSs

01/31/24
Advanced Database System 2
Topics discussed in this chapter

Concepts for Object-Oriented Databases

Weaknesses of RDBMSs

Overview of Object-Oriented Concepts

Object Identity, Object Structure, and Type Constructors

Encapsulation of Operations, Methods, and Persistence

01/31/24
Advanced Database System 3
Weaknesses of RDBMSs

 Poor Representation of “Real World” Entities.

 Difficulty in representing complex data types

 Poor Support for Integrity and Enterprise Constraints.

 Limited Operations.

 Schema changes are difficult.

01/31/24
Advanced Database System 4
What is an OODBMS ?

OODBMS:-Object-oriented Database Management System

 Data is represented in the form if objects as OOP.


 Its stores complex data without mapping to relational
rows and columns.
 One benefit of ODBMS is that when it’s integrated with
an OOP there is much greater consistency between the
database and programming language.
 OODB=OOP concepts + DB concepts

01/31/24
Advanced Database System 5
What is an OODBMS ?
Features of OODBs:

 It allow to specify state and Behavior


 Most of OOP use objects directly from the DB.
 Object identity.
 Methods and messages.
 Classes, subclasses, super classes, and inheritance.
 Overriding, Overloading, Polymorphism and Dynamic Binding.

01/31/24
Advanced Database System 6
Object Oriented Concepts:

 Abstraction, encapsulation, and information hiding.


 Objects and classes
 Object identity.
 Methods and messages.
 Subclasses, super classes, and inheritance.
 Overriding, Overloading, Polymorphism and Dynamic Binding.

01/31/24
Advanced Database System 7
Objects
Object is a uniquely identifiable entity that contains both:
 The attributes that describe the state of a real-world
object and the actions associated with it.

In definition very similar to that of an entity, however,

 Object encapsulates both state and behavior;

 An entity only models state (attributes).

01/31/24
Advanced Database System 8
Specifying Object Persistence

The typical mechanisms for making an object persistent are Naming and

Reachability.

The Naming mechanism involves giving an object a unique persistent

name through which it can be retrieved by this and other programs.

The Reachability mechanism works by making the object reachable from

some persistent object

An object B is said to be reachable from an object A if a sequence of

references in the object graph lead from object A to object B


01/31/24
Advanced Database System 9
Object Identity (OID)
In RDBMS, entity identity is value-based: primary key is used to provide
uniqueness.

Primary keys do not provide type of object identity required in OO


systems:
 key only unique within a relation, not across entire system;
 key generally chosen from attributes of relation, making it
dependent on entity state.

Objects exist independently of their (current) values

01/31/24
Advanced Database System 10
Object Identity (OID)

OID is the unique, system-generated mechanism of referring


persistent objects.
OIDs cannot be based on ordinary values provided by
application (value orientation) ... but: OIDs are
Unique (system-wide) means not relation based
Unchanged during object lifetime
Not reused after object deletion (immutable)
Generally system-managed:

01/31/24
Advanced Database System 11
Object Identity (OID)

The main property required of an OID is that it be immutable that is ,

the OID value of a particular objet should not change. This preserves

the identity of the real object being represented.

Assume we have one object its name is Obj1 and OID:A123 then when

the object is deleted from the database since the OID is immutable we

can’t use to refer any other object. i.e. if an object is deleted its OID

must not be assigned to any other object.

01/31/24
Advanced Database System 12
Complex Objects
A Complex object is something that can be viewed as a single
object in the real world but it actually consists of many sub-
objects.
Two types of complex objects:
Unstructured complex objects:
• Their structure is hard to determine.
• Requires a large amount of storage.
• BLOB (Binary Large Objects): images and
• CLOB(character long objects): large strings.
Structured complex objects:
• Clear structure. E.g tuple
01/31/24
Advanced Database System 13
Object Structure
In OODB, the value of a complex object can be constructed from
other objects

Each object can be viewed or represented as a triplet( i, c, v)


Where i is the unique object identifier (OID)
c is the constructor or an indication of how the object value is
constructed (operator)
v is the value of the object (state)
Basic constructors are atom, tuple, set
Others: list, array

01/31/24
Advanced Database System 3-14
Type constructors

Type constructors: In OO databases , the state(current value)

of a complex object may be constructed from other

objects(other values ) by using certain type constructors.

This determine how the object is constructed and it tell us

the basic structure of the object.

01/31/24
Advanced Database System 15
Type constructors
Kind of basic constructors are:
 Atom: is used to represent all basic atomic values such as int, char,

float and string.

 Set: set of values of same type with no duplicate items

 Bag: set with duplication allowed

 List: ordered collection of items of same type {123,456,678}

 Array: similar to a list but with a fixed size.

 Tuple: collection of elements of the above types.

01/31/24
Advanced Database System 16
Type Constructors
Tuple constructor TUPLE OF
 Name|Set of locations|Array of emp
 It represent in the form of : <A1:i1, A2:i2….An:in>
 Eg: <Name :i1, Set of locations:i2,Array of emp: i3>
Set constructor SET OF
 Many elements of the same type build a set.
 Each element can only be contained once in the set.
Multi-set constructor BAG OF: like set but
 one element can have copies in the bag
List constructor LIST OF: like bag, but
 The OIDs in a list are ordered, and hence we can refer to the first,
second, or nth object in a list Sequence is of interest

17
01/31/24
Advanced Database System
Object Structure
Object state interpreted based on constructor ‘C’

Type ‘C’ Object state ‘V’

Atom Value is domain of basic values

Set OID={ i1, i2 , i3….. in }

Tuple { a1:i1 , a2:i2…..an:in } }

List Ordered list { i1, i2 , i3 …in, }

Array Array of OIDs

01/31/24 Advanced Database System


3-18
Object Structure
The value v can be interpreted on the basis of the
constructor c
Example:
if c = atom then v = atomic value
o1 = (i1, atom, ‘House’)
o2 = (i2, atom, Blue)
o3 = (i3, atom, Sugarland)
The value ‘House’
o4 = (i4, atom, 5)
o5 = (i5, atom, Research)
o6 = (i6, atom, 22-May-10)

01/31/24
Advanced Database System 3-19
Object Structure
Example:
p le
if c = tuple then v = < a1:i1..an:in > a rt m en t t u
d e p
A

o8 = (i8, tuple, < dname:i5, dnumber: i4, mgr: i9,


locations:i7, employees:i10, projects: i11>)

o9 = (i9, tuple, < manager:i12, managerstartdate: i6>)

01/31/24
Advanced Database System 3-20
Object Structure
Example:
if c = set then v = {i1,i2,i3}
o7 = (i7, set, { i2, i1 , i3 } )
o10 = (i10, set, { i12, i13 , i14 } )
o11 = (i11, set, { i15, i16 , i17 } )

A set
of em p
loyees

01/31/24
Advanced Database System 3-21
Classes
 Classes are blueprints for defining a set of similar
objects. Or common description of similar objects.
 Objects in a class are called instances.
 Class is also an object with own class attributes and
class methods.
 Object created from the same class share the same
class attributes and methods.

01/31/24
Advanced Database System 22
Class Instance Share Attributes & Methods

BranchNo = B005
Street = 22 Deer Rd
City = London
Postcode = SW1 4EH

BRANCH
Attributes
branchNo BranchNo = B007
street Street = 16 Argyll St
city
postcode City = Aberdeen
Postcode = AB2 3SU
Methods

print()
getPostCode()
numberOfStaff() BranchNo = B003
Street = 163 Main St
City = Glasgow
Postcode = G11 9QX

01/31/24
Advanced Database System 23
OO Data Modelling: Unified Modeling Language (UML)

UML is a standard language for specifying, constructing,


visualizing, and documenting the artifacts of a software
system.
Include many structural diagrams (Class, Object diagrams…) and

behavioral diagrams (UseCase, Sequence diagrams…).


Used to model objects and object relationships.

Class Name MANAGER Association PROPERTY


StaffNo PropertyNo
Attribute
sex 1..1 manage 1.1 1..1 offer 1.* street
DOB
salary offered-by city
postcode
increasesalary() rooms
Method type

01/31/24
Advanced Database System 24
Unified Modeling Language (UML)
PERSON
Name
FName
LName

STAFF OWNER CLIENT


ClientNo
StaffNo OwnerNo telNO
position address prefType
DOB MaxRent
salary 1
OwnedBy N
Owns ViewedBy Views
M PROPERTY M
MANAGER SALESTAFF
PropertyNo
rooms
rent
1 M WorksAt M
Has 1
Manages 1 BRANCH 1 Offers
ManagedBy BranchNo IsOfferedBy
address
01/31/24
Advanced Database System 25
The End

01/31/24
Advanced Database System 3-26
Chapter Two

Query Processing and Optimization

27
Introduction
Query Processing
 Activities involved in retrieving data from the database.

 This includes translation of high –level queries into low


level expressions that can be used at physical level of
the file system, query optimization and actual execution
of the query to get the result.

28
Query Processing…

Aims of query processing (QP):


 Transform query written in high-level language (e.g.,
SQL), into correct and efficient execution strategy
expressed in low-level language that implements
relational algebra (RA);
 Execute strategy to retrieve required data.

29
Basic Steps in Query Processing
1. Parsing and translation
2. Optimization
3. Evaluation

3-30
Parsing and translation
 Scanner: The scanner specifies and recognizes the language tokens such as

SQL Keywords, attribute names, and relation names in the text of the query.

 Parser: The parser checks the query syntax to determine whether it is

formulated according to the syntax rules of the query language.

 Validation: The query must be validated by checking that all attributes and

relation names are valid and semantically meaningful names in the schema of

the particular database being queried.

3-31
Parsing and translation

 Query is converted to relational algebra by SQL


interpreter.
 Relational Algebra converted to annotated tree, joins

as branches
 Each operator has implementation choices.

32
Translating SQL Queries into Relational Algebra
Query block:

The basic unit that can be translated into the algebraic operators
and optimized.

A query block contains a single SELECT-FROM-WHERE expression, as


well as GROUP BY and HAVING clause if these are part of the block.

Nested queries:

Within a query are identified as separate query blocks.

Aggregate operators in SQL must be included in the extended


algebra.

33
Translation Example

Possible SQL Query:

 SELECT balance FROM account WHERE balance<2500

Possible Relational Algebra Query:

 balance(balance<2500(account))

3-34
Translating SQL Queries into Relational Algebra

Consider: to find names of employees making more than


everyone in department 5.

SELECT lname, fname FROM employee WHERE salary >


( SELECT MAX(salary) FROM employee WHERE dno=5)

35
Translating SQL Queries into Relational Algebra

2 query blocks: Relational Algebra:

SELECT lname, fname π lname, fname (σsalary>cons (employee))


FROM employee
WHERE salary > constant where cons is the result from:

SELECT MAX(salary) π MAX Salary (σdno=5(employee))


FROM employee
WHERE dno=5

36
Translating SQL Queries into Relational Algebra

consider: to find names of employees making more


than everyone in department 5.

SELECT lname,fname, dname FROM employee e,


department d WHERE e.dno=d.dno

Relational Algebra:

π lname, fname (employee ⋈e.dno=d.dno department)


37
Optimization
The query optimizer selects an execution plan that has
lowest and fastest but functionally equivalent form.
A relational algebra expression may have many
equivalent expressions, each of which gives rise to a
different evaluation plan.

Bala( bala>100(Account))

 Bala (Account)) both are equivalent query i.e.


bala>100(
they display the same results.
Amongst all equivalent evaluation plans choose the one with
lowest cost.

3-38
Execution plan

An internal representation of the query is then created, usually as a

tree data structure called a query tree.

The DBMS must then devise an execution strategy or plan for

retrieving the results of the query from the database files.

A query typically has many possible execution strategies, and the

process of choosing a suitable one for processing a query is known as

query optimization.

39
Evaluation
When the query came how the database answer it?

The query-execution engine takes a query-evaluation plan,


executes that plan, and returns the answers to the query.

3-40
Relational Algebra: overview
Project (unary)

 <attr list> (R)

 <attr list> is a list of attributes (columns) from R only

 Ex: title, year, length (Movie) “horizontal restriction”

A1 A 2 A 3 … A n A1 A 2… A k


i j
...

...
n K, n≥k 41
Project

PROJECT can produce many tuples with same value


 Relational algebra semantics says remove duplicates
 SQL does not -- one difference between formal and
actual query languages

42
Relational Algebra: Select
Select or Restrict
 <predicate> (R)

 <predicate> is a conditional expression of the type that we are

familiar with from conventional programming languages


 <attribute> <op> <attribute>
 <attribute> <op> <constant>
 attribute in R
 op  {=,,<,>,, …, AND, OR}

 Ex: length100 (Movie) vertical restriction

43
Pictorially
Movie title year length filmType
Star Wars 1977 124 color
Mighty result set
1991 104 color
Ducks
Wayne’s
1992 95 color
World

A1 A 2 A 3 … A n A1 A 2 A 3 … A n


i j, i  j
...

...
# of selected tuples is referred to as the selectivity of the condition
44
Cartesian Product

 R x S

 Sets of all pairs that can be formed by choosing the first

element of the pair to be any element of R, the second any


element of S.
 Resulting schema may be ambiguous

 Use R.A or S.A to disambiguate an attribute that occurs in


both schemas

45
Example

R S
A B B C D A R.B S.B C D
1 2 2 5 6 1 2 2 5 6
x
3 4 4 7 8 1 2 4 7 8
9 10 11 1 2 9 10 11
3 4 2 5 6
3 4 4 7 8
3 4 9 10 11

46
Join Operations

Natural Join (binary)


 R join S
 Match only those tuples from R and S that agree in whatever
attributes are common to the schemas of R and S
 If r and s from r(R) and s(S) are successfully paired, result is
called a joined tuple
 This join operation is the same we used in earlier section to
recombine relations that had been projected onto two subsets of
their attributes (e.g., as a result of a BCNF decomposition)

47
Example

R S
A B B C D A B C D
1 2 2 5 6 1 2 5 6
join
3 4 4 7 8 3 4 7 8
9 10 11

48
Optimization
A relational algebra expression may have many equivalent expressions

E.g.,salary75000(salary(instructor)) is equivalent to

salary(salary75000(instructor))

Each relational algebra operation can be evaluated using one of several


different algorithms
Correspondingly, a relational-algebra expression can be evaluated
in many ways.
E.g., can use an index on salary to find instructors with salary <
75000,
or can perform complete relation scan and discard instructors
with salary  75000
3-49
Optimization….
Annotated expression specifying detailed evaluation strategy is called an
evaluation-plan.

Query Optimization: Amongst all equivalent evaluation plans choose the


one with lowest cost.

Cost is estimated using statistical information from the database catalog

e.g. number of tuples in each relation, size of tuples, etc.


Total cost= CPU cost + I/O cost + communication cost

3-50
Three Key Concepts in QPO
1. Building blocks
 Similarly, most DBMS have few building blocks:
• select (point query, range query), join, sorting, ...
 SQL query is decomposed in building blocks

2. Query processing strategies for building blocks


 DBMS keeps a few processing strategies for each building
block
• e.g. a point query can be answer via an index or via scanning
data-file
3. Query optimization
 For each building block of a given query, DBMS QPO tries
to choose
• “most efficient” strategy given database parameters
• parameter examples: table size, available indices, …
• ex. index search is chosen for a point query if the index is
available
3-51
Query tree
Query tree: a tree data structure that corresponds to a
relational algebra expression. It represents the input
relations of the query as leaf nodes of the tree, and
represents the relational algebra operations as internal
nodes.
An execution of the query tree consists of executing an
internal node operation whenever its operands are available
and then replacing that internal node by the relation that
results from executing the operation.
3-52
Tree Representation of Relational Algebra

balancebalance<2500(account))

 balance

balance<2500

account

3-53
Making An Evaluation Plan
Annotate Query Tree with evaluation instructions:

balance

balance<2500 use index 1

account
The query can now be executed by the query execution engine.

3-54
Tree Representation of Relational Algebra

 A1,,,,Anp( R1 x,….Rk))

 A1,,,An

P

x Rk

x
R1 R3
R2 3-55
Why Learn about QPO?
Why learn about QPO in a DBMS?
 Identify performance bottleneck for a query
• is it the physical data model or QPO ?
 How to help QPO speed up processing of a query ?
• providing hints, rewriting query, etc.
 How to enhance physical data model to speed up
queries?
• add indices, change file- structures, …

3-56
Measures of Query Cost
Cost is generally measured as total elapsed time for answering
query
 Many factors contribute to time cost

• disk accesses, CPU, or even network communication


Typically disk access is the predominant cost, and is also relatively
easy to estimate. Measured by taking into account
 Number of seeks * average-seek-cost
 Number of blocks read * average-block-read-cost
 Number of blocks written * average-block-write-cost

• Cost to write a block is greater than cost to read a block


• data is read back after being written to ensure that the write
was successful

3-57
Algorithms for select operations
Implementing the SELECT Operations

There are many algorithms for executing a select operation , which is


basically a search operation to locate the records in a disk file that
satisfy a certain condition.
Let as discuss on the ff relational operations.
 OP1: SSN=“123” (Employee)
 OP2: Dnumber>5 (department)
 OP3: Dno>5 (employee)

58

You might also like