Module 4-DB Management Module

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 104

Managing Database

Design, create, backup and recovery

Outcome-Based Learning Module for


HIT Level IV Learners

1
Acknowledgments
This learner module would not have been possible without the support of many organizations
and experts. The Ethiopian federal ministry of health and Tulane international would like to
express their gratitude to the regional Health Science Colleges for their participation in the
development of the draft materials for this learner module. We also would like to gratefully
acknowledge Harar health Science College for all kinds of support provided during the initial
draft development workshop held at the College. Finally, an honorable mention goes to
FMOH and Tulane International experts for their invaluable contributions during the
processes of the HIT occupational standard revision, new curriculum development and lastly
this learning material preparation.

2
Table of Contents
Introduction ....................................................................................................................... 6

Topic 1: Basics of Database ............................................................................................ 11

1.1 Introduction .......................................................................................................... 11

1.2 Learning objectives .............................................................................................. 11

1.3 File System and Database Approach .................................................................. 11

1.4 File System vs. Database Approach ..................................................................... 13

1.5 Characteristics of the Database Approach ........................................................... 14

1.6 Advantage of Using the DBMS ........................................................................... 16

1.7 Basics of Database Architecture .......................................................................... 18

1.8 Actors on the Database Environment................................................................... 20

Activities/exercise .......................................................................................................... 23

Topic 2: Design, Produce and Deploy Relational Database ........................................... 24

2.1 Introduction .......................................................................................................... 24

2.2 Learning Objectives ............................................................................................. 24

2.3 Designing a Database ........................................................................................... 24

2.4 Relational Model (ER-modeling)......................................................................... 29

2.5 Relationships ........................................................................................................ 33

2.6 Mapping E-R diagrams to Tables ........................................................................ 38

2.7 Normalization....................................................................................................... 39

Activities/exercise .......................................................................................................... 45

Topic 3: Introduction to SQL .......................................................................................... 48

3.1 Introduction .......................................................................................................... 48

3.2 Learning Objectives ............................................................................................. 48

3.3 Introduction to Microsoft SQL Server ................................................................. 48

3.4 Using Management Studio with Database Engine ............................................... 52

3.5 Managing Databases Using Object Explorer ....................................................... 55

3.3.1 Creating a Database ....................................................................................... 55

3
3.3.2 Modifying Databases without Using SQL .................................................... 58

3.6 SQL ...................................................................................................................... 60

3.7 Types of SQL Commands .................................................................................... 62

3.8 Using SQL Commands ........................................................................................ 64

3.6.1. Creation of a Database .................................................................................. 64

3.6.2. Creating Tables ............................................................................................. 65

3.6.3. Data Types in SQL ........................................................................................ 66

3.6.4. Adding or Dropping a New Column ............................................................. 66

3.6.5. Inserting data into a Table ............................................................................. 67

3.6.6. Updating Records .......................................................................................... 67

3.6.7. SQL SELECT Statement ............................................................................... 68

3.9 Generating a Report ............................................................................................. 69

Activities/exercise .......................................................................................................... 72

Topic 4: Database Backup and Recovery ....................................................................... 74

4.1. Introduction .......................................................................................................... 74

4.2. Learning Objectives ............................................................................................. 74

4.3. Database Backup .................................................................................................. 75

4.4. Methods of Database Backup .............................................................................. 75

4.5. Performing Backup .............................................................................................. 78

4.6. Backing Up Using SQL Server Management Studio........................................... 79

4.7. Database Recovery ............................................................................................... 84

4.7.1. Database Recovery Point .............................................................................. 84

4.7.2. Recovery Methods......................................................................................... 85

4.7.3. Recovery Techniques: ................................................................................... 87

4.7.4. Disaster Recovery Plan and Procedures ........................................................ 88

4.7.5. Performing a Database Recovery .................................................................. 90

Activities/exercise .......................................................................................................... 93

Topic 5: Data and Database Administration ................................................................... 94

4
5.1 Introduction .......................................................................................................... 94

5.2 Learning Objectives ............................................................................................. 94

5.3 Functions and Roles of Data/Database Administrator ......................................... 96

5.4 Basics of Security................................................................................................. 97

Activities/Exercises...................................................................................................... 101

5
Introduction
This learner module developed in line with the national competency standard in the Health
Information Technique (HIT) Training Package HLT HIT4 for the unit of competency of
Design, produce and manage mid-sized database (HLT HIT4 10 0611) and Complete
Database Back-Up and Recovery (HLT HIT4 16 0611). Primarily the module discusses how
to create a database and manage the data stored. From the beginning to the end it tries to deal
with an example that is related to health. The last part of the module covers how to take a
backup of a given database and the different mechanisms.

Completion of this module will help you to understand how a database being designed and
created. A given table may not be in its normal form (means it may have anomalies that
hinder you from getting appropriate values form your database). So in that case you should
follow the different phases that are important to convert the unormalized table to a
normalized one. The database that is created should be kept in a proper place or you need to
have a backup of that. If damage happened to the database you can use one of the different
recovering mechanisms that are simple and easier for that particular case.

At the end of the module you are expected to gain the following essential knowledge and
skills.

Essential Knowledge on:


• Identifying the difference between data and information
• The purpose of creating a database
• The advantages on using database management system
• a database and table definition
• Data types and relation among tables
• Identifying the different sql queries
• Purpose of database backup and restore
• Ensuring security to a database

6
Essential skill on:
• Creating a database and tables
• Defining data types and input data to tables
• Normalization of relational tables
• Creating relationship among tables and applying sql queries to tables
• Taking database backup and applying recovery methods using sql server management
studio
• Securing database confidentiality and security

Learning Outcome Summary

Upon the completion of the module you should be able to:


• Analyze, design and produce a mid-sized relational database
• Determine and implement database backup and recovery methods
• Ensure entry, security and confidentiality of data

Assessment criteria

The set of competency (skill, knowledge and attitudes) you have developed at the completion
of this module should allow you to demonstrate certain level of performance in the work
world. These may be assessed by the following assessment criteria.

1. Mid-sized database is designed on the basis of client requirement.

2. Data characteristics is identified on the basis of user requirement

3. User friendly form is designed

4. Queries are written to generate reports to a mid-sized database.

5. Report layout is designed on the basis of client’s requirement.

6. Orientation is given to the client about the developed mid-level database.

7. Back-up methods are identified and implemented on the basis of organizational and
database backup standards

7
8. Possible failure scenarios are identified and recovery plans are implemented on the
basis of organizational and database restoration standards.

9. On-line file back-ups are determined by organizational and security standards and
with minimal down time completed

10. Complete full off-line back-ups according to organizational and security standards
with minimal down time is ensured.

11. Application of data entry procedure is checked, based on institutional


guideline/manual

12. Data is checked for completeness and accuracy

13. Data security, confidentiality and integrity mechanisms are identified and
implemented on the basis of organizational and database security and confidentiality
standards

How to use the learner module (Part- II for accelerated HIT)


• This learner module is prepared for the units of competency ‘Design, Produce and
Manage Mid-sized Database’ that contains knowledge, skills and attitudes required to
create and use a database by HIT level IV students. It contains training materials and
activities relevant to the aforementioned units of competency.
• You are required to go through a series of learning activities in order to complete each of
the topics of the module. In each topic and sub-topics, there are Information and
activities. Carry out those activities on your own at the end of each learning activity.
Each topic or sub-topic may have more than one learning activity. All the activities
prepared for the generic one are included in this material.
• This module will be the source of information that will enable you to acquire the
knowledge and the skills independently at your own pace or with minimum supervision
or help from your teacher.
• The information contained in this material is very summarized as it assumes most of the
topics and subtopics are already covered in your previous studies/ levels (IT support
service level 1 and 2).

8
Resource

Topics Resource/Learning materials


• Computers, Microsoft SQL server
• Analyze, design and produce a
2008, operating system,
mid-sized relational database

• Computers, Microsoft SQL server


• Determine and implement
2008, operating system, external
database backup and recovery
media (CD, DVD or external hard
methods
disk to take a backup)
• Computers, Microsoft SQL server
• Ensure entry, security and
2008, operating system,
confidentiality of data.

9
References

Resource/Learning materials

• Fundamentals of Database Systems by Elmasri,Navathe

• Microsoft SQL server 2008 A Beginner’s Guid

• Database Normalization by Mike Hillyer

• Modern Database Management by Jeffrey A.Hoffer, Mary B. Prescott, Fred R.


McFadden

• Beginning SQL server 2005 Administration by Dan Wood Chris Leiter Paul
Turley

• http://db.grussell.org/section005.html

• http://db.grussell.org/section005.html

• Introduction to Database Systems by Dr. Jean-Claude Franchitti

• http://infogoal.com/sql/sql-drop-database.htm

• http://www.tutorialspoint.com/sql/sql-null-values.htm

10
Topic 1: Basics of Database
1.1 Introduction
As this is the first topic in the module it covers in dealing with some of the common terms
and there definition. In addition to that the topic also covers some of the advantages we gain
by using a database approach than the previous file system. Finally the topic ends by
discussing the basic principles that are important in designing database architecture.

1.2 Learning objectives


At the end of this topic the student should be able:
• Define basic database terms
• Identify the difference between file system and database approach
• Explain the advantage of using DBMS
• Illustrate the different kinds of database architectures
• Identify role players in the database environment

1.3 File System and Database Approach


Before going to the detail discussion of database it is good to define some of the basic terms
that are related to database. Some of these terms are briefly discussed below:

Data versus Information

The terms data and information are closely related, and in fact are often used
interchangeably. However, it is useful to distinguish between data and information

Data: Stored representations of objects and events that have meaning and importance in the
user's environment

Information: is a processed data that has meaning and increase the knowledge of the person
who uses the data. For example consider the following list of facts.

Alemu Bekele 32456

Seid Kemal 45627

Belay Teklu 67892

Abebech Tolosa 34256

11
The above facts satisfy the definition of data. It is simply the presentation of names with
numbers and usually considered as useless as it does not give meaning what the entire mean.

There are different ways to convert the above data into information. In this section we see the
two basic Methods. The first is to put the data into table. So the above data can be
represented as follow

Table 1-1: Information regarding daily patient treatment


Daily Treatment
Physician Name: Hailu Tefera
Status: Pediatrics Date: May 2, 2012
Name Patient ID Sex Age
Alemu Bekele 32456 M 3
Said kemal 45627 M 2
Belay Teklu 67892 M 4
Abebech 34256 F 1
Tolosa

Another way to convert data into information is to summarize them or otherwise process and
present them in pictorial forms. For example figure 1-1 shows the distribution of outbreaks in
the four major regions of Ethiopia.

Figure 1-1: Outbreak Distribution

12
Figure 1.1 shows the outbreak distribution presented in graphical way. This information
could be used as a basis for intervention Planning.

In practice today, definition database may contain either data or information (or both) for
example a data may contain an x-ray image of the patient stored in the database. Also data
are often preprocessed and stored in summarized form in databases that are used for decision
support.

Metadata: are data that describe the properties or characteristics of end-user data and the
context of that data. Some of the properties that are typically described include data names,
definitions, length (or size), and allowable values. Metadata describing data context include
the source of the data, where the data are stored, ownership (or stewardship), and usage.

Database: defined as an organized collection of logically related data

Database: A database is similar to a data file in that it is a storage place for data. Like a data

file, a database does not present information directly to a user; the user runs an application
that accesses data from the database and presents it to the user in an understandable format

1.4 File System vs. Database Approach


File system and Database Management System are the two ways that could be used to
manage, store, retrieve and manipulate data. A File System is a collection of raw data files
stored in the hard-drive whereas in database system a bundle of applications that is dedicated
for managing data stored in databases. It is the integrated system used for managing digital
databases, which allows the storage of database content, creation/ maintenance of data,
search and other functionalities. Both systems can be used to allow the user to work with data
in a similar way. A File System is one of the earliest ways of managing data. But due the
shortcomings present in using a File System to store electronic data, Database Systems came
in to use sometime later, as they provide mechanisms to solve those problems.

File System

File system has a number of characteristics that differs from the database management
system. In file system approach, each user defines and implements the needed files for a

13
specific application to run. For example in hospital system, One user will be maintaining the
details of how many patients are there in the hospital along with their histories, these details
will be stored and maintained in a separate file.

Another user will be maintaining the about payments the patients made during treatment, the
detailed payment transactions report will be stored and maintained in a separate file.
Although both of the users are interested in the data's of the patient they will be having their
details in separate files and they need different programs to manipulate their files. This will
lead to wastage of space and redundancy or replication of data's, which may lead to
confusion, sharing of data among various users is not possible, data inconsistency may occur.
These files will not be having any inter-relationship among the data's stored in these files.
Therefore in traditional file processing every user will be defining their own constraints and
implement the files needed for the applications

Database Management System

A database Management System (DBMS) is a collection of programs that enables users to


create and maintain a database. The DBMS is hence a general purpose software system that
facilitates the process of defining, constructing, and manipulating database for various
applications. Defining a database involves specifying the data types, structures, and
constraints for the data to be stored in the database. Constructing the database is the process
of storing the data itself on some storage media that is controlled by the DBMS.
Manipulating a database includes such functions and querying the database to retrieve
specific data, updating the database to reflect changes in the mini world, and generating
reports from the data.

1.5 Characteristics of the Database Approach


In the database approach, a single repository of data is maintained that is defined once and
then is accessed by various users. The main characteristics of the database approach are the
following.

a. Self-Describing Nature of the Database System

A fundamental characteristic of the database approach is that the database system contains
not only the database itself but also a complete definition or description of the database

14
structure and constraints. This definition is stored in the system catalog, which contains
information such as, the structure of each file; the type and storage format each data item,
and various constraints on the data. This information stored in the catalog is called meta-data,
and it describes the structure of the primary database.

b. Insulation between Programs, Data and Data Abstraction

In traditional file processing, the structure of data files is embedded in the access programs,
so any changes to the structure of a file may require changing all programs that access this
files. By contrast, DBMS access programs do not require such changes in most case. The
structure of data files is stored in the DBMs catalog separately from the access programs. We
call this property program data independence.

The characteristic that allows program-data independence and program-operation


independence is called data abstraction.

c. Support of multiple views of the data

A database typically has many users, each of whom may require a different perspective or
view of the database. A view may be a subset of the database or it may contain virtual data
that is derived from the data base files but is not explicitly stored. Some user may not need to
be aware of whether the data they refer to is stored or derived. A multiple DBMS whose
users have a variety of application must provide facilities for defining multiple views.

d. Sharing of data and multiuser transaction processing

A multiuser DBMS, as its name implies, must allow multiple users to access the database at
the same time. This is essential if data for multiple applications is to be integrated and
maintained in a single database. The DBMS must include concurrency control software to
ensure that several users trying to update the same data do so in a controlled manner so that
the result of the updates is correct. For example in a hospital when several matron (A woman
in charge of nursing in a medical institution) try to assign a bed to a patient, the DBMS
should insure that each bed can be accessed by only one matron at a time for assignment to a
patient. A fundamental role of multiuser DBMS software is to insure that concurrent
transaction operate correctly.

15
1.6 Advantage of Using the DBMS
In this section we discuss some of the advantages of using a DBMS and the capability of that
a good DBMS should possess.

a) Controlling redundancy

In conventional data systems, an organization often builds a collection of application


programs often created by different programmers and requiring different components of the
operational data of the organization. The data in conventional data systems is often not
centralized. Some applications may require data to be combined from several systems. These
several systems could have data that is redundant as well as inconsistent (that is, different
copies of the same data may have different values). Data inconsistencies are often
encountered in everyday life. For example, we have all come across situations when a new
address is communicated to an organization that we deal with (e.g. a bank, or Telecom, or a
gas company), we find that some of the communications from that organization are received
at the new address while others continue to be mailed to the old address. Combining all the
data in a database would involve reduction in redundancy as well as inconsistency. It also is
likely to reduce the costs for collection, storage and updating of data.

b) Restricting unauthorized Access

In file systems, applications are developed for specific purpose. Often different system of an
organization would access different components of the operational data. In such an
environment, enforcing security can be quite difficult. Setting up of a database makes it
easier to enforce security restrictions since the data is now centralized. It is easier to control
that has access to what parts of the database. However, setting up a database can also make it
easier for a determined person to breach security. We will discuss this in topic 4 of this
module.

c) Better service to the Users

A DBMS is often used to provide better service to the users. In conventional systems,
availability of information is often poor since it normally is difficult to obtain information

16
that the existing systems were not designed for. Once several conventional systems are
combined to form one centralized data base, the availability of information and its up-to-
datedness is likely to improve since the data can now be shared and the DBMS makes it easy
to respond to unforeseen information requests. Centralizing the data in a database also often
means that users can obtain new and combined information that would have been impossible
to obtain otherwise. Also, use of a DBMS should allow users that do not know programming
to interact with the data more easily. Clients/patients usually tend to forget their service card
when they go to the hospital or clinic. But if there is an EMR (Electronic Medical Record)
system then MRN can be found by searching his name in the database.

d) Enforcing Integrity Constraint

Basically the data of an organization using a database approach is centralized and would be
used by a number of users at a time, it is essential to enforce integrity controls.

Integrity may be compromised in many ways. For example, patient may be shown to have
taken hospital service but not even registered. So enforcing patient registration before any
service is mandatory.

If a number of users are allowed to update the same data item at the same time, there is a
possibility that the result of the updates is not quite what was intended. For example we
could have a situation where many ward assignments could be made for patients that are
larger than the available beds. Controls therefore must be introduced to prevent such errors to
occur because of concurrent updating activities. However, since all data is stored only once,
it is often easier to maintain integrity than in conventional systems.

e) Enforces standard

Since all access to the database must be through the DBMS, standards are easier to enforce.
Standards may relate to the naming of the data, the format of the data, the structure of the
data etc.

17
f) Cost of developing and maintaining systems is lower

As noted earlier, it is much easier to respond to unforeseen requests when the data is
centralized in a database than when it is stored in conventional file systems. Although the
initial cost of setting up of a database can be large, one normally expects the overall cost of
setting up a database and developing and maintaining application programs to be lower than
for similar service using conventional systems since the productivity of programmers can be
substantially higher in using non-procedural languages that have been developed with
modern DBMS than using procedural languages.

g) Flexibility of the system is improved

Changes are often necessary to the contents of data stored in any system. These changes are
more easily made in a database than in a conventional system in that these changes do not
need to have any impact on application programs.

h) Provide backup and recovery

A DBMS must provide facilities for recovering from hardware or software failures. The
backup and recovery subsystem of the DBMS is responsible for recovery for example, if the
computer system fails in the middle of a complex update program, the recovery subsystem is
responsible for making sure that the database restored to the state it was in before the
program started executing. Alternatively, the recovery subsystem could ensure that the
program is resumed from the point at which it was interrupted so that it is full effect is
recorded in the database.

1.7 Basics of Database Architecture


The database architecture is the set of specifications, rules, and processes that dictate how
data is stored in a database and how data is accessed by components of a system. It includes
data types, relationships, and naming conventions. The database architecture describes the
organization of all database objects and how they work together. It affects integrity,

18
reliability, scalability, and performance. The database architecture involves anything that
defines the nature of the data, the structure of the data, or how the data flows.

The Three-level of Architecture

The goal of the three-schema architecture is to separate the user applications and the physical
database.

In any data model it is important to distingue between the description of the database and the
database itself. The description of the database is called the database schema, which is
specified during database design. A displayed schema is called a schema diagram.

Figure 1-2: Schema diagram for a database

In this architecture, schemas can be defined at the following three levels:

1. The internal level has an internal schema, which describes the physical storage structure
of the database. The internal schema uses a physical data model and describes the
complete details of data storage and access paths for the database.

2. The conceptual level has a conceptual schema, which describes the structure of the
whole database for a community of users. The conceptual schema hides the details of
physical storage structures and concentrates on describing entities, data types,
relationships, user operations, and constraints. A high-level data model or an
implementation data model can be used at this level.

3. The external or view level includes a number of external schemas or user views. Each
external schema describes the part of the database that a particular user group is

19
interested in and hides the rest of the database from that user group. A high-level data
model or an implementation data model can be used at this level.

1.8 Actors on the Database Environment


For small database the list of address one person typically defines, constructs, and
manipulates the database. However many persons are involved in the design, use,
maintenances of a large database with a few hundreds of users. In this section we identify the
people whose jobs involve the day-to-day use of a large database; we call them “actors on the
scene”

1. Database Administrator

In any organization where many persons use the same resource, there is a need for a chief
administrator to oversee and manage these resources. In a database environment, the primary
resource is the database itself and the secondary resource is the DBMS and related software.
Administering these resources is the responsibility of the data administrator (DBA). The
database administrator is responsible for authorizing access to the database, for coordinating
and monitoring its use and for acquiring software and hardware resources as needed. The
DBA is accountable for problem such as breach of security or poor system response time. In
large organizations, the DBA is assisted by a staff that helps carry out these functions.

2. Database Designer

Database designers are responsible for identifying the data to be stored in the database and
for choosing appropriate structures to be represented and store this data. These tasks are
mostly undertaken before the database is actually implemented and populated with data. It is
the responsibility of database designers to communicate with all prospective database users,
in order to understand their requirements, and to come up with a design that meets these
requirements. In many cases, the designers are on the staff of the DBA and may be assigned
other staff responsibilities after the database design is completed. Database designers
typically interact with each potential group of users and develop a view of the database that
meets the data and processing requirements of this group. These views are then analyzed and

20
integrated with the view of other user groups. The final database design must be capable of
supporting the requirements of all user groups.

3. End users

End users are the people whose jobs require access to the database for querying, updating,
and generating reports; the database primarily exists for their use. These are several
categories of end users:

o Causal end user occasionally accesses the database, but they may need different
information each time. They use a sophisticated database query language to specify
their request and are typically middle or high-level managers or other occasional
browsers.

o Naive or parametric end users make up a sizeable portion of database end users. Their
main job function revolves around constantly querying and updating the database,
using standard types of queries and updates called canned transaction that have been
carefully programmed and tasted.

o Sophisticated end users include ingénues, scientists, business analysis, and other who
thoroughly familiarize themselves with the facilities of the DBMS so as to implement
their applications to meet their complex requirements.
o Stand-alone users maintain personal databases by using ready-made program package
that provide easy to use menu or graphic based interfaces.

A typical DBMS provides multiple facilities to access a database. Naive end users need to
learn very little about the facilities provided by DBMS; they have to understand only the type
of standard transaction designed and implemented for their use. Casual users learn only a few
facilities that they may use repeatedly. Sophisticated users try to learn most of the DBMS
facilities in order to achieve their complex requirements. Stand alone users typically become
very proficient in using a specific software package.

21
4. System analysts and application programmers (Software Engineers)

System analysts determine the requirements of end users, especially naive and parametric
end users, and develop specification for canned transactions that meet these requirements.
Application programmer’s implements’ these specifications as programs; then they test,
debug, document, and maintain these canned transaction. Such analysts and programmer
(nowadays called software engineers) should be familiar with the full range of capabilities
provided by the DBMS to accomplish their tasks.

22
Activities/exercise
Give the correct answer for each of the following questions

1. Define the following terms


a. Data
b. Information
c. Database
d. Metadata

2. List and discuss the difference between file system and database approach
3. What is the advantage of using DBMS
4. What are the different kinds of database architectures
5. Identify role players in the database environment

23
Topic 2: Design, Produce and Deploy Relational Database

2.1 Introduction

Prior to actual implementation of a database, having a good foundation in designing is


imminent. It is really important knowing the basic steps that are needed in database design.
In practice there are different database models that are available in the industry. Relational
model is the one that is widely used than any other model. Once a model is used then the data
has to be transferred into a table. The tables that are found in the database have to be
normalized in order to avoid anomalies. This and other issues are discussed in this topic.

2.2 Learning Objectives

At the end of this topic the student should be able to:


• Identify requirement collection and analysis procedures
• Identify functional requirements
• Define entities, attributes and domain
• Explain the types of relationships
• Construct entity-relationship diagram
• Normalize entity-relationships
• Map E-Rs diagrams to tables

2.3 Designing a Database

Database design is the process of analyzing a problem that needs to be solved, developing the
logical model of the data available to solve that problem, grouping the items of data into
related tables, and finally committing the design to the database software system that will be
used for the actual database.

The goals of database design are multiple:

• Satisfy the information content requirements of the specified users and applications.

• Provide a natural and easy-to-understand structuring of the information.

24
• Support processing requirements and any performance objectives, such as response
time, processing time, and storage space.

Steps in Database Design

The process of database design is divided into different phases and consists of a series of
steps. The main phases are

1. Requirements collection and analysis.

2. Conceptual database design.

3. Choice of a DBMS.

4. Data model mapping (also called logical database design).

5. Physical database design.

6. Database system implementation and tuning.

Figure 2-1: Database design process

25
Phase 1: Requirement Collection and Analysis

Before we can effectively design a database, we must know and analyze the expectations of
the users and the intended uses of the database in as much detail as possible.

This process is called requirements collection and analysis. To specify the requirements, we
first identify the other parts of the information system that will interact with the database
system. These include new and existing users and applications, whose requirements are then
collected and analyzed. Typically, the following activities are part of this phase:

i. The major application areas and user groups that will use the database or whose work
will be affected by it are identified.

ii. Existing documentation concerning the applications is studied and analyzed.

iii. The current operating environment and planned use of the information is studied.

iv. Written responses to sets of questions are sometimes collected from the

v. Potential database users or user groups.

Requirement analysis is carried out for the final users, or customers, of the database system
by a team of system analysts or requirement experts. The initial requirements are likely to be
informal, incomplete, inconsistent, and partially incorrect. Therefore, much work needs to be
done to transform these early requirements into a specification of the application that can be
used by developers and testers

There is evidence that customer participation in the development process increases customer
satisfaction with the delivered system. For this reason, many practitioners use meetings and
workshops involving all stakeholders.

The requirements collection and analysis phase can be quite time-consuming, but it is crucial
to the success of the information system.

Phase 2: Conceptual Database Design


The second phase of database design involves two parallel activities. The first activity,
conceptual schema design, examines the data requirements resulting from Phase 1 and produces
a conceptual database schema. The second activity, transaction and application design,

26
examines the database applications analyzed in Phase 1 and produces high-level specifications
for these applications.

Phase 2a Conceptual Schema Design


The purpose of this process is to produce a conceptual schema of the database
• expressed using concepts of the high level data model
9 not including implementation details (has to be understood by non-technical users)
9 but detailed in terms of the “objects” of the domain the database will represent
• independent of the DBMS to be used (no relational DB-oriented notions!)
• cannot be used directly to implement the database
• design is made in terms of a semantic or conceptual data model
• the goal is to achieve understanding of database structure, semantics, interrelationships
and constraints

Phase 2b: Transaction Design.


The purpose of Phase 2b, which proceeds in parallel with Phase 2a, is to design the
characteristics of known database transactions (applications) in a DBMS-independent way.
When a database system is being designed, the designers are aware of many known applications
(or transactions) that will run on the database once it is implemented. An important part of
database design is to specify the functional characteristics of these transactions early on in the
design process. This ensures that the database schema will include all the information required
by these transactions. In addition, knowing the relative importance of the various transactions
and the expected rates of their invocation plays a crucial part during the physical database design
(Phase 5).

Transactions usually can be grouped into three categories: (1) retrieval transactions, which are
used to retrieve data for display on a screen or for printing of a report; (2) update transactions,
which are used to enter new data or to modify existing data in the database; and (3) mixed
transactions, which are used for more complex applications that do some retrieval and some
update.

27
Phase 3: Choice of a DBMS
The choice of a DBMS is governed by a number of factors - some technical, others economic,
and still others concerned with the activities and rules of the organization. The technical factors
focus on the suitability of the DBMS for the task at hand. Issues to consider are the type of
DBMS (relational, object-relational, object, other), the storage structures and access paths that
the DBMS supports, the user and programmer interfaces available, the types of high-level query
languages, the availability of development tools, the ability to interface with other DBMSs via
standard interfaces, the architectural options related to client-server operation, and so on.
Nontechnical factors include the financial status and the support organization of the vendor

Phase 4: Data Model Mapping (Logical Database Design)


The purpose of this phase is to transform the generic, DBMS independent conceptual schema in
the data model of the chosen DBMS (data model mapping). The mapping can proceed in two
stages:

1. System independent mapping: no consideration of any specific characteristics that


may apply to the specific DBMS package

2. Tailoring to DBMS: different DBMSs may implement the same data model in slightly
different ways

The result of this phase should be DDL (data definition language) statements in the language of
the chosen DBMS that specify the conceptual and external level schemas of the database system.
But if the DDL statements include some physical design parameters, a complete DDL
specification must wait until after the physical database design phase is completed.

Phase 5: Physical Database Design


Physical database design is the process of choosing specific file storage structures and access
paths for the database files to achieve good performance for the various database applications.
Each DBMS offers a variety of options for file organizations and access paths. Once a specific
DBMS is chosen, the physical database design process is restricted to choosing the most
appropriate structures for the database files from among the options offered by that DBMS. In
this section we give generic guidelines for physical design decisions; they hold for any type of

28
DBMS. The following criteria are often used to guide the choice of physical database design
options:

1. Response time. This is the elapsed time between submitting a database transaction for
execution and receiving a response.
2. Space utilization. This is the amount of storage space used by the database files and their
access path structures on disk, including indexes and other access paths.
3. Transaction throughput. This is the average number of transactions that can be processed
per minute; it is a critical parameter of transaction systems such as those used for airline
reservations or banking. Transaction throughput must be measured under peak conditions on the
system.

Phase 6: Database System Implementation and Tuning


After the logical and physical designs are completed, we can implement the database system.
This is typically the responsibility of the DBA and is carried out in conjunction with the database
designers. Language statements in the DDL are compiled and used to create the database
schemas and (empty) database files. The database can then be loaded (populated) with the data.
If data is to be converted from an earlier computerized system, conversion routines may be
needed to reformat the data for loading into the new database.

Database programs are implemented by the application programmers, by referring to the


conceptual specifications of transactions, and then writing and testing program code with
embedded DML (data manipulation language) commands. Once the transactions are ready and
the data is loaded into the database, the design and implementation phase is over and the
operational phase of the database system begins.

2.4 Relational Model (ER-modeling)

An example Database Application


In this module we are going to take a patient management system to demonstrate all the
contents found here after. We concentrate only on the development process of the database
application. For illustration purpose only some of the major entities in the hospital system are
discussed. We list the data requirements of the database here, and we create its conceptual

29
schema step-by-step as we introduce the modeling concepts of the ER model. The database,
called HOSPITAL keeps track of the information of the hospital departments, doctors, beds and
patients. The hospital provides inpatient and outpatient service for its clients.

• The hospital is organized into departments. Each department has a unique name, a unique
number, and a particular employee who manages the department. The hospital provides
IPD (Inpatient Department) and OPD (Outpatient Department) services.
• We store each doctor’s name, id number, office phone, salary, sex and specialization. A
doctor is assigned to one department and takes care for all patients and maintains and
arranges the clinical information.
• A patient being treated in the hospital will have MRN (Medical Record Number) during
registration and his/her name, address, date of birth and sex will be stored in the database.
• A patient assigned for a bed will be identified and the room number, the type of the room,
the price and the date of admission and discharge will be stored in the database.

The entity-relationship model (or ER model) is a way of graphically representing the logical
relationships of entities (or objects) in order to create a database. In ER modeling, the structure
for a database is portrayed as a diagram, called an entity-relationship diagram (or ER
diagram), that resembles the graphical breakdown of a sentence into its grammatical parts.

2.4.1 Identifying Entities


The basic object that the ER model represents is called an entity, which is a "thing" in the real
world with an independent existence. An entity may be an object with a physical existence—a
particular person, car, house, or employee, doctor, patient—or it may be an object with a
conceptual existence—a company, a job, or a hospital. More concisely an entity is defined as any
object in the system that we want to model and store information about. Entities are represented
by rectangles (either with round or square corners) as

30
Figure 2-2: Entity notations

2.4.2 Attributes and Domain

An Attribute is a property that describes an entity. In the above example, the patient is the entity
and MRN, name, address, and sex etc are attributes of patient.
A specific entity will have a value for each of its attributes. For example a specific patient entity
may have Name='Abera Hailu', MRNNO='5846', Sex ='M', Age = 25‘, Address= 'Bole sub city,
Addis Ababa, Ethiopia'.

Figure 2-3: Defining attribute for an entity

An attribute can have simple, composite or multi-valued values. For example:


• Simple
– Each entity has a single atomic value for the attribute. For example, MRN
(Medical Record Number) or Sex.
• Composite
– The attribute may be composed of several components. For example, Name
(FirstName, MiddleName, LastName). Composition may form a hierarchy
where some components are themselves composite.
• Multi-valued

31
– An entity may have multiple values for that attribute. For example, disease of
a patient or previous work place of a doctor. Denoted as {diseases} or
{PrviousWorkAreas}.
• Stored Versus Derived Attributes
– In some cases two (or more) attribute values are related—for example, the
Age and Birth/Date attributes of a person. For a particular person entity, the
value of Age can be determined from the current (today’s) date and the value
of that person’s Birth/Date. The Age attribute is hence called a derived
attribute and is said to be derivable from the Birth/Date attribute, which is
called a stored attribute. Some attribute values can be derived from related
entities; for example, an attribute Number Of Employees of a department
entity can be derived by counting the number of employees related to
(working for) that department

• Null Values
– In some cases a particular entity may not have an applicable value for an
attribute. For example, the Apartment Number attribute of an address applies
only to addresses that are in apartment buildings and not to other types of
residences, such as single-family homes. Similarly, a College Degrees
attribute applies only to persons with college degrees. For such situations, a
special value called null is created. An address of a single-family home would
have null for its Apartment Number attribute, and a person with no college
degree would have null for College Degrees. Null can also be used if we do
not know the value of an attribute for a particular entity—for example, if we
do not know the home phone of "Gadissa Muleta” The meaning of the former
type of null is not applicable, whereas the meaning of the latter is unknown.
The unknown category of null can be further classified into two cases. The
first case arises when it is known that the attribute value exists but is
missing—for example, if the Height attribute of a person is listed as null. The
second case arises when it is not known whether the attribute value exists—for
example, if the Home Phone attribute of a person is null

32
Nulls can have multiple interpretations, such as the following:
• The attribute does not apply to this tuple.
• The attribute value for this tuple is unknown.
• The value is known but absent; that is, it has not been recorded yet
• Having the same representation for all nulls compromises the different meanings they
may have

Domain

Each attribute has a domain, an expression of the allowable values for that attribute. It is a set of
atomic values. By atomic we mean that each value in the domain is indivisible as far as the
relational model is concerned. A common method of specifying a domain is to specify a data
type from which the data values forming the domain are drawn. It is also useful to specify a
name for the domain, to help in interpreting its values. Some examples of domains are shown
below: By atomic we mean that each value in the domain is indivisible as far as the relational
model is concerned

Office phone: The set of 10-digit valid phone numbers in Ethiopia

Age: is between 0 and 125.

Sex: Male or Female

2.5 Relationships

A relationship relates two or more distinct entities with a specific meaning. For example
DOCTOR Assefa Belay treats the patient Meaza Birru. The relationship between the two entities
is “treats”.

An easy way to decide whether an object should be an entity or a relationship is to map nouns in
the requirements to entities, and to map the verbs to relations. For example, in the statement,
“DOCTOR Assefa Belay treats the patient Meaza Birru” we can identify the entities “Doctor”
and “patient,” and the relationship “treats”.

33
Entity-relationship diagram is used to represent the relationship among the entities. An entity-
relationship (ER) diagram is a specialized graphic that illustrates the relationships between
entities in a database. ER diagrams often use symbols to represent three different types of
information. As we mentioned before in Chen’s ER-modeling rectangles are commonly used to
represent entities. Diamonds are normally used to represent relationships and ovals are used to
represent attributes.

Figure 2-4: A relationship between Doctor and Patient entities using diamonds

In place of diamond we can use also straight line to represent relationships

Figure 2-4: A relationship between Doctor and Patient entities

A number of data modeling techniques are being used today. One of the most common is
the entity relationship diagram (ERD). Several ERD notations are available to show the
cardinality between the entities. In this module we will be using Crow’s Foot Notation.

Components used in the creation of an ERD

Entity - A person, place or thing about which we want to collect and store multiple instances of
data. It has a name, which is a noun, and attributes which describe the data we are interested in
storing.

34
Relationship -Illustrates an association between two entities. It has a name which is a verb. It
also has cardinality and modality.

Cardinality- Defines the number of occurrences of one entity for a single occurrence of the
related entity. E.g. a Doctor can treat one or more patients per day.

Types of relationships

There are several types’ of database relationships.

A) One-to-one relationships -occur when there is exactly one record in Table A that
corresponds to exactly one record in Table B. One-to-one relationships are single-valued
in both directions. A patient being treated in a hospital will have only one bed. So the
relationship between the patient and the bed table is one-to-one (1:1) relationship.

Figure 2-5: 1 to 1 relationship between Patient and Bed

B) One-to-many or many-to-one relationships -is the most common type of relationship.


It occurs when each record in Table A may have many linked records in Table B but each
record in Table B may have only one corresponding record in Table A. For example a
given hospital department can have many doctors working under it and usually one
doctor works for a particular department. This is a one-to-many relationship (1: n)

Figure 2-6: A many to 1 relationship between Patient and Bed

C) Many-to-many relationships- occurs when each record in Table A may have many
linked records in Table B and vice-versa. You create such a relationship by defining a

35
third table, called a junction table, whose primary key consists of the foreign keys from
both table A and table B.

Figure 2-7: A many to many relationship between Patient and Bed

In a many-to-many relationship a given doctor can treat many patients and the same patient can
be treated by different doctors, so it is a many to many (m: n) relationship.

Relational keys
There are many types of keys that can be defined in the relational model. These have significant
importance in maintaining data consistency and correctness in the database.

Candidate Key
A candidate key is one or more attribute that uniquely identifies an entity. Every entity
in relational database must have at least one candidate key but it is possible that some may have
two or more. For example, MRN and name of the patient may identify the patient. Therefore
MRN and name can be considered candidate keys for a patient table.

Primary Key
A primary key is an attribute, or set of attributes, that allows each information for an entity to be
uniquely identified. Every entity in a relational database must have a primary key. For example,
a patient entity has attributes such as MRN, name, address, date of birth and sex, and then MRN
can be used as a primary key. As mentioned above MRN and name are candidate keys for the
patient table but name cannot uniquely identify all the patients, whereas MRN is unique for all
patients. Therefore MRN is a primary key for the patient table.

Foreign Key
Entities are related to each other through foreign keys. A foreign key references a
particular attribute of an entity containing the corresponding primary key. For example, a patient
entity with MRN as its primary key for a patient and doctor entity with doctor Id as its

36
primary key for doctor information can be related to each other through MRN. Therefore, MRN
will be a foreign key for doctor entity where as the MRN will be a primary key for the Patient
entity.

Degree of relationships

Degree of relationship simply refers to the concept of, how many number of entities associated
with the relationship and in general way we categories as:

A binary relationship The association between two entities is the most common type in the real
world. A recursive binary relationship occurs when an entity is related to itself. An example
might be some patients are treated by doctors”.

A ternary relationship involves three entities and is used when a binary relationship is
inadequate. Many modeling approaches recognize only binary relationships. Ternary or n-ary
relationships are decomposed into two or more binary relationships.

The followings are degree of relationships

1. Single entity Unary


2. Double entities Binary
3. Triple entities Ternary
4. N entities N ary

The E-R diagram for example discussed so far can be represented by the diagram shown below.
One physician treats many patients and in a given department many physician works. The
attributes of each of the entities is shown in the diagram below.

37
Figure 2-7: E-R diagram for patient to doctor relation

2.6 Mapping E-R diagrams to Tables

To map E-R diagrams to Tables follow the steps shown below.

• For each entity type E in the ER schema, create a relation R (Table) that includes all the
simple attributes of E.
• Choose one of the key attributes of E as the primary key for R.
• If the chosen key of E is composite, the set of simple attributes that form it will together
form the primary key of R.

Relational Model is made up of tables and the mapping between E-R and a relation (tables) can
be summarized as

38
E-R diagrams Tables
A row of table a relational instance/tuple
A column of table an attribute
A table a schema/relation
Cardinality number of rows
Degree number of columns

Diagrammatically it can be expressed as shown in the figure below.

Figure 2-7: E-R diagram for patient to doctor relation

2.7 Normalization

Normalization is the process of organizing the fields and tables of a relational database to
minimize redundancy and dependency.

Normalization usually involves dividing large tables into smaller (less redundant) tables and
defining relationships between them. The objective is to isolate data so that insertions,
deletions, and modifications (updating) of a field can be made in just one table and then
propagated through the rest of the database via the defined relationships.

Redundant data wastes disk space and creates maintenance problems. If data that exists in more
than one place must be changed, the data must be changed in exactly the same way in all
locations. A Patient address change is much easier to implement if that data is stored only in the
Patient table and nowhere else in the database.

39
What is an "inconsistent dependency"? While it is appropriate for a user to look in the Patient
table for the address of a particular patient, it may not make sense to look there for the
specialization of the Doctor who treated that patient. The Doctors specialization is related to, or
dependent on, the doctor and thus should be moved to the Doctor table. Inconsistent
dependencies can make data difficult to access because the path to find the data may be missing
or broken.

There are a few rules for database normalization. Each rule is called a "normal form." If the first
rule is fulfilled, the database is said to be in "first normal form." If the second rules are observed,
the database is considered to be in "second normal form." and so on. Although other levels of
normalization are possible, third normal form is considered the highest level necessary for most
applications.

In this module we are going to discuss the example mentioned before. We will have three entities
along with data shown in the table (Table--). The table shows unnormalized information of the
Hospital data.

Let’s follow this normalization with an example to facilitate understanding of the rules. Consider
the following table holding information regarding patient treatment in a hospital. The table is in
the unnormalized form and needs to be normalized.

Table Patient Registration System Unnormalized Form

MRN Patient Sex Age Dr. Dr. Name Dept Dept Treatment
Name ID Code Name Given
5648 Assefa Kebede M 45 23 Bekele Hailu 234 Internal IV(intravenous)
Medicine drug
5643 Yeshi Mola F 36 67 Hassen Ali 256 Surgery Appendectomy

6712 Zemzem Jemal F 23 45 Mule Belay 236 Gyane & Chemotherapy


Obstetrics
1580 Hayat Kedir F 10 23 Shifa Seid 289 Pediatrics Oral Antibiotic

5648 Assefa Kebede M 45 67 Hassen Ali 256 Surgery Laparatomy

1580 Hayat Kedir F 10 24 Saba Tariku 236 Dentistry Tooth


Extraction

40
First Normal Form (1NF)

Transform a table of unnormalized data into first normal form (1NF), any repeating attributes to
a new table. A repeating attribute is a data field within the UNF relation that may occur with
multiple values for a single value of the key. The process is as follows:

• Identify repeating attributes.

o After removing the duplicate data the repeating attributes are easily identified.

• Remove these repeating attributes to a new table together with a copy of the key from the
UNF (unnormalized form) table.

o In the previous table the Dr Id, Dr. Name, Dept Code, Dept Name and Treatment
Type attributes are repeating. That is, there is a potential for more than one
occurrence of these attributes for each MRN. These are the repeating attributes
and have been to a new table together with a copy of the original key (i.e.: MRN).

• Assign a key to the new table (and underline it). The key from the original unnormalized
table always becomes part of the key of the new table. A compound key is created. The
value for this key must be unique for each entity occurrence.

o A key of MRN and Dr. ID has been defined for this new table. This combination is
unique for each row in the table.

a) MRN Patient Sex Age


Name
5648 Assefa Kebede M 45

5643 Yeshi Mola F 36

6712 Zemzem Jemal F 23

1580 Hayat Kedir F 10

5648 Assefa Kebede M 45

1580 Hayat Kedir F 10

41
b)
MRN Dr. Dr. Name Dept Dept Treatment
ID Code Name Given
5648 23 Bekele 234 Internal IV(intravenous)
Hailu Medicine drug
5643 67 Hassen Ali 256 Surgery Appendectomy

6712 45 Mule Belay 236 Gyane & Obstetrics Chemotherapy

1580 23 Shifa Seid 289 Pediatrics Oral Antibiotic

5648 67 Hassen Ali 256 Surgery Laparatomy

1580 24 Saba Tariku 236 Dentistry Tooth Extraction

Second Normal Form (2NF)

The next step is transforming the data in first normal form (1NF) into second normal form
(2NF). The rule is: remove any non-key attributes that only depend on part of the table key to a
new table. Ignore tables with (a) a simple key or (b) with no non-key attributes. These go straight
to 2NF with no conversion. The process is as follows:

Take each non-key attribute in turn and ask the question: is this attribute dependent on one
part of the key?

• If yes, remove attribute to new table with a copy of the part of the key it is dependent
upon. The key it is dependent upon becomes the key in the new table. Underline the key
in this new table.

o The first table went straight to 2NF as it has a simple key (MRN).

• If no, check against other part of the key and repeat above process

o Dr. name, Dept Code and Dept Name are dependent upon Dr. Codey. Therefore,
they were moved to a new table with Dr. Code being the key.

• If still no, i.e. not dependent on either part of key, keep attribute in current table.

42
o However, Treatment given is dependent upon both MRN and Dr. Code as a
doctor may give different treatment type depending upon the patient they are
treating. Therefore it remained in the original table.

Partial Key Dependencies Removed and the resulting 2NF tables are:

a) b)
MRN Patient Sex Age MRN Dr. Treatment
Name ID Given
5648 Assefa Kebede M 45 5648 23 IV(intravenous)
drug
5643 Yeshi Mola F 36
5643 67 Appendectomy
6712 Zemzem Jemal F 23
6712 45 Chemotherapy
1580 Hayat Kedir F 10
1580 23 Oral Antibiotic
5648 Assefa Kebede M 45
5648 67 Laparatomy
1580 Hayat Kedir F 10
1580 24 Tooth Extraction

c)
Dr. Dr. Name Dept Dept
ID Code Name
23 Bekele 234 Internal
Hailu Medicine
67 Hassen Ali 256 Surgery

45 Mule Belay 236 Gyane &


Obstetrics
23 Shifa Seid 289 Pediatrics

67 Hassen Ali 256 Surgery

24 Saba Tariku 236 Dentistry

43
3rd Normal Form (3NF)

The next step is transforming the data in second normal form (2NF) into third normal form
(3NF). The rule is: remove to a new table any non-key attributes that are more dependent on
other non-key attributes than the table key. Ignore tables with zero or only one non-key attribute
(these go straight to 3NF with no conversion). The process is as follows:

If a non-key attribute is more dependent on another non-key attribute than the table key:

• Move the dependent attribute, together with a copy of the non-key attribute upon which it
is dependent, to a new table.

o The first two tables in the 2NF directly pass to the 3NF. In the last table,
Department Name is more dependent on Department code than Dr. Code and
therefore was moved to a new table.

• Make the non-key attribute, upon which it is dependent, the key in the new table.
Underline the key in this new table.

o Dept Code is the key in this new table and a foreign key in the Employee table.

44
a) MRN Patient Sex Age b) Dept Dept
Name Code Name
5648 Assefa Kebede M 45 234 Internal Medicine

5643 Yeshi Mola F 36 256 Surgery

6712 Zemzem Jemal F 23 236 Gyane & Obstetrics

1580 Hayat Kedir F 10 289 Pediatrics

5648 Assefa Kebede M 45 256 Surgery

1580 Hayat Kedir F 10 236 Dentistry

c) d)
Dr. Dr. Name Dept MRN Dr. Treatment
*
ID Code ID Given
23 Bekele Hailu 234 5648 23 IV(intravenous)
drug
67 Hassen Ali 256
5643 67 Appendectomy
45 Mule Belay 236
6712 45 Chemotherapy
23 Shifa Seid 289
1580 23 Oral Antibiotic
67 Hassen Ali 256
5648 67 Laparatomy
24 Saba Tariku 236
1580 24 Tooth Extraction

45
Activitiess/exercise

1. List
L and discu
uss Steps in Database Deesign
2. Describe
D the difference between relattion and relaational schem
ma by given an example..
3. Define
D entitiees, attributes and domainn by giving an
a example
4. List
L and discu
uss the threee types of rellationships
5. What
W is norm
malization disscuss the diffferent steps..
6. Design
D a relaational databbase for corresponding too the diagram
m shown beelow. Identiffy the
prrimary keys

7. ABC
A Compaany markets various prooducts to thhousands off regular (reepeat) custom
mers.
E
Each productt is identifiedd by a produuct ID and has
h product description,
d quantity on hand
annd unit priice. A uniqque accountt number identifies
i eaach custom
mer. All releevant
innformation of
o the custom
mer is mainntained in thhe database (e.g., contacct name, adddress,
phhone numbeer, etc.). Cusstomers placce orders forr various prooducts. Eachh customer or
o der
m consist of
may o one or moore products.. The quantitty ordered for
fo various prroducts can vary.
H
However, thee order deliveery date willl be the samee.

46
a. Model the above scenario by way of Entity Relationship diagram. Identify the
keys, attributes and relationship cardinalities for each entity.

b. Based on your E-R model develop the set of tables.

47
Topic 3: Introduction to SQL

3.1 Introduction

Currently there are different database engines that are available in the market. Microsoft SQL
server is the one that has been under use from lower level application to complex applications.
There are different versions currently used by database developers. In this topic SQL server 2008
will be utilized starting from creation of the database to using the different queries that are need
to access information that is stored in the database. SQL server Management studio plays a key
role in creating and managing a database. The topic covers all this issues one by one.

3.2 Learning Objectives


At the end of this topic the student should be able to
• Define SQL Server Management Studio
• Use SQL Server Management Studio
• Create and manage a database
• Use SQL commands
• Generate a report

3.3 Introduction to Microsoft SQL Server

SQL Server Management Studio

The administrator’s primary tool for interacting with SQL server is SQL Server Management
Studio. Both administrators and end users can use this tool to administer multiple servers,
develop databases, and replicate data, among other things. To open this tool, click the Start
menu, All Programs, Microsoft SQL Server 2008, and then SQL Server Management
Studio in the SQL Server program group. Every user with access to the particular database
server can also use SQL Server Management Studio.

48
SQL Server Management Studio comprises several different components that are used for the
authoring, administration, and management of the overall system. The following are the main
components used for these tasks:

o Registered Servers
o Object Explorer
o Query Editor
o Solution Explorer

Connecting to a Server
When you open SQL Server Management Studio, it displays the Connect to Server dialog box
Figure 3-1, which allows you to specify the necessary parameters to connect to a server. For our
discussion we choose Database Engine.

Server Name Select or type the name of the server that you want to use. (Generally, you can
connect SQL Server Management Studio to any of the installed products on a particular server.)

Figure 3-1: Server dialog box

Authentication Choose between the two authentication types:

49
Windows Authentication Connect to SQL Server using your Windows account. This option is
much simpler and is recommended for use by Microsoft.

SQL Server Authentication Database Engine uses its own authentication.

When you click Connect, Database Engine connects to the specified server. After connecting to
the database server, the default SQL Server Management Studio window appears Figure 3-2.

Figure 3-2: SQL Server Management Studio, default window settings

Registered Servers

Registered Servers is represented as a pane that allows you to maintain connections to already
used servers. You can use these connections to check a server’s status or to manage its objects.
Each user has a separate list of registered servers, which is stored locally. (If the Registered
Servers pane isn’t visible, select its name from the View menu.)

You can add new servers to the list of all servers, or remove one or more existing servers from
the list. You can also group existing servers into server groups. Each group should contain the
servers that belong together logically. You can also group servers by server type, such as
Database Engine, Analysis Services, Reporting Services, and Integration Services.

50
Object Explorer

The Object Explorer pane contains a tree view of all the database objects in a server. (If the
Object Explorer pane isn’t visible, select Object Explorer from the View menu.) The tree shows
you a hierarchy of the objects on a server. Hence, if you expand a tree, the logical structure of a
corresponding server will be shown. To connect Object Explorer to a server, right-click the
server name and choose Connect. To disconnect, click Disconnect in the toolbar and select the
Disconnect function.

Object Explorer allows you to connect to multiple servers in the same pane. The server can be
any of the existing servers for Database Engine, Analysis Services, Reporting Services, or
Integration Services. This feature is user-friendly, because it allows you to manage all servers of
the same or different types from one place.

Organizing and Navigating Management Studio’s Panes

You can dock or hide each of the panes of Management Studio. By right-clicking the title bar at
the top of the corresponding pane, you can choose between the following presentation
possibilities:

Floating When a window’s state is set to floating, it exists as a separate floating window on top
of the rest of SQL Server Management Studio windows. Such a window can be moved anywhere
around the screen.

Dockable Enables you to move panes of SQL Server Management Studio and dock them in
different positions. To move a component, click and drag the title bar of the pane into the middle
of the document window. The pane undocks and remains floating until you drop it.

Tabbed Document You can create a tabbed grouping using the Designer window. When this is
done, the pane’s state changes from dockable to tabbed document.

Hide Closes the window. (Alternatively, you can click the × in the upper-right corner of the
window.) To display a closed window, select the component name from the View menu.

51
Auto Hide Minimizes the pane and stores it on the left side of the screen. To reopen (maximize)
such a pane move your mouse over the tabs on the left side of the screen and click the push pin
to pin the pane in the open position.

The difference between the Hide and Auto Hide options is that the former option removes the
pane from SQL.

3.4 Using Management Studio with Database Engine

SQL Server Management Studio has two main purposes:

• Administration of the database servers


• Management of database objects

The following sections describe these features of SQL Server Management Studio.

Administering Database Servers

The administration tasks that you can perform by using SQL Server Management Studio are,
among others, the following:

• Register servers
• Connecting to a server
• Create new server groups
• Start and stop SQL Server

The following subsections describe these administration tasks one by one.

Registering Servers

SQL Server Management Studio separates the activities of registering servers and exploring
databases and their objects. (Both of these activities can be done using Object Explorer.) Every
server (local or remote) must be registered before use. A server can be registered during the first
execution of SQL Server Management Studio or later.

52
To register a database server, right-click the folder of your database server in Object Explorer
and choose Register. (If the Object Explorer pane doesn’t appear on your screen, select View
and click Object Explorer.) The New Server Registration dialog box appears, as shown in Figure
3-3. Choose the name of the server that you want to register and the authentication mode
(Windows Authentication or SQL Server Authentication

Figure 3-3: The New Server Registration dialog box

Connecting to a Server

SQL Server Management Studio also separates the tasks of registering a server and connecting to
a server. This means that registering a server does not automatically connect you to the server.
To connect to a server from the Object Explorer window, right-click the server name and choose
Connect.

Creating a New Server Group

To create a new server group in the Registered Servers pane, right-click Local Server Groups and
choose New Server Group. In the New Server Group properties dialog box, enter a (unique)
group name and optionally describe the new group.

53
Managing Multiple Servers

SQL Server Management Studio allows you to administer multiple database servers (called
instances) on one computer by using Object Explorer. Each instance of Database Engine has its
own set of database objects (system and user databases) that are not shared between different
instances.

To manage a server and its configuration, right-click the server name in Object Explorer and
choose Properties. The Server Properties dialog box Figure 3-4 contains several different
pages, such as General, Security, and Permissions. The General page shows general properties
of the server. The Security page contains the information concerning the authentication mode of
the server and the login auditing mode. The Permissions page shows all logins and roles that can
access the server. The lower part of the page shows all permissions that can be granted to the logins and
roles.

Figure 3-4: The Server Properties dialog box

You can replace the existing server name with a new name. Right-click the server in the Object
Explorer window and choose Register. Now you can rename the server and modify the existing
server description in the Registered Server frame.

54
Starting and Stopping Servers

A Database Engine server can be started automatically each time the Windows operating system
starts or by using SQL Server Management Studio. To start the server using Management Studio,
right-click the selected server in the Object Explorer pane and click Start in the context menu.

The menu also contains Stop and Pause functions that you can use to stop or pause the activated
server, respectively.

3.5 Managing Databases Using Object Explorer

The following are the management tasks that you can perform by using SQL Server Management
Studio:

• Create databases without using SQL


• Modify databases without using SQL
• Manage database objects and their usage
• Generate and execute SQL statements

3.5.1 Creating a Database


The organization of a database involves many different objects. All objects of a database can be
physical or logical. The physical objects are related to the organization of the data on the
physical device (disk). Database Engine’s physical objects are files and file groups. Logical
objects represent a user’s view of a database. Databases, tables, columns, and views (virtual
tables) are examples of logical objects.

The first database object that has to be created is a database itself. Database Engine manages
both system and user databases. An authorized user can create user databases, while system
databases are generated during the installation of the database system. System databases are

• master
• tempdb
• model
• msdb

55
• resource
You can create a new database by using Object Explorer or the SQL language. As the name
suggests, you also use Object Explorer to explore the objects within a server. From the Object
Explorer pane, you can inspect all the objects within a server and manage your server and
databases. The existing tree contains, among other folders, the Databases folder.

This folder has several subfolders, including one for the system databases and one for each new
database that is created by a user.

To create a database using Object Explorer, right-click Databases and select New Database. In
the New Database dialog box Figure 3-5 type the name of the new database in the Database
Name field and then click OK. Each database has several different properties, such as file type,
initial size, and so on. Database properties can be selected from the left pane of the New
Database dialog box. There are several different pages (property groups):

• General
• Files (appears only for an existing database)
• File groups
• Options
• Permissions (appears only for an existing database)
• Extended Properties (appears only for an existing database)
• Mirroring (appears only for an existing database)
• Transaction Log Shipping (appears only for an existing database)

56
Figure 3-5: The New Database dialog box

The General page of the Database Properties dialog box displays, among other things, the
database name, the owner of the database, its collation, and recovery model. The properties of
the data files that belong to a particular database comprise the name and initial size of the file,
where the database will be stored, and the type of the file (PRIMARY, for instance). A database
can be stored in multiple files. The Filegroups page of the Database Properties dialog box
displays the name(s) of the Filegroups to which the database file belongs, the art of the filegroup
(default or nondefault), and the allowed operation on the filegroup (read/write or read-only).

The Options page of the Database Properties dialog box enables you to display and modify all
database-level options. There are several groups of options: Automatic, Cursor, Miscellaneous,
Recovery, and State. For instance, the following four options exist for State:

• Database Read-Only Allows read-only access to the database. This prohibits users from
modifying any data. (The default value is False.)
• Database State Describes the state of the database. (The default value is NORMAL.)

57
• Restrict Access Restricts the use of the database to one user at a time. (The default value
is MULTI_USER.)

• Encryption Enabled Controls the database encryption state. (The default value is False.)

If you choose the Permissions page, the system opens the corresponding dialog box and displays
all users and roles along with their permissions.

3.5.2 Modifying Databases without Using SQL


Object Explorer can also be used to modify an existing database. Using this component, you can
modify files and filegroups that belong to the database. To add new data files, right-click the
database name and choose Properties. In the Database Properties dialog box, select Files, click
Add, and type the name of the new file. (In the Add File dialog box, you can also change the
autogrowth properties and the location of each existing file.) You can also add a (secondary)
filegroup for the database by selecting Filegroups and clicking Add.
To delete a database using Object Explorer, right-click the database name and choose Delete.

Creating Tables

After you create a database, your next task is to create all tables belonging to it. Again, you can
create tables by using either Object Explorer or SQL. Again, only Object Explorer is discussed
here.

To create a table using Object Explorer, expand the Databases folder, expand the database, right-
click the Tables subfolder, and then click New Table. Enter the names of all columns with their
properties. Column names, their data types, as well as the NULL property of the column must
be entered in the two-dimensional matrix, shown in the top-right pane of Figure 3-6. All data
types supported by the system can be displayed (and one of them selected) by clicking the arrow
sign in the Data Type column (the arrow appears after the cell has been selected). Subsequently,
you can type entries in the Length, Precision, and Scale rows for the chosen data type on the
Column Properties tab. Some data types, such as CHAR, require a value for the Length row, and
some, such as DECIMAL, require a value in the Precision and Scale rows. On the other hand,
data types such as INTEGER do not need any of these entries to be specified. (The valid entries
for a specified data type are highlighted in the list of all possible column properties.)

58
Figure 3-6: Creating patient table using SQL Server Management Studio

The check box in the Allow Nulls column must be checked if you want a table column to permit
NULL values to be inserted into that column. Similarly, if there is a default value, it should be
entered in the Default Value or Binding row of the Column Properties tab. (A default value is a
value that will be inserted in a table column when there is no explicit value entered for it.)

To specify a column as the primary key of a table, you must right-click the column and choose
Set Primary Key. Finally, close the right pane with the information concerning the new table.
After that, the system will display the Choose Name dialog box, where you can type the table
name. To view the properties of an existing table, double-click the folder of the database to
which the table belongs, double-click Tables and then right-click the name of the table and
choose Properties.

To rename a table, right-click in the name of the table that is found in the Tables folder and
choose Rename. To remove a table, right-click the name of the table in the Tables folder in the
database to which the table belongs and select Delete.

After you have created all tables you can use another feature of SQL Server Management Studio
to display the corresponding entity-relationship (ER) diagram of the sample database. (The

59
process of converting the existing tables of a database into the corresponding ER diagram is
called reverse engineering.)

To see the ER diagram of a database, right-click the Database Diagrams subfolder of the
sample database and select New Database Diagram.

3.6 SQL

3.8.1. Introduction to SQL

The name SQL is derived from Structured Query Language. Originally, SQL was called
SEQUEL (for Structured English Query Language) and was designed and implemented at IBM
Research as the interface for an experimental relational database system called SYSTEM R. SQL
is now the standard language for commercial relational DBMSs.

SQL is a comprehensive database language; it has statements for data definition, query, and
update. Hence, it is both a Data Definition Language (DDL) and a Data Manipulation Language
(DML). In addition, it has facilities for defining views on the database, for specifying security
and authorization, for defining integrity constraints, and for specifying transaction controls.. In
our discussion, we will follow SQL Server 2008 features.

3.8.2. Query Editor

To launch the Query Editor pane, click the New Query button in the toolbar of SQL Server
Management Studio. If you expand it to show all the possible queries, it shows more than just a
Database Engine query. Once you open Query Editor, the status bar at the bottom of the pane
tells you whether your query is in a connected or disconnected state. If you are not connected
automatically to the server, the Connect to SQL Server dialog box appears where you can type
the name of the database server to which you want to connect and select the authentication mode.

Query Editor can be used by end users for the following tasks:

• Generating and executing SQL statements


• Storing the generated SQL statements in a file
• Generating and analyzing execution plans for generated queries

60
• Graphically illustrating the execution plan for a selected query
Query Editor contains an internal text editor and a selection of buttons in its toolbar. The main
window is divided into a query pane (upper) and a results pane (lower). Users enter the Transact-
SQL statements (queries) that they want to execute into the query pane, and after the system has
processed the queries, the output is displayed in the results pane.

The example shown in Figure 3-7 demonstrates a query entered into Query Editor. Clicking the
Query button in the Query Editor’s toolbar and then selecting Execute or pressing f5 returns the
results of these statements in the results pane of Query Editor.

Figure 3-7: Query Editor with a query

The following additional information concerning the execution of the statement(s) is displayed in
the status bar at the bottom of the Query Editor window:

• The status of the current operation (for example, “Query executed successfully”)
• Database server name
• Current username and server process ID
• Current database name

61
• Elapsed time for the execution of the last query
• The number of retrieved rows
One of the main features of SQL Server Management Studio is that it’s easy to use, and that also
applies to the Query Editor component. Query Editor supports a lot of features that make coding
of SQL statements easier. First, Query Editor uses syntax highlighting to improve the readability
of SQL statements. It displays all reserved words in blue, all variables in black, strings in red,
and comments in green

3.7 Types of SQL Commands


The following sections discuss the basic categories of commands used in SQL to perform various
functions. These functions include building database objects, manipulating objects, populating
database tables with data, updating existing data in tables, deleting data, performing database
queries, controlling database access, and overall database administration.
The main categories are

• DDL (Data Definition Language)


• DML (Data Manipulation Language)
• DQL (Data Query Language)
• DCL (Data Control Language)
• Data administration commands and
• Transactional control commands

Data Definition Language

Data Definition Language, DDL, is the part of SQL that allows a database user to create and
restructure database objects, such as the creation or the deletion of a table. Some of the most
fundamental DDL commands include the following:

CREATE
ALTER
DROP

62
Data Manipulation Language

Data Manipulation Language, DML, is the part of SQL used to manipulate data within objects of
a relational database.

There are three basic DML commands:

INSERT
UPDATE
DELETE

Data Query Language


Though comprised of only one command, Data Query Language (DQL) is the most concentrated
focus of SQL for modern relational database users. The base command is as follows:
SELECT

This command, accompanied by many options and clauses, is used to compose queries against a
relational database. Queries, from simple to complex, from vague to specific, can be easily
created.

A query is an inquiry to the database for information. A query is usually issued to the database
through an application interface or via a command line prompt.

Data Control Language

Data control commands in SQL allow you to control access to data within the database. These
DCL commands are normally used to create objects related to user access and also control the
distribution of privileges among users. Some data control commands are as follows:

ALTER PASSWORD
GRANT
REVOKE
CREATE SYNONYM

You will find that these commands are often grouped with other commands.

63
Data Administration Commands
Data administration commands allow the user to perform audits and perform analyses on
operations within the database. They can also be used to help analyze system performance. Two
general data administration commands are as follows:
START AUDIT
STOP AUDIT

Do not get data administration confused with database administration. Database


administration is the overall administration of a database, which envelops the use of all levels of
commands. Database administration is much more specific to each SQL implementation than are
those core commands of the SQL language.

Transactional Control Commands


In addition to the previously introduced categories of commands, there are commands that allow
the user to manage database transactions.
COMMIT Saves database transactions
ROLLBACK Undoes database transactions
SAVEPOINT Creates points within groups of transactions in which to ROLLBACK
SET TRANSACTION Places a name on a transaction

3.8 Using SQL Commands

3.8.1. Creation of a Database

The second method of creating a database involves using the SQL statement CREATE
DATABASE. This statement has the following general form,

CREATE DATABASE db_name

For the syntax above optional items appear in brackets [ ]. Items written in braces, { }, and
followed by “...” are items that can be repeated any number of times. db_name is the name of the
database. The maximum size of a database name is 128 characters. The maximum number of
databases managed by a single system is 32,767. All databases are stored in files. These files can
be explicitly specified by the system administrator or implicitly provided by the system.

64
For example

For the database designed earlier the database can physically created by the command:

Create database selam

3.8.2. Creating Tables

The CREATE TABLE statement creates a new base table with all corresponding columns and
their data types. The basic form of the CREATE TABLE statement is

CREATE TABLE table_name

(col_name1 type1 [(size)] [NOT NULL| NULL]

[{, col_name2 type2 [(size)] [NOT NULL| NULL]} ...]

[{, CONSTRAINT constraint name constraint type [constraint attributes])

table_name is the name of the created base table. The maximum number of tables per database
is limited by the number of objects in the database (there can be more than 2 billion objects in a
database, including tables, views, stored procedures, triggers, and constraints). col_name1,
col_name2,... are the names of the table columns. type1[(size)], type2[(size)],... are data types
and size of corresponding columns. Constraint name is name of the created constraint and
constraint type defines the type of constraint you are creating (Primary key/foreign key).

In the database created by the name selam, we can create the patient table as follows.

Table Definition example

Create table patient (MRN char(10) not null, FirstName


varchar(20) not null, MiddleName varchar(20), LastName
varchar(20), Sex char(6), BirthDate date ,city varchar (20)
Woreda varchar(20), Kebele varchar (20)constraint patient_pk
primary key (MRN))
Note that the constraint patient_pk defines MRN to be the primary key of the table.

65
3.8.3. Data Types in SQL

Data types specify what the type of data can be for that particular column. If a column called
"Last_Name", is to be used to hold names, then that particular column should have a "varchar"
(variable-length character) data type.

Table 3-1: The most common Data types:

char(size) Fixed-length character string. Size is specified in parenthesis. Max 255 bytes.

varchar(size) Variable-length character string. Max size is specified in parenthesis.

number(size) Number value with a max number of column digits specified in parenthesis.

date Date value

number(size,d) Number value with a maximum number of digits of "size" total, with a maximum
number of "d" digits to the right of the decimal.

3.8.4. Adding or Dropping a New Column

You can use the ADD clause of the ALTER TABLE statement to add a new column to the
existing table. Only one column can be added for each ALTER TABLE statement.

For example

ALTER TABLE patient


ADD telephone_no CHAR(12) NULL;

ALTER TABLE patient


ADD registered_date DATE;

The ALTER TABLE statements above separately add columns telephone_no and registered_date
to the patient table.

You can use the DROP COLUMN clause to drop an existing column of a table.

66
ALTER TABLE patient
DROP COLUMN telephone_no;

The ALTER TABLE statement above removes the telephone_no column, which was added to
the patient table with the ALTER TABLE statement.

3.8.5. Inserting data into a Table

The insert statement is used to insert or add a row of data into the table.

To insert records into a table, enter the key words insert into followed by the table name,
followed by an open parenthesis, followed by a list of column names separated by commas,
followed by a closing parenthesis, followed by the keyword values, followed by the list of values
enclosed in parenthesis. The values that you enter will be held in the rows and they will match up
with the column names that you specify. Strings should be enclosed in single quotes, and
numbers should not.

Insert into "tablename" (first_column,...last_column) values


(first_value,...last_value);
In the example below, the column name firstName will match up with the value 'Alemu',
and the column name city will match up with the value 'Addis Ababa'.

Example:

Insert into patient (FirstName MiddleName LastName Sex BirthDate


city woreda kebele) values (‘Alemu’, ‘Zelalem’, ‘Molla’,’M’,
‘2/3/2000’, ‘Addis Ababa’ 06 24)

3.8.6. Updating Records

The update statement is used to update or change records that match a specified criteria. This is
accomplished by carefully constructing a where clause.

67
update "tablename" set "columnname" = "newvalue"
[,"nextcolumn" = "newvalue2"...]
where "columnname" OPERATOR "value"
[and|or "column" OPERATOR "value"];

[] = optional

[The above example was line wrapped for better viewing on this Web page.]

Examples:

update patient
set MiddleName = Tolla
where FirstName = Alemu;

3.8.7. SQL SELECT Statement

The select statement is used to query the database and retrieve selected data that match the
criteria that you specify. Here is the format of a simple select statement:

select "column1" [,"column2",etc] from "tablename"


[where "condition"];[] = optional

The column names that follow the select keyword determine which columns will be returned in
the results. You can select as many column names that you'd like, or you can use a "*" to select
all columns.

The table name that follows the keyword from specifies the table that will be queried to retrieve
the desired results.

The where clause (optional) specifies which data values or rows will be returned or displayed,
based on the criteria described after the keyword where.

Conditional selections used in the where clause:

68
= Equal
> Greater than
< Less than
>= Greater than or equal
<= Less than or equal
<> Not equal to
LIKE *See note below

The LIKE pattern matching operator can also be used in the conditional selection of the where
clause. Like is a very powerful operator that allows you to select only rows that are "like" what
you specify. The percent sign "%" can be used as a wild card to match any possible character that
might appear before or after the characters specified. For example:

select FirstName, MiddleName, LastName from patient where


FirstName
LIKE 'Al%';

3.9 Generating a Report

Report A set of attributes, in a predetermined format, based on many unrelated records. A report
usually contains the same attributes about each record. For example, a report might list the
product identifier, current year sales, and current year sales goal for all the products for which
sales are below goal. A report usually includes page numbers, titles on each page, the date the
report was printed, and other descriptive information.

Most good business applications contain a built-in reporting tool; this is simply a front-end
interface that calls or runs back-end database queries that are formatted for easy application
usage. For example, eHMIS (Electronic Health Management Information System) software
application may contain specifically defined reports on Health Center or Clinic or Hospital
Monthly Inpatient (IPD) disease report.

69
Setting conditions to generate a report

In simple terms a report is a database report presents information retrieved from a table or query
in a preformatted, attractive manner. To see with an example assume that the database created at
the beginning of this topic has the following information. Look at the table found below. The
table holds patient information that got treatment in a given hospital. From the database it is
possible patients that came from Addis Ababa city.

Table 3-2: patient information

FirstName MiddleName LastName Sex BirthDate city Woreda Kebele


Alemu Zelalem Molla M 2/3/2000 Addis Ababa 06 24
Temesgen Kebede Abebe M 5/3/1991 Dessie 05 15
Ali Jemal Kebede M 6/4/1995 Addis Ababa 04 17
Meserete Ayalew Yared F 4/5/1980 Shashemene 08 23
Tigist Tesfaye Mehari F 1/8/1978 Addis Ababa 15 27

So the query goes like all patients who came from Addis Ababa. The resulting output will be

Table 3-2: patient information who came from Addis Ababa

FirstName MiddleName LastName Sex BirthDate city Woreda Kebele


Alemu Zelalem Molla M 2/3/2000 Addis Ababa 06 24
Ali Jemal Kebede M 6/4/1995 Addis Ababa 04 17
Tigist Tesfaye Mehari F 1/8/1978 Addis Ababa 15 27

In the same manner filter out patients who are female.

Generating a Report

A report can be run at any time, and will always reflect the current data in the database. Reports
are generally formatted to be printed out, but they can also be viewed on the screen, exported to
another program, or sent as e-mail message. For example in the eHMIS application once the
report is generated it can be exported to word, excel of pdf formats depending on your interest. It

70
is also possible to send the same report through internet or save to your hard disk and send it
using secondary storage medias.

71
Activities/exercise

a. Query on a single table

Use the Student table below and write SQL statement and the return results for the following
operations.
Open Microsoft SQL Management Studio and create the tables shown below
Assign the following data types for each of the fields
Stud_id varchar (10) not null,
Name varchar (20) not null,
Major varchar(30) not null
Age integer (3)

STUDENT
Stud_id Name Major Age
10000 Jemal Economy 21
15000 Paulos Communication 19
20000 Bahiru Economy 25
25000 Girma Management 24
35000 Roza Computer Sc 20
40000 Rediet <null> 19

1. List out the student name whose age is less than 21.
2. List all the student details where their student id is from 20000 to 4000
3. List the name of students that are major in ‘Economy’ or ‘Management’.
4. List out the student id which do not have a major
5. Write the SQL statements for the following queries:

a. List names and IDs of students whose IDs begin with '10'
b. List names and IDs of students whose names begins with 'R'

6. List out the student id, name whose major is in ‘Economy’ and age is greater than 21.

72
7. List out all the major (Hint: do not list out the repeating major)

b. Query on Multiple Tables

Assume that Enroll and Lecture tables have been created in the database. The structure and data
of these tables is as below:-

ENROLL
Stud_id Code Status
10000 ECO4405 Compulsory
15000 COM2000 Compulsory
20000 ECO4405 Elective
20000 SAK2500 Compulsory
40000 COM2000 Elective
40000 MTK4100 Compulsory
40000 SAK2500 Elective

LECTURE
Code Time Room
COM2000 MF9 DK5
ECO4405 MWF3 DK4
MTK4100 MWF8 DK4
SAK2500 MTTh11 DK2

Write SQL statement and the return results for the following operations:-

1. For each student, list his/her name and the code all courses he/she enrolls in
2. List the lecture time for all courses enrolled in by the student whose name is Jamal
3. List the name and major of all the students who do not enroll in any course.

73
Topic 4: Database Backup and Recovery
4.1. Introduction
People are advised to take backup of their important data from the hard drive and then use
removable storage media devices like flash drive, memory card, CD/DVD etc to store the backup
file. This can help people in making data available in future even if the hard drive has replaced or
any hardware equipments become failing.

Backup and recovery is the set of concepts, procedures, and strategies involved in protecting the
database against data loss caused by media failure or users errors. In general, the purpose of a
backup and recovery strategy is to protect the database against data loss and reconstruct lost data.

In this topic we discuss about the database backup and recovery processes. Preventing data loss
is one of the most critical issues involved in managing database systems. Data can be lost as a
result of many different problems such as:

• Hardware failures
• Viruses
• Incorrect use of UPDATE and DELETE statements
• Software bugs
• Disasters, such as fire or flood
To prevent data loss, you can implement a backup and recovery strategy for your databases. This
topic explains the database backup and recovery techniques.

4.2. Learning Objectives


At the end of this topic the students should be able to:

• Understand the importance of backup and recovery


• Difference between backup and recovery
• Identify the back methods and recovery
• Learn the different types of recovery

74
4.3. Database Backup
A backup is a copy of data from your database that can be used to reconstruct that data or it is the
activity of copying files or databases so that they will be preserved in case of equipment failure
or other catastrophe.

Database backups are at the core of any SQL Server disaster recovery planning for any
production system. Backups may be used to provide a means of recovery to a point-in-time when
the database was last operational. Microsoft SQL Server provides several types of backups that
may be combined to formulate a customized disaster recovery plan depending on the nature of
the data and the recovery requirements. It is highly recommended that all SQL Server databases
be backed up periodically.

In any software application we are using, it is highly recommended to take back of the data in a
given period of time. For example, in the case of EMR system, it is recommended to take a back
of the data in the daily manner. The main reason for this, as mentioned above, is to have a copy
of the actual data before any disaster caused the data lost.

Some software has built in functionality that helps you to take a back up. But after taking the
back it is really important to save it in another secondary storage to be more in the save side.

4.4. Methods of Database Backup


Database backup is the process of dumping data (from a database, a transaction log, or a file)
into backup devices that system creates and maintains. A backup device can be a disk file or a
tape. Database Engine provides both static and dynamic ways to take backups.

• Static backup means that during the backup process, the only active session supported by
the system is the one that creates the backup. In other words, user processes are not
allowed during backup.

• Dynamic backup means that a database backup can be performed without stopping the
database server, removing users, or even closing the files. (The users will not even know
that the backup process is in progress.)

75
Database Engine provides four different backup methods. They are:

1. Full database backup


2. Differential backup
3. Transaction log backup
4. File (or file group) backup

1. Full Database Backup


A full database backup captures the state of the database at the time the backup is started. During
the full database backup, the system copies the data as well as the schema of all tables of the
database and the corresponding file structures. If the full database backup is executed
dynamically, the database system records any activity that takes place during the backup.
Therefore, even all uncommitted transactions in the transaction log are written to the backup
media.

2. Differential Backup
A differential backup creates a copy of only the parts of the database that have changed since the
last full database backup. (As in a full database backup, any activity that takes place during a
differential backup is backed up, too.) The advantage of a differential backup is speed. It
minimizes the time required to back up a database, because the amount of data to be backed up is
considerably smaller than in the case of a full database backup. (Remember that a full database
backup includes a copy of all database pages.)

3. Transaction Log Backup


A transaction log backup considers only the changes recorded in the log. This form of backup is
therefore not based on physical parts (pages) of the database, but rather on logical operations—
that is, changes executed using the DML statements INSERT, UPDATE, and DELETE. Again,
because the amount of data is smaller, this process can be performed significantly quicker than a
full database backup and quicker than a differential backup. Dominate

There are two main reasons to perform a transaction log backup: first, to store the data that has
changed since the last transaction log backup or database backup on a secure medium; second

76
(and more importantly), to properly close the transaction log up to the beginning of the active
portion of it. (The active portion of the transaction log contains all uncommitted transactions.)

Using a full database backup and a valid chain of all closed transaction logs, it is possible to
propagate a database copy on a different computer. This database copy can then be used to
replace the original database in case of a failure. (The same scenario can be established using a
full database backup and the last differential backup.)

Database Engine does not allow you to store the transaction log in the same file in which the
database is stored. One reason for this is that if the file is damaged, the use of the transaction log
to restore all changes since the last backup will not be possible.

Using a transaction log to record changes in the database is a common feature used by nearly all
existing relational DBMSs. Nevertheless, situations may arise when it becomes helpful to switch
this feature off. For example, the execution of a heavy load can last for hours. Such a program
runs much faster when the logging is switched off. On the other hand, switching off the logging
process is dangerous, as it destroys the valid chain of transaction logs. To ensure the database
recovery, it is strongly recommended that you perform a full database backup after the successful
end of the load.

One of the most common system failures occurs because the transaction log is filled up. Be
aware that such a problem may cause a complete standstill of the system. If the storage used for
the transaction log fills up to 100 percent, the system must stop all running transactions until the
transaction log storage is freed again. This problem can only be avoided by making frequent
backups of the transaction log: each time you close a portion of the actual transaction log and
store it to a different storage media, that portion of the log becomes reusable, and the system thus
regains disk space.

Some differences between log backups and differential backups are worth noting. The benefit of
differential backups is that you save time in the restore process, because to recover a database
completely, you need a full database backup and only the latest differential backup. If you use
log backups for the same scenario (situation), you have to apply a full database backup and all
existing log backups to bring the database to a consistent state. A disadvantage of differential

77
backups is that you cannot use them to recover data to a specific point in time, because they do
not store intermediate changes to the database.

4. File or Filegroup Backup


File (or filegroup) backup allows you to back up specific database files (or filegroups) instead of
the entire database. In this case, Database Engine backs up only files you specify. Individual files
(or filegroups) can be restored from a database backup, allowing recovery from a failure that
affects only a small subset of the database files. You can use either a database backup or a
filegroup backup to restore individual files or filegroups. This means that you can use database
and transaction log backups as your backup procedure and still be able to restore individual files
(or filegroups) from the database backup.

A differential backup and a transaction log backup both minimize the time required to
back up the database. But there is one significant difference between them: the transaction log
backup contains all changes of a row that has been modified several times since the last backup,
whereas a differential backup contains only the last modification of that row.

The database file backup is recommended only when a database that should be backed up is very
large and there is not enough time to perform a full database backup.

It does not make sense to back up a transaction log unless a full database backup has been
performed at least once.

4.5. Performing Backup


You can perform backup operations using SQL server management studio. All types of backup
operations can be executed using two Transact-SQL statements:

• BACKUP DATABASE
• BACKUP LOG

Before we describe these two Transact-SQL statements, we will specify the existing types of
backup devices. Database engine allows you to back up databases, transaction logs, and files to
the following backup devices:

78
• Disk
• Tape

Disk files: are the most common media used for storing backups. Disk backup devices can be
located on a server’s local hard disk or on a remote disk on a shared network resource. Database
engine allows you to append a new backup to a file that already contains backups from the same
or different databases. By appending a new backup set to existing media, the previous contents
of the media remain intact, and the new backup is written after the end of the last backup on the
media. (Backup set includes all stored data of the object you chose to back up.) By default,
database engine always appends new backups to disk files.

Tape backup devices are generally used in the same way as disk devices. However, when you
back up to a tape, the tape drive must be attached locally to the system. The advantage of tape
devices in relation to disk devices is their simple administration and operation.

4.6. Backing Up Using SQL Server Management Studio

Before you can perform a database or transaction log backup, you must specify (or create)
backup devices. SQL Server Management Studio allows you to create disk devices and tape
devices in a similar manner. In both cases, expand the server, expand Server Objects, right-click
Backup Devices, and choose New Backup Device. In the Backup Device dialog box enter the
name of either the disk device (if you clicked File) or the tape device (if you clicked Tape). In
the former case, you can click the ... button on the right side of the field to display existing
backup device locations. In the latter case, if Tape cannot be activated, then no tape devices exist
on the local computer.

79
Figure 4-1: The backup device dialog box

After you specify backup devices, you can do a database backup. Expand the server; expand
Databases, and right-click the database. After pointing to Tasks, choose Backup. The Backup
Database dialog box appears. On the General page of the dialog box, choose the backup type in
the Backup Type drop-down list (Full, Differential, or Transaction Log), enter the backup set
name in the Name field, and optionally enter a description of this set in the Description field. In
the same dialog box, you can choose an expiration date for the backup. In the Destination frame,
select an existing device by clicking Add. The Remove button allows you to remove one or
more backup devices from the list of devices to be used. (See the following diagram)

80
Figure 4-2: The backup database dialog box general page

On the Options page to append to an existing backup on the selected device, click the Append
to the existing backup set radio button. Choosing the Overwrite all existing backup sets radio
button in the same frame overwrites any existing backups on the selected backup device.

81
Figure 4-3: The backup database dialog box, Option page

For verification of the database backup, click Verify backup when finished in the Reliability
frame. On the Options page, you can also choose to back up to a new media set by clicking
Back up to a new media set, and erase all existing backup sets and then entering the media set
name and description.

For creation and verification of a differential database backup or transaction log backup, follow
the same steps, but choose the corresponding backup type in the Backup type field on the
General page.

After all options have been selected, click OK. The database or the transaction logs then backed
up. The name, physical location, and the type of the backup devices can be shown by selecting
the server, expanding the Server Objects folder, and finally expanding the Backup Devices
folder.

82
Scheduling Backups with Management Studio

A well-planned timetable for the scheduling of backup operations will help you avoid system
shortages when users are working. SQL Server Management Studio supports this planning by
offering an easy-to-use graphical interface for scheduling backups. Scheduling backups using
SQL Server Management Studio is explained in detail as follows:

Determining Which Databases to Back Up


The following databases should be backed up regularly:

• The master database


• All production databases

Backing up the master Database


The master database is the most important database of the system because it contains information
about all of the databases in the system. Therefore, you should back up the master database on a
regular basis. Additionally, you should back up the master database anytime certain statements
and stored procedures are executed, because Database Engine modifies the master database
automatically.

You can perform full database backups of the master database only. (The system does not
support differential, transaction log, and file backups for the master database.)

Without a backup of the master database, you must completely rebuild all system databases,
because if the master database is damaged, all references to the existing user-defined databases
are lost.

Backing up Production Databases


You should back up each production database on a regular basis. Additionally, you should back
up any production database when the following activities are executed:

• After creating it
• After creating indices
• After clearing the transaction log

83
• After performing non-logged operations

Always make a full database backup after it has been created, in case a failure occurs between
the creation of the database and the first regular database backup. Remember that backups of the
transaction log cannot be applied without a full database backup. Backing up the database after
creation of one or more indices saves time during the restore process, because the index
structures are backed up together with the data.

Backing up the transaction log after creation of indices does not save time during the restore
process at all, because the transaction log only records the fact that an index was created (and
does not record the modified index structure).

Backing up the database after clearing the transaction log is necessary because the transaction
log no longer contains a record of database activity, which is used to recover the database. All
operations that are not recorded to the transaction log are called non-logged operations.
Therefore, all changes made by these operations cannot be restored during the recovery process.

4.7. Database Recovery

It is the process of reconstructing the database after any kind of data loss. A major responsibility
of the database administrator is to prepare for the possibility of hardware, software, network,
process, or system failure. If such a failure affects the operation of a database system, you must
usually recover the database and return to normal operation as quickly as possible. Recovery
should protect the database and associated users from unnecessary problems and avoid or reduce
the possibility of having to duplicate work manually.

Recovery processes vary depending on the type of failure that occurred, the structures affected,
and the type of recovery that you perform. If no files are lost or damaged, recovery may amount
to no more than restarting an instance. If data has been lost, recovery requires additional steps.

4.7.1. Database Recovery Point

Database recovery point restores the database from backups prior to the target time for recovery,
then uses incremental backups and redoes to roll the database forward to the target time. It is

84
sometimes called incomplete recovery because it does not use all of the available redo or
completely recover all changes to your database.

4.7.2. Recovery Methods

Method 1

Right Click on Database >> Go to Properties >> Go to Option. On the Right side you can find
recovery model.

Method 2

Click on the Database Node in Object Explorer. In Object Explorer Details, you can see the
column Recovery Model. See figure 4-4.

Figure 4-4: Recovery model details

Method 3

This is a very easy method and it gives all the database information in one script. See figure 4-5.

85
SELECT name AS [Database Name],
recovery_model_desc AS [Recovery Model]
FROM sys.databases
GO

Figure 4-5: Database recovery method displaying all the databases

Method 4

This method provides only one database at a time.

SELECT 'Jugel' AS [Database Name],


DATABASEPROPERTYEX(‘Jugel', 'RECOVERY')
AS [Recovery Model]
GO

86
Figure 4-6: Database recovery method displaying a single database

4.7.3. Recovery Techniques:

1. Salvation program: Run after a crash to attempt to restore the system to a valid state. No
recovery data used. Used when all other techniques fail or were not used. Good for cases
where buffers were lost in a crash and one wants to reconstruct what was lost.
2. Incremental dumping: Modified files copied to archive after job completed or at intervals.
3. Audit trail: Sequences of actions on files are recorded. Optimal for "backing out" of
transactions. (Ideal if trail is written out before changes).
4. Differential files: Separate file is maintained to keep track of changes, periodically merged
with the main file.
5. Backup/current version: Present files form the current version of the database. Files
containing previous values form a consistent backup version.
6. Multiple copies: Multiple active copies of each file are maintained during normal operation
of the database. In cases of failure, comparison between the versions can be used to find a
consistent version.
7. Careful replacement: Nothing is updated in place, with the original only being deleted after
operation is complete.

87
4.7.4. Disaster Recovery Plan and Procedures
Disaster recovery: every organization requires contingency plans for dealing with disasters that
may severely damage or destroy their data center. Such disasters may be natural for example
floods, earthquakes, tornadoes and hurricanes. Preventing a natural disaster is very difficult,
measures such as good planning which includes mitigation measures can help reduce or avoid
losses. Such disaster may be man-made for example wars, infrastructure failure, sabotage and
terrorist attacks etc.

Planning for disaster recovery is an organization wide responsibility. Database administration is


responsible for developing plans for recovering the organization’s data and for restoring data
operations. The followings are some of the major components of a recovery plan.

• Develop a detailed, written disaster recovery plan. Schedule regular tests of the plan.
• Choose and train a multidisciplinary team to carry out the plan.
• Establish a backup data center at an off-site location. This site must be located at a
sufficient distance from the primary site so that no foreseeable disaster will disrupt both
sites. If an organization has two or more data centers, each site may serve as a backup for
one of the others. If not, the organization may contract with a disaster recover service
provider.
• Send backup copies of database to the back-up data center on a scheduled basis. Database
backups may be sent to the remote site by courier or transmitted by replication software.

The only way to ensure rapid recovery from a system failure or other disaster is to plan carefully.
You must have a set plan with detailed procedures. Whether you are implementing a standby
database or you have a single database system, you must have a plan for what to do in the event
of a catastrophic failure.

A disaster recovery plan is a document that defines the policies and procedures for dealing with
various types of disasters that can affect an organization; especially the organization’s IT
(information technology) infrastructures. A disaster is an event that has a significant impact on
an enterprise’s ability to conduct normal business. This plan includes the information and
procedure needed to resume an organization’s operation after some sort of disasters (e.g. loss of
a server).

88
Disaster recovery is all about risk management. The cost of ignoring disasters can be very high,
including total collapse of the enterprise. The first step is to understand the risks your enterprise
faces. This is often called a business impact analysis. You should need to answer the following
questions:

• What are the most critical functions or system for your organization?
• What would be the impact if these were severely interrupted?

You can’t document every types of disaster that might befall an organization’s servers and
networks. Make sure your plan includes some general policy guidelines to cover any cases not
specifically mentioned. Some types of disasters you should specifically plan for include:

• Physical break-ins: theft and/ or destruction, terrorist attacks


• Remote attacks: attempt to steal, destroy, or corrupt data, theft of service, denial
of service (DoS), computer service.
• Hardware failure: servers, databases, networks, power outages
• Environmental disaster: fire, flood, hurricane, etc (generally all these result in
power outages too)
• Accident ( human error):file loss, DB record loss, data corruption
• Other disruption: disgruntled employees, organized criminal activity, strikes, legal
actions (e.g. shutdown order), etc

There are a number of techniques that can be used to reduce or eliminate the probability of some
disasters. Of course you can’t completely eliminate the risk of disasters but you can reduce
disasters by using the following techniques:

• Keep paper copies of vital data


• Keep information (contact information, password,……) current
• Use anti-virus and malware removal software
• Use and regularly test UPS, fire and smoke sensors and alarms, anti-theft systems
etc.

89
4.7.5. Performing a Database Recovery

Whenever a transaction is submitted for execution, Database Engine is responsible either for
executing the transaction completely and recording its changes permanently in the database or
for guaranteeing that the transaction has no effect at all on the database. This approach ensures
that the database is consistent in case of a failure, because failures do not damage the database
itself, but instead affect transactions that are in progress at the time of the failure. Database
Engine supports both automatic and manual recovery, which are discussed next in turn.

Automatic Recovery

Automatic recovery is a fault-tolerant feature that Database Engine executes every time it is
restarted after a failure or shutdown. The automatic recovery process checks to see if the
restoration of databases is necessary. If it is, each database is returned to its last consistent state
using the transaction log.

During automatic recovery, Database Engine examines the transaction log from the last
checkpoint to the point at which the system failed or was shut down. (A checkpoint is the most
recent point at which all data changes are written permanently to the database from memory.
Therefore, a checkpoint ensures the physical consistency of the data.) The transaction log
contains committed transactions (transactions that are successfully executed, but their changes
have not yet been written to the database) and uncommitted transactions (transactions that are not
successfully executed before a shutdown or failure occurred).

Database Engine rolls forward all committed transactions, thus making permanent changes to the
database, and undoes the part of the uncommitted transactions that occurred before the
checkpoint.

Database Engine first performs the automatic recovery of the master database, followed by the
recovery of all other system databases. Then, all user-defined databases are recovered.

90
Manual Recovery

A manual recovery of a database specifies the application of the full backup of your database and
subsequent application of all transaction logs in the sequence of their creation. (Alternatively,
you can use the full database backup together with the last differential backup of the database.)
After this, the database is in the same (consistent) state as it was at the point when the transaction
log was backed up for the last time.

When you recover a database using a full database backup, Database Engine first re-creates all
database files and places them in the corresponding physical locations. After that, the system re-
creates all database objects.

Database Engine can process certain forms of recovery dynamically (in other words, while an
instance of the database system is running). Dynamic recovery improves the availability of the
system, because only the data being restored is unavailable. Dynamic recovery allows you to
restore either an entire database file or a file group. Microsoft calls dynamic recovery “online
restore.”

Is my backup set ready for recovery?

After executing the BACKUP statement, the selected device (tape or disk) contains all data of
the object you chose to back up. The stored data is called a backup set. Before you start a
recovery process, you should be sure that

• The backup set contains the data you want to restore


• The backup set is usable

Database Engine supports a set of Transact-SQL statements that allows you to confirm that the
backup set is usable and contains the proper data. The following four options, among others,
belong to it:

• RESTORE LABELONLY
• RESTORE HEADERONLY
• RESTORE FILELISTONLY

91
• RESTORE VERIFYONLY

RESTORE LABELONLY This statement is used to display the header information of the
media (disk or tape) used for a backup process. The output of the RESTORELABELONLY
statement is a single row that contains the summary of the header information (name of the
media, description of the backup process, and date of a backup process).

Restore label only reads just the header file, so use this statement if you want to get a quick look
at what your backup set contains.

RESTORE HEADERONLY Whereas the RESTORE LABELONLY statement gives you


concise information about the header file of your backup device, the RESTOREHEADERONLY
statement gives you information about backups that are stored on a backup device. This
statement displays a one-line summary for each backup on a backup device. In contrast to
RESTORE LABELONLY, using RESTOREHEADERONLY can be time consuming if the
device contains several backups. The output of RESTORE HEADERONLY contains a
Compressed column, which tells you whether the backup file is compressed (value 1) or not.

RESTORE FILELISTONLY The RESTORE FILELISTONLY statement returns a result set


with a list of the database and log files contained in the backup set. You can display information
about only one backup set at a time. For this reason, if the specified backup device contains
several backups, you have to specify the position of the backup set to be processed.

You should use RESTORE FILELISTONLY if you don’t know exactly either which backup sets
exist or where the files of a particular backup set are stored. In both cases, you can check all or
part of the devices to make a global picture of existing backups.

RESTORE VERIFYONLY After you have found your backup, you can do the next step: verify
the backup without using it for the restore process. You can do the verification with the
RESTORE VERIFYONLY statement, which checks the existence of all backup devices (tapes or
files) and whether the existing information can be read.

92
Activities/exercise

1. What is a backup mean?


2. Describe the use of backup and recovery?
3. Briefly identify the backup Medias.
4. What causes data loss?
5. Explain the difference between transaction log backup and differential backup?
6. What is the difference between backup and recovery?
7. List techniques to recover the database.
8. What is the use of disaster recovery plan?

93
Topic 5: Data and Database Administration

5.1  Introduction  
Once a database is created the next step is to ensure proper utilization of the data. For this data
administration is the main task to be handled by the data administrator. The database
administrator has to know his roles and responsibilities before engaging into the task. Of the
major issues that needs high attention in database administration is ensuring the security of the
data. All kinds of threats that pose risk to the data needs to identified and proper care should be
taken. This and other related issues are discussed in the topic one by one.

5.2 Learning Objectives


At the end of this topic the student should be able to:
• Define database administration, database administrator, database security,
• State the basic functions and roles of database administrator
• Define data integrity, availability, vulnerability and confidentiality

Data Administration: A high-level function that is responsible for the overall management of
data resources in an organization, including maintaining corporate-wide definitions and
standards

Database Administration: A technical function that is responsible for physical database design
and for dealing with technical issues such as security enforcement, database performance, and
backup and recovery

Essentially, the main role of a database administrator has to do with overseeing the installation
and ongoing function of software on a system designed for use by a number of users. There are
several specific responsibilities that the typical database administrator will perform in just about
any corporate environment.

A basic responsibility for just about every database administrator involves

• The installation of new databases. As part of the database installation, the database
administrator will set up login credentials to authorized persons, define the

94
privileges associated with each authorized user, and ensure that every work station
attached to the network is set up to access the new database. This process usually
involves a period of troubleshooting, in which the database administrator will
address and resolve any problems that users experience with the new product.
• Database administrators often handle the process of creating backup records of the
information contained in the databases on the system. This involves more than
setting up an automatic backup and assuming that the backup is proceeding
according to plan. The competent database administrator will check the backup files
to make sure the information is complete, the integrity of the data is secure, and that
the saved files can easily be accessed and loaded in the event that something
happens to the main database.
• With just about all software, new releases and upgrades are made available from
time to time. The database administrator will be aware of any new versions or
upgrades to existing versions that could improve the efficiency of a currently
installed database.
• Generally, a database administrator is authorized to upload free upgrades and install
them at will. In the event that a new version is available, the administrator may
work with others in the company to determine if the cost of replacing the existing
database software is worth the investment.

Traditional Database Administration Functions can be summarized as


• Data policies, procedures, standards
• Planning
• Data conflict (ownership) resolution
• Internal marketing of DA concepts
• Managing the data repository
• Selection of hardware and software
• Installing/upgrading DBMS
• Tuning database performance
• Improving query processing performance
• Managing data security, privacy, and integrity
• Data backup and recovery

95
5.3 Functions and Roles of Data/Database Administrator
As we discussed earlier in this topic, the database administrator (DBA) is the central authority
for managing a database system. The DBA’s responsibilities include granting privileges to users
who need to use the system and classifying users and data in accordance with the policy of the
organization. The DBA has a DBA account in the DBMS, sometimes called a system or
superuser account, which provides powerful capabilities that are not made available to regular
database accounts and users. DBA privileged commands include commands for granting and
revoking privileges to individual accounts, users, or user groups and for performing the
following types of actions:

1. Account creation: This action creates a new account and password for a user or a
group of users to enable them to access the DBMS.
2. Privilege granting: This action permits the DBA to grant certain privileges to certain
accounts.
3. Privilege revocation: This action permits the DBA to revoke (cancel) certain privileges
that were previously given to certain accounts.
4. Security level assignment: This action consists of assigning user accounts to the
appropriate security classification level.

5.4 Access Protection, User Accounts, and Database Audits


Whenever a person or a group of persons needs to access a database system, the individual or
group must first apply for a user account. The DBA will then create a new account number and
password for the user if there is a legitimate need to access the database. The user must log into
the DBMS by entering the account number and password whenever database access is needed.
The DBMS checks that the account number and password are valid; if they are, the user is
permitted to use the DBMS and to access the database. Application programs can also be
considered as users and can be required to supply passwords.

It is straightforward to keep track of database users and their accounts and passwords by creating
an encrypted table or file with the two fields Account Number and Password. This table can
easily be maintained by the DBMS. Whenever a new account is created, a new record is inserted

96
into the table. When an account is canceled, the corresponding record must be deleted from the
table.

The database system must also keep track of all operations on the database that are applied by a
certain user throughout each log-in session, which consists of the sequence of database
interactions that a user performs from the time of logging in to the time of logging off. When a
user logs in, the DBMS can record the user’s account number and associate it with the terminal
from which the user logged in. All operations applied from that terminal are attributed to the
user’s account until the user logs off. It is particularly important to keep track of update
operations that are applied to the database so that, if the database is tampered with, the DBA can
find out which user did the tampering.

To keep a record of all updates applied to the database and of the particular user who applied
each update, we can modify the system log. The system log includes an entry for each operation
applied to the database that may be required for recovery from a transaction failure or system
crash. We can expand the log entries so that they also include the account number of the user and
the on-line terminal ID that applied each operation recorded in the log. If any tampering with the
database is suspected, a database audit is performed, which consists of reviewing the log to
examine all accesses and operations applied to the database during a certain time period. When
an illegal or unauthorized operation is found, the DBA can determine the account number used to
perform this operation. Database audits are particularly important for sensitive databases that are
updated by many transactions and users, such as a banking database that is updated by many
bank tellers. A database log that is used mainly for security purposes is sometimes called an
audit trail.

5.5 Basics of Security


5.5.1. Security Concepts
Database security in its broad sense it deals about protection of the data against accidental or
intentional loss, destruction, or misuse. Increased difficulty due to Internet access and
client/server technologies are arising.

Database security is a very broad area that addresses many issues, including the following:

97
• Legal and ethical issues regarding the right to access certain information. Some
information may be deemed to be private and cannot be accessed legally by unauthorized
persons.
• Policy issues at the governmental, institutional, or corporate level as to what kinds of
information should not be made publicly available—for example, credit ratings and
personal medical records.
• System-related issues such as the system levels at which various security functions
should be enforced—for example, whether a security function should be handled at the
physical hardware level, the operating system level, or the DBMS level.
• The need in some organizations to identify multiple security levels and to categorize the
data and users based on these classifications—for example, top secret, secret,
confidential, and unclassified. The security policy of the organization with respect to
permitting access to various classifications of data must be enforced.

5.5.2. Managing Data Security

The goal of database security is the protection of data from accidental or intentional threats to
their integrity and access. The database environment has grown data against accidental or
intentional more complex, with distributed databases located on client/server architectures and
loss, destruction, or misuse personal computers as well as on mainframes. Access to data has
become more open through the Internet and corporate intranets and from mobile computing
devices. As a result, managing data security effectively has become more difficult and time-
consuming.

Because data are a critical resource, all persons in an organization must be sensitive to security
threats and take measures to protect the data within their domains. For example, computer
listings or computer disks containing sensitive data should not be left unattended on desktops.
Data administration is often responsible for developing overall policies and procedures to protect
databases. Database administration is typically responsible for administering database security
on a daily basis. What are the potential threats to data security? They are discussed in the coming
topic.

98
5.5.3. Threats to Data Security

Threats to data security may be direct threats to the database. For example, those
who gain unauthorized access to a database may then browse, change, or even steal
the data to which they have gained access. Focusing on database security alone, however, will
not ensure a secure database. All parts of the system must be secure, including the
database, the network, the operating system, the building(s) in which the database
resides physically, and the personnel who have any opportunity to access the system.

Figure 5-1: Possible locations of data security threats

Figure 5-1 diagrams many of the possible locations for data security threats,
Accomplishing this level of security requires careful review, establishment of security
procedures and policies, and implementation and enforcement of those procedures
and policies. The following threats must be addressed in a comprehensive data security plan:

• Accidental losses, including human, error, software, and hardware-caused breaches


Establishing operating procedures such as user authorization, uniform soft-
ware installation procedures, and hardware maintenance schedules are examples of
actions that may be taken to address threats from accidental losses. As in any effort that
involves human beings, some losses are inevitable, but

99
carefully planned policies and procedures should reduce the amount and
severity of losses. Of potentially more serious consequence are the threats that
are not accidental,

• Theft and fraud. These activities are going to be perpetrated by people, quite possibly
through electronic means, and may or may not alter data. Attention here
should focus on each possible location shown in Figure 5-1. For example, physical
security must be established so that unauthorized persons are unable to gain
access to rooms where computers, servers, telecommunications facilities, or computer
files are located. Physical security should also be provided for employee
offices and any other locations where sensitive data are stored or easily accessed.
Establishment of a firewall to protect unauthorized access to inappropriate parts
of the database through outside communication links is another example of a
security procedure that will hamper people who are intent on theft or fraud.

• Loss of privacy or confidentiality. Loss of privacy is usually taken to mean loss of


protection of data about individuals, whereas loss of confidentiality is usually
taken to mean loss of protection of critical organizational data that may have
strategic value to the organization. Failure to control privacy of information
may lead to blackmail, bribery, public embarrassment, or stealing of user pass-
words. Failure to control confidentiality may lead to loss of competitiveness.
State and federal laws now exist to require some types of organizations to create and
communicate policies to ensure privacy of customer and client data. Security mechanisms
must enforce these policies, and failure to do so can mean significant financial and
reputation loss.

• Loss of data integrity. When data integrity is compromised, data will be invalid or
corrupted. Unless data integrity can be restored through established back-up and recovery
procedures, an organization may suffer serious losses or make incorrect and expensive
decisions based on the invalid data.
• Loss of availability. Sabotage of hardware, networks, or applications may cause the data
to become unavailable to users, which again may lead to severe operational difficulties.
This category of threat includes the introduction of viruses intended to corrupt data or

100
software or to render the system unusable. It is important to counter this threat by always
installing the most current antivirus software, as well as educating employees on the
sources of viruses.

5.5.4. Database Software Data Security Features

A comprehensive data security plan will include establishing administrative policies and
procedures, physical protections, and data management software protections. There are many
mechanisms to ensure security for your database and some of them are physical protections, such
as securing data centers and work areas, disposing of obsolete media, and protecting portable
devices from theft are few of them to mention. But in addition to this there are administrative
policies and procedures that will be discussed later in this topic. All the elements of a data
security plan work together to achieve the desired level of security. Some industries, for example
health care, have regulations that set standards for the security plan and, hence, put requirements
on data security. The most important security features of data management software follow:

1. Views or sub schemas, which restrict user views of the database that should
2. Domains, assertions, checks, and other integrity controls defined as database objects,
which are enforced by the DBMS during database querying and updating
3. Authorization rules, which identify users and restrict the actions they may take against a
database
4. User-defined procedures, which define additional constraints or limitations in using a
database
5. Encryption procedures, which encode data in an unrecognizable form
6. Authentication schemes, which positively identify persons attempting to gain access to a
database
7. Back-up capabilities, which facilitate recovery procedures

5.5.5. Security Policies and Procedures

We have described numerous features of data management software that organizations should
use to secure their databases and other computing resources. Organizations must also establish
administrative policies and procedures that serve as a context for effectively implementing these

101
measures. Four types of security policies and procedures are the following: personnel controls,
physical access controls, maintenance controls, and data privacy controls.

Personnel Controls. Adequate controls of personnel must be developed and followed, for the
greatest threat to business security is often internal rather than external. In addition to the
security authorization and authentication procedures just discussed, organizations should develop
procedures to ensure a selective hiring process that validates potential employees' representations
about their backgrounds and capabilities. Monitoring to ensure that personnel are following
established practices, taking regular vacations, working with other employees, and so forth
should be followed. Employees should be trained in those aspects of security and quality that
are relevant to their jobs and encouraged to be aware of and follow standard security and data
quality measures. Standard job controls, such as separating duties so no one employee has
responsibility for an entire business process or keeping application developers from having
access to production systems, should also be enforced.

Physical Access Controls. Limiting access to particular areas within a building is usually a part
of controlling physical access. Swipe or proximity access cards can be used to gain access to
secure areas, and each access can be recorded in a database with timestamps. Sensitive
equipment, including hardware and peripherals, such as printers (which may be used to print
classified reports) can be controlled by placement in secure areas. Other equipment may be
locked to a desk or cabinet or may have an alarm attached. Back-up data tapes should be kept in
fireproof data safes and/or kept off-site at a safe location.

Maintenance Controls An area of control that helps to maintain data quality and availability but
that is often overlooked is maintenance control. Organizations should review external
maintenance agreements for all hardware and software they are using to ensure that appropriate
response rates are agreed to for maintaining system quality and availability. It is also important
to consider reaching agreements with the developers of all critical software so that the
organization can get access to source code should the developer go out of business or stop
supporting the programs. Controls should be in place to protect data from inappropriate access
and use by outside maintenance staff and other contract workers.

102
Data Privacy Controls. Information privacy legislation generally gives individuals the right to
know what data have been collected about them and to correct any errors in those data. As the
amount of data exchanged continues to grow, the need is also growing to develop adequate data
protection. Also important are adequate provisions to allow the data to be used for legitimate
legal purposes so that organizations that need the data can access them and rely on their quality.
Individuals need to be given the opportunity to state with whom data retained about them may be
shared, and then these wishes must be enforced; enforcement is more reliable if access rules
based on privacy wishes are developed by the DBA staff and handled by the DBMS.

103
Activities/Exercises

1. Distinguish among vulnerability, threat, and control.


2. List the three goals of database security
3. Read about computer virus and present to your friends in class room
4. List and explain the four elements of information security policy
5. List the basic tasks of database administration
6. List and discuss security policies and procedures

104

You might also like