Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

Name: __________________________ Course & year: _____________ Date of Submission: ____________

Course Code: COMP 211 Descriptive Title: Information Managemet

Lesson 2. Characteristics and Value of Information

Introduction

Data is raw facts. Data is like raw material. Data does not interrelate
and also it does not help in decision making. Data is defined as
groups of non-random symbols in the form of text, images, voice
representing quantities, action and objects.
Information : Information is the product of data processing.
Information is interrelated data. Information is equivalent to finished
goods produced after processing the raw material.
The information has a value in decision making. Information brings clarity and creates an intelligent human response in the mind.
According to Davis and Olson : “Information is a data that has been processed into a form that is meaningful to recipient and is of
real or perceived value in the current or the prospective action or decision of recipient.”
It is a most critical resource of the organization. Managing the information means managing future. Information is knowledge that one
derives from facts placed in the right context with the purpose of reducing uncertainty.

Learning Objectives:

At the end of this lesson, the students will be able to:


1. Define the concept of Information;
2. Discuss the Characteristics of Information;

LESSON PROPER

A. Characteristics of Information

The parameters of a good quality are difficult to determine for information. Quality of information refers to its fitness for use, or its
reliability. Following are the essential characteristic features:

i) Timeliness : Timeliness means that information must reach the recipients within the prescribed timeframes. For effective
decisionmaking, information must reach the decision-maker at the right time, i.e. recipients must get information when they
need it. Delays destroys the value of information. The characteristic of timeliness, to be effective, should also include up-to-
date, i.e. current information.
ii) Accuracy : Information should be accurate. It means that information should be free from mistakes, errors &, clear.
Accuracy also means that the information is free from bias. Wrong information given to management would result in wrong
decisions. As managers decisions are based on the information supplied in MIS reports, all managers need accurate
information.
iii) Relevance : Information is said to be relevant if it answers especially for the recipient what, why, where, when, who and
why? In other words, the MIS should serve reports to managers which is useful and the information helps them to make
decisions.
iv) Adequacy : Adequacy means information must be sufficient in quantity, i.e. MIS must provide reports containing
information which is required in the deciding processes of decision-making. The report should not give inadequate or for that
matter, more than adequate information, which may create a difficult situation for the decision-maker. Whereas inadequacy of
information leads to crises, information overload results in chaos.
v) Completeness : The information which is given to a manager must be complete and should meet all his needs. Incomplete
information may result in wrong decisions and thus may prove costly to the organization.
vi) Explicitness : A report is said to be of good quality if it does not require further analysis by the recipients for decision
making.
vii) Impartiality : Impartial information contains no bias and has been collected without any distorted view of the situation.

Assessment

1. Differentiate data and information.

1
____________________________________________________________________________________________________
____________________________________________________________________________________________________
____________________________________________________________________________________________________
____________________________________________________________________________________________________
___________________________________________________________________________.

2. Which is more appropriate to use when developing software? Why?


____________________________________________________________________________________________________
____________________________________________________________________________________________________
____________________________________________________________________________________________________
____________________________________________________________________________________________________
___________________________________________________________________________.

Lesson 3. Computer-Based Information System

Introduction

The IT architecture and IT infrastructure provide the basis for all information systems in
the organization. An information system (IS) collects, processes, stores, analyzes, and
disseminates information for a specific purpose. A computer-based information system
(CBIS) is an information system that uses computer technology to perform some or all
of its intended tasks. Although not all information systems are computerized, most are.
For this reason, the term “information system” is typically used synonymously with
“computer based information system.”

Learning Objectives:

At the end of this lesson, the students will be able to:


1. Understand the concept of computer-based information systems
2. Discuss the different application programs used in computer-
based IS;

LESSON PROPER
A. Major Capabilities of Computer-Based Information Systems
•Perform high-speed, high-volume, numerical computations.
•Provide fast, accurate communication and collaboration within and among organizations.
• Store huge amounts of information in an easy-to-access, yet small, space.
•Allow quick and inexpensive access to vast amounts of information, worldwide.
 Facilitate the interpretation of vast amounts of data.
 Increase the effectiveness and efficiency of people working in groups in one place or in several locations, anywhere.
 Automate both semiautomatic business processes and manual tasks.
B. Application Programs

An application program is a computer program designed to support a


specific task or process. Each functional area or department within a
business organization employs dozens of application programs. Note that
application programs are synonymous with applications.

For instance, the human resources department sometimes uses one


application for screening job applicants and another for monitoring
employee turnover. The collection of application programs in a single
department is usually referred to as a departmental information system.
For example, the collection of application programs in the human
resources area is called the human resources information system (HRIS).
One can see in Figure 34 how a variety of applications enables
Commerce Bank to successfully serve its customers.
Creating the IT architecture is a cyclical process, which is driven by the
business architecture. Business architecture describes organizational
plans, visions, objectives and problems, and the information required to
support them.
The potential users of IT must play a critical role in the creation of
business architecture, in order to have both business architecture and IT
architecture that meets the organization’s long-term needs. We can use the architecture of a house as an analogy. When preparing a
conceptual high-level drawing of a house, the architect needs to know the requirements of the dwellers and the building constraints
(time, money, materials, etc.). In preparing IT architecture, the designer needs similar information. This initial information is contained in
the business architecture.

Once the business architecture is finished, the system developer can start a five-step process of building the IT architecture, as shown
in Figure 35. Notice that translating the business objectives into IT architecture can be a very complex undertaking. Let us look now at
various basic elements of IT architecture.

2
C. Managing Information Resources

Information resources are a general term that includes all the hardware, software (information systems and applications), data, and
networks in an organization. In addition to the computing resources, numerous applications exist, and new ones are continuous ly being
developed. Applications have enormous strategic value. Firms rely on them so heavily that, in some cases, when they are not working
(even for a short time), an organization cannot function. In addition, these information systems are very expensive to acquir e, operate,
and maintain. Therefore, it is essential to manage them properly.

However, it is becoming increasingly difficult to manage an organization’s information resources effectively. The reason for this difficulty
comes from the evolution of the MIS function in the organization. When businesses first began to use computers in the early 1950s, the
information systems department (ISD) owned the only computing resource in the organization, the mainframe. At that time, end users
did not interact directly with the mainframe.

Today, computers are located throughout the organization, and almost all employees use computers in their work. This system is
known as end user computing. As a result of this change, the ISD no longer owns the organization’s information resources. Ins tead, a
partnership has developed between the ISD and the end users. The ISD now acts as more of a consultant to end users, viewing them
as customers. In fact, the main function of the ISD is to use IT to solve end users’ business problems.

D. Which IT Resources Are Managed and by Whom

As we just saw, the responsibility for managing information resources is now divided between the ISD and the end users. This
arrangement raises several important questions:
 Which resources are managed by whom?
 What is the role of the ISD, its structure, and its place within the organization?

In this section we provide brief answers to these questions. There are many types of information systems resources. In addition, their
components may come from multiple vendors and be of different brands. The major categories of information resources are hardware,
software, databases, networks, procedures, security facilities, and physical buildings. These resources are scattered throughout the
organization, and some of them change frequently. Therefore, they can be difficult to manage.

To make things more complicated, there is no standard menu for how to divide responsibility for developing and maintaining information
resources between the ISD and end users.

Instead, that division depends on many things: the size and nature of the organization, the amount and type of IT resources, the
organization’s attitudes toward computing, the attitudes of top management toward computing, the maturity level of the techno logy, the
amount and nature of outsourced IT work, and even the country in which the company operates.

Generally speaking, the ISD is responsible for corporate-level and shared resources and the end users are responsible for
departmental resources.

It is important that the ISD and the end users work closely together and cooperate regardless of who is doing what. Let us begin by
looking at the role of the ISD within the organization.
E. The Role of the IS Department

The role of the director of the ISD is changing from a technical


manager to a senior executive, who is often called the chief
information officer (CIO). The role of the ISD is also changing from a
purely technical one to a more managerial and strategic one. For
example, the ISD is now responsible for managing the outsourcing of
projects and for creating business alliances with vendors and IS
departments in other organizations.
Because its role has expanded so much, the ISD now reports directly
to a senior vice president of administration (Previously it reported to a
functional department such as accounting). In its new role, the ISD
must be able to work closely with external organizations such as
vendors, business partners, consultants, research institutions, and
universities.

Inside the organization, the ISD and the end-user units must be close
partners. The ISD has the responsibility for setting standards for
hardware and software purchases, as well as for information security.
The ISD also monitors user hardware and software purchases, and it
serves as a gatekeeper concerning software licensing and illegal
downloads (e.g., music files).

Assessment
1. Explain the main role of director of ISD?

____________________________________________________________________________________________________
____________________________________________________________________________________________________
____________________________________________________________________________________________________
____________________________________________________________________________________________________
_______________________________________________________________________________________________.

3
Lesson 3. Purpose of Information Management

Introduction

To gain the maximum benefits from your company's information system, you have to exploit all its capacities. Information systems
gain their importance by processing the data from company inputs to generate information that is useful for managing your
operations. To increase the information system's effectiveness, you can either add more data to make the information more acc urate
or use the information in new ways.

Learning Objectives:

At the end of this lesson, the students will be able to:


1. Discuss the importance of information management

LESSON PROPER
A. Business Communication Systems

Part of management is gathering and distributing information, and information systems can make this process more efficient by
allowing managers to communicate rapidly. Email is quick and effective, but managers can use information systems even more
efficiently by storing documents in folders that they share with the employees who need the information. This type of communication
lets employees collaborate in a systematic way.

Each employee can communicate additional information by making changes that the system tracks. The manager collects the inputs
and sends the newly revised document to his target audience.

B. Business Operations Management

How you manage your company's operations depends on the information you have. Information systems can offer more complete
and more recent information, allowing you to operate your company more efficiently. You can use information systems to gain a cost
advantage over competitors or to differentiate yourself by offering better customer service. Sales data give you insights abo ut what
customers are buying and let you stock or produce items that are selling well. With guidance from the information system, you can
streamline your operations.

C. Company Decision-Making

The company information system can help you make better decisions by delivering all the information you need and by modeling the
results of your decisions. A decision involves choosing a course of action from several alternatives and carrying out the
corresponding tasks. When you have accurate, up-to-date information, you can make the choice with confidence.

If more than one choice looks appealing, you can use the information system to run different scenarios. For each possibility, the
system can calculate key indicators such as sales, costs and profits to help you determine which alternative gives the most
beneficial result.

D. Company Record-Keeping

Your company needs records of its activities for financial and regulatory purposes as well as for finding the causes of problems and
taking corrective action. The information system stores documents and revision histories, communication records and operational
data. The trick to exploiting this recording capability is organizing the data and using the system to process and present it as useful
historical information. You can use such information to prepare cost estimates and forecasts and to analyze how your actions
affected the key company indicators.

Assessment

1. As an IT student, how important Information Management in your chosen field/course?


____________________________________________________________________________________________________
____________________________________________________________________________________________________
____________________________________________________________________________________________________
____________________________________________________________________________________________________
___________________.

Lesson 3. Data Management

Introduction

Data consists of raw facts, such as employee numbers and sales figures. For data to be transformed into useful information, it must first be
organized in a meaningful way.

4
Learning Objectives:

At the end of this lesson, the students will be able to:


1. Discuss the importance Data management
2. Define general data management terms

LESSON PROPER
A. The Hierarchy of Data

Data of a book is organised into characters, words, phrases, sentences, paragraphs and
chapters. Similarly data in a database can be organised into fields, records and files that
forms a hierarchy. Data hierarchy begins with the smallest piece of data used by computers (a
bit) and progress through the hierarchy to a database.

Bits can be organized into units called bytes. A byte is typically 8 bits. Each byte
represents a character. Character is the basic building block of data, consisting of
letters (A, B, C, …, Z, a, b, …, z), numeric digits (0, 1, 2, …, 9) or special symbols (., +, - ,
@, …).

Characters are put together to form a field. A field is typically a name (employee name),
number (salary) or combination of characters (national ID number) that describes an
aspect of a business object (e.g. an employee, a location, a vehicle) or activity (e.g. a
sale).

A collection of related data fields is a record. An employee record is a collection of fields


about an employee (i.e. Name, Designation, Department).

A collection of related records is a file (e.g. all employee records – typically known as
the employee file).

Database is a collection of interrelated files (e.g. Employee file, Department file, Payroll file). A KEY (e.g. Employee, Department and
Payroll files can be linked by the Employee ID key)

B. Data Entities, Attributes, and Keys

Entity is a generalised class of people (e.g. Employee, Student, Customer), places (e.g. City, Outlet, Warehouse) or things ( e.g. Part,
Item, Inventory) for which data is collected, stored and maintained. A record is an instance of an entity.

Attribute is a characteristic of an entity. Employee name, Designation are attributes of an employee. Main purpose of an attr ibute is to
capture the relevant characteristics of entities such as employees or customers. The specific value of an attribute, called a data item,
can be found in the fields of the record describing an entity (e.g. De Silva is data item of the name attribute of an employee entity).

A key is a field or set of fields in a record that is used to identify the record. A primary key is a key that uniquely identifies the record
(e.g. national ID number or employee number may be used to identify an employee uniquely).

C. Traditional Approach to Data Management

From the beginning of the use of computers to perform business functions, companies have used the traditional approach to process
their functions. In the traditional approach separate files have been used for each application. Today it has changed to data base
approach which uses a unified and integrated database for most of the transactions of the company.

Traditional Approach
Manual method of managing data is by recording them on paper
(e.g. filling an employee application form) and putting them in files
(e.g. employee file), which are stored using filing cabinets of the
personnel division.

One of the most basic ways to manage data electronically is via


computer files, because a file is a collection of related records
associated with a particular application.

The traditional approach to data management use separate data


files for each application programme (e.g. employee file for
personnel application, payroll file for payroll application). For a
particular application one or more files were created. As shown in
figure 3.3 each application programme had a file.

Each division created and managed files required for their applications.
Thus data which were common for several applications appeared in many files (e.g. employee name, address). This became one of the
flaws of the traditional approach to data management (e.g. employee name and address appeared in employee file, payroll file,
employee performance management file etc.). Duplication of data in separate files is known as data redundancy.

5
This caused problems when data had to be developed and coordinated to ensure that each file was properly updated. As this is difficult
to achieve in practice lot of inconsistencies could occur among data stored in separate files.

Problems of the traditional approach are,


• Data Redundancy Independent data files included a lot of duplicated data; the same data (Example: Customer’s name and
address) was recorded and stored in several files. This data redundancy caused problems when data had to be updated,
since separate file maintenance programs had to be developed and coordinated to ensure that each file was properly
updated. Of course, this proved difficult in practice, so a lot of inconsistencies occurred among data stored in separate files.

• Lack of Data Integration Having data in independent files made it difficult to provide end users with information for ad hoc
requests that required accessing data stored in several different files. Special computer programs had to be written to
retrieve data from each independent file. This was so difficult, time consuming and costly for some organizations that it was
impossible to provide end users or management with such information. If necessary, end users had to manually extract the
required information from the various reports produced by each separate application and prepare customized reports for
management.

• Data Dependence In file processing systems, major components of a system – the organization of files, their physical
locations on storage hardware, and the application software used to access those files – depended on one another in
significant ways. For example, application programs typically contained references to the specific format of the data stored
in the files they used. Thus, changes in the format and structure of data and records in a file required that changes be made
to all of the programs that used that file. This program maintenance effort was a major burden of file processing systems. It
proved difficult to do properly, and it resulted in a lot of inconsistency in the data files.

D. Database Approach to Data Management

To overcome the problems of the traditional approach to data


management the database approach is used. In a database approach a
pool of related data is shared by multiple application programs.

To use the database approach to data management, additional software


called a DataBase Management System (DBMS) is required. The DBMS
acts as a software interface between users and databases. This helps
users to easily access the data in a database. Therefore, database
management involves the use of database management software to
control how databases are created, integrated and maintained to provide
information needed by end users.

E. Advantages/disadvantages of Database Approach

The facilities offered by DBMS vary. However, a good DBMS should


provide the following advantages.

Advantages of the Database Approach


 Data and program independence - both the database and the user program can be altered independently of each other.
 Ability to share data and non-redundancy of data - enables applications to share an integrated database containing all the data
needed by the applications and this eliminates data redundancies.
 Integrity – helps to maintain the integrity of data. Inconsistencies between two entries representing the same `fact' give an
example of lack of integrity (caused by redundancy in the database).
 Centralized control - With central control of the database, the Database Administrator (DBA) can ensure that standards are
followed in the representation of data.
 Security - Having control over the database the DBA can ensure that access to the database is through proper channels and
can define the access rights of any user to any data items or defined subset of the database. The security system must
prevent corruption of the existing data either accidently or maliciously.

*DBA - a person responsible for the installing, configuring, upgrading, administrating, monitoring and maintaining of databases
in an organization.

Disadvantages of the Database Approach


 Costly
o Specialized DBMS software
o Specialized DBMS administrators and operators
 Increased vulnerability
o Single point of failure
o Targets for attacks

F. Types of DBMS

Data model is a tool that the database designers use to show the logical relationships among data. When data modeling done at a level
of entire organization it is known as enterprise data modeling.

Based on the type of data modeling used different DBMS exist. They are hierarchical, network, relational and Object-Oriented models.
Based on the no. of users too DBMS types are identified. They are single user (e.g. MS Access) and multi-user (e.g. Oracle) DBMS.

6
Let’s consider the hierarchical, network, relational and Object-Oriented database models.

Hierarchical Databases

It is one of the oldest methods of organizing and storing data, and used by few
organizations. Related fields or records are grouped together so that there are
higher-level records and lowerlevel records, just like the parents in a family tree
sit above the subordinated children. Furthermore, each child can also be a
parent with children underneath it.

The parent record at the top of the pyramid is called the root record. A child
record always has only one parent record to which it is linked, just like in a
normal family tree. In contrast, a parent record may have more than one child
record linked to it. Hierarchical databases work by moving from the top down. A
record search is conducted by starting at the top and working down through the
tree from parent to child until the appropriate child record is found.

The advantage of hierarchical databases is that they can be accessed and


updated rapidly because the tree-like structure and the relationships between
records are defined in advance. The disadvantage of this type of database
structure is that each child in the tree may have only one parent, and
relationships or linkages between children are not permitted, even if they make
sense from a logical standpoint. Hierarchical databases are so rigid in their
design that adding a new field or record requires that the entire database be redefined.

Network Databases

Network databases are similar to hierarchical databases by also having a


hierarchical structure. There are a few key differences, however. Instead of
looking like an upside-down tree, a network database looks more like a
cobweb or interconnected network of records. In network databases,
children are called members and parents are called owners. The most
important difference is that each child or member can have more than one
parent (or owner).

Like hierarchical databases, network databases are principally used on


mainframe computers. Since more connections can be made between
different types of data, network databases are considered more flexible.
However, two limitations must be considered when using this kind of
database. Similar to hierarchical databases, network databases must be
defined in advance. There is also a limit to the number of connections that
can be made between records.

The Relational Database Model

The relational model describes data using a standard tabular format. In a database structured according to the relational model, all data
elements are placed in two dimensional tables, called relations which are the logical equivalent of files. The tables in relational
databases organize data in rows and columns, simplifying data access and manipulation. It is easier for managers to understand the
relational model than other database models. IT1105 ©UCSC 8 In the relational model, each row of a table represents a data entity,
with the columns of the table representing attributes. Each attribute can take on only certain values. The allowable values for these
attributes are called the domain. The domain for a particular attribute indicates what values can be placed in each of the columns of the
relational table. The relational database model is widely used.

7
Manipulating Data

Once data has been placed into a relational database, users can make inquiries & analyze data. To manipulate relational databases a
set of relational operators have been defined. Basic data manipulations using relational operators include selecting, projecting & joining.

Selecting involves eliminating rows according to certain criteria. Projecting involves eliminating columns in a table. Joining involves
combining two or more tables.

As long as the table share at least one common data attribute,


the tables in a relational database can be linked to provide useful
information and reports. One of the primary advantages of a
relational database is that it allows tables to be linked. It is easier
to control, more flexible, more intuitive than other approaches
because it organizes data in tables. The ability to link relational
tables also allows users to relate data in new ways without
having to redefine complex relationships.

Figure 3.11: Join Operation


Object Oriented Database Model

Hierarchical and network databases are all designed to handle structured data; that is, data that fits nicely into fields, rows, and
columns. They are useful for handling small snippets of information such as names, addresses, zip codes, product numbers, and any
kind of statistic or number you can think of. On the other hand, an object-oriented database can be used to store data from a variety of
media sources, such as photographs and text, and produce work, as output, in a multimedia format.

Object-oriented databases use small, reusable chunks of software called objects. The objects themselves are stored in the obje ct-
oriented database. Each object consists of two elements: 1) a piece of data (e.g., sound, video, text, or graphics), and 2) the
instructions, or software programs called methods, for what to do with the data. Part two of this definition requires a little more
explanation. The instructions contained within the object are used to do something with the data in the object. For example, test scores
would be within the object as would the instructions for calculating average test score.

Object-oriented databases have two disadvantages. First, they are more costly to develop. Second, most organizations are reluctant to
abandon or convert from those databases that they have already invested money in developing and implementing. However, the
benefits to objectoriented databases are compelling. The ability to mix and match reusable objects provides incredible multimedia
capability. Healthcare organizations, for example, can store, track, and recall CAT scans, X-rays, electrocardiograms and many other
forms of crucial data.

Assessment:

Direction: Fill in the blanks.

1. ___________________________ is a characteristic of an entity.


2. A collection of related data fields is a __________________.
3. ___________________________ is a tool that the database designers use to show the logical relationships among data.
4. The _____________________ acts as a software interface between users and databases.
5. It is one of the oldest methods of organizing and storing data, and used by few organizations. ____________________.

Lesson 3. Database Management Systems and Applications

Introduction

Database Management Systems (DBMS) is a collection of programs that manages a databases structure and control access to the
data stored in the database. It is software which facilitates the process of defining, storing, manipulating and sharing database among
various users and applications.

8
As stated below, Database Applications enable organisations to generate information useful for decision making. Furthermore, with the
help of databases organisations try to improve their efficiency as well as achieve competitive. For e.g. databases support organisations
to carry out data mining and business intelligence which will help to identify customer preferences effectively.

Learning Objectives:

At the end of this lesson, the students will be able to:


1. Discuss the importance Data management Systems and its application;
2. Discuss Data Mining Application;
3. Discuss important factors when Selecting a Database Management System;

LESSON PROPER

A. Popular Database Management Systems

Typical users of the DBMS are Database Administrator, database designers, end users, systems analysts and application
programmers. DBMS performs several important factors which are discussed below.

Like other software products, there are a number of commercial database systems (e.g. SQL Server, DB2, Oracle, Informix) and open-
source (PostgreSQL and MySQL).
mySQL : This is the most popular open-source database management system.

B. Linking the Company Database to the Internet

Today customers, suppliers and company employees must be able to access corporate database through the internet, intranet and
extranet to meet various business needs.
Example:
1. When a customer is going to buy a book through internet he is accessing a database to find the book information, author,
price, etc.
2. With the help of the databases the suppliers can check the raw materials and the current production schedule to determine
when & how much of their products must be delivered to support just-in-time inventory management
3. Employees Of a company working from abroad may want to access the internal databases through the Internet or the
intranet to make important decisions.

Developing a seamless integration of traditional databases with the internet is often called a semantic web. The semantic web is about
taking the relational database & webbing it. It allows accessing and manipulating a number of traditional databases at the sa me time
through the internet.

Instead of the internet, organizations are gaining access to databases through networks to get good prices and reliable services.
However, linking company databases to external network such as the Internet can be potentially dangerous due to issues relate d to
security. For example a competitor or any other hacker may gain access to these databases.

C. Data Mining Applications

Data warehouses and Data Mining

The raw data necessary to make sound business decisions is stored


in a variety of locations and formats.

Using data warehouses and data mining data can be used to support
decision making. A data warehouse stores data that have been
extracted from the various operational, external and other databases
of an organisation. It is a central source of the data that have been
cleaned, transformed, catalogued so that they can be used by
managers and other business professionals for data mining, online
analytical processing and other form of business analysis, market
research and decision support. A data warehouse can also be
viewed as a database for historical data from different functions
within a company.

The structure of data warehouse is easier for end users to navigate,


understand and query against unlike the relational databases it is
primarily designed to handle lots of transactions. Data warehouse
enable queries that cast across different segments of a company’s
operation.

Example: Production data could be compared against inventory data


even if they were originally stored in different tables with different
structures. Data warehousing is an efficient way to manage and
report on data collected from a variety of sources, which are non-
uniform and scattered throughout a company.

Data Mining

Data mining is a major use of data warehouse databases. It is an information-analysis tool that involves the automated discovery of
hidden patterns and trends in historical business activity. Data mining’s objective is to extract patterns, trends and rules from data
warehouses to evaluate (i.e. predict or score) proposed business strategies, which in turn will improve competitiveness, impr ove profits
and transform business processes. With the help of data mining it is possible to improve customer retention, campaign management
and customer segmentation analysis.
9
D. Business Intelligence

Databases can be used for the purpose of business intelligence closely linked to data mining. Business Intelligence is the process of
gathering enough of the right information in a timely manner and usable form and analyzing it so that it can have a positive impact on
business strategy, tactics or operations. Business Intelligence turns data into valuable information and distributes it throughout an
enterprise. This information is used by the companies to improve strategic discussions about which markets to enter, how to select and
manage key customer relationships, how to improve sales promotions etc. Business Intelligence tools (applications) can be found in
different categories such as Business planning, Customer Relationship Management (CRM), Management Information Systems (MIS)
etc.

Online Analytical Processing (OLAP)

Online analytical processing allows users to explore data from a number of different perspectives. OLAP involves analysing complex
relationships among thousands or even millions of data items stored in data marts, data warehouses, and other multidimensiona l
databases to discover patterns, trends and exceptional conditions.

E. Important factors when Selecting a Database Management System

The database administrator often selects the best database management system for an organization. The process begins by a nalyzing
database needs and characteristics. The information needs of the organization affect the type of data that is collected and the type of
database management system that is used.

The important features that have to be considered when selecting a Database Management System are as follows.
 Database size
Database size depends on the number of records or files in the database. The size determines the overall storage requirement
for the database.
To maintain good performance and to reduce costs companies are trimming the size of their databases.
 Number of concurrent users
Number of simultaneous users that can access the contents of the database is also an important factor. A database that is
used by a large workgroup must be able to support number of concurrent users. If it cannot, then the efficiency of the user
requests will be lowered. To provide flexibility to the database, highly scalable DBMS is preferred by the companies. Scalability
describes how well a database performs as the size of the database or the number of concurrent users increase.
 Performance
How fast the database is able to update records can be the most important performance criterion for some organizations.
Example: Credit and airline companies must have database systems that can immediately update customer records and check
credit or make a plane reservation in seconds not minutes. However payroll applications can be processed once a week or
less frequently and do not require immediate processing. When an application demands immediacy, it also demands rapid
recovery facilities in the event that the computer system shuts down temporarily. Other performance considerations include the
number of concurrent users that can be supported and the amount of memory that is required to execute the database
management program.
 Integration
A key aspect of any database is its’ ability to be integrated with other applications and databases. A key determinant here is
what operating systems it can run under – such as Linux, UNIX or Windows. Some companies use several databases for
different applications at different locations.
 Features
The features of the database management system can also make a big difference. Most database programs come with
security procedures, privacy protection and a variety of tools.
 The vendor
The size, reputation and financial stability of the vendor is also an important aspect. Some organisations would rely on vendor
support to handle operational aspects of the system.
 Cost
Cost of a database system varies from few thousands to millions of rupees based on the number of users and functionalities.
In addition to the initial cost of the database package, annual or monthly maintenance or operating costs should be
considered.

Assessment:
Direction: Write TT if the statement is true and FF it is rather false.

_____1. Today customers, suppliers and company employees must not be able to access corporate database through the
internet, intranet and extranet to meet various business needs.
_____2. Data Warehouse is a major use of data warehouse databases.
_____3. Databases can be used for the purpose of business intelligence closely linked to data mining.
_____4. The features of the database management system can also make a big difference.
_____5. The size, reputation and financial stability of the vendor is not an important aspect.

Lesson 1. Overview

Introduction
The term "Data Warehouse" was first coined by Bill Inmon in 1990. According to Inmon, a data warehouse is a subject oriented,
integrated, time-variant, and non-volatile collection of data. This data helps analysts to take informed decisions in an organization.
10
An operational database undergoes frequent changes on a daily basis on account of the transactions that take place. Suppose a
business executive wants to analyze previous feedback on any data such as a product, a supplier, or any consumer data, then the
executive will have no data available to analyze because the previous data has been updated due to transactions.
A data warehouses provides us generalized and consolidated data in multidimensional view. Along with generalized and consolidated
view of data, a data warehouses also provides us Online Analytical Processing (OLAP) tools. These tools help us in interactive and
effective analysis of data in a multidimensional space. This analysis results in data generalization and data mining.
Data mining functions such as association, clustering, classification, prediction can be integrated with OLAP operations to enhance the
interactive mining of knowledge at multiple level of abstraction. That's why data warehouse has now become an important platform for
data analysis and online analytical processing.

Learning Objectives:

At the end of this lesson, the students will be able to:


1. Discuss the overview of Data Warehouse.

LESSON PROPER

A. Understanding a Data Warehouse


 A data warehouse is a database, which is kept separate from the organization's operational database.
 There is no frequent updating done in a data warehouse.
 It possesses consolidated historical data, which helps the organization to analyze its business.
 A data warehouse helps executives to organize, understand, and use their data to take strategic decisions.
 Data warehouse systems help in the integration of diversity of application systems.
 A data warehouse system helps in consolidated historical data analysis.

B. Why a Data Warehouse is Separated from Operational Databases
A data warehouses is kept separate from operational databases due to the following reasons −

 An operational database is constructed for well-known tasks and workloads such as searching particular records, indexing,
etc. In contract, data warehouse queries are often complex and they present a general form of data.
 Operational databases support concurrent processing of multiple transactions. Concurrency control and recovery
mechanisms are required for operational databases to ensure robustness and consistency of the database.
 An operational database query allows to read and modify operations, while an OLAP query needs only read only access of
stored data.
 An operational database maintains current data. On the other hand, a data warehouse maintains historical data.

C. Data Warehouse Features


The key features of a data warehouse are discussed below −

 Subject Oriented − A data warehouse is subject oriented because it provides information around a subject rather than the
organization's ongoing operations. These subjects can be product, customers, suppliers, sales, revenue, etc. A data
warehouse does not focus on the ongoing operations, rather it focuses on modelling and analysis of data for decision
making.
 Integrated − A data warehouse is constructed by integrating data from heterogeneous sources such as relational databases,
flat files, etc. This integration enhances the effective analysis of data.
 Time Variant − The data collected in a data warehouse is identified with a particular time period. The data in a data
warehouse provides information from the historical point of view.
 Non-volatile − Non-volatile means the previous data is not erased when new data is added to it. A data warehouse is kept
separate from the operational database and therefore frequent changes in operational database is not reflected in the data
warehouse.
Note − A data warehouse does not require transaction processing, recovery, and concurrency controls, because it is physically stored
and separate from the operational database.
D. Data Warehouse Applications
As discussed before, a data warehouse helps business executives to organize, analyze, and use their data for decision making. A
data warehouse serves as a sole part of a plan-execute-assess "closed-loop" feedback system for the enterprise management. Data
warehouses are widely used in the following fields −

 Financial services
 Banking services
 Consumer goods
 Retail sectors
 Controlled manufacturing

E. Types of Data Warehouse


Information processing, analytical processing, and data mining are the three types of data warehouse applications that are discussed
below −

11
 Information Processing − A data warehouse
allows to process the data stored in it. The data
can be processed by means of querying, basic
statistical analysis, reporting using crosstabs,
tables, charts, or graphs.
 Analytical Processing − A data warehouse
supports analytical processing of the
information stored in it. The data can be
analyzed by means of basic OLAP operations,
including slice-and-dice, drill down, drill up, and
pivoting.
 Data Mining − Data mining supports
knowledge discovery by finding hidden
patterns and associations, constructing
analytical models, performing classification and
prediction. These mining results can be
presented using the visualization tools.

Assessment

1. Give an example of how DataWarehouse is


being used in Retail Sectors.
____________________________________________________________________________________________________
____________________________________________________________________________________________________
____________________________________________________________________________________________________
____________________________________________________________________________________________________
______________________________.

Lesson 2. Concepts

Introduction

Data warehousing is the process of constructing and using a data warehouse. A data warehouse is constructed by integrating data from
multiple heterogeneous sources that support analytical reporting, structured and/or ad hoc queries, and decision making. Data
warehousing involves data cleaning, data integration, and data consolidations.

Learning Objectives:

At the end of this lesson, the students will be able to:


1. Discuss the overview of Data Warehouse;

LESSON PROPER

A. Using Data Warehouse Information


There are decision support technologies that help utilize the data available in a data warehouse. These technologies help exe cutives
to use the warehouse quickly and effectively. They can gather data, analyze it, and take decisions based on the information present in
the warehouse. The information gathered in a warehouse can be used in any of the following domains −

 Tuning Production Strategies − The product strategies can be well tuned by repositioning the products and managing the
product portfolios by comparing the sales quarterly or yearly.

 Customer Analysis − Customer analysis is done by analyzing the customer's buying preferences, buying time, budget
cycles, etc.

 Operations Analysis − Data warehousing also helps in customer relationship management, and making environmental
corrections. The information also allows us to analyze business operations.
B. Integrating Heterogeneous Databases
To integrate heterogeneous databases, we have two approaches −

 Query-driven Approach
 Update-driven Approach

C. Query-Driven Approach
This is the traditional approach to integrate heterogeneous databases. This approach was used to build wrappers and integrato rs on
top of multiple heterogeneous databases. These integrators are also known as mediators.

Process of Query-Driven Approach


 When a query is issued to a client side, a metadata dictionary translates the query into an appropriate form for individual
heterogeneous sites involved.
12
 Now these queries are mapped and sent to the local query processor.
 The results from heterogeneous sites are integrated into a global answer set.

Disadvantages
 Query-driven approach needs complex integration and filtering processes.
 This approach is very inefficient.
 It is very expensive for frequent queries.
 This approach is also very expensive for queries that require aggregations.

D. Update-Driven Approach
This is an alternative to the traditional approach. Today's data warehouse systems follow update-driven approach rather than the
traditional approach discussed earlier. In update-driven approach, the information from multiple heterogeneous sources are integrated
in advance and are stored in a warehouse. This information is available for direct querying and analysis.

Advantages
This approach has the following advantages −
 This approach provides high performance.
 The data is copied, processed, integrated, annotated, summarized and restructured in semantic data store in advance.
 Query processing does not require an interface to process data at local sources.

E. Functions of Data Warehouse Tools and Utilities


The following are the functions of data warehouse tools and utilities −

 Data Extraction − Involves gathering data from multiple heterogeneous sources.


 Data Cleaning − Involves finding and correcting the errors in data.
 Data Transformation − Involves converting the data from legacy format to warehouse format.
 Data Loading − Involves sorting, summarizing, consolidating, checking integrity, and building indices and partitions.
 Refreshing − Involves updating from data sources to warehouse.
Note − Data cleaning and data transformation are important steps in improving the quality of data and data mining results.
Assessment
Direction: Fill in the blanks.

1. ___________________________ is the process of constructing and using a data warehouse.


2. ___________________________ This is the traditional approach to integrate heterogeneous databases.
3. ___________________________ This is an alternative to the traditional approach.
4. ___________________________ Involves finding and correcting the errors in data
5. ___________________________ Involves updating from data sources to warehouse.

Lesson 3. Delivery Process

Introduction
A data warehouse is never static; it evolves as the business expands. As the business evolves, its requirements keep changing and
therefore a data warehouse must be designed to ride with these changes. Hence a data warehouse system needs to be flexible.
Ideally there should be a delivery process to deliver a data warehouse. However data warehouse projects normally suffer from various
issues that make it difficult to complete tasks and deliverables in the strict and ordered fashion demanded by the waterfall method.
Most of the times, the requirements are not understood completely. The architectures, designs, and build components can be
completed only after gathering and studying all the requirements.
Learning Objectives:

At the end of this lesson, the students will be able to:


1. Discuss the overview of Delivery Process;
2. Discuss the Business case for investment;

LESSON PROPER

A. Delivery Method
The delivery method is a variant of the joint application
development approach adopted for the delivery of a data
warehouse. We have staged the data warehouse
delivery process to minimize risks. The approach that we
will discuss here does not reduce the overall delivery
time-scales but ensures the business benefits are
delivered incrementally through the development
process.
Note − The delivery process is broken into phases to
reduce the project and delivery risk.
The following diagram explains the stages in the delivery
process −

13
B. IT Strategy
Data warehouse are strategic investments that require a business process to generate benefits. IT Strategy is required to procure and
retain funding for the project.
C. Business Case
The objective of business case is to estimate business benefits that should be derived from using a data warehouse. These benefits
may not be quantifiable but the projected benefits need to be clearly stated. If a data warehouse does not have a clear business case,
then the business tends to suffer from credibility problems at some stage during the delivery process. Therefore in data warehouse
projects, we need to understand the business case for investment.
D. Education and Prototyping
Organizations experiment with the concept of data analysis and educate themselves on the value of having a data warehouse before
settling for a solution. This is addressed by prototyping. It helps in understanding the feasibility and benefits of a data warehouse. The
prototyping activity on a small scale can promote educational process as long as −

 The prototype addresses a defined technical objective.


 The prototype can be thrown away after the feasibility concept has been shown.
 The activity addresses a small subset of eventual data content of the data warehouse.
 The activity timescale is non-critical.
The following points are to be kept in mind to produce an early release and deliver business benefits.

 Identify the architecture that is capable of evolving.


 Focus on business requirements and technical blueprint phases.
 Limit the scope of the first build phase to the minimum that delivers business benefits.
 Understand the short-term and medium-term requirements of the data warehouse.

E. Business Requirements
To provide quality deliverables, we should make sure the overall requirements are understood. If we understand the business
requirements for both short-term and medium-term, then we can design a solution to fulfil short-term requirements. The short-term
solution can then be grown to a full solution.
The following aspects are determined in this stage −

 The business rule to be applied on data.


 The logical model for information within the data warehouse.
 The query profiles for the immediate requirement.
 The source systems that provide this data.

F. Technical Blueprint
This phase need to deliver an overall architecture satisfying the long term requirements. This phase also deliver the components that
must be implemented in a short term to derive any business benefit. The blueprint need to identify the followings.

 The overall system architecture.


 The data retention policy.
 The backup and recovery strategy.
 The server and data mart architecture.
 The capacity plan for hardware and infrastructure.
 The components of database design.

G. Building the Version


In this stage, the first production deliverable is produced. This production deliverable is the smallest component of a data warehouse.
This smallest component adds business benefit.
H. History Load
This is the phase where the remainder of the required history is loaded into the data warehouse. In this phase, we do not add new
entities, but additional physical tables would probably be created to store increased data volumes.
Let us take an example. Suppose the build version phase has delivered a retail sales analysis data warehouse with 2 months’ worth of
history. This information will allow the user to analyze only the recent trends and address the short-term issues. The user in this case
cannot identify annual and seasonal trends. To help him do so, last 2 years’ sales history could be loade d from the archive. Now the
40GB data is extended to 400GB.
Note − The backup and recovery procedures may become complex, therefore it is recommended to perform this activity within a
separate phase.
I. Ad hoc Query
In this phase, we configure an ad hoc query tool that is used to operate a data warehouse. These tools can generate the database
query.
Note − It is recommended not to use these access tools when the database is being substantially modified.
J. Automation
In this phase, operational management processes are fully automated. These would include −

 Transforming the data into a form suitable for analysis.


 Monitoring query profiles and determining appropriate aggregations to maintain system performance.
14
 Extracting and loading data from different source systems.
 Generating aggregations from predefined definitions within the data warehouse.
 Backing up, restoring, and archiving the data.

K. Extending Scope
In this phase, the data warehouse is extended to address a new set of business requirements. The scope can be extended in two
ways −

 By loading additional data into the data warehouse.


 By introducing new data marts using the existing information.
Note − This phase should be performed separately, since it involves substantial efforts and complexity.
L. Requirements Evolution
From the perspective of delivery process, the requirements are always changeable. They are not static. The delivery process must
support this and allow these changes to be reflected within the system.
This issue is addressed by designing the data warehouse around the use of data within business processes, as opposed to the data
requirements of existing queries.
The architecture is designed to change and grow to match the business needs, the process operates as a pseudo-application
development process, where the new requirements are continually fed into the development activities and the partial deliverab les are
produced. These partial deliverables are fed back to the users and then reworked ensuring that the overall system is continually
updated to meet the business needs.
Assessment
Direction: Fill in the blanks.
1. ____________________________ This is a variant of the joint application development approach adopted for the delivery
of a data warehouse.
2. __________________________ In this phase, we configure an ad hoc query tool that is used to operate a data
warehouse. These tools can generate the database query.
3. __________________________ In this phase, operational management processes are fully automated
4. ___________________________ This phase should be performed separately, since it involves substantial efforts and
complexity.
5. ___________________________ This is the phase where the remainder of the required history is loaded into the data
warehouse.

Lesson 4. System Process

Introduction

We have a fixed number of operations to be applied on the operational databases and we have well-defined techniques such as use
normalized data, keep table small, etc. These techniques are suitable for delivering a solution. But in case of decision-support
systems, we do not know what query and operation needs to be executed in future. Therefore techniques applied on operational
databases are not suitable for data warehouses.

Learning Objectives:

At the end of this lesson, the students will be able to:


1. Discuss how to build data warehousing solutions on top open-system technologies like Unix and relational
databases.

LESSON PROPER

A. Process Flow in Data Warehouse

There are four major processes that contribute to a data warehouse −


 Extract and load the data.
 Cleaning and transforming the data.
 Backup and archive the data.
 Managing queries and directing them to the appropriate data sources.

B. Extract and Load Process

Data extraction takes data from the source systems. Data load takes the extracted data and loads it into the data warehouse.

Note − Before loading the data into the data warehouse, the information extracted from the external sources must be reconstructed.

Controlling the Process


Controlling the process involves determining when to start data extraction and the consistency check on data. Controlling pro cess
ensures that the tools, the logic modules, and the programs are executed in correct sequence and at correct time.

15
When to Initiate Extract
Data needs to be in a consistent state when it is extracted, i.e., the data warehouse should represent a single, consistent version of
the information to the user.
For example, in a customer profiling data warehouse in telecommunication sector, it is illogical to merge the list of customers at 8 pm
on Wednesday from a customer database with the customer subscription events up to 8 pm on Tuesday. This would mean that we ar e
finding the customers for whom there are no associated subscriptions.

Loading the Data


After extracting the data, it is loaded into a temporary data store where it is cleaned up and made consistent.

Note − Consistency checks are executed only when all the data sources have been loaded into the temporary data store.

C. Clean and Transform Process


Once the data is extracted and loaded into the temporary data store, it is time to perform Cleaning and Transforming. Here is the list of
steps involved in Cleaning and Transforming −

 Clean and transform the loaded data into a structure


 Partition the data
 Aggregation

Clean and Transform the Loaded Data into a Structure


Cleaning and transforming the loaded data helps speed up the queries. It can be done by making the data consistent −

 within itself.
 with other data within the same data source.
 with the data in other source systems.
 with the existing data present in the warehouse.
Transforming involves converting the source data into a structure. Structuring the data increases the query performance and
decreases the operational cost. The data contained in a data warehouse must be transformed to support performance requirements
and control the ongoing operational costs.

Partition the Data


It will optimize the hardware performance and simplify the management of data warehouse. Here we partition each fact table into
multiple separate partitions.

Aggregation
Aggregation is required to speed up common queries. Aggregation relies on the fact that most common queries will analyze a subset
or an aggregation of the detailed data.
Backup and Archive the Data
In order to recover the data in the event of data loss, software failure, or hardware failure, it is necessary to keep regula r back ups.
Archiving involves removing the old data from the system in a format that allow it to be quickly restored whenever required.
For example, in a retail sales analysis data warehouse, it may be required to keep data for 3 years with the latest 6 months data being
kept online. In such as scenario, there is often a requirement to be able to do month-on-month comparisons for this year and last year.
In this case, we require some data to be restored from the archive.
D. Query Management Process
This process performs the following functions −

 manages the queries.


 helps speed up the execution time of queris.
 directs the queries to their most effective data sources.
 ensures that all the system sources are used in the most effective way.
 monitors actual query profiles.
The information generated in this process is used by the warehouse management process to determine which aggregations to
generate. This process does not generally operate during the regular load of information into data warehouse.
Assessment
Essay.

1. Why is there a need for a structuring of data?


____________________________________________________________________________________________________
____________________________________________________________________________________________________
_____________________________________.

Lesson 4. Architecture

Learning Objectives:

At the end of this lesson, the students will be able to:


1. Discuss the business analysis framework for the data warehouse design and architecture of a data warehouse.

16
LESSON PROPER

A. Business Analysis Framework


The business analyst get the information from the data warehouses to measure the performance and make critical adjustments in
order to win over other business holders in the market. Having a data warehouse offers the followi ng advantages −

 Since a data warehouse can gather information quickly and efficiently, it can enhance business productivity.
 A data warehouse provides us a consistent view of customers and items, hence, it helps us manage customer relationship.
 A data warehouse also helps in bringing down the costs by tracking trends, patterns over a long period in a consistent and
reliable manner.
To design an effective and efficient data warehouse, we need to understand and analyze the business needs and construct
a business analysis framework. Each person has different views regarding the design of a data warehouse. These views are as
follows −

 The top-down view − This view allows the selection of relevant information needed for a data warehouse.
 The data source view − This view presents the information being captured, stored, and managed by the operational system.
 The data warehouse view − This view includes the fact tables and dimension tables. It represents the information stored
inside the data warehouse.
 The business query view − It is the view of the data from the viewpoint of the end-user.
B. Three-Tier Data Warehouse Architecture
Generally a data warehouses adopts a three-tier architecture. Following
are the three tiers of the data warehouse architecture.

 Bottom Tier − The bottom tier of the architecture is the data


warehouse database server. It is the relational database
system. We use the back end tools and utilities to feed data
into the bottom tier. These back end tools and utilities perform
the Extract, Clean, Load, and refresh functions.
 Middle Tier − In the middle tier, we have the OLAP Server that
can be implemented in either of the following ways.
o By Relational OLAP (ROLAP), which is an extended
relational database management system. The
ROLAP maps the operations on multidimensional
data to standard relational operations.
o By Multidimensional OLAP (MOLAP) model, which
directly implements the multidimensional data and
operations.
 Top-Tier − This tier is the front-end client layer. This layer
holds the query tools and reporting tools, analysis tools and
data mining tools.
The following diagram depicts the three-tier architecture of
C. Data Warehouse Models data warehouse −
From the perspective of data warehouse architecture, we have the
following data warehouse models −

 Virtual Warehouse
 Data mart
 Enterprise Warehouse

Virtual Warehouse
The view over an operational data warehouse is known as a virtual warehouse. It is easy to build a virtual warehouse. Building a
virtual warehouse requires excess capacity on operational database servers.

Data Mart
Data mart contains a subset of organization-wide data. This subset of data is valuable to specific groups of an organization.
In other words, we can claim that data marts contain data specific to a particular group. For example, the marketing data mar t may
contain data related to items, customers, and sales. Data marts are confined to subjects.
Points to remember about data marts −

 Window-based or Unix/Linux-based servers are used to implement data marts. They are implemented on low-cost servers.
 The implementation data mart cycles is measured in short periods of time, i.e., in weeks rather than months or years.
 The life cycle of a data mart may be complex in long run, if its planning and design are not organization-wide.
 Data marts are small in size.
 Data marts are customized by department.
 The source of a data mart is departmentally structured data warehouse.
 Data mart are flexible.

Enterprise Warehouse
 An enterprise warehouse collects all the information and the subjects spanning an entire organization
 It provides us enterprise-wide data integration.
 The data is integrated from operational systems and external information providers.
 This information can vary from a few gigabytes to hundreds of gigabytes, terabytes or beyond.

17
D. Load Manager

This component performs the operations required to extract and load process.
The size and complexity of the load manager varies between specific solutions from
one data warehouse to other.

Load Manager Architecture


The load manager performs the following functions −
 Extract the data from source system.
 Fast Load the extracted data into temporary data store.
 Perform simple transformations into structure similar to the one in the data
warehouse.

Extract Data from Source


The data is extracted from the operational databases or the external information providers. Gateways is the application progr ams that
are used to extract data. It is supported by underlying DBMS and allows client program to generate SQL to be executed at a server.
Open Database Connection(ODBC), Java Database Connection (JDBC), are examples of gateway.

Fast Load
 In order to minimize the total load window the data need to be loaded into the warehouse in the fastest possible time.
 The transformations affects the speed of data processing.
 It is more effective to load the data into relational database prior to applying transformations and checks.
 Gateway technology proves to be not suitable, since they tend not be performant when large data volumes are involved.

Simple Transformations
While loading it may be required to perform simple transformations. After this has been completed we are in position to do the complex
checks. Suppose we are loading the EPOS sales transaction we need to perform the following checks:

 Strip out all the columns that are not required within the warehouse.
 Convert all the values to required data types.

E. Warehouse Manager

A warehouse manager is responsible for the warehouse management process. It consists of third-party system software, C programs,
and shell scripts.
The size and complexity of warehouse managers varies between specific solutions.

Warehouse Manager Architecture


A warehouse manager includes the following −
 The controlling process
 Stored procedures or C with SQL
 Backup/Recovery tool
 SQL Scripts

Operations Performed by Warehouse Manager


 A warehouse manager analyzes the data to perform consistency and
referential integrity checks.
 Creates indexes, business views, partition views against the base data.
 Generates new aggregations and updates existing aggregations.
Generates normalizations.
 Transforms and merges the source data into the published data warehouse.
 Backup the data in the data warehouse.
 Archives the data that has reached the end of its captured life.

Note − A warehouse Manager also analyzes query profiles to determine index and aggregations are appropriate.

F. Query Manager

 Query manager is responsible for directing the queries to the


suitable tables.
 By directing the queries to appropriate tables, the speed of
querying and response generation can be increased.
 Query manager is responsible for scheduling the execution of
the queries posed by the user.

Query Manager Architecture


The following screenshot shows the architecture of a query manager. It
includes the following:

 Query redirection via C tool or RDBMS


 Stored procedures
 Query management tool
 Query scheduling via C tool or RDBMS
 Query scheduling via third-party software
18
E. Detailed Information
Detailed information is not kept online, rather it is aggregated to the next level
of detail and then archived to tape. The detailed information part of data
warehouse keeps the detailed information in the starflake schema. Detailed
information is loaded into the data warehouse to supplement the aggregated
data.
The following diagram shows a pictorial impression of where detailed
information is stored and how it is used.

Assessment

1. What does data warehouse offers? What are the advantages being
offered?
____________________________________________________________________________________________________
____________________________________________________________________________________________________
____________________________________________________________________________________________________
____________________________________________________________________________________________________
__________________________________________________________________________________.

VALUES INTEGRATION:

1. How will you manifest perseverance in understanding IM as a new subject for your taken course?
2. In creating group projects for Information Management how will you show cooperation with the team?

REFERENCES:

1. https://ecomputernotes.com/mis/what-is-mis/what-do-you-understand-by-information-what-are-the-characteristics-of-
information

2. https://uotechnology.edu.iq/ce/Lectures/SarmadFuad-MIS/MIS_Lecture_12.pdf

3. https://www.google.com/search?rlz=1C1CHBD_enPH863PH863&sxsrf=ALeKk01wezWlNY6gNJFw9k9hnvAk8QiwFw:159859
6554854&q=importance+of+information+management&sa=X&ved=2ahUKEwic9qmepL3rAhUFfnAKHb9GDpkQ1QIoAXoECA
oQAg&biw=1366&bih=625

4. https://www.tutorialspoint.com/dwh/dwh_data_warehousing.htm#:~:text=Data%20warehousing%20is%20the%20process,hoc
%20queries%2C%20and%20decision%20making.

5. https://slbit.files.wordpress.com/2015/02/organizing-data-and-information.pdf

Prepared by:

JELINDA T. MACASUSI
Full-time Lecturer

19

You might also like