Knowledge Management UNIT-2 Notes

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Subject:- Knowledge Management

Code:-BCA-604

Unit-2 Notes
Expert System
An expert system is a computer program that is designed to solve complex problems and to
provide decision-making ability like a human expert. It performs this by extracting
knowledge from its knowledge base using the reasoning and inference rules according to the
user queries.

● The expert system is a part of AI, and the first ES was developed in the year 1970,
which was the first successful approach of artificial intelligence.
● It solves the most complex issue as an expert by extracting the knowledge stored in its
knowledge base. The system helps in decision making for complex problems
using both facts and heuristics like a human expert.
● It is called so because it contains the expert knowledge of a specific domain and can
solve any complex problem of that particular domain. These systems are designed for
a specific domain, such as medicine, science, etc.

Popular examples of the Expert System:


● DENDRAL: It was an artificial intelligence project that was made as a chemical
analysis expert system. It was used in organic chemistry to detect unknown organic
molecules with the help of their mass spectra and knowledge base of chemistry.
● MYCIN: It was one of the earliest backward chaining expert systems that was
designed to find the bacteria causing infections like bacteraemia and meningitis. It
was also used for the recommendation of antibiotics and the diagnosis of blood
clotting diseases.
● PXDES: It is an expert system that is used to determine the type and level of lung
cancer. To determine the disease, it takes a picture from the upper body, which looks
like the shadow. This shadow identifies the type and degree of harm.
● CaDeT: The CaDet expert system is a diagnostic support system that can detect
cancer at early stages.

Characteristics of Expert System


● High Performance: The expert system provides high performance for solving any
type of complex problem of a specific domain with high efficiency and accuracy.
● Understandable: It responds in a way that can be easily understandable by the user. It
can take input in human language and provide the output in the same way.
● Reliable: It is much reliable for generating an efficient and accurate output.
● Highly responsive: ES provides the result for any complex query within a very short
period of time.

Components of Expert System


An expert system mainly consists of three components:

● User Interface
● Inference Engine
● Knowledge Base
1. User Interface

With the help of a user interface, the expert system interacts with the user, takes queries as an
input in a readable format, and passes it to the inference engine. After getting the response
from the inference engine, it displays the output to the user. In other words, it is an interface
that helps a non-expert user to communicate with the expert system to find a solution.

2. Inference Engine (Rules of Engine)


● The inference engine is known as the brain of the expert system as it is the main
processing unit of the system. It applies inference rules to the knowledge base to
derive a conclusion or deduce new information. It helps in deriving an error-free
solution of queries asked by the user.
● With the help of an inference engine, the system extracts the knowledge from the
knowledge base.
● There are two types of inference engine:

1) Deterministic Inference engine: The conclusions drawn from this type of inference
engine are assumed to be true. It is based on facts and rules.
2) Probabilistic Inference engine: This type of inference engine contains uncertainty in
conclusions, and based on the probability.

Inference engine uses the below modes to derive the solutions:

● Forward Chaining: It starts from the known facts and rules, and applies the
inference rules to add their conclusion to the known facts.
● Backward Chaining: It is a backward reasoning method that starts from the goal and
works backward to prove the known facts.

3. Knowledge Base
● The knowledgebase is a type of storage that stores knowledge acquired from the
different experts of the particular domain. It is considered as a big storage of
knowledge. The more the knowledge base, the more precise will be the Expert
System.
● It is similar to a database that contains information and rules of a particular domain or
subject.
● One can also view the knowledge base as collections of objects and their attributes.
Such as a Lion is an object and its attributes are it is a mammal, it is not a domestic
animal, etc.

Data Warehouse
A Data Warehouse (DW) is a relational database that is designed for query and analysis rather
than transaction processing. It includes historical data derived from transaction data from
single and multiple sources.

A Data Warehouse provides integrated, enterprise-wide, historical data and focuses on


providing support for decision-makers for data modeling and analysis.

A Data Warehouse can be viewed as a data system with the following attributes:

● It is a database designed for investigative tasks, using data from various applications.
● It supports a relatively small number of clients with relatively long interactions.
● It includes current and historical data to provide a historical perspective of
information.
● Its usage is read-intensive.
● It contains a few large tables.

Components or Building Blocks of Data Warehouse


Architecture is the proper arrangement of the elements. We build a data warehouse with
software and hardware components. To suit the requirements of our organizations, we arrange
these buildings. We may want to boost up another part with extra tools and services. All of
these depend on our circumstances.
The figure shows the essential elements of a typical warehouse. We see the Source Data
component shown on the left. The Data staging element serves as the next building block. In
the middle, we see the Data Storage component that handles the data warehouses data. This
element not only stores and manages the data; it also keeps track of data using the metadata
repository. The Information Delivery component on the right consists of all the different ways
of making the information from the data warehouses available to the users.

Data Warehouse architecture


A data warehouse is a heterogeneous collection of different data sources organized under a
unified schema. There are 2 approaches for constructing a data-warehouse: Top-down
approach and Bottom-up approach are explained below.

1. Top-down approach:

The essential components are discussed below:

1. External Sources –

External source is a source from where data is collected irrespective of the type of data.
Data can be structured, semi-structured and unstructured as well.

2. Stage Area –

Since the data, extracted from the external sources do not follow a particular format,
there is a need to validate this data to load into the data warehouse. For this purpose, it
is recommended to use the ETL tool.
● E(Extracted): Data is extracted from the External data source.
● T(Transform): Data is transformed into the standard format.
● L(Load): Data is loaded into the data warehouse after transforming it into
the standard format.

3. Data-warehouse –

After cleansing of data, it is stored in the data warehouse as a central repository. It


actually stores the metadata and the actual data gets stored in the data marts. Note that
the data warehouse stores the data in its purest form in this top-down approach.

4. Data Marts –

Datamart is also a part of the storage component. It stores the information of a particular
function of an organization that is handled by a single authority. There can be as many
numbers of data marts in an organization depending upon the functions. We can also say
that data mart contains a subset of the data stored in the data warehouse.
5. Data Mining –

The practice of analyzing the big data present in the data warehouse is data mining. It is
used to find the hidden patterns that are present in the database or in a data warehouse
with the help of an algorithm of data mining.

2. Bottom-up approach:

1. First, the data is extracted from external sources (same as happens in a top-down
approach).
2. Then, the data goes through the staging area (as explained above) and loaded
into data marts instead of the data warehouse. The data marts are created first and
provide reporting capability. It addresses a single business area.
3. These data marts are then integrated into the data warehouse.

Difference between Database and Data Warehouse

Database Data Warehouse


1. It is used for Online Transactional 1. It is used for Online Analytical Processing (OLAP). This reads
Processing (OLTP) but can be used for other the historical information for the customers for business
objectives such as Data Warehousing. This decisions.
records the data from the clients for history.

2. The tables and joins are complicated since 2. The tables and joins are accessible since they are
they are normalized for RDBMS. This is done denormalized. This is done to minimize the response time for
to reduce redundant files and to save storage analytical queries.
space.

3. Data is dynamic 3. Data is largely static

4. Entity: Relational modeling procedures are 4. Data: Modeling approach are used for the Data Warehouse
used for RDBMS database design. design.

5. Optimized for write operations. 5. Optimized for read operations.

6. Performance is low for analysis queries. 6. High performance for analytical queries.

7. The database is the place where the data is 7. Data Warehouse is the place where the application data is
taken as a base and managed to get available handled for analysis and reporting objectives.
fast and efficient access.

Difference between OLTP and OLAP

OLTP System
OLTP System handle with operational data. Operational data are those data contained in the
operation of a particular system. Example, ATM transactions and Bank transactions etc.

OLAP System
OLAP handles Historical Data or Archival Data. Historical data are those data that are
achieved over a long period. For example, if we collect the last 10 years' information about
flight reservation, the data can give us much meaningful data such as the trends in the
reservation. This may provide useful information like peak time of travel, what kind of
people are traveling in various classes (Economy/Business) etc.
Feature OLTP OLAP

Characteristic It is a system which is used to manage It is a system which is used to manage informational Data.
operational Data.

Users Clerks, clients, and information Knowledge workers, including managers, executives, and
technology professionals. analysts.

System OLTP system is a customer-oriented, OLAP system is market-oriented, knowledge workers


orientation transaction, and query processing are including managers, data analysts, executives and analysts.
done by clerks, clients, and information
technology professionals.

Data contents OLTP system manages current data that is The OLAP system manages a large amount of historical data,
too detailed and is used for decision provides facilitates for summarization and aggregation, and
making. stores and manages data at different levels of granularity.
This information makes the data more comfortable to use in
informed decision making.

Database Size 100 MB-GB 100 GB-TB

Database design OLTP systems usually use an The OLAP system typically uses either a star or snowflake
entity-relationship (ER) data model and model and subject-oriented database design.
application-oriented database design.

View OLTP system focuses primarily on the The OLAP system often spans multiple versions of a
current data within an enterprise or database schema, due to the evolutionary process of an
department, without referring to historical organization. OLAP systems also deal with data that
information or data in different originates from various organizations, integrating information
organizations. from many data stores.

Volume of data Not very large Because of their large volume, OLAP data is stored on
multiple storage media.

Access patterns The access patterns of an OLTP system Accesses to OLAP systems are mostly read-only methods
consist mainly of short, atomic because these data warehouses store historical data.
transactions. Such a system requires
concurrency control and recovery
techniques.

Access mode Read/write Mostly write

Insert and Short and fast inserts and updates Periodic long-running batch jobs refresh the data.
Updates proposed by end-users.

Number of Tens Millions


records accessed

Normalization Fully Normalized Partially Normalized

Processing Very Fast It depends on the amount of files contained, batch data
Speed refresh, and complex query may take many hours, and query
speed can be upgraded by creating indexes.

You might also like