Tech Max

Unit V
Knowledge Management and
Artificial Intelligence and Expert
Systems
Syllabus Topic : Introduction to Knowledge Management
5.1___ Introduction to Knowledge Management | _
- Knowledge management is an activity practised by enterprises all over the world. In the
Process of knowledge management, these enterprises comprehensively gather information
using many methods and tools. .
- Then, gathered information is organized, Stored, shared, and analysed using defined
techniques. The analysis of such information will be based on resources, documents,
people and their skills.

~ Properly analysed information will then be stored as ‘knowledge’ of the €nterprise. This
knowledge is later used for activities such as organiza
tional decision making and training
new staff members.
Processes have been automated.
— Therefore, information storing, retrieval and sha
ring have become Convenient, Nowadays,
Most enterprises have their own knowledge m
anagement framework in place, |
~ The framework defines the knowledge gatheri i
data storing tools and techniques and analysing mechanism.
Scanned by CamScanner
(P}pusiness Intelligence (MU-B,Sc.-IT-Sem-VI) 5-9 Knowledge Mgmt. & Al & Expert Systems
SS ee ee EO oe
5.1.1 The Knowledge Management Process
g.§.1.2 Explain knowledge management process. (Ref. Sec. 5.1.1) 6 Marks)
Q.5.1.3 _ Write short note on approaches knowledge management.
(Ref. Sec. 5.1.1)
(5 Marks)
_ The process of knowledge management is universal for any enterprise. Sometimes, the
resources used, such as tools and techniques, can be unique to the organizational
environment.
- The Knowledge Management process has six basic steps assisted by different tools and
techniques. When these steps are followed sequentially, the data transforms into
knowledge.
Decision Making
Synthesizing
Analyzing —
Summarizing
Organizing
Data
Collecting ce
Fig. 5.1.1
Step 1 : Collecting
~ This is the most important step of the knowledge management process. If you collect the
incorrect or irrelevant data, the resulting knowledge may not be the most accurate.
Therefore, the decisions made based on such knowledge could be inaccurate as well.
-
There are many methods and tools used for data collection. First of all, data collection
should be a procedure in knowledge management process. These procedures should be
Properly documented and followed by people involved in data collection process.
The data collection procedure defines certain data collection points. Some points may be
the Summary of certain routine reports. As an example, monthly sales report and daily
attendance reports may be two good resources for data collection.
With data collection points, the data extraction techniques and tools are also defined. As
an example, the sales report may be a paper-based report where a data entry operator
9°
et Business Intelligence (MU-B.Sc.-IT-Sem-VI) 5-3 Knowledge Mgmt. & Al & Expert s tom
needs to feed the data manually to a database whereas, the daily attendance report May },
an online report where it is directly stored in the database.
- In addition to data collecting points and extraction mechanism, data storage is aly
defined in this step. Most of the organizations now use a software database application for
this purpose,
Step 2 : Organizing
The data collected need to be organized. This organization usually happens baseq on
certain rules. These rules are defined by the organization. |
As an example, all sales-related data can be filed together and all staff-related data coyy
be stored in the same database table. This type of organization helps to maintain data
accurately within a database. ,
~— If there is much data in the database, techniques such as ‘normalization’ can be used for
organizing and reducing the duplication. °
- This way, data is logically arranged and
related to one another for easy retrieval. When
data passes step 2, it becomes information.
Step 3: Summarizing —
- In this step, the information is summarized in order to take the essence of it. The lengthy
information is presented in tabular or graphical format and stored appropriately.

- For summarizing, there are many tools that can be used such as software packages, charts
(Pareto, cause-and-effect), and different techniques.
Step 4: Analyzing
- At this stage, the information is analyzed in order to find the relationships, redundancies
and patterns. °
- An expert or an expert team should be assigned for this purpose as the experience of the
person/team plays a vital role. Usually, there are reports created after analysis of
information.
Step 5 : Synthesizing
- At this point, information becomes knowledge. The results of analysis (usually the
reports) are combined together to derive various concepts and artefacts,
~ - A pattern or behavior of one entity can be applied to explain another, and collectively, the
organization will have a set of knowledge elements that can be used across the
organization.
~ This knowledge is then stored in the organizational knowledge base for further use.
(7 eusiness Intelligence (MU-B.Sc.-IT-Sem-VI)__5-4 Knowledge Mgmt. & Al & Expert Systems
_ Usually, the knowledge base is a software implementation that can be accessed from
anywhere through the Internet,
_ You can also buy such knowledge base software or download an open-source
implementation of the same for free.
Step 6: Decision Making
_ - At this stage, the knowledge is used for decision making. As an example, when estimating
a specific type of a project or a task, the knowledge related to previous estimates can be
used.
_ This accelerates the estimation process and adds high accuracy. This is how the
organizational knowledge management adds value and saves money in the long run.
=e)
Syllabus Topic : Roles of People in Knowledge Mariagement
5.2 __ Roles of People in Knowledge Management

- People are ultimately the holders of knowledge. The goal is to encourage them to not
only search for it and improve it for applying it to improving internal processes, but to
make them see the benefits of sharing it with the organization, in this context it is
“important:
1. To give people autonomy in their jobs and find new ways to fulfill them.
2. To provide proper storage and sharing of knowledge systems.
3. To empower them and continually train them
4. To keep them motivated
5. To give them adequate remuneration, to ensure their commitment.
- The manager should always be aware of the fact that decisions made by people can affect
the entire organization. .
> That’s why your motivation is crucial, that’s what will make employees share and
replicate the knowledge they accumulate in their activities in the company with
colleagues. ’
~ The worst that can happen is to lose that talent to the competition, along with everything
they have learned.
Sem-VI) 5-5 Knowledge Mgmt. & Al & Expert System, |
(ee Business Intelligence (MU-B,Sc.-IT-S
Syllabus Topic : Organizational Leaming
i tion
5.3 Learning Organisa _
[a. 5.3.1. Write short note on learning organization. (Ref. Sec. 5.3) _(5 Marks)
The learning organisation is an organisation characterised by a deep commitment ty
learning and education with the intention of continuous improvement.
This concept reviews several theories relating to the learning organisation, including some
criticism. -
Also, it examines some evidence on how learning organisations operate. Learning
organisations facilitate collective learning in order to continually improve the capacity to
respond to changing demands in the environment.
This permeates all organisational activities, stractures, processes, climate and values,
leading to an enhanced ability to react quickly to opportunities and threats.
n=
Syllabus Topic : Organizational Transformation
5.4 Organizational Transformation
Q.5.4.1- Write short tnot 2on.C rganizational t transfo mation. (Ref. S
Organizational transformation takes place when there is a change in the way the business
is done or in the event of a re-engineering or restructuring activity. “
Along with the structural changes, the attitude of the employees, their perspectives as well
as the culture of the organization undergoes a significant change.
- It’s about re-modelling an organization in its entirety.

Fig. 5.4.1
There are three key stages for managing organisational transformation along with the
critical success factors for managing change at each stage.
.|
Hews ness Intelligence
{MU'8.Sc-IT-Sem-V!)
5:6 ___Knowledge Mgmt. & Al & Expert Systems
stage 1: Break with the past
_ Bring in outsiders. The Board should introduc
: ¢ entrepreneurial outsiders with targeted
expertise onto the top management team,
Break with your administrative heritage. Important mechanisms here can be the removal
of blockers, rotation of managers,
a promotion of young managers untainted by the
organisational heritage, the utilisati
on of project teams, the achievement of early.
successes and designing a suitable bonus/incentive system.
- Use aspects of the administrative herita

/ ge that help the change process. Not everything
that worked in the past needs to be thro
wn away.
This will vary from company to company, Some may be able leverage a traditional
command-and-control management style to achieve more rapid implementation of
change, however, in environments where a more democratic leadership style is the norm,
it may be more appropriate to leverage other factors, for example, customer relationships,
a strong R&D department, or the latent enthusiasm of organisational members for
participating in new initiatives, Crisis is also an important lever for organisational change.
Stage 2 : Manage the present
Vary your leadership style as appropriate. The top-down approach of Stage 1 may be still
required to break with the past in some parts of the organisation, while other parts may by
this stage already have the ability to learn and therefore may be given authority and
empowerment to act,
Exploit best practice from your own or other organisations. This will require knowledge
acquisition, knowledge internalisation and knowledge dissemination.
-Reconfigure, divest and integrate resources. This involves everything from streamlining
business systems to removing non-aligned employees to consolidating new acquisitions
operationally and culturally.
Stage 3: Invest in the future
Empower the organisation. The top management team should delegate to employees as
well as motivating and enabling them to act.

Enable the organisation to engage in exploration of new ideas and business practices. You
can achieve this by encouraging innovation, trial and experimentation and by developing
a culture which encourages informed risk-taking and facilitates learning from mistakes.
Exploration enables the organisation to develop new capabilities fitted to its specific
context, rather than just importing systems and routines from other contexts.
Create new paths. This means creating a deliberate change in direction using new
Capabilities, whether that be in terms of new products, services, processes or business models,
sl
[business Intelligence (MU-B.Sc.-IT-Sem-VI) 5-7 Knowledge Mgmt. & Al & Expert Systems
The combination of exploration and path creation will lead you to the “disruptive Innovation”
that will help you secure sustainable competitive advantage.
By going through these stages, organizations can establish new developmental Pathway.
enhance their strategic flexibility, and react successfully to changes in the environment.
5.5
Syllabus Topic : Knowledge Management Activities
Knowledge Management Activities
9-55: _Enplin ovis manag acini (ot 6e0 85) War
A winning knowledge management program increases staff productivity, product ang
service quality, and deliverable consistency by capitalizing on_ intellectual and
| knowledge-based assets.
Many organizations leap into a knowledge management solution (e.g. document

Management, data mining, blogging, and community forums) without first considering the
purpose or objectives they wish to fulfill or how the organization will adopt and follow
best practices for managing its knowledge assets long term.
A successful knowledge management program will consider more than just technology.
An organization should also consider:
9.5.1
—_—
People. : They represent how you increase the ability of individuals within the
Organization to influence others with their knowledge.
Processes : They involve how you establish best
practices and governance for the
efficient and accurate identification, mana
éement, and dissemination of knowledge.
Technology : It addresses how you choose, configure,
enable knowledge Management.
Structure : It directs’ how you transform organizational Structures to facilitate and
encourage cross-discipline awareness and expertise,
and utilize tools and automation to
Culture : It embodies how you establish and cultivate a knowledge-sharing,

knowledge-
driven culture.
The Power of Knowledge Management
Implementing a complete knowledge management takes time and mon
results can be impressive and tisks can be minimized by taking a phas
gives beneficial returns at each step,
ey, however, the
€d approach that
[Hf ausiness Intelligence (MU-B.Sc.-IT-Sem-Vl) 5g Knowledge Mgmt. & Al & Expert Systems.
_ Organizations that have made this kind of investment in knowledge management realize
tangible results quickly.
_ They add to their top and bottom lines through faster cycle times, enhanced efficiency,
better decision making and greater use of tested solutions across the enterprise.
Syllabus Topic : Approaches to Knowledge Management
5.6 Approaches to Knowledge Management
rr
Approaches to Knowledge Management are explained in Section 5.1.1.
nS
Syllabus Topic : Information Technology (IT) in Knowledge Management
5.7 __Information Technology (IT) in Knowledge Management
[a.s.71 Explain IT in knowledge management(Ref. Sec.5.7) === Marks)

KM was initially driven primarily by IT, information technology, and the desire to put
that new technology, the Internet, to work and see what it was capable of.
That first stage has been described using a horse breeding metaphor as “by the internet out
of intellectual capital,” the sire and the dam.
The concept of intellectual capital, the notion that not just physical resources, capital, and
manpower, but also intellectual capital (knowledge) fueled growth and development,
provided the justification, the framework, and the seed. The availability of the internet
provided the tool.
As described above, the management consulting community jumped at the new
capabilities provided by the Internet, using it first for themselves, realizing that if they
shared knowledge across their organization more effectively they could avoid reinventing
the wheel, underbid their competitors, and make more profit.
The central point is that the first stage of KM was about how to deploy that new |
technology to accomplish more effective use of information and knowledge.
The first stage might be described as the “If only Texas Instruments knew what Texas
Instruments knew” stage, to revisit a much quoted KM mantra. The hallmark phrase of Stage 1
Was first “best practices,” later replaced by the more politic “lessons learned.”
eu Business Intelligence (MU-B.Sc.-IT-Sem-Vl)__5-9 Knowledge Mgmt. & Al & Expert System,
lt SY stom,
Syllabus Topic : Knowledge Management Systems Implementation 1
5.8 Knowledge Management Systems Implementation.
—$——__
Q. 5.8.1 Write steps involved in knowledge management system implementation.
(Het: Sec. 5.8) / (5 Marks)
* Steps to Implementation
Implementing a knowledge management program is no easy feat. You will encounter
many challenges along the way including many of the following:
Inability to recognize « or articulate knowledge; turning tacit knowledge into explicit
knowledge.
- Geographical distance and/or language barriers in an international company.
~ Limitations of information and communication technologies.
- Loosely defined areas of expertise.
- Internal conflicts (e.g. professional territoriality).
- Lack of incentives or performance management goals.
- Poor training or mentoring programs.

Cultural barriers (e.g. “this is how we've always done it” mentality). -
‘The following eight-step approach will enable you to identify these challenges so you can
plan for them, thus minimizing the risks and maximizing the rewards. This approach was
developed based on logical, tried-and-true activities for implementing any new organizational
program. The early steps involve strategy, planning, and requirements gathering while the later
steps focus on execution and continual improvement.
Step 1 : Establish Kuowledee Management Program Objectives
Before selecting a tool, defining a process, and developing workflows, you should
envision and articulate the end state.
In order to establish the appropriate program objectives, identify and document the
business problems that need resolution and the business drivers that will provide
momentum and justification for the endeavor. .
Provide both short-term and long-term objectives that address the business problems and
support the business drivers. Short-term objectives should seek to provide validation that
the program is on the right path while long-term objectives will help to create and
communicate the big picture.
(er Buainosa Intolligonco (MU-B.So,-IT-Som-V1) 5-10 Knowledge Mgmt. & Al & Expert Systems.
Step 2: Prepare for Change
- Knowledge management is more than just an application of technology. It involves
cultural changes in the way employees perceive and share knowledge they develop or
possess.
7 One commion cultural hurdle to increasing the sharing of knowledge is that companies
primarily reward individual performance.
- This practice promotes a "knowledge is power" behavior that contradicts the desired
knowledge-sharing, knowledge-driven culture end state you are after.
- Successfully implementing a new knowledge management program may require changes
within the organization's norms and shared values; changes that some people might resist
or even attempt to quash.
- To minimize the negative impact of such changes, it's wise to follow an established
approach for managing cultural change,
Step 3 : Define High-Level Process
- To facilitate the effective management of your organization's knowledge assets, you
should begin by laying out a high-level knowledge management process.
- The process can be progressively developed with detailed procedures and work
instructions throughout steps four, five, and six. However, it should be finalized and
approved prior to step seven (implementation).
Organizations that overlook or loosely define the knowledge management process will not
realize the full potential of their knowledge management objectives.
How knowledge is identified, captured, categorized, and disseminated will be ad hoc at
best. There are a number of knowledge management best practices, all of which comprise
similar activities. 7 |
In general, these activities include knowledge strategy, creation, identification,
classification, capture, validation, transfer, maintenance, archival, measurement, and
reporting.
Step 4 : Determine and Prioritize Technology Needs
~ Depending on the program objectives established in step one and the process controls and
criteria defined in step three, you can begin to determine and prioritize your knowledge
management technology needs.
With such a variety of knowledge management'solutions, it is imperative to understand
the cost and benefit ef each type of technology and the primary technology providers in
the marketplace.
(I business Intelligence (MU-B.Sc.-IT-Sem-VI) 5-11 Knowledge Mgmt. & Al & Expert S Stems
- Don't be too quick to purchase a new technology without first determining if your existing
technologies can meet your needs.
You can also wait to make costly technology decisions after the knowledge managemen
program is well underway if there is broad support and a need for enhanced computing
and automation.
Step 5: Assess Current State
- Now that you've established your program objectives to solve your business Problem,
prepared for change to address cultural issues, defined a high-level process to enable the
effective management of your knowledge assets, and determined and prioritized your
technology needs that will enhance and automate knowledge management relateg
activities, you are in a position to assess the current state of knowledge management
within your organization. ,
- The knowledge management assessment should cover all five core knowledge
management components: people, processes, technology, structure, and culture.
- A typical assessment should Provide an overview of the assessment, the gaps between
current and desired states, and the recommendations for attenuating identified gaps. The
recommendations will become the foundation for the roadmap in step six.
Step 6 : Build a Knowledge Management Implementation Roadmap
— With the current-state assessment in hand, it is time to build the implementation roadmap
for your knowledge management program.
- But before going too far, you should re-confirm senior leadership's support and
commitment, as well as the funding to implement and maintain the knowledge
Management program. ;
- Without these prerequisites, your efforts will be futile. Having solid evidence of your
organization’ s shortcomings, via the assessment, should drive the urgency rate up.
- Having a strategy on how to overcome the shortcomings will be critical in gaining
leadership's support and getting the funding you will need.
~ This strategy can be presented as a roadmap of related Projects, each addressing specific
gaps identified by the assessment.
— The roadmap can span months and years and illustrate key milestones and dependencies.
A good roadmap will yield some short-term wins in the first step of projects, which will
bolster support for subsequent steps.
— _ As time progresses, continue to review and evolve the roadmap based upon the changing
economic conditions and business drivers. ;
feusiness In Intelligence (MU-B.Sc.-IT-Sem-VI) 5.49 Knowledge Mgmt. & Al & Expert Systems
- You will undoubtedly
gain additional insight through the lessons learned from earlier
projects that can be app
lied to future projects as well.
Step 7: Implementation
~- Implementing a knowledge mana
gement program and maturing the overall effectiveness
of your organization will require
significant personnel resources and funding.
- _ Be prepared for the long haul, but at the s
ame time, ensure that incremental advances are ~
~ made and publicized.
- As long as there are recognized value and benefits, especially in light of ongoing
successes, there should be little resistance to continued knowledge management
investments.
With that said, it's time for the rubber to meet the road. You know what the objectives are.
’ You have properly mitigated all cultural issues,
- You've got the processes and technologies that will enable and launch your knowledge
management program. You know what the gaps are and have a roadmap to tell you how
to address them.
- As you advance through each step of the roadmap, make sure you are realizing your
short-term wins. Without them, your Program may lose momentum and the support of key
stakeholders. 7
Step 8 : Measure and Improve the Knowledge Management Program
How will you know your knowledge management investments are working? You will
need a way of measuring your actual effectiveness and comparing that to anticipated
results.
- If possible, establish some baseline measurements in order to Capture the before shot of
the organization’s performance prior to implementing the knowledge management
program.
~ Then, after implementation, trend and compare the new results to the old results to see
how performance has improved.
Don’t be disillusioned if the delta is not as large as you would have anticipated. It will
take time for the organization to become proficient with the new processes and
improvements. Over time, the results should follow suit.
When deciding upon the appropriate metrics ‘to measure your organization’s progress,
establish a balanced scorecard that provides metrics in the areas of performance, quality,
compliance, and value. pO

The key point behind establishing a knowledge management balanced scorecard is that it
provides valuable insight into what's working and what's not.
i.
er Business Intelligence (MU-B.Sc,-IT-Som-VI) 5-13 Knowledge Mgmt. & Al & Expert System,
- You can then take the necessary actions to mitigate compliance, performance, quality, ay, d
value gaps, thus improving overall efficacy of the knowledge management program.
Syllabus Topic : Concepts and Definitions of Artificial Intelligence
5.9 Introduction to Artificial Intelligence
———
Q.5.9.1 _ Whatis Artificial Intelligence? (Ref. Sec. 5.9) , (6 Marka]
— Since the invention of computers or machines, their capability to perform various tasks
went on growing exponentially. |
- Humans have developed the power of computer systems in terms of their diverse Working
domains, their increasing speed, and reducing size with respect to time.
- A branch of Computer Science named Artificial Intelligence pursues creating the
computers or machines as intelligent as human beings.
- According to the father of Artificial Intelligence, John McCarthy, it is “The science and
engineering of making intelligent machines, especially intelligent computer programs”.
- _ Artificial Intelligence is a way of making a computer, a computer-controlled robot, or a
software think intelligently, in the similar manner the intelligent humans think.
- Al is accomplished by studying how human brain thinks, and how humans learn, decide,
_ and work while trying to solve a problem, and then using the outcomes of this study as a
basis of developing intelligent software and systems.
SaaS eee
Syllabus Topic : Artificial Intelligence Versus Natural Intelligence
5.10 Differences Between Artificial Intelligence and Human Intelligence
@. 5.10.1 Differentiate between human inteligence and artic
Intelligence can be defined asa general mental ability for reasoning, problem-solving, and
learning. Because of its general nature, intelligence integrates cognitive functions such as
perception, attention, memory, language, or planning.
On the basis of this definition, intelligence can be reliably measured by standardized tests
with obtained scores predicting several broad social outcomes such as educational
achievement, job performance, health, and longevity. So let’s study the differences
between Artificial Intelligence and Human Intelligence in a detail.

yr
ee Business Intelligence (MU-B.Sc.-IT-Sem-V1) 5-14 Knowledge Mgmt. & Al & Expert Systems .
@ Artificial Intelligence
Artificial Intelligence is the study and design of Intelligent agent, These intelligent agents
have the ability to analyze the environments and produce actions which maximize
success.
Al research uses tools and insights from many fields, including computer science,
psychology, Philosophy, neuroscience, cognitive science, linguistics, operations research,
economics, control theory, probability, optimization and logic.
Al research also Overlaps with tasks such as robotics, control systems, scheduling, data
mining, logistics, speech recognition, facial recognition and many others.
# Human Intelligence :
- Human Intelligence is defined as the quality of the mind that is made up of capabilities to
learn from past experience, adaptation to new situations, handling of abstract ideas and
the ability to change his/her own environment using the gained knowledge.
Human Intelligence can provide several kinds of information. It can provide observations
during travel or other events from travellers, refugees, escaped friendly POWs, etc.
It can provide data on things about which the subject has specific knowledge, which can
be another human subject, or, in the case of defectors and spies, sensitive information to
which they had access. Finally, it can provide information on interpersonal relationships
and networks of interest.
* Key Differences between Artificial Intelligence and Human Intelligence
Below are the lists of points, describe the key Differences between Artificial Intelligence
and Human Intelligence.
| Key Differences between Artificial
Intelligence and Human Intelligence
4 1.Nature of Existence
2.Memory usage
3.Mode of creation
4.Leaming process
5.Dominance
Fig. 5.10.1 : Key Differences between Artificial Intelligence and Human Intelligence |
, (eT Business Intelligence (MU-B.Sc.-IT-Sem-VI) 5-15 Knowledge Mgmt. & Al & Expert Systems
SSS ————E====—
> 1. Nature of Existence
Human intelligence revolves around adapting to the environment using a combination of
several cognitive processes. The field of Artificial intelligence focuses on designing
machines that can mimic human behaviour.
“> 2. Memory usage’
Humans use content memory and thinking whereas, robots are using the built-in
instructions, designed by scientists.
—~> 3. Mode of creation
Human intelligence is bigger because its creation of God and artificial intelligence as the
name suggests is artificial, little and temporary created by humans. Also, Humans
intelligence is the real creator of the artificial intelligence even but they cannot create a
human being with superiority.
> 4. Learning process
- Human intelligence is based on the variants they encounter in life and responses they get
which may result in millions of functions overall in their lives.
_ 7 However, for Artificial intelligence is defined or developed for specific tasks only and its
applicability on other tasks may not be easily possible.

=~ 5. Dominance :
Artificial intelligence can beat human intelligence in some specific areas:such as in Chess
a supercomputer has beaten the human player due to being able to store all the moves played
by all humans so far and being able to think ahead 10 moves as compared to human players
who can think 10 sey ahead bat cannot store and r retrieve that number ofr moves in i Chess.
Nov} Factor 30 oes
1. | Energy efficiency | 25 watts human brain 2 watts for modern machine
learning machine.
2. | Universal Humans usually learn how | While consuming kilowatts of
to manage hundreds of | energy, this machine is usually

different skills during life. designed for a few tasks.
3. | Multitasking Human worker work on | The time needed to teach system
multiple responsibilities. on each and every response is
considerably high.
(7 Business Intelligence (MU-B,Sc.-IT-Sem-V1) 5-16 Knowledge Mgmt. & Al & Expert Systems
oo ,
rat
sr. |. Comparison Human Intelligence Artificial Intelligence
No. - Factor
4, | Decision Making | Humans have the ability to | Even the most advanced robots
learn decision making from | can hardly compete in mobility
experienced scenarios. with 6 years old child. And. this
results we have after 60 years of
research and development.
5. | State _ | Brains are Analogue Computers are digital

a
Syllabus Topic : Basic Concepts of Expert Systems
5.11 Basic Concepts of Expert Systems
[@.5.11.2 What are expert systems? (Ref, Sec. 5.11 _(S Marks)
Expert Systems (ES) are one of the prominent research domains of AI. It is introduced by
the researchers at Stanford University, Computer Science Department.
The expert systems are the computer applications developed to solve complex problems
in a particular domain, at the level of extra-ordinary human intelligence and expertise.
* Characteristics of Expert Systems

- ~ High performance.
Understandable.
Reliable.
- Highly responsive.
* Capabilities of Expert Systems
The expert systems are capable of :
- Advising.
- Instructing and assisting human in decision making.
~ Demonstrating. |
~ Deriving a solution.
. Diagnosing.
~ Explaining.
. & Al & Expert System
eh Business Intelligence (MU-B.Sc.-IT-Sem-VI)_5-17 Knowledge Mgmt 3 ystems
- Interpreting input.
- _ Predicting results.
Justifying the conclusion.
Suggesting alternative options to a problem. )
* In Capabilities of Expert Systems
They are incapable of :
Substituting human decision makers.
— Possessing human capabilities.
- Producing accurate output for inadequate knowledge base.
- Refining their own knowledge. . ;
Syllabus Topic : Structure of Expert Systems
Sm SSS ee
9.12 Components of Expert Systems

Q.5.12.1 Explai
Q.5.12.2 Explain structure of exper
The components of ES include :
- Knowledge Base.
- Inference Engine.
~ User Interface.
Let us see them one by one briefly :
‘> ci
Human Knowledge
Expert Engineer
(May not be an expert)
Fig. 5.12.1
Intelli
[ff eusiness intoligence (MU-B.Se.-. SECS (MU-B.Sc.-IT-Sem-VI) 5-18 Knowledge Mgmt. & Al & Expert
Systems
Syllabus Topic : Knowledge Engin
5.12.1 Knowledge Base
@.5.12.3 What is Knowledge? (Ref. Sec. 5.12.1) » (5 Marks)
It contains domain-specific and high-quality knowledge.
Knowledge is required to exhibit intelligence. The success of any ES majorly depends
upon the collection of highly accurate and precise knowledge.
The data is collection of facts. The information is organized as data and facts about the
ask domain. Data, information, and past experience combined together are termed as
knowledge.
5.12.1.1 Components of Knowledge Base
9. (Ref. Sec. 5.12.1.1) (5 Marks)
[a5124 Explain forward chaining and backwa
The knowledge base of an ES is a store of both, factual and heuristic knowledge.
- Factual Knowledge : It is the information widely accepted by the Knowledge Engineers
and scholars in the task domain.
- Heuristic Knowledge : It is about practice, accurate judgement, one’s ability of
evaluation, and guessing.
* Knowledge representation
It is the method used to organize and formalize the knowledge in the knowledge base. It is
in the form of IF-THEN-ELSE rules.
* Knowledge acquisition |
- The success of any expert’ system majorly depends on the quality, completeness, and
accuracy of the information stored in the knowledge base.
- The knowledge base is formed by readings from various experts, scholars, and the
Knowledge Engineers. The knowledge engineer is a person with the qualities of
empathy, quick learning, and case analyzing skills. .
- . He acquires information from subject expert by recording, interviewing, and observing
him at work, etc. He then categorizes and organizes the information in a meaningful way,
in the form of IF-THEN-ELSE rules, to be used by interference machine. The knowledge
engineer also monitors the development of the ES.
-=
er Business Intolligonce (MU-B.Sc,-IT-Sem-VI 5-19 Knowledge Mgmt. & Al & Expert S tems
5.12.2 Inference engine
=»
Use of efficient procedures and rules by the Inference Engine is essential in deducting 4
correct, flawless solution.
In case of knowledge-based ES, the Inference Engine acquires and manipulates the
knowledge from the knowledge base to arrive at a particular solution.
In case of rule based ES, it :
© Applies rules repeatedly to the facts, which are obtained from earlier rule application,
© Adds new knowledge into the knowledge base if required.
© Resolves rules conflict when multiple rules are applicable to a particular case.
To recommend a sol ution, the Inference Engine uses the following strategies :
1. Forward Chaining
2. Backward Chaining
1. Forward Chaining
It is a strategy of an expert system to answer the question, “What can happen next?”
of conditions and derivations and finally
and rules, and sorts them before concluding
Here, the Inference Engine follows the chain
deduces the outcome. It considers all the facts
to a solution.
This strategy is followed for workin
Prediction of share market status as an
& on conclusion, result, or effect. For example,
effect of changes in interest rates.

Fact 1
AND
Fact 2
AND} Decision 4
Fact 3 / . Bete, cages
OR decision 2
Fact 4 _———
Fig. 5.12.2
2. Backward Chaining
With this strategy, an expert system finds out the answer to the question, “Why this
happened?”
[$f ausiness Intelligence (MU-B.Sc.-IT-Sem-VI) 5.20 Knowledge Mgmt. & Al & Expert Systems
On the basis of what has already happened, the Inference Engine tries to find out which —
conditions could have happened in the past for this result. This strategy is followed for
finding out Cause oF reason. For example, diagnosis of blood cancer in humans.
Fact 1
Fact 2
Fact 3
Fact 4
5.12.3 User Interface
User interface provides interaction between user of the ES and the ES itself. It is generally
Natural Language Processing so as to be used by the user who is well-versed in the task
domain.
The user of the ES need not be necessarily an expert in Artificial Intelligence.
It explains how the ES has arrived at a particular recommendation. The explanation may
appear in the following forms : |
© ' Natural language displayed on screen,
o Verbal narrations in natural language.
o Listing of rule numbers displayed on the screen.

The user interface makes it easy to trace the credibility of the deductions.
Requirements of Efficient ES user interface
It should help users to accomplish their goals in shortest possible way.
It should be designed to work for user’s existing or desired work practices.
Its technology should be adaptable to user’s requirements; not the other way round.
It should make efficient use of user input. _
ey Business Intelligence (MU-B,Sc,-IT-Sem-VI)_ 5-21. Knowledge Mgmt. & Al & Expert Systems
* Expert systems limitations
No technology can offer casy and complete solution. Large systems are costly, require
significant development time, and computer resources. ESs have their limitations Which
include :
- Limitations of the technology.
- Difficult knowledge acquisition.
— ES are difficult to maintain.
'
- High development costs.
See
Syllabus Topic : Applications of Expert Systems
5.13 Applications of Expert System | x
Raa
Q.5.13.1_ Explain applications of expert system in detail. (Ref. Sec. 5.13) __(5 Marks)
The Table 5.13.1 shows where ES can be applied.

Table 5.13.1
Design Domain Camera lens design, automobile design.
Medical Domain Diagnosis Systems to deduce cause of disease from observed
3 data, conduction medical operations on humans.
Monitoring Systems Comparing data continuously with observed system or with

prescribed behavior such as leakage monitoring in long
petroleum pipeline.
Process Control Systems | Controlling a physical process based on monitoring.
Knowledge Domain Finding out faults in vehicles, computers. 4
Finance/Commerce Detection of possible fraud, suspicious transactions, stock |
market trading, Airline scheduling, cargo scheduling. |
(7 pusiness Intelligence (MU-B.Sc.-IT-Sem-V1) 5-22 Knowledge Mgmt. & Al & Expert Systems
5.13.1 Expert System Technology
g.5.13.2_Write application of expert system. (Ref. Sec. 5.13.1) “(5 Marks)
There are several levels of ES technologies available, Expert systems technologies include :
» 1. Expert System Development Environment
The ES development environment includes hardware and tools.
Levels of ES Technologies
1. Expert System Development Environment

2. Tools
3. Shells
Fig. 5.13.1 : Levels of ES Technologies
They are :
o Workstations, minicomputers, mainframes.
o High level Symbolic Programming Languages such as LISt Programming (LISP)
and PROgrammation en LOGique (PROLOG).
o Large databases.
+ 2. Tools
- They reduce the effort and cost involved in developing an expert system to large extent.
o Powerful editors and debugging tools with multi-windows.
o They provide rapid prototyping. .
o Have Inbuilt definitions of model, knowledge representation, and inference design.
> ad . A shell ides the
i ithout knowledge base. A shell provide
7 ‘- nothing but an expert system WI A : Jes t
Laslenait ne acquisition, inference engine, user interface, and explanation
facility. For example, few shells are given below : |
Java Expert System Shell (JESS) that provides fully developed Java API for creating
o Java Expe:
’ an expert system.
o Vidwan, a shell developed at
in 1993. It enables knowledge enc
the National Centre for Software Technology, Mumbai
oding in the form of IF-THEN rules.
[FT Business Intelligence (MU-B.Sc.-IT-Sem-VI) __5-23 Knowledge Mgmt. & Al & Expert Systems
-——
Syllabus Topic : Development of Expert Systems
5.14 Development of Expert Systems: General Steps
hy
Q. 5.14.1 Enlist and explain steps of development of expert system. tS 58
(Ref, Seo. 5.14) - _____@ Marks)
The process of ES development is iterative. Steps in developing the ES include :
. Steps in developing the Expert Systems.
Gea PH een coe pe ee ge a es are bere et Meg eee
Step 1 —» [ Identify Problem Domain . ] |
Step 2 —> Design the System ]

Step 3 —> [ Develop the Prototype .
T cE eee
Step 4 — | Test and Refine the Prototype ,
Step 5 —»> | Develop and Complete the ES
Step 6 —> { Maintain the System
Tre Ae ERE SPE
Fig. 5.14.1 : Steps in developing the Expert Systems
“> 1. Identify Problem Domain
The problem must be suitable for an expert system to solve it,
Find the experts in task domain for the ES project.
Establish cost-effectiveness of the system.

> 2. Design the System
Identify the ES Technology,
Know and establish the degree of integration with the other systems and databases
Realize how the concepts can represent the domain knowledge best.
= . Scanned by CamScanner
[7] eusiness Intelligence (MU-B.Sc.-IT-Sem-VI) 5-24 Knowledge Mgmt. & Al & Expt Sas Al & Expert
Systems
3. Develop the Prototype
From Knowledge Base: The knowledge engincer works to :
_ Acquire domain knowledge from the expert.
_ Represent it in the form of If-THEN-ELSE rules.
= 4. Test and Refine the Prototype
The knowledge engineer uses sample cases to test the prototype for any deficiencies in
performance.
- End users test the prototypes of the ES.
+ 5. Develop and Complete the ES
Test and ensure the interaction of the ES with all elements of its environment, including
end users, databases, and other information systems.
—- Document the ES project well,
- Train the user to use ES.
> 6. Maintain the System
Keep the knowledge base up-to-date by regular review and update.
Cater for new interfaces with other information systems, as those systems evolve.
© Benefits of Expert Systems

Availability ; They are easily available due to mass production of software.
Less Production Cost : Production cost is reasonable. This makes them affordable.
Speed : They offer great speed. They reduce the amount of work an individual puts in.
Less Error Rate : Error rate is low as compared to human errors.
Reducing Risk : They can work in the environment dangerous to humans.
Steady response : They work steadily without getting motional, tensed or fatigued.
5.15 Exam Pack (Review Questions)
* Syllabus Topic : Introduction to Knowledge Management
Q.1 Explain Knowledge management. (Refer Section 5.1)
(5 Marks)
Q@.2 Explain knowledge management process. (Refer Section 5.1.1) (5 Marks)
Q.3 Write short note on approaches knowledge management.
(Refer Section 5.1.1) (5 Marks)
5.25 __Knowledge M
Business Intelligence (MU-B.SeT-Sem-M)
@ Syllabus Topic : Roles of People in Knowledge Management
Q.4 Whatare the role of knowledge management ? (Refer Section 5.2)
@ Syllabus Topic : Organizational Learning
Q@.5 Write short note on leaming organization. (Refer Section 5.3)
@ Syllabus Topic : Organizational Transformation
Q.6 Write short note on Organizational transformation. (Refer Section 5.4)
'@ Syllabus Topic : Knowledge Management Activities
Q.7 Explain knowledge management activities in brief. (Refer Section 5.5)
@ Syllabus Topic : Information Technology (IT) in Knowledge Management
Q.8 Explain IT in knowledge management.(Refer Section 5.7)
@ Syllabus Topic : Knowledge Management Systems Implementation
Write steps involved in knowledge management system implementation.
(Refer Section 5.8)
@ Syllabus Topic : Concepts and Definitions of Artificial Intelligence

Q.9
Q.10 What is Artificial Intelligence? (Refer Section 5.9)
@ Syllabus Topic : Artificial Intelligence Versus Natural Intelligence
Q.11 Differentiate between human intelligence and artificial intelligence.
(Refer Section 5.10)
@ Syllabus Topic : Basic Concepts of Expert Systems
Q. 12 Explain basic concepts of expert systems. (Refer Section 5.11)
Q.13 What are expert systems? (Refer Section 5.11)
- Syllabus Topic : Structure of Expert Systems
Q.14- Explain components of expert system. (Refer Section 5.12)
Q.15 Explain structure of expert systems. (Refer Section 5.12)
@ Syllabus Topic : Knowledge Engineering |
Q.16 What is Knowledge? (Refer Section 5.12.1)
Q.17 Explain forward chaining and backward chaining. (Refer Section 5,12.1.1)
gmt. & Al & Expert Syatemg
(5 Marks)
(5 Marks)
(5 Marks)
(5 Marks)
(5 Marks)
(5 Marks)
(5 Marks)
(5 Marks)
(5 Marks)
(5 Marks)
(5 Marks)
(5 Marks)
(5 Marks)
(5 Marks)
(7 susiness Intelligence (MU-B.Sc.-IT-Sem-Vl) 5-26 Knowledge Mgmt. & Al & Expert Systems
@ Syllabus Topic : Applications of Expert Systems : 4
g.18 Explain applications of expert system in detail. (Refer Section 5.13) (5 Marks)
@.19 Write application of expert system. (Refer Section 5.13.1) (5 Marks)
@ Syllabus Topic : Development of Expert Systems .
Q.20 Enlist and explain steps of development of expert system.
‘(Refer Section 5.14) (5 Marks)
O00
Chapter Ends...
; Ae :
https://E-next.in Scanned by CamScanner

4d
Unit IV
Business Intelligence
Applications
arketing Models : Relational Marketing
Syllabus Topic : M
Relational Marketing
Q.4.1.1 Explain Relational marketing and various factor associated with it.
(Ref. Sec. 4.1) _(6 Marks) |
Let’s understand relational marketing with example. Most of us have noticed that
whenever a mobile company is about to launch a new device into the market a survey is
done by the company so that they get different opinions from their customers, which
helps them to enhance the functionality provided by that device. .
And it is not only about a mobile phone, when you visit a restaurant waiters get the
feedback forms along with the bills wherein the customers have to rate the restaurant in
different aspects so that they improvise themselves.
Almost all the companies study the behaviour and the feedbacks given by the customers
and try to inculcate the features that are been required by the customers into their device
with a reasonable and effective cost price so that the customers are attracted towards the
product and thus sale of the company is increased.
Most of the e-commerce company store huge database which have collective information
about their customers and the data regarding their previous purchases which helps the
company to provide options to its customers which are more likely to be liked by tb
customers again resulting in growth in the sales of the customers. .
The strategy that is been followed in relational marketing is to start, strengthen, objectily
and maintain the relationship between the customers, stakeholders and the one?
which is been presented by the customers, analysis is done, planning is done according y
executed and evaluated to achieve the objectives.
[YF Business Intalligence (MU-B.Sc.-IT-Sem-Vl) 4-2 Business Intelligence Applications
Relational Marketing evolved and became popular in late 1990s to increase customer’s
catisfaction so that the competitive advantage is achieved.
initially this approach was initiated by companies providing financial and
telecommunication services and later on implemented by almost all the companies
wherein they are more concern about what the customer actually needs and accordingly
implement the same into their respective products so as to sustain the competitive market.
4.1.1. Motivations and Objectives
Reasons to spread relational marketing are complex but interconnected which are listed
below :
With evolution of companies in the respective fields, the number of customers has also
increased comparatively.
Earlier it was innovation-production and obsolescence cycle which was eventually
compressed from 1980s which happened to boost customized business intelligence
options for customers.
Increase transparency and flow of data an also addition of e-commerce sites lead to global
comparisons between different features, prices and also reviews from the customers who
have used that particular product.

Due to increased competitors in the market, it is very uncertain whether the customer will
renew the existing service or opt a new one because the facilities to change the services
have become much easier and convenient to use.
Most of the companies have maintain different levels/versions of the products and
services provided by them so that the customer has got the flexibility of choosing the
Services according to its requirement and also switch between the services as and when
required.
Data is gathered of the transactions and products and services that are been used by the
Customers so that the company has huge range of data to analyze what is expected next by
the customers, advanced automation techniques are used to analyze this data so that
accurate observation is achieved.
Bu
siness Intelligence Applications
¢ relational marketing rotate around the following choices :
es of re ;
_ Strategi
Fig. 4.1.1(a) : Decision-making options for a relational marketing strategy
Above mentioned are the choices through which the strategies for relational marketing
can be constructed and implemented.
Product services are the services that can be provided by the company for the
maintenance of the product post purchase.
Various distribution channels can be constructed to make the product available for the
customers, like nowadays the companies are not sourly depended on traditional approach
where the product is distributed to various shops from where the customers would
purchase the same instead the products are been distributed to e commerce sites and sales
with attractive offers due to which customers get wide range of options to purchase the
product.
Fig. 4.1.1(b) : Components of a relational marketing strategy
-IT-Sem-Vl) 4-4 Business Intelligence Applications
Ei: (Gf ausiness Intelligence (MU-B.Sc.
and prices of the product is also maintained to compete in the market. Different
motions are done to attract the customers and make them aware about the
Segments
creative pro
specification of the product.
Above mentioned are the different components that are been used in relational marketing
strategy Where in the organization, its technologies, business strategies and its data
mining, Process implemented to construct and promote the product together help in
[ achieving efficient and strong relationship among its customers and also the company.
Fig. 4.1.1(c) represent the different people involved in relational marketing strategy where
| all the nodes are interconnected to each other.
Fig. 4.1.1 (c) : Network of relationships involved in a relational marketing strategy
4.1.2 An Environment for Relational Marketing Analysis
Operational
; Extemal data
F }¢——————- Information systems ——————»¢-—- Decision making process —__——>|
Fig. 4,1.2(a) : Components of an environment for relational marketing analysis
f
. i -IT- -V - i :
er Business Intelligence (MU-B.Sc.-IT-Sem-VI)_4-5 Business Intelligence Application
Fig. 4.1.2(a) shows the main elements that are been used to create an environment f |
relational marketing analysis. "|
Information infrastructures consist of the company’s data warehouse, which is bee, |
achieved by collecting data from different internal and external data Sources, and als
from marketing data mart which gives business intelligence and data mining analyses fo,
understanding the potential of the company and identifying the actual customers that the
company has.
— With different machine learning and pattern recognition models it is easy to achiev.
various sections of customer base which can be later on used to define and design Policies
for marketing actions.
Classification model can also be generated to classify different objectives of the company |
say as for example the classification model can be made to check what the customer j,
frequently buying from the offers been provided by the company and project the similz
kind of offer to only those customers where the possibility of their acceptance to the
model is more.
Managing marketing campaign is a difficult task which needs strong planning for every
typé of customer, what would be the actions taken and communication channels through
which the customer can communicate with the company and how can the available
resources both human and finance is been used.
This decision making process can be managed and formally expressed with the help of
optimization models. The end phase of marketing activity cycle is execution of the
campaign that is been planned with appropriate gathering of results.
The data that is been collected through this results is then put into marketing data mart fet
future data mining analysis. °
- Whenever a campaign is been executed it is important to set procedures which will help to
control the campaign and also analyze the data which is been obtained in the form of
result. ,
— To test how effective the campaign has been it is important to restrict the campaign '
selected set of People which will have same features as of the people who would be using
that product without taking any action against them.
|
—— <r
mos
a ee Ts
PS
(G7 eusiness Intelligence (MU-B.Sc.-IT-Sem-VI) 4-6. Business Intelligence Applications
Data
‘warehouse
* customers
® products
* services
® payments
pe rotitabily og
OP
Fig. 4.1.2(c) : Cycle of relational marketing analysis
4.1.3 Lifetime Value
- Following are the main stages of customer lifetime which show cumulative value of
customer throughout the time.
- — Italso shows the different actions that can be taken for a customer by any company. In the
starting phase any individual is a prospect or also known as potential customer who has
not yet started purchasing the product or using the services provided by the company,
(a7 Business Intelligence (M U-B.SC.-1T- SO SCS A = Benne nealiperice Application,
For these customers acquisition actions are been carried out in both directly and indirecyy
fashion.
In direct acquisition the customer is been given information about the product or S€rVicg
via calls, emails, oral talks with the agents of the company and so on.
In indirect acquisition advertising and information about the product is displayed on the
dashboard of the company’s website highlighting the new products or services.
This actions includes cost which will be assigned to the customers and then calculate the
loss as all the customers that are been approached would not agree the buy the product or
service.
This event can have different forms in different situations like the service may require
subscription of the service, or the customer will only be able to buy the product wheg
he/she opens an account with the company and so on.
Before the prospect becomes a customer for the company he/she will be getting constant
reminders from the company in the form of messages, call, and emails in order to get their
customer ship.
This lead to generation of cost which has an progressive amount and if the prospect is not
convinced to buy the product this ultimately puts the company at the loss which is stated
to be negative outcome.
Retention
gt Cross/up-selling
2 Churmer
a. Retention
Lost proposal 1
Cross/up-selling '
Acquisition
4'
al lL =
> Time
Fig. 4.1.3(a) : Lifetime of a customer
This phase which is considered to make the relationship between the customer and
ee Pic and also known as maturity phase may also lead to retention, cross
€ and up selling to sustain the revenue invested on the customer.
The . . * . e
last phase is interruption of relationship where the customer calls off the service of
the ‘
of onm and moves on to the competitor company due to the inconvenience in terms
nts or various other problems like change in office or residence address.
>
nai pusiness Intelligence (MU-B.Sc.-IT-Sem-VI 4-B Business Intelligence Applications
Fig. 4.1.3(b) : Main relational marketing tasks
4.1.4 The Effect of Latency in Predictive Models
- Fig. 4.1.4 illustrates the logic for development of classification model for analysis of
relational marketing taking into consideration the temporal dimension. Let’s assume t is
the current time period which needs to be derived as inductive learning model of
classification problem.
- Say for example at the beginning of month January a mobile provider wants to develop a
classification model to find the probability of its customer. The data mart will contain
data from past periods which will be updated as t-1.In our case will have data up to
December.
- Imagine the provider wanted to get the probability of future h months in advance say for
supposing next 2 months that is February and March so in that case probability will be
generated from the data that you have till December.
- Here you have to note that data for period t will not be used to predict because the data for
_ period t will not be clear at starting of period t.
- To develop classification model the values of target variables are used for last known
period as t — 1, which are the customers that were seethed in December month.
- For testing the model the data from t — 2 should not be used because that is the training
period of the model.
Business Intellij |
. et Business Intelligence (MU-B.Sc.-IT-Sem-VI) 4-9 ence Application
=— =
- Pastdata.
“from marketing
data mart
upto...
period t-1
Fig. 4.1.4 : Development and application flow chart for a predictive model
4.1.5 Acquisition
Even if retention is the important aspect of relational marketing strategies acquisition js
also an important factor for some of the companies.
It is an process which requires identification of new prospects which are said to be
potential customers which can be or may be partially or completely unaware about the
products or services that are been provided by the company for did not require this
products or services in the past and now are in need of one or the might also be customers
of the competitors who are hunting for better services or the other case would be that the
customer has switched from your company to the competitor.
Once the company has identified the prospects it is important to assign acquisition
campaign with high profitability to both the prospects and the company with various
levels marketing strategies along with the marketing resources available with the
company.
- Traditional marketing strategies are were the advertising and campaign is based on the
earlier pools taken from the public in order to enhance the quality of products and
services that are been provided which is been fed into data mart to derive classificatio®
rules which provides characteristics for the profiles of acquisition.

4.1.6 Retention
= i to the reach of maturity stage by most of the products and services and its saturatio®
in market has lead to competition amongst companies
7] Business Intelligence (MU jence MU-B.So.-IT-Som-V1) 4-10 Businoss Intelligence Applications
Due to this the negative side effect is that the expansion of customer base of company has
more of switch mechanism like acquisition of customer at cost of that taken by other
company which is common in service industries for saving management,
telecommunication and so on.
Due to this many companies invest more amounts in resources to analyze and characterize
the attributes due to which customer’s switches from their company to another.
The other reason could be the attractive offers given by the competitive company to grab
the attention of the prospects and thus bring the market strategies if the company down.
Also there can be various reasons that the customer would not find the charge relevant to
pay for the services provided by the company and thus hunt for an alternative one and
switch for the same.
There are various other aspects that would lead to retention of products and services that
are been provided by the company and thus the company has to be keen about the same.
4.1.7. Cross-selling and Up-selling
Data mining models can also contribute to relational marketing analysis which aims to.
identify different market segments through which most of the possibility for purchasing
additional services or products from the company.
For example assume a mobile shop where there is an offer that if the customer buys a
smart phone the or she can pay extra Rs. 100 to get annual subscription of Netflix along
with smart phone but there is no compulsion that every customer purchasing smart phone
would be interested for the subscription and due to this the mobile provider get the
classification of customer who are interested and people who are not interested in the
offer.
And if the number of interested customer is more the shop owner will have to get more
services from Netflix. This demographic information about the customer can be fed into
data mart which can be used as explanatory attributes to develop classification model
which will help to develop various offers in forthcoming period and how customer would .
react to it.
Cross selling means trying to sell a product or service to the customer who is already
active and is J relationship with the company. ,
=a _—_
ae Business Intelligence MU-B.Sc.-IT-Sem-VI)__ 4-11
Business Inte}j, nee
Through classification model the company can understand which ajj custom
ers
interested in cross selling and approach only those customers.
For example, we often get calls from our banks asking us to upgrade oy; debit
credit once, now this calls are only been done to the customers holding debjt card ang
to those holding credit. So this defines a margin for acquisition to cajj only te,
customers holding debit card. .
This can also be stated as up selling where the customer is informed and asked to ow,
product or services which are one level higher than the existing one and will haye ae
features and availability.
4.1.8 Market Basket Analysis
The main objective of market basket analysis is to get the exact view of what products t,,
customers are purchasing so that the company gets the required knowledge to organiz.
and plan their marketing strategies.

Usually used to analyze what kind of product is sold more on e commerce sites or retaj]
industries. .
It can also to be applied to check the purchases done with help of credit card or landline
services or complementary once to check whether the policies taken are been taken by
same households.
Data-used here can also be referred as purchase transactions which can be associated with
time dimension to track the purchase.
4.1.9 Web Mining
As it is well known fact that web is the most common and easier way of communication
with the maximum of the crowd.
And most of the companies are using social media platform to promote their products (9
the people. E commerce sites are considered to be the important sales channels.
Since web mining is used to analyze data from the activities that are been carried out 0°
those sites by the visitor this web mining methods are mostly used for three purposes
content mining, structure mining and usage mining.
ae Business Intelligence (MU-B.Sc.-IT-Sem-Vl) 4-12 Business Intelligence Applications
— Text mining
-HTML mining —
-—| Content mining *~ XML mining
_ Image mining ~
_ Web mining -
| User profile —
ge
ie
¥ ie a ‘a at any
Usage mining”
Fig. 4.1.9 : Taxonomy of web mining analyses
> 1. Content mining
It involves analyses of content that is there on the web page to remove required
information. Search engines like Google also perform content mining to provide links to
data that is been required by the customer.
It can also be tracked back to data mining problems for analysis of texts present on web
page in format of HTML and XML, images and multimedia content.
> 2. Structure mining
This type of mining is used to understand the structure of web using different links on
different pages. Graphs can be created where nodes correspond to web pages and arches
are going to the nodes that are the link to other page.
Results and algorithms from graph theory is used to characterize web structure which
identifies area of high intensity.
> 3. Usage mining
It aims to certifying most relevant standpoint of relational marketing which explores paths
that are been followed by navigators and behaviour during the visit to company’s website.
Methods that are been used for extraction of association rules are used to obtain
correlations between different pages visited during session.
Business Intelligence
ey Intelligence (MU-B.Sc.-IT-Sem-VI)_4-13 —2blCatio,
Business Inte =
arketing Models ; Sales Force Management
Syllabus Topic : M
4.2 Sales Force Managemen a
1 Explain sales force management and various factor associated with it,
| Q. 4.2.
|5
° (Ref. Sec. 4.2) ( Marta
days almost all the companies have sales department into their organizations 4,
eT employees of those department for the sales of product or services that ar.
rely o
been offered by the company. .
Every employee is been given a target and depending upon id the targets are been
achieved these employees play an important role in the profit that is been gained by the
company.
- ee ae various marketing strategies that are been implemented by the sales departmen,
for selling off the product or services.
- The sales forces is a term coined for all the people and roles along with different tasks and
responsibilities that are associated with sales as a process. .
— The basic terms associated with sales forces based on the activities that are been carried
out are stated below:
© Residential : This sales activities take place at one, or more Places which are
managed by company supplying products and services from where the customers can
purchase, this includes sales at retail shops and wholesale dealers.
© Mobile : In this type of sales the agents of the company go to the customers house or
office to give information about their Product or service and also collect the orders.
In this category the sale occurs within B2B(Business 2 Business) relationship it can
_ also be encountered in B2C(Business 2 Customer) criteria.
© Telephone : This sales happens on telephonic conversations where the company
agents call up the customers and Promote the product and also collect the orders.
~ When it comes to mobile Sales force there are varies
subdivided into few main categories listed below :
© designing the sales network,
planning the agents’ activities,
Problem with it which can be

Oo contact management,
© sales Opportunity management,
© customer management.
(7 eusiness Intelligence (MU-B.Sc.-IT-Sem-V1) 4-14 Business Intelligence Applications
activity management,
order management.
area and territory management.
support for the configuration of products and services.
o 0 0o9 08 8
knowledge management with regard to products and services.
When a sales network is been designed and when agent’s activity are been planned there
are requirement of decision making task which will take advantage of optimization
model.
Rest can be managed with help of automation tools also known as Sales Force
Automation (SFA) which is nowadays implemented by almost all the companies.
4.2.1 Decision Processes in Sales Force Management
When it comes to designing and managing sales force various problems related to
decision making arises as shown is Fig. 4.2.1. If this problems are successfully overcome
then they yield maximum of profit, increases the efficiency of sales action and also sees to
efficient use of resources along with professional rewards to the sales agents.
The process of decision that is shown in the Fig. 4.2.1. It shows that how the strategic
objective of the company should be taken into consideration along with different other
components of marketing and see to it that the role assigned to sales force have broader
framework with respect to relational marketing.

Reser ens See
Fig. 4.2.1: Decision processes in sales force management
ON
(EP Business intelli janca (MU-B.Sc,-IT-Sem-VI Business Intelligence APplicas
Ong
~ The two ways arrow connection means that all the component interact with €ach othe,
consideration with marketing.
—- The decision-making processes related to sales force management can be STOuped ;
;A‘n
three categories: design, planning and assessment. "0
4.2.1.1 Design
~ It deals with the start phase of any commercial activity or during subsequent TeStriction
phase,
- For example, during the planning of acquisition plans for the PrOSpects or group of
companies.
‘- This phase works in different parts of creation of market segments which i
build. Salesforce design includes three types of decisions. 7

Types of decisions
1. Organizational structure -
| 2. Sizing
3. Sales territories -
Fig. 4.2.2 : Types of Decisions
‘> 1. Organizational structure
— This structure can take different forms which corresponds to hierarchical cluster of agents
with help of group of products, geographical areas or brands, in some cases markets are
also been considered to form a cluster.

- For understanding organizational structure it is mandatory to analyze complexity of the
customers, products and else activity to decide how can agents be specialized and to what
extent.
~> 2. Sizing
It is the working done on the number of agents that should work within a selected
structure of sales which relies on different factors like count of customers and prospects, how
much of sales area coverage should be done, time limit for every call and travelling time of
every agent.
Business Intelligence (MU-B.Sc.-IT-Sem-V1)
~» 3. Sales territories
en it comes to designj :
Wh ; 4 asei ©signing sales territory means creating a cluster of geographical areas
in a region ane’ assigns that region to a particular agent or group of agent.
_ Factors that should be considered while desi
1g. te ie gning and assigning this territories to the
agen S
ales Potential of every area, time required to travel from one area to
another and what time limit a particular agent has,
| Segmentation [Products-services]
_ Sales activity a
Fig. 4.2.3 : Sales force design process

4.2.1.2 Planning
- Decision making tasks that are associated with planning are assignment of sales resources,
structured and sized during the design phase, to market entities.
- Resources can be calculated as work time of the agent and the budget whereas market
entities comprises of products, market segments, distribution channels and customers.
- Allocation can be calculated as the time spend on every customer to promote the product
or service, time and cost required to travel and how effect the action was to convince the
customer for the product.
- Further possibilities can also be considered like explaining the technical and functional
features of the product or service and suggestions coming from the customers.
4.2.1.3 Assessment
- Assessment is important to control the activities to check the effectiveness and efficiency
of the agents in sales network so that proper remuneration and incentives can be designed
for every individual.
P
ee Business Intelligence (MU-B,Sc.-IT-Sem-V1)
ON
4-17 Business Intelligencg Applicat
Ong
On account to measure effective efficiency of the agent it is very important to announ
Ce
the criteria on which they would be judged.
So that the agents give their full contribution towards the sales of the Product and SerVicg
thus increasing the profit of the company as well as their individual Profit and sis,
enhance their performance
4.2.2 Models for Sales Force Management
Following are some classes of optimization models for designing and planning Salesforc,
Before starting here are some of the notions that-would be used in following Sections ¢
let’s learn about it first.
Let’s assume that are a particular region is divided into M geographical areas Of sales
which is also known as sales coverage unit so let M = {1,2,....M}.Areas should be
divided into disjoint clusters known as territories such that each area belongs to only one
territory and is also connected to all areas of same territory.
Time connection property implements that each area it is possible to reach another area of
same territory.
Time span can be divided into T intervals which are of same length which are usually
weeks or months which can be indicated as t € T = { 1,2,...,T}. .
Each territory has a sales agent associated with it which belongs 0 one area of the territory
which is considered to be agent’s residence.
Time and cost of travelling from one area to another depends on the area of residence of
the agent. Let N be number of territories so N={1, 2, ....,N}.
In territories there are customers and prospects which would be visited by the agent to
Promote their product which will be given as H in some models it is considered to have
various segments and thus they are counted same. So h = (1, 2, ..., H}.
And finally assume every agent sells K products and services during the call so let
k=(l, 2,..,K}.
4.2.3 Response Functions |
This plays an important role in formulating the models to design and plan sales network.
In general it defines the flexibility of sales with respect to sales action and a formal way
to describe complex relationships between sales actions and market reactions.
Sales to which response functions refers to are expressed in products units or monetary
units known as revenue or margins.
They are presented as sales revenues formally. The anxiety of sales action can be related
to different variables number of calls made to the customer in given period of time, how
118 __Business Intelligence Applications
many times product was mentioned in
customer In person during a given peri
8iven period of time, how much time was given to
od of time.
=; .
Xo ® Sales action effort
Fig. 4.2.4(a) : A concave response function
4>
Xg %, Sales action effort

Fig. 4.2.4(b) : A sigmoidal response function
42.4 Sales Territory Design
- It involves allocation of sales coverage units to a particular agent to minimize weighted
sum of two terms, which represents total distance between the areas of same territory and
inequality between the opportunities given to the agents.
- Every region is divided in J areas which are then combined into I territories whose
number will be already decided. Every territory has an agent which would be associated
to sales coverage unit which is considered to be residence of that agent.
- It is imagined that travel times with each area is slandered keeping in mind travel time
between a pair of distinct areas. |
- Every area will be identified by coordinates (¢;, fj)of one of its point Choose the point
whose coordinates are obtained as the average of the coordinates of all points belonging
7 Scanned by CamScanner
[& Business Intelligence (MU-B.Sc.-IT-Sem-VI)_4-19 Business Intelligence plications
to that area. For every territory, let (e;, f, ) denote the coordinates of the area Where ty,
.¢
agent associated with the territory resides.
This area will be called centroid of territory i. The parameters in the model are as follow,
dj, is the distance between centroid i and area j. It is given by,
di; = V (ce; — e) + (f; = f)”
.a; is the opportunity for sales in area j; and is a relative weight factor between total
distance and sales imbalance. Consider a set of binary decision variables Yj defined as
1 if area j is assigned to territory i
" “lO otherwise
Define I additional continuous variables that express the deviations from the average sale,
opportunity value for each territory:
- §,=deviation from the average opportunity value + » a; for territory i.
JE
- Hence, the corresponding optimization problem can be formulated as
min ~ adj Yj +B S,
iel jeJ
02 a ¥y-7 2 ass, iel
aa Te Bee ts ‘eh
i Wye Ie
S20, Y,¢ {0,1}, iel, je J.
Syllabus Topic : Logistic and Production Models : Supply Chain Optimization
4.3 Supply Chain Optimization
Q. 4.3.4 Describe Supply chain optimization. (Ref. Sec. ASS) oe ia a pte ale os (5 Mark )|
7 Supply chain can be stated as network of linked and
which co-ordinates with each other to mana
interdependent institutional units
Business Intelligence Applications
ew pusinoss Intolligence (MU-B.Sc,-IT-Sem-VI
The aim and benefit of havin i
ween the supply chain j es an integrated planning and operations been carried out
| ‘ ti , ail enain institutes to have systematic objectivity to make decisions and
take a om me ingly to maintain the standard of sub programs which would be related
to logistic operating of company system.
Most of the companies involved in manufacturing are implementing such kind of logistic
supply chain approach so that the upstream and downstream of the supply chain whereas
the problems in the co-operation between the subprograms can also be tracked.
Also oat other advantage of having integrated logistic supply chain will reduce the cost of
expenditure which includes cost of processing, cost for transportation and distribution.
Also the inventory and equipment cost are been included and reduced in integrated supply
chain.
It is equally important to upgrade logistic supply chain by adding models and automated
tools which would help in planning and analyzing the capacity in critical situations where
~ the complexity is high in the logistic supply chain which is made to function.
In most dynamic situations where the competition is much more high as the competitor
company would also have all its efforts put into their supply chain to make it more
effective.
Competitor companies can be the companies which are production wide range of products
and so these companies will require multi centric logistic supply chain which would
effectively look into distribution of the products according to the demands of the
customers.
s need to be widely spread with most of the
This multi centric logistic supply chain
have large amount of
automation which makes the work simpler and also these chains
financial investment done so as to automate and make the chains more effective.
The effectiveness and features that are associated with logistic supply chain is directly
proportional to the profile that the company maintains to communicate with the
customers.
Business Intel
usiness Intelligence (MU-B.Sc.-IT-Sem-VI)_4-21 ligence AP bli
| | , T ue
Purchase Production Transp. ese oe
costs _ costs costs
Offshore suppliers
Kitsuppliers © OS
"Asia/Pacific - Asia/Pacific market
Fig. 4.3.1: An example of global supply chain
Syllabus Topic : Logistic and Production Models : Optimization
Models for Logistics Planning

Ret. Sec. ai
Following are some of the optimization models which are associated with the features of
logistic supply chain and logistic production systems. °
While learning about this models one should understand that real world logistic
production systems have more than one element that are been considered so it would be
more complex and it will have combination of different features of different elements.
Before stosting with detailed study of the models some notations that are usually used by
these models should be known.
|
(PT gusinoss Intolligence (MU-B.Sc..IT-Sem-VI) 4.99 Business Intelligence Applications
In logistic systems I is products denoted by index i € I= {1, 2, ... , I}. Also the planning
horizon 1s been further divided into time intervals T denoted as t € T = { 1, 2, ... .T }
which is usually of equal length with duration of weeks or months.
The manufacturing company have some set of critical resources that are been shared
among the companies during the manufacturing process and are also available in limited
quantity.
These PekENaS may contain manpower, tools, assembly lines, specific fixtures and so on.
These critical resources are denoted by R and given as r€ R= {1,2,...,R}-
When even a single critical resource is applicable to the manufacturing process the index
value of ris completely omitted to maintain simplicity.
44.1 Tactical Planning
It is the first form where the main objective of planning is to regulate the amount of
production for every product over T time period which includes midterm ‘planning
horizon as well which should also satisfy given demand and capacity limits for each and
every resource that is been used in manufacturing process and which also keeps the cost
to minimal which will sum up manufacturing and inventory costs.
Hence decision variables like :

oP, are products i which will be manufactured over t period of time.
o I, are products I which is in inventory at end period of time t.
od, is the product demand I over t period of time.
oO c, is unit manufacturing cost for I product in t period of time.
© h, is inventory cost for product I in t period of time.
o ¢, is capacity absorption to manufacture a particular unit
ob, is capacity available in period t.
So the problem is formulated as follows :
min x2 ( ci Pi + hit Tir)
i€T iel
S.to Pi + Ey — Fir = div i€é I, te T,
iel e, P,, <b, te T,
Pity 29, ielI,te T.
4.4.2 Extra Capacity —
, ‘ : ti ‘
The first model deals with resorting extra capacity with respect to over time, part time .
third party capacity. —
The decision variables in first model are also considered here with addition of fey Moe
variables listed below.
O, is extra capacity which is occupied in period t. |
And parameter like, q, is unit cost of extra capacity in fort period.
- So the formula now becomes
min 22% (Cy Pip + iy Ty, ) + 2 q, O,
i€T ie!
sto Pa tT, Ty = dy iel, te T,
2 io P,, Sb, + O,, te Ty
ie!
iel, te T.
Ps T,O,20,
4.4.3 Multiple Resources
If the critical resources are to be included in the manufacturing process the formula will
have few more parameters included and the decision variables required are already been
included.
Additional parameters are listed below:

— by is quantity of resource r available in t period of time.’
€,, is quantity of resource r absorbed to manufacture one unit of product i.
- So the formula is given as :
min 2 x (Ci, Pi, + hi, Ti, )
i€T ie!
sto Pi tT. —T = dips ie lI, te T,
x ¢, P, Sb, re R,teT,
Pi, Tj, 2 0, ie I, te T,
4.4.4 Backlogging
This is an additional feature that is to be considered in logistic systems. Term backlog
refers to possibility that a portion of demand is to be given in certain period of time andit |
pusiness Intelligence (MU- .Sc.-IT-
a SE (MU-B.Sc.-1T-< BSc.-IT Sem-Vi)__ 4.24 Business Intelligence Applications
could not be completed so there is a
was left after th penalty cost that is been involved and the work that
. ¢ time completion is said to be backlogged.
_ Backlog is a feature that usually happens in B2B
goods, which is most likely to develop different
as lost sales which cannot be fulfilled and so the there is a subsequent lost.
This ance 1S Important to add new decision variables like B,, is units of demand for
product i that are been delayed in period t.
And parameters g,,
industries which produce mass consumer
variants in backlog which can be referred
1S unit cost of delaying the demand for product i over period of time.
_ Sothe formula becomes : .
;x+
min jet ier (Cv Put byl, + g, By )
$.to Py + Tix_1 hy + By - B;, t- 1 = d., iel, te T,
iel c; P,, < dD, . t E T,
Pip Tip Bix 2 0, | iel, teT.

4.4.5 Minimum Lots and Fixed Costs
- More additional features needs to be added in manufacturing systems which are to be
presented in minimum conditions which would be for technical and economy reasons. -
only, sometimes the conditions are like the production values should be equal to 0 for one
or more products or less than the threshold value that is been in minimum lot.
- To include these conditions in model binary decision variables listed below need to be
included.
1 ifP,>0-:
Yy -= :
0 otherwise,
- Also the parameters liked.
- |, which is minimum lot for product i.
- -yis constant value larger than any producible volume for i.
~ So the formula becomes :
x x _P., +h, I,
ieT iel (cy Pir hilis )
sto Py + Ty Fn = div ie], teT,
2 e; Py Sb, , te T,
iel
oS
ey Business Intelligence (MU-B.Sc.-IT-Sem-Vl) 4-25 Business Intelligence Application,
Pi, 2 1, Yin ié I, te T,
Pi SY Yin ie I, te T,
Pi, 1,20, Y, € (0, 1}, él, te T.
4.4.6 Bill of Materials
- One more feature that can be added in planning model is bill of materials which is
associated with complex structure,
- In which end product that is been made will have various components that are been useg
to build up the end product,
~ Parameters that define the format of bill of materials are : ;
~ Aj which is units of product j directly required by one unit of product j, in which term,
product refers to end product and associated components required which define differen,
levels of bill of materials,
- So the formula becomes :
zz
min ieT ie! (Cy Pi, + hi Ii, )
sto Pi, +1.._, a a it P., ie I, teT,

2
ict e; Pi, <b, te T,
Pee, | ie tet. .
— Itis the responsibility of logistic system to supply N number of peripheral depots to every
manufacturing plant turn by turn, Every manufacturing plant m € M = { 1, 2, ...,M } is
been featured by maximum product that are available there which is
that particular plant has demand of d, products,
~ _ Also the transportation cost Con
depot n and for every pair of
network,
given by s,, when
is included which include sending a production plant mo
(m, n) which is origin and destination we have logistic
The main aim of Company is to have optimistic logistic plan which satis
fies the demands
of depots in minimum cost without exploiting the availability of producti
on plants.
P
Ff susiness Intelligence (MU-B.Sc.-IT-Sem-v1) 4-2
6 Business Intelligence Applications
pecision variables included in this model w
transported for every plant and depot pair is
transported from m to n. :
So the formula for the product becomes :
min = meM neN Smn Xmn
hich represent the quantity that needs to be
given by x,,, which is unit of product to be
Xan = 8
s.to neN mn m? me M,
Xma 2 dy, néeN
Xma = 0, me M, ne N.
Syllabus Topic : Logistic and Production Models : Revenue
Management Systems —
45 Revenue Management Systems
Revenue management is a policy to manage and its main objective is to maximize the
profits for the company by maintaining the balance between demand and supply.
It is usually created for marketing and logistic criteria and has also gained interest in
service industries responsible for transport, tourist and hotels. —
Eventually it was been accepted by manufacturing and distribution companies. It was
expected to grow as the basic idea was related to the revenue and every company thinks
about maximizing their profit to the max.
But the revenue management needs to be planned according to the strategies and decision
making patterns and models of the company and so it becomes complex when data is bee:
feed to it.
4.5.1 Decision Processes in Revenue Management
When it comes to revenue management the models that are involved have mathematical
models which are used to determine the actions of the customers at every level so the
availability of the product and its price can be optimized to have maximum of the profits
out of the sales.
~ Scanned by CamScanner
et Business Intelligence (MU-B.Sc.-IT-Sem-Vl)_4-27 Business Intelligence Applications
— The aim of revenue management is not only maximizing profit but also managing Various
offers on products and services to increase the demand which will have different ideas of
marketing strategies to promote the offers and logistics.
- It gives focus on fulfilling the requirements with minimum expenditure on the cost for the
transport.
Since it is a managerial policies most of the companies have taken up this policy ang
working over it. It is been notices that this policy have become the favourite and growing
successfully and the fields that are actively implementing this policy are automotive rentaj
companies, entertainment companies, hotel chains, airlines and so on.
The common features among these fields are they have low margin sales cost and the
possibility of imposing dynamic policies for public and also violating various sales
channels. ;
SNe ee
NOTE
Syllabus Topic : Data Envelopment Analysis : Efficiency Measures
4.6 Efficiency Measures

When it comes to data development analysis the units which are being compared are
known as decision making units also known as DMUs as they have decisions that are self
governed.
To calculate the efficiency of n units N = {1, 2,...,.n} re the set of units being compared.
If these units are able to produce one single output from one single input only the effect of
;j decision making unit DMU,, j € N which is given as : :
In that yj will be the output value generated by DMU, and x; is input that is been used. -
And if output is generated using different input factors, the efficiency of DMU; will be
defined as ratio between weighted sum of outputs and inputs.
Given by H = {1, 2, ..., 8) is set of production factors and K = {1, 2, .... m} which are.the
outputs. In x, i € H which gives quantity of inputs I which are been used in DMU, and
’ v1 K which is the quantity of output r that is been gained and the efficiency of DMU;
is given as : ,
a UY ij + WY.) +... + UL nl _ Dex Us Ye

SF VX A VO tee $V Dicy Vix
Business Intelli :
busi ae B.Sc. IT-Sem-Vi) 4-28 Business Intelligence Applications
- _ Where weighs u,, u :
: Is Uy...) Uy are been associated by outputs and v,, v2,..., Vv, is been
assigned to inputs.
_ Whereas when j ue
een it comes to second case, the ability value may have different variations
Becomes difficult to fix single structure of weights which can be shared by
different units.
- Soto erond different problems that can be raised by units to represent a unit of weights
that will give advantage to few DMUs instead of benefiting to all. |
Data envelopment analysis calculates the ability for every unit on bases of this weigh
mechanism which is good for DMU where the efficiency of system will be maximized.
- Also by doing additional analysis the aim of data envelopment analysis are efficient or
not. ,
Syllabus Topic : Data Envelopment Analysis : Efficient Frontier
4.7‘ Efficient Frontier
Q. 4.7.1 _ Explain in brief efficien
It is also known as production function which shows the relation between the inputs that
are been used and the outputs that are been produced using those inputs. It also shows the
maximum amount of outputs that can be generated by given combination of inputs.
Also it showed the minimum quantity of inputs that would be required to obtain the
required output level. -
. And hence efficient frontier is directly proportional to technical efficiency of operating
‘methods. Efficient frontier can easily be gained by having set of observations which
shows the output level of given set of combination of input level production factor.
When it comes to data envelopment analysis the observations that are been obtained
responds to the units that are been evaluated. Statistical methods which use instances to
calculate regression curve give predefined hypotheses on shape of production functions.
Data seeeslopmei analysis considers assumptions on functional form of efficient frontier
and is non parametric in nature. |
‘The only condition is that the units which are been compare
production function depending on its ability value.
d should not be placed on
"Scanned by CamScanner
(er Business Intelligence (MU-B.Sc.-IT-Sem-Vl) 4-29 Business Intelligence Applications
Syllabus Topic : Data Envelopment Analysis : The CCR Model
4.8 The CCR Model
—__
Q.4.8.1 Explain in brief CCR model, (Ref, Sec. 4,8) +6 Marks) |
When data envelopment analysis model is used the option of choosing the optimal
weights of generic DMU, included solving mathematical optimization model whose
decision variables are given by weights u,, r € K and y,, i € H that is been associated With
every input and output. .
There are various formulas to get the efficiency score both the well known js
Charnes—Cooper-Rhodes (CCR) model which is given by formula :
reK UY,
x
i¢H
icH 195.
sto ——— <1, jeN,
an ViXjj
u,, v; 2 0, ré K,ie H.
The aim is to maximize the capability measures for DMU;.
max Ve eK Uy
i€H
rek "5 icy
u,, ¥, 2 0, re K,ie H.
max v=
s.to ViXij = 1,
Vj x50 ‘ j Ee N,
Let 9« be the optimum value of the objective function corresponding to the optimal
Solution (v*, u*) of . DMUj is said to be efficient if 9* =1 and if there exists at least one
optimal solution (v#, u*) such that v* > 0 and u* > 0.
By solving a similar optimization model for each of the n units being compared, one
obtains n systems of.weights,

we gusiness Intelligence (MU-B.Sc.-IT-Sem-VI) 4-30 Business Intelligence Applications
_ The flexibility enjoyed by the units in choosing the weights represents an undisputed
advantage, in that if a unit turns out to be inefficient based on the most favourable system
of weights, its inefficiency cannot be traced back to an inappropriate evaluation process.
_ However, given a unit that scores 0° =1, it is important to determine whether its efficiency
value should be attributed to an actual high-level performance or simply to an optimal
selection of the weights structure.
4.8.1 Dual of the CCR Model
- CCR model that is been associated with input oriented dual problem which has
interpretation as follows :
min,
a j AGj ‘i
s.to on I—v x50, ie H,
ZX Ay,
jen 229-M420 ieK
i, 2.0, jeEN.
4.8.2 Definition of Target Objectives
- When it comes to real world applications it is always favourable to set improvement aims
for inefficient units for both input utilized and output generated.
_ - _Data envelopment analysis gives important suggestions in this case as it can identify at
which levels of the input and output the not so capable units will give ability values.
-. The ability scores of unit show the highest peoportion of inputs that are been utilized and
given current output levels.
-- The opposite of ability score shows the factors the factor by which current level of output
must be multiplied to make unit capable which constantly holds the level of utilized
inputs.
Based on capability values data envelopment analysis gives a account for every unit that
will be compared to savings that is been done in inputs or what has increased in output to
make unit capable.
To analyses target values input output strategy can be followed where the first case is the
improvement aims that ate to be considered for resources to be used and target values of
input and output are given below :
target —_ .
x = V*Xy-5; * ie H,
target *
Vy = y,+s* » rek,
4-31 Business Intelligence Application,
ey Business Intelligence (MU-B.Sc,-IT-Sem-VI)
Whereas in second case, target values for inputs and outputs are given by,
target — ic ‘
4*
tarpet Yyt 5, K
Vy = ve ee
4.8.3 Peer Groups
- Data envelopment analysis demonstrates every unit that is not capable from the set Of best
units which are said to be peer group which have both capable units that contribute jp
excellence and the units that are not capable.
~ This group is made up of multiple DMUs which are differentiated based on Operating
methods which are same as inefficient units that are been checked in real environment
where the unit should show its best capability so as to improve the operating practices and
its performance. .
- The units that are present in peer group the given unit DMU; can be identified by
following and DMUs for first and second conditions are :
_)., > * _ I .
5 = {iB oly= 2 vin
Syllabus Topic : Data Envelopment Analysis : Identification of
Good Operating Practices
4.9 Identification of Good Operating Practices
Q.4.9.1 Explain basic factors associated with Identification of good operating practices.
(Ref, Sec. 4.9) | Oe _ (5 Marks)
~ Having good operating practices is important has it helps to improve the performance
given by unit when compared,
The units that are said to be capable in terms of data envelopment analysis demonstrate to
compare and also examples that are associated with other units.
~ Also between all the most efficient units there might be some which will help to improve
the existing ability. It is important to search for most capable unit so that the ability of
existing operating practices is improved. .
er Business Intelligence (MU-B.Sc.-IT-Sem-\ 49 s ni septa
— SSS) % usiness Intelligence tions
to identifi ; ;
So Y Ereat operating practices the units that are actually capable needs to be
“ cm i i
_ To distinguish betwee :
alysis, evalustion pn these units We can use different methods like: cross-efficiency
a , of virtual inputs and virtual outputs, and weight restrictions.
‘49.1 Cross-Efficiency Analysis
_ Cross efficiency analysis is done with the help of efficiency matrix that gives information
about ne nature of Weights systems which are been implemented by units for their ability
calculation.
- The square efficiency matrix contains multiple rows and columns that have units that are
been compared. The element 0,; of matrix denotes ability of DMU, calculated with optimal
weights structure for DMU, and Q;, ability of DMU, which is evaluated using optimal
weights.
- IfDMU is efficient ie 9; = 1 even if it shows behaviour which is to be sustained to special
dimension along with units the ability value in column related to DMU, that should be
less than 1. ,
- The quantities of interest can be derived from efficiency matrix. In which first is the
average ability which is obtained from ;j column whereas second is average efficiency
obtained by measuring unit of optimal system of weights to other units.
Later is gained by averaging values in rows which is been associated with units that are -
been examined. .
The difference between 0, and DMU; and ability gained as average value of j" column
gives the result of how much the unit relies on system weights that is been used by units
to calculate the process. . .
If the difference obtained between the two terms is relevant, DMU, will choose structure
that is not beer’ shared by other DMU in order to given all the privilege of analysis for
efficient functioning.
4.9.2 Virtual Inputs and Virtual Outputs:
- Virtual inputs and virtual outputs gives information about importance of every units
features for every input and output for the reason to maximize its ability score.
- And hence allows some specific capability of every lnicsSeteribsFiee, BAGRNEAteS nt Aled
its weaknesses are been presented at same time. The virtual inputs that are of DMU are
said to be the product of inputs that are been used by unit and its interrelated weights.
Business Intelligence Application,
Business Intelligence MU-B,Se,-IT-Som-VI 4-33
t of outputs of unit and linked with Optima
Also virtual outputs are denoted as produc
f input outputs pair for which unit shows maximum high 5COrg
weights. The pair o
provides idea of activities in which unit occurs efficient.
e virtual score but have differen,
different Operating
There can be a scenario where two units have sam
d outputs which means there are two
combinations of inputs an a a
practices, So here each unit has got two different ways in which it can function to gai,
maximum output.
4.9.3 Welght Restrictions
When the units that are really efficient are to be separated from efficiency score majorly
depends on the weights system that is been selected.
Conditions are been implied on the values of weights which will be related to inputs and
outputs. These conditions are the converted into definition of maximum threshold of
specific output for a particular weight or minimum threshold for specific inputs of
weights.
Even when different conditions are imposed on weights they still have some resilience in
choosing multiplicative factors of inputs and outputs.
And due to this reason it will be helpful to sort evaluation of virtual inputs and outputs to
identify units that are more efficient operating practices related to usage of specific input
resources or generation of outputs.
@ Syllabus Topic : Marketing Models : Relational Marketing
Q.1 Explain Relational marketing and various factor associated with it.
(Refer Section 4.1) (5 Marks)
Q.2 Explain the concept of acquisition. (Refer Section 4.1.5) . . (5 Marks)
* Syllabus Topic : Marketing Models : Sales Force Management
Q.3 Explain sales force management and various factor associated with it.
' (Refer Section 4.2) (5 Marks)
* Syllabus Topic : Logistic and Production Models : Supply Chain Optimization
Q.4 Describe Supply chain optimization. (Refer Section 4.3) (5 Marks)
—————
@ Syllabus Topic : Logistic and Pro
du |
Logistics Planning Uction Models : Optimization Models for
(Refer Section 4.4)
(5 Marks)
¢ Syllabus Topic : Logistic and Production Models : Revenue Management Systems
Q.6 List Revenue management s i i i
(Refer Section 4.5) ystems. Explain any one in detail. (5 Marks)
+ Syllabus Topic : Data Envelopment Analysis : Efficiency Measures
Q.7 List and explain efficiency measures associated with Data Envelopment analysis.
@ Syllabus Topic : Data Envelopment Analysis : Efficient Frontier
@.8 Explain in brief efficient frontier. (Refer Section 4.7) . (5 Marks)
7 Syllabus Topic : Data Envelopment Analysis : The CCR Model

Q.9 Explain in brief CCR model. (Refer Section 4.8) . _ (5 Marks)
® Syllabus Topic : Data Envelopment Analysis : Identification of Good Operating
Practices :
Q.10 Explain basic factors associated with Identification of good operating practices.
Q00
Chapter Ends...
Unit Il
98 CHAPTER
ey Classification and Clustering
—o>—>Eyz~zEmAAAADAD»_ _—_——eeeE=eEEeeEeEEEeEeEeEeEeEEEeE>EeEeEeEE>EEEE_
Syllabus Topic : Classification Problems
3.1 Classification Problems

Q.3.1.1 Whatis ctaseffication? What a are the components of cl assification problem?
(Ref, Sec. 3.1) (5 Marks)
* _ Classification problems are supervised learning methods. It is used to predict the target
. attribute.
- Classification application includes image and pattern recognition, medical diagnosis, loan
approval, detecting faults and industry applications. Estimation and prediction are viewed
as type of classification.
- Consider we have dataset N. It has x observations and y explanatory attributes and
categorical target attribute. ,
- The explanatory attribute are termed as predictive variables. The target attribute is named
as class or label. Observations are called as examples or instances.
The purpose of classification model is to recognise recurring relationship between the
Predicted or explanatory variables. It describes the examples belonging to the same class.
These relationships are interpreted into classification rules. It is used to predict the class
Of the three components of a classification problem: a generator of observations, a
Supervisor of the target class and a classification algorithm.

(7) susiness Intelligence (MU-B.Sc.-IT-Sem-VI) _3-2 Classification and Clustering
Components of a classification
problem
1. Generator
2, Supervisor
3. Algorithm
Fig. 3.1.1 : Components of a classification problem
+> 1. Generator
The role of the generator is to take out random vectors m of examples permitting to an
unknown probability distribution Pm (m).
=> 2. Supervisor .
The supervisor returns for each vector m of examples the value of the target class
according to a conditional distribution is not known. .
+> 3. Algorithm
A classification algorithm is called as classifier which chooses a function which helps to
minimize loss of function.
3.1.1 Phases of Classification Model
Q. 3.1.2 Whatare the three phases of classification model ? (Ref. Sec. 3.1.1) (5 Marks)
The three main phases of classification model are as follows :
Phases of classification mode
1. Training phase
2. Test phase |
Fig, 3.1.2 : Phases of a classification model

> 1. Training phase
The classi i . . i
€ classification algorithm is applied to the subset of N which is called as training set.
To derive classification rules it
allow the c i .
disiectinrnun, orresponding target class z to be involved to each
| (277 Business Intelligence (MU-B.ScIT-Sem-VI)_3-3 Classification and Clustering.
—> 2. Test phase
The rules are generated during the training phase. It is used to classify the observations of
L N. It is not included in the training set, for which the target class value is already known. The
training set and test set should be different.
| = 3. Prediction phase
| A prediction is achieved by applying the rules generated during the training phase to the
explanatory variables that describe the new instance.
| 3.1.2 Taxonomy of Classification Model
Q.3.1.3 What are the main components of classification model ?
(Ref. Sec, 3.1 2) as . . gees mee S te ee 5 ; (5 Marks)

There are four main components of classification model.
Components of
classification model -
2. Separation moels
, Fig. 3.1.3 : Components of classification model
~> 1. Heuristic models
- It includes nearest neighbour methods. It is based on the conception of distance between
observations, and classification trees.
- Distance between observations and classification trees is used to divide-and-conquer
schemes to derive groups of observations that are as homogeneous as possible with
respect to the target class.

“> 2. Separation model
The classification models which belongs to separation model category differ from each
other with respect to the type of separation regions, loss function etc.
The most popular separation techniques include discriminant analysis, perceptron
methods, neural networks and support vector machines. Some variants of classification
trees can also be placed in this category
Classification and Cluster
at Business Intelligence B.Sc.- IT-Sem-V N) en
. Regression model
v siders the functional form of the
It is the prediction of continuous target variables. It oon al Seta By fh
conditional probabilities, which correspond to the assignment oO
supervisor.
= 4. Probabilistic models
In probabilistic models, a hypothes
target class, known as class-conditional probabilities. |
probabilities of the target class assigned by the
is is formulated regarding the observations given the
- Subsequently, using Bayes’ theorem,
supervisor.
Syllabus Toplc : Evaluation of Classification Models
3.2 Evaluation of Classification Model

(5 Marks)
@.3.2.1 How you evaluate classification method? (Ref. Sec. 3.2)
Evaluation of classification
model
3. Scalability
4, interpretability .
Fig. 3.2.1 : Evaluation of classification model
> 1. Accuracy
The accuracy of a model is to forecast the target class for future observations. Based 04
accuracy values, it is possible to compare different models in order to select the classifier.
—> 2. Speed
- Classification methods characterized by computation times, It is applied to a small-sizé
training set obtained from a large number of observations by selecting of rando®
samplings.
A classification method is strong if the classification rules generated, and corresponding
accuracy, do not vary significantly as the choice of the training set. It is expected !°
handle missing data and outliers.
: (&F Business | Intelligence (MU-B.Sc.
T-Sem-VI)__3-5
Classification and Clustering
=> 3. Scalability
It is the ability of classifier to learn from large datasets.
> 4. Interpretability
The objective of a classification analysis is to interpret as well as predict. The rules
generated should be simple knowledge workers and experts in the application domain
should understand it easily.
3.2.1 Holdout Method
@.3.2.2 Explain the Holdout method. (Ref.Sec.3.2.1) ~—~——s(4 Marks)

The holdout method reserves a certain amount of data set for testing and the remainder for
training. Usually one third for testing, the rest for training .
The holdout method offers an evaluation of the true error rate (accuracy) of a classifier.
We have a (small) data sample of the whole data (population). Sampling is used to divide
the data in test set and training set .
That is why true error rate is difficult to calculate.
3.2.2 Repeated Random Sampling
In Holdout estimate, the process of repeating different subsamples make the method more
reliable. In each iteration, a certain proportion is arbitrarily selected for training (possibly
with stratification).
The error rates (or some other performance measure) on the different iterations are
averaged to produce an overall error rate.
The disadvantage of repeated holdout method is that it is still not optimum. The different
set may overlap.
Formula for repeated random sampling
There hre m observations in two disjoint sets T and V. T is for training and V is for testing
purpose. Repeated random sampling involves replicating the holdout method r number of
times. |
For each repetition a sample Ty, is extracted and corresponding accuracy is calculated T,
involves t observation where V, =D - Ty ‘
acc, = Tr
Y accan (Vx)
k=1
Se hanes
[GET eusiness intligence (MU-B.Se-FT-Sem-V}
3.2.3. Cross-Validation
Q.3.2.4 Explain the cross validation. (Ref. Sec. 3.2.3) . (4 Marks) |
that each observation of dataset D
— Cross validation evades overlapping test sets. It assures
appears the same number of times.
The cross validation is based on dataset D. There are r disjoint subsets L,, L,, L,...L, and
require r iterations. At i iteration L, is selected as the test set and union of all other
subsets in the partition as the training set.
Vj=L T= jek
— Standard method for evaluation is ten fold cross validation. Extensive experiments have
shown that 10 is the best choice to get accurate estimate.
— Repeated stratified cross validation even better. Ten fold cross validatio
times and results are averaged (reduces the variance). Leave one out is a particular form
of cross validation. In this case m test sets include only one observation and each example
in turn measure accuracy.
m repeated 10
3.2.4 Confusion Matrices
@.3.25 Explain the confusion matrices. (Ref Sec.32.4) (Marks)
A binary classifier produces output with two class values or labels, such as Yes/No and
1/0, for given input data. The class of interest is usually denoted as “positive” and the
other as “negative”.
- A test dataset is used for performance evaluation. It should hold the correct labels
(observed labels) for all data instances. These labels are used to compare with the
predicted labels for performance evaluation after classification.
- The predicted labels will be exactly the same if the performance of a binary classifier is
perfect. but it is not common in practical situation.
- A binary classifier predicts all data instances of a test dataset as either positive of
negative. This classification (or prediction) produces four outcomes - true positive, true
negative, false positive and false negative.
- First two basic measures from the confusion matrix.
- Enrror rate (ERR) and accuracy (ACC) are the most common and intuitive measures
derived from the confusion matrix.
@ Error rate
— The best error rate is 0.0, whereas the worst is 1.0.
et Business Intelligence (MU-B.Sc.-IT-Sem-VI)__3-7 Classification and Clustering
re
Error rate is calculated as the total number of two incorrect predictions (FAN + FAP)
divided by total number of dataset (F + N).
oe FAP + FAN FAP + FAN
Erorrate = ERR=TRpyTAN+FAN+FAP~ P+N
@ Accuracy
Accuracy is calculated as the number of all correct predictions divided. by the total
number of dataset. The best accuracy is 1.0 whereas the worst is 0.0. It can be calculated as,
1-EPR.
AGG: ws TRP + TAN _TRP+TAN
ACC = =TRP+TAN+FAN+FAP P+N
@ True positive rate
True positive rate or sensitivity is calculated as the number of correct positive predictions
divided by the total number of positives.
The best true positive rate is 1.0 and worst is 0.0.
aaa ool _ __TRP
Tue positive rate = TRP + FAN
@ True negative rate or specificity
It is the number of correct negative predictions divided by the total number of negatives.
TAN
SP = TAN + FAP
@ Precision
It is calculated as the total number of correct positive predictions divided by the total
number of positive predictions. The best precision is 1.0 whereas the worst is 0.0.
_.. _ __TRP
Precision = TRP + FAP
© False positive rate
It is calculated as the number of incorrect positive predictions divided by the total number
of negatives.
1 — Specificity
FAP
FPR = TAN +FAP~
F score is harmonic mean of precision and recall.
1 + 8’) (PREC - REC
Fy
= “(6 - PREC + REC)
False positive rate

1-SP
@} Business Intelligence (MU-B.Sc.-IT-Sem-VI)__3-8 Classification and Clustering
B is commonly 0.5, 1 or 2.
3.2.5 ROC Curve Charts
@.3.2.6 Explain the ROC curve chart. (Ref. Sec. 3.2.5) (5 Marks) |
Receiver Operating Characteristics plot measure is based on two basic evaluation
measures - specificity and sensitivity. Specificity is a performance measure of the whole
negative part of a dataset.
Sensitivity is a performance measure of the whole positive part. Receiver Operating
Characteristic (ROC) curve charts allow the user to visually evaluate the accuracy of a
classifier.
It-is used to compare different classification models. They visually express the
information content of a sequence of confusion matrices.
It allow the ideal trade-off between ‘the number of correctly classified positive
_ observations and the number of incorrectly classified negative observations to be
assessed. In this respect, they are an alternative to the assignment of misclassification
costs. :
Observed labels Four outcomes of a classifier

Negative
prediction
x-axis 1- Specificity y-axis Sensitivity
False positive rate True positive rate
ape ea Be
A Dataset has two labels (P and N), and a classifier separates
se dataset into four outcomes - TAP, TAN, FAP, FAN. The ROC plot
ss ased on two basic measures - specificity and sensitivity
al are calculated from the from the four outcomes.
Fig. 3.2.2
ROC CuEveR with the top left corner area (0.0, 1.0) show good performance levels. ROC
curves bottom right comer (1.0, 0.0) area indicate poor performance levels.
(47 Business intelligence (MU-B.Sc.-IT-Sem-V1)
3-9 Classification and Clusterin
A ROC curve of a random classifier
1.00 +
0.75) Good
0.50 - Random
Sensitivity
0.25 + Poor
0.00 +
0.00 0.25 0.50 0.75 1.00
1 - Specificity
A ROC curve represents a classifier
with the random performance level.

The curve separates the space into
two areas for good and poor
performance levels.
Fig. 3.2.3
3.2.6 Cumulative Gain and Lift Charts
Gain or lift is the measure of the effectiveness of classification model. It is calculated as
the ratio between the results obtained with or without model.
It is visual aid for calculating performance of classification model. Both charts consist of
lift curve and base line.
For example, An educational institute wants to do mail marketing drive for new course. It
costs institute Irs for each item mailed. They have information of 1,00,000 students. Out
of 1 lac 20000 students showed positive response.
Suppose we use response model to assign score.

Prediction of response model.
Cost | Total Number of Students Contacted | Positive Response
10000 10000 6000
20000 20000 10000
30000 30000 13000
40000 40000 15800
app Business Intelligence (MU-B.Sc.-IT-*
-IT-Sem-V1)__ 310 2 Classification and Clustering

Cost | Total Number of Students Contacted | Positive Response
150000 50000 17000
60000 60000 18000
70000 70000 18800
80000 80000 . 19400
90000 90000 198000
1,00,000 1,00,000 20,000
@ Cumulative gain chart
— They axis shows the percentage of positive response and x axis shows the percentage of
students contacted. .
— Baseline — overall response rate-It means if institute contact n number of students then n
number of students are positive.
-— Lift curve-Using prediction of response inca calculate the percentage of positive
response for the percentage of the students contacted. e.g. [6000/20000]* 100 = 30 %.
Cumulative Gains Chart
1007-
90
2 80T
S 70 : — Lift curve
S = —e Base li
: 50 @ line
3 40
. 30 3
* 20 4
10 3
0'TTT''TtT
0 10 20 30 40 50 60 70 80 90 100
% Customers Contacted .
Fig. 3.2.4
@ Liftchart |
It shows actual lift.
(EF eusiness Intelligence (MU-B.Sc.-IT-Sem-VI) 3.44 Classification and Clustering
For contacting 10% of students using no model we should get 10% of the responders and
using model 30% of the responders so y value of the lift curve is 30/10 = 3. Similarly for
20% of students 50% of the responders so 50/20 = 2.5.
_ The cumulative and lift chart gives an idea that which customers to contact.
Lift Chart
3.5
? \ ~e Lift Curve
2.5
’ Baseline
12=
= Oe
-A
1.5
1 fee. gg ga
0.5 +— ss
0s t ' T T T LU '
10 20 30 40 50 60 70 80 90 100
% Customers Contacted
Fig. 3.2.5
Syllabus Topic : Bayesian Methods
3.3 Bayesian Methods

Q.3.3.1 - Write short note on Bayesian methods. (Ref. Sec. 3.3) (4 Marks)
- Bayes’ theorem is one of the earliest probabilistic inference algorithms developed by
Reverend Bayes’, It is a classification technique based on Bayes’ Theorem.
~ It assumes that there is independence among predictors. In simple terms, a Naive Bayes’
classifier assumes that.the presence of a particular feature in a class is unrelated to the
Presence of any other feature.
~ For example, a fruit may be considered to be an apple if it is red, round, and about
3 inches in diameter. Even if these features depend on each other or upon the existence of
the other features, all of these properties independently contribute to the probability that
this fruit is an apple and that is why it is known as ‘Naive’.
P(Class/data) = p(data/class) - p(class) p(data)
Pa
hei a SF
ey Business Intelligence (MU-B.Sc.-IT-Sem-Vl) 3-12 Classification and Clustering
3.3.1 Bayes’ Theorem Implementation
Let us implement the Bayes’ Theorem using a simple example. Suppose we want to find
the odds of an individual having high blood pressure, given that he or she was tested for it
and got a positive result. ,
In the medical field, such probabilities play a very important role as it usually deals with
life and death situations. ,
We assume the following :
— P(Bp) is the probability of a person having Blood pressure.
_— Assume’ 1% of the general population has Blood pressure: So p(Bp)= 90.01
— P(Pos) is the probability of getting a positive test result.
— P(Neg) is the probability of getting a negative test result.
— P(PoslBp) is the probability of getting a positive result on a test done for detecting Blood
pressure, given that you have Blood pressure. This has a value 0.9. In other words the test
is correct 90% of the time. This is also called the Sensitivity or True Positive Rate.
-P(Negl ~ Bp) is the probability of getting a negative result on a test done for detecting
diabetes, given that you do not have diabetes. This also has a value of 0.9 and is therefore
correct, 90% of the time. This is also called the Specificity or True Negative Rate.
— The Bayes formula is as follows : .
P(AIB) = P(BIA) P(A) aa A
_ P(A) is the prior probability of A occurring independently. In our example this is P(Bp)-
This value is given to us.
— P(B) is the prior probability of B occurring independently. In our example this.is P(Pos).
— P(AIB) is the posterior probability that A occurs given B. In our example this is
P(Bp!Pos).
— Thatis, the probability of an individual having Blood pressure, given that, that individual
got a positive test result. This is the value that we are looking to calculate.
— P(BIA) is the likelihood probability of B occurring, given A. In our example this is
P(Pos|Bp). This value is given to us.
— Putting our values into the formula for Bayes theorem we get:
P(BpIPos) = (P(Bp) * P(PostBp) / P(Pos)
— The probability of getting a positive test result P (Pos) can be calculated using the
Sensitivity and Specificity.
Scanned by CamScanner .
(47) Business Intelligence (MU-B.Sc.-IT-Sem-VI) 3-13 Classification and Clustering
Using specificity and sensitivity are as follows :
P(Pos) = [P(Bp) * Sensitivity] + [P(~Bp) * (1— Specificity))]
P(Bp) = Probability having blood pressure = 0.01
P(~Bp) = Probability of not having blood pressure = 0.99
Sensitivity = P(Pos/Bp) = getting positive result = 0.9
P(Negi~Bp) = 0.9 = getting negative result
P(Pos) = Probability of getting positive test result = [P(Bp) *
Sensitivity] + [P(~Bp) * (1— Specificity))]
3.3.2 Naive Bayes Classifier (Simplification)
[fa'as2 expan nave Bayes lassie wih example. (Ref $60.82) (Marka
— The naive Bayes algorithm reduces the complexity of Bayes’ theorem by assuming
conditional independence over the training dataset.
- — This assumption makes the Bayes algorithm, naive.
- Given, n different attribute values, the likelihood now can be written as,
:n
P(X,...XIY) = TI POY),
i=1
- In Naive Bayes algorithm considers the features that particular feature in a class is
independent or not related to the presence of any other feature.
— For example, a fruit may be considered to be an apple if it is red, round, and about 3
inches in diameter. In this case all properties or features are independently contribute to
the probability that this fruit is an apple and that is why it is known as ‘Naive’.
- So in the above example, we are considering only one feature, that is the test result. If we
add another feature, ‘exercise’. ,
- Let’s say this feature has a binary value of O and 1, where the former signifies that the
individual exercises less than or equal to 2 days a week and the latter signifies that the
individual exercises greater than or equal to 3 days a week.
- If we had to use both of these features, namely the test result and the value of the
‘exercise’ feature, to compute our final probabilities, Bayes’ theorem would fail. Naive
Bayes’ is an extension of Bayes’ theorem that assumes that all the features are independent
of each other.
-T-Sem-Vl) 3-14 _Classification and Clustering
(G77 Business Intelligence (MU-B.Sc.
@ Advantages
It is easy and fast to predict class of test data set. It performs well in multi class
prediction.
— When assumption of independence holds, a Naive Bayes classifier performs better
compare to other models like logistic regression and you need less training data.
— It perform well in case of categorical input variables compared to numerical variable(s),
For numerical variable, normal distribution is assumed (bell curve, which is a strong
assumption). ,
@F Disadvantages
— If categorical variable in test data set has a category ,which was not observed in training
data set, then model will assign.a 0 (zero) probability. It will be unable to make a
prediction. This is often known as “Zero Frequency”. To solve this, one of the simplest
techniques is called Laplace estimation.
— The limitation of Naive Bayes is the assumption of independent predictors. In real life
situation, it is not possible to get a set of predictors which are completely independent.
@ Applications of Naive Bayes Algorithms
— Naive Bayes is used for making prediction§ in real time. It is very fast.
- It is used for multi class prediction feature. It predict the probability of multiple classes of
target variable.
- Naive Bayes classifiers mostly used in text classification (due to better result in multi
class problems and independence rule) have higher success rate as compared to other
algorithms. As a result, it is widely used in Spam filtering (identify spam e-mail) and
Sentiment Analysis (in social media analysis, to identify positive and negative customer
sentiments).
- Naive Bayes Classifier and Collaborative Filtering together builds a Recommendation
System. It uses machine learning and data mining techniques to to predict whether a uset
would like a given resource or not.
@ Example of Naive Bayes Classifier
Sr.No| Age |Income|Student|Credit card performance|Class- Buys computer
1 <30 High No Fair no
2 <30 | High | No Excellent No

(er Business Intelligence (MU-B.Sc.-IT-Sem-VI) 345 Classification and Clustering
—_oaosa eee —S———— eee

Sr.No} Age |Income|Student|Credit card performance|Class- Buys computer
3 |30To 59] High No Fair Yes
4 >60 |Medium| No Fair Yes
5 ‘> 60 Low Yes Fair - Yes
6 >60 | Low | Yes Excellent No
7 |30To59| Low Yes Excellent Yes
8 <30 |Medium} No Fair No
9 <30 | Low | Yes Fair Yes
10. >60 |Medium] Yes Fair Yes
11 <30 |Medium| Yes | - Excellent Yes
12. |30 To 59|Medium| No Excellent . Yes
13 |30To 59} High | Yes Fair Yes.
14 >60 |Medium) No excellent NO
X = (Age = ’< = 30’, Income = medium, student = yes, credit_rating = fair)

P(cl) = p(Buys_computer = yes)= 9/14 = 0.643
P(c2) = p(buys_computer = no) = 5/14 = 0.357
P(age < = 30/buys_computer = yes)
_ (number of rows where age <= 30 buys computer = yes)
(number of rows which buys computer = yes)
P(age < = 30/buys_computer = yes) = 2/9 = 0.222
P(age < = 30/buys_computer = no) = 3/5 = 0.6000
P(Income = medium/buys_computer = yes) = 4/9 = 0.444
P(income = medium/buys_computer = no) = 2/5 = 0.400
P(student = yes/buys_computer = yes)6/9 = 0.667
P(student = ues/buys_computer = no) = 1/5 =.2000
P(credit = fair/buys_computer = yes) = 6/9 = 0.667
P(credit = fair/buys_computer = no) = 2/5 = 400
X = (Age =’ <= 30’,Income= medium,student=yes,credit_rating= fair)
To find p(X/buys computer = = yes) = p(age<30/buys computer yes)
Classification and Clusterip,
(ae Business Intelligence (MU-B.Sc.-IT-Sem-VI) 3-16
*p(income = medium/buys computer yes)*p(student = yes/buys computer = yes)
*p(credit ration = fair/buys computer = yes)
= 0.222*0.444*0.667*0.667 = 0.044
3.3.3 Bayesian Networks
Q. 3.3.3 Whatis Bayesian networks ? (Ref. Sec. 3.3.3) ‘ (4 Marky]
Bayesian networks are a type of Probabilistic Graphical Model. It is used to builg
models from data and/or expert opinion.
It can be used for a wide range of tasks including time series prediction, decision under
uncertainty, diagnostics, automated insight anomaly detection and reasoning.
A Bayesian network consist of two main components. The first is an acyclic oriented
graph where the nodes correspond to the predictive variables and the arcs indicate
relationships of stochastic dependence.
The variable X; associated with nade a; in the network which is dependent on predecessor
nodes of a;.
The second component consists of the table associated with the variable Xj indicates the
conditional distribution of P(X; IC; ), where C; represents the set of explanatory variables
associated with the predecessor nodes of node a; in the network and is estimated based on
the relative frequencies in the dataset.
3.4
Syllabus Topic : Logistic Regression
Logistic Regression
@.3.4.1 Write short note on logistic regression. (Ref. Sec. 3.4) —SSC*C«((G Marks)|
Logistic regression is used to :
© Bstimate the: probability of an event occurs for a randomly selected observation
verses the probability that the event does not occur.
o Predict the effect of variables on binary response variable.
o Classify observation by estimating the probability that an observation is in particulat
category.
Model the probability of an event occurring depending on the values of the
independent variable, which can be numerical. '

[af Business Intelligence (MU-B.Sc.-IT-Sem-Vl) _ 3-17 Classification and Clustering
Logistic regression is generally used where the dependent variable is Binary. That means
the dependent variable can take only two possible values such as “Yes or No”, “Default or
No Default”, “Living or Dead”, “Responder or Non Responder”, “Yes or No” etc.
Independent factors or variables can be categorical or numerical variables.
@ Example of logistic regression
Example 1
- Ifacredit card company is going to build a model to decide whether to issue a credit card
to a customer or not, it will model for whether the customer is going to “Default” or “Not
Default” on this credit card. This is called “Default Propensity Modeling” . ;
- The probability of any event lies between 0 and 1 (or 0% to 100%). when we plot the
probability of dependent variable by independent factors, it will demonstrate an ‘S’ shape
curve.
Example 2
Suppose we have to predict the probability of a given candidate to get admission in a
college of his or her choice by-the score candidates receives in the admission test. The
dependent variable is binary- “Admission “or “No Admission”.
Since the relationship between the Score and Probability of Selection is not linear it
shows an ‘S’ shape, we can’t use a linear model to predict probability of selection by a
score. We need to do Logit transformation of the dependent variable to make the
correlation between the predictor and dependent variable linear. Use a logistic regression
model to predict the probability of getting the “Admission.
100.0%
90.0%
80.0%
70.0%
60.0%
50.0%
40.0%
30.0%
20.0%
10.0% }
0.0%
200 300 400 500 600
Score In entrance test

Probability of Selection
700 800
Fig. 3.4.1 : Graph for selection of college,
(ey Business Intelligence (MU-B.Sc.-IT-Sem-VI) 3-18 Classification and sestering
The above graph is called as Sigmoid function and it gives S-shaped curve. It gives value
between 0<p<l.
The logistic function is defined as :
Transformed = 1/(1+¢e*-x)
Where e is the numerical constant Euler’s number and x is a input we plug into the
function. Logit expression can be expressed as,
log( p(x)/(1— p(x))
where the left-hand side is called the logit or log odds function. The odds signifies the
ratio of probability of success to probability of failure.
ss,
3.5
Syllabus Topic : Neural Networks
Neural Networks
Q°3.5.1 Write’short note on neural network, (Ref. Sec.3.5) —=—~—~S*~C*™:*«SS Maas)

A neural network comprises of units (neurons) which is arranged in layers. It converts an
input vector into some output.
Each unit takes an input, it applies a nonlinear function to it and then passes the output on
to the next layer. Generally the networks are defined to be feed-forward: a unit feeds its
output to all the units on the next layer, but there is no feedback to the previous layer.
Weightings are applied to the signals which passes from one unit to another, and in these
weightings which are tuned in the training phase to adapt a neural network to the
particular problem at hand.
3.541 The Rosenblatt Perceptron
Perceptron were popularised by Frank Rosenblatt in the 1960. They appeared to have very
powerful learning algorithm, ;
. A perceptron is<a neural network unit (an artificial neuron) which does certain
computations to detect features or business intelligence in the input data.
It consists of single neuron with adjustable synaptic weights and bias. It can be used to
a linearly separated pattern, A simple perceptron can be used to classify into two
classes.
oie eee 18 a supervised learning algorithm for binary classifiers. This algorithm
enables neurons to learn and processes elements in the training set one at a time. -
7 eusiness Intelligence (MU-B.Sc.-IT-Sem-Vl) 3-19 Classification and Clustering
SSS SSS
spite Weights wt
Net input Activation
function function
Fig, 3.5.1
- There are two types of Perceptrons: Single layer and Multilayer.
- Single layer Perceptrons can learn only linearly separable patterns.
- Multilayer Perceptrons or feed forward neural networks with two or more layers have the
greater processing power.
@ Perceptron Function
- Perceptron is a function that maps its input “x” which is multiplied with be ies
weight coefficient; an output value ’f(x)”is generated:
5) -{ 1. ifw-x+b>0 Ve
~ {| 0 otherwise - /
Where, | ;
“wy” = vector of real-valued weights.

“b” = bias (an element that adjusts the boundary away from origin without any
dependence on the input value).
“x” = vector of input x values.
2 Wj X;
i=]
Where, “‘m” = number of inputs to the Perceptron.
- The output can be represented as “1” or “0.” It can also be represented as “1” or “—1”
depending on which activation function is used.
*. Inputs of a Perceptron
~ A Perceptron accepts inputs, moderates them with certain weight values, then applies the
transformation function to output the final result.
(7 Business Intelligence (MU-B.Sc.-IT-Sem-VI) __3-20 Classification and Clusterin
SSS
A Boolean output is based on inputs such as salaried, married, age, past credit profile, etc
It has only two values: Yes and No or True and False. The summation function “y»
multiplies all inputs of “x” by weights “w” and then adds them up as follows :
(Wy + WD, X; + WX +... +0, X,
For example: If © @,x; > 0 => then final output “o” = 1 (issue bank loan).
Else, final output “‘o” = — 1 (deny bank loan).
In the Perceptron Learning Rule, the predicted output is compared with the known output,
If it does not match, the error is propagated backward to allow weight adjustment to
happen. ;
Perceptron has the following characteristics :
o Perceptron is an algorithm for Supervised Learning of single layer binary linear
classifier. , |
o » Optimal weight coefficients are automatically learned.
Weights are multiplied with the input features and decision is made if the neuron is
fired or not.
o Activation function applies a step rule to check if the output of the weighting
function is greater than zero.

o Linear decision boundary is drawn enabling the distinction between the two linearly
separable classes +1 and —1.
— If the sum of the input signals exceeds a certain threshold, it outputs a signal; otherwise,
there is no output.
3.5.2 Multi-Level Feed-Forward Networks .
Multilayer Perceptron (MLP) includes at least one hidden layer (except for one input layer
and one output layer).
Multi-level feed-forward neural network, is a more complex structure than the perceptron,
since it includes input nodes, hidden nodes and output nodes use a neural network with |
two input nodes i, and i,, two hidden neurons h, and h,, two output neurons 0, and 0.
ee SSS SSS
Here’s the basic structure :,
Fig. 3.5.2
The goal of back propagation is to optimize the weights so that the neural network can
learn how to correctly map arbitrary inputs to outputs.
Input nodes : Input nodes receive input the values of the explanatory attributes for each
observation. Usually, the number of input nodes equals the number of explanatory
variables. i
Hidden nodes : Hidden nodes receives the information from input nodes and transforms
the input values inside the network. Each node is connected with outgoing arcs to output
nodes or to other hidden nodes. =e
Output nodes : Output nodes receive connections from hidden nodes or from input nodes
and return an output value that corresponds to the prediction of the response variable.
Each node of the network has given weights which are associated with the input
arcs. Each node is associated with a distortion or bias coefficient and an activation
function. -
Back propagation algorithm is used in multilevel feed forward network.

a ee
3.6
Syllabus Topic : Support Vector Machines
Support Vector Machine
0.3.64 Write short note on support vector machine. (Ref. Sec.3.6) — _—«(& Marks)
The simply way to describe SVM is a binary classifier. It attempts to find a hyperplane
that can separate two class of data by the largest margin. Quazi Marufur Rahman gives a
very good example of what is margin,
and Janice Gates points kernel trick. I think the kernel
trick is most important part of SVM, it distinct SVM with other classifiers.
(ep Business Intelligence (MU-B.Sc.-IT-Sem-Vl) 3-22 Classification and Clustating
3.6.1. Structural Risk Minimization
— Structural Risk Minimization (SRM) (Vapnik and Chervonekis, 1974) is an inductive
principle for model selection used for learning from finite training data sets.
— It describes a general model of capacity control and provides a trade-off between
hypothesis space complexity (the VC dimension of approximating functions) and the
quality of fitting the training data |
— Suppose we have two dimensional data with different features x, So X3.
/ Fig. 3.6.1
/ - The above data can be divided into two classes class 1 and class 2. The above data is
linearly separable.
Class + 1
kk ae eve
We Wie f(x) = f(x, xp) = 0
: Aa x * f(x) <0
ae : pe
or 4 :
rena] 908, cate
—
Fig. 3.6.2
- Astraight line will classify data into two classes. The equation is f(x) = f(x,, X,) = 0.
- The classifier is called as linear classifier. Data is called training sets.
Unseen pattem
-— (Test data)
G6
"88°
co
Fig. 3.6.3
a
[FT Business Intelligence (MU-B.Sc.-IT-Sem-Vl) 3-23 Classification and Clustering
- Suppose we have unseen data set. The value of unseen dataset f(x) < 0 then it is classified
as class — I,
- Now we are in position to define tow quantities training error and test error.
-— Since during training phase classifier learns the distribution of data, the low value of
training error is required. Test error also should be low. Because it controls the unseen
data pattern.
We have to always look for test error along with training error.
Improving on training error not always improves test error.
Increase in machine capacity may result in poor test performance.
a Pp
It is difficult to estimate true test error of classifier.
— To ensure low test value of classifier.
a (tog (=) + 1-log (3)
VC dimension is used = Test error < training error + m
- It gives upper bound of test error with probability 1 —n.

Where,M =. Number of training samples.
a = related capacity of machine n is called as VC dimension
VC (Vapnik — Crervonekis => test error < training error complexity
— The graph of VC dimension with fixed sample size.
"Upper bound test error
Training error
(Complexity)
ap Vg dimension
Fig. 3.6.4
As we increase VC dimension, the training error will be reduced. Complexity increases
with VC dimension.
Upper round (dimension) first decreases later on it incr
value of test error should be minimum. To achieve this su
error and training error should be minimum.
eases. For efficient classifier, the
m of penalty error or complexity

[7 Business inteligence (MU-B.So.1T-Sem-VI)_3-24 Classification and Clustering
@ Points In general position
Inn dimensional feature space a set of m points (m > n) is in general position iff no subset
(n + 1) points lie on (n — 1) dimensional hyperplane.
n=2 Where
m=4 mon n+1=3
Fig. 3.6.5
— If weadd one more points.
—> Three points are lying
on the straight line
Fig. 3.6.6
So we can say all above 3 points are not in general position.
@ Shattering
Hypothesis (H) shatters m points in n-dimensional space if all possible-combinations of m
points in n-dimensional space are correctly classified by H.

Fig. 3.6.7
2° = 8 possible arrangements as this points can take two values 0 orl.
So VC dimension is cardinality of the largest set of points that the hypothesis can shatter.
VC dimension of linear classifier, (n + 1) {points should be in general position}.
For non linear classifier VC dimension is difficult to compare. VC dimension is directly
related to machine/hypothesis capacity error VC dimension gives probabilistic uppe
bound test error.
(7 Business Intelligence (MU-B.Sc.-IT-Sem-VI)__ 3-25 Classification and Clustering
3.6.2 Maximal Margin Hyperplane for Linear Separation
— The following is an example of hyper plane that separates training instances with no
errors.
Fig. 3.6.8
- If we think then there are multiple hyper planes which can be choose for separating two
data points.
- For the maximum margin hyper plane only ‘examples on the margin matter (only these
affect the distances). These are called support vectors.
® Definition
Define the hyper planes H such that :
w:x,+b2+4 1, wheny,=+ 1. "8
w+ x,+b2-1, when y; =— 1,
H, and H, are the planes :
H,:w-x;+b2+1
H,:w-x,+b2-1
The points on the planes H, and H, are the tips of the support vectors.
The planes Hy is the median in between, where w : x, +b=0.
d’ = the shortest distance to the closest positive point.
d~ =the shortest distance to the closest negative point.
The margin (gutter) of a separating hyper plane is d° +d.
[ep Business Intelligence (MU-B.Sc.-IT-Sem-VI)__3-26 Classification and Clustering
Fig, 3.6.10
3.6.3. Nonlinear Separation
- Nonlinear Classification : Classes may not be separable by a linear boundary.
— Kernels: Make linear models work in nonlinear settings By mapping data to higher
dimensions where it exhibits linear patterns.
-— The simplest way to separate two groups ‘of data is ‘with straight line, flat plane an
N-dimensional hyper plane. However there are situations where a non linear region can
separate the groups more efficiently. |
SVM handles this by using kernel function(non linear) to map the data into different
space where a hyper plane (linear) cannot be used to do the separations.
It means a non linear function is learned by linear learning machine in a high dimensional
feature space which the capacity of the system is controlled by a parameter that does not
depend on the dimensionality of the space.

This is called as kernel trick which means kernel function transform the data into higher
dimensional feature space to make it possible to perform the linear separation.
Kernel function map the data into new space. It take the inner product of new vectors. The
image of the inner product of the data is the linear product of the images of the data. Two
kernel function are shown as below :
@ Polynomial kernel
k(X;, Xp)
(x, X, + 1)"
@ Gaussian kernels
wo (eget)
k(x, X) 20
-IT-Sem-Vl)_3-27
[EF eusiness Inteligence (MU-B.Sc.
3.7 Clustering
—_—_—_—
Cluster analysis or clustering is the task of grouping a set of objects in such a way that
objects in the same group (called a cluster) are more similar (in some sense) to each other than
to those in other groups (clusters).
Syllabus Topic : Clustering Methods
3.7.1 Clustering Methods
[a. 3.7.1 Whatare the characteristics of clustering method? (Ref. Sec. 3.7.1) _ (4 Marks)
Clustering methods must satisfy a few general necessities, as indicated below.
Clustering Methods |
Necessities
1. Flexibility |
[ 3, Efficiency
Fig. 3.7.1 : Clustering Methods Necessities
~> 1. Flexibility
There are clustering methods which can be used on numerical characteristics only. In such
cases most of the time Euclidean metrics is used to determine the distances between
observations.
—- A flexible clustering algorithm is used to analyse datasets containing categorical
attributes.
> 2. Robustness
The robustness of an algorithm is the stability of the clusters generated with respect to
small changes in the values of the attributes of each observation.
~> 3. Efficiency .
In some applications there are large num
algorithms must generate clusters efficiently in o
ber of observations In such case clustering
rder to guarantee reasonable computing times
for large problems.
(Business Intelligence (MU-B.Sc.-IT-Sem-VI) 3-28 a Classification and Clustering
3.7.2. Taxonomy of Clustering Methods
Q. 3.7.2. Whatis taxonomy of clustering method? (Ref. Sec. 3.7.3) (4 Marks)
The different types of Clustering based on the logic are partition methods, hierarchical
methods, density based methods and grid methods.
Types of Clustering
1. Partition methods
TT
2. Hierarchical methods
3. Density based methods
4. Grid methods
~ Fig. 3.7.2: Types of Clustering
=~ 1. Partition methods
Partition methods, is a division of the given dataset into a predetermined number K of
non-empty subsets. They generate a spherical or at most convex shape after grouping.
+ 2. Hierarchical methods
In Hierarchical methods, subset is divided into tree structure. It categorized clusters by
different homogeneity thresholds. Predetermined clusters are not required.
=> 3. Density-based methods
Hierarchal and partition methods are founded on the distance between observations.
Density-based methods determine clusters from the number of observations locally falling
in a neighbourhood of each observation.
For each member which belongs to a specific cluster, a neighbourhood with a specified
diameter should contain a number of observations which should not be less than 4
minimum threshold value.
Density-based methods identify clusters of non-convex shape which helps them to isolate
any possible outliers.
> 4. Grid methods
Grid methods obtain a grid structure co
reduce computing times, despite a lower accuracy 1
nsisting of cells, The grid structure is achieved to
n the clusters generated.
3.7.3 Affinity Measures
In Hierarchical clustering clusters are repeatedly links to pairs of clusters so that every
data object is included in the hierarchy. To determine the similarity between the clusters the
distance functions, such as the Manhattan and Euclidian distance functions, are used
3.7.3.1 Distance Functions
- Given two p-dimensional data objects i = (Xj, Xjz, «--Xjp) aNd j = (Xj,Xjq, -.Xjp), the
following common distance functions can be defined :
1. Euclidian Distance Function
2.
Manhattan Distance Function
= 1. Euclidian Distance Function
dai, j) = Vix - Xa +1X,.— I +... +155 — Xiph
-~ 2. Manhattan Distance Function

dG, j) = lay — xl + x2 — Kpl +... + Ip — Xp
- Distances are always positive numbers. In the Euclidian distance function, attributes with
larger scales of measurement may overcome attributes measured on a smaller scale. To
prevent this problem, the attribute values are often normalized to lie between 0 and 1.
- A third option which generalizes both the Euclidean and Manhattan metrics. The
Minkowski distance defined as,
dist (i,j) =4
qq
- Example:
To calculate a distance between two points p (x1, y) and q (Xzs y2) in xy-plane.
Fig. 3.7.3
(2) susiness Intell ence (MU-B.Sc.-IT-Sem-VI) _ 3-30 , ae ere CAN
The distance between two points is the sum of the (absolute) differences of their
coordinates. E.g. it counts | unit for a straight move, and it counts cost as 2 if one takes
crossed move.
Manhattan Distance
214 |.2
14 1
212
X41 — Xai + l¥y - Yo
Fig. 3.7.4
In chess, the distance between squares on the chessboard for rooks is measured in
Manhattan distance
3.7.4 Attribute
An attribute is a data field, which represents: a characteristic or feature of a data object.
The nouns attribute, dimension, feature, and. variable are commonly recognized as
attribute in literature. ‘
In data warehousing attributes are referred as dimension. In Machine learning literature it
is referred as feature, while statisticians call this term as variable.
Data mining and database professionals commonly use the term attribute. Attributes
describing a customer object can include, for example, customer ID, name, and address.
Univatiate distribution involves only one attribute. The distribution of data having two
attributes is known as bivariate. '
The type of an attribute is determined by the set of possible values the attribute can have.
Attributes can be nominal, binary, ordinal, or numeric. In the following subsections, we
introduce each type.
Types of Altribute
2. Nominal
3. Ordinal
4. Mixed Composition Attribute
Fig. 3.7.5
&P Business Intelligence (MU-B.Sc.-IT-Sem-VI)_3-31 ______Classification and Clustering
> 1. Binary Attributes
Q. 3.7.3 Write short note on Binary attribute. (Ref. Sec, 3.7.5(1)) (5 Marks)
- Nominal attribute is treated as binary attribute. It has two categories or states 0 or 1.
- 0 means attribute is absent and 1 means it is present. Binary attributes are referred to as
Boolean as two states correspond to true and false. 1 means that it is present.
- E.g. Smoker describing a patient object, | indicates that the patient smokes, while O
indicates that the patient does not.
- Asimilarity measure for two objects, i and j, will typically return the value 0 if the objects
are unalike. The higher the similarity value, the greater the similarity between objects.
(Typically, a value of 1 indicates complete similarity, that is, that the objects are
identical.)
- Adissimilarity measure works the Opposite way. It returns a value of 0 if the objects are
the same (and therefore, far from being clsaivailar. The higher the dissimilarity value, the
more dissimilar the two objects are.
_~ A nominal attribute can take on two or more states. For example, flower color is a
nominal attribute that may have, say, five states: red, yellow, green, pink, and blue
| — © Let the number of states of a nominal attribute be M. The states can be denoted by letters,
symbols, or a set of integers, such as 1, 2,..., M. The dissimilarity between two objects i
and j can be computed based on the ratio of mismatches :
a,j) = =™
- Where m is the number of matches (i.e., the number of attributes for which i and j are in
the same state), and p is the total number of attributes describing the objects. Weights can
be assigned to increase the effect of m or to assign greater weight to the matches in
attributes having a larger number of states
- There is another approach which involves computing a dissimilarity matrix from the
given binary data.
Table 3.7.1 : A contingency table for binary attributes
Object j
Object i "1a 0 | sum
1 q R qtr
0 Ss t st+t
sum | q+s|r+t|P
1 for both objects i and j. r is the number of
0 for object j. s is the number of attributes
the number of attributes thay
Where q is the number of attributes that equal
attributes that equal 1 for object i but that are '
that equal 0 for object i but equal 1 for object J. And t is
equal 0 for both objects i and j.
The total number of attributes is p. Where p=q+r+s5 +t.
binary attributes, each state is equally valuable. | —
ic binary attributes, then the dissimilarity
Recall that for symmetric
If objects i and j are described by symmetr
between i and j is,
-. r+s
The above equation states a degree of similarity between pairs(i,j) of observations through
the coefficient of similarity.
Assume that all n attributes are binary and asymmetric. In such case, for a pair of
asymmetric attributes it is interesting to match positives, records possessing the property
relative to each attribute.
For binary variables, the Jaccard coefficient is therefore used
d(i,j) =rt+sqtrts
> 2. Nominal Attribute
Q.3.7.4 Write short note on Nominal attribute. (Ref.
- (4 Marks)
Nominal attributes means “relating to names.” Nominal attribute are symbols or names of
things. Each value denotes some kind of category, code, or state. Nominal attributes are
also referred as categorical. In computer’ science, the values are also known as
enumerations.
Nominal attributes. Suppose that Hair color and Marital status are two attributes
describing person objects. In our application, possible values for Hair color are black,
brown blond, red, auburn, grey, and white.
It is symmetric attribute where the value is greater than 2.We use similarity coefficient in
extended form, dist (i ,j) =(n—-f)/n
Where, f is the number of attributes in which observations i and j take the same value.
~> 3. Ordinal Attribute
Q.3.7.5 Write short note on Ordinal attribute. (Ref. Sec, 3.7.5(3)) 4 Marks) |
Values of ordinal attribute has possible values and have a meaningful order or ranking
among them. The magnitude between consecutive values is not known.
[G7] Business Intelligence (MU-B.Sc.-IT-Sem-VI)__3-33 ___CClassification and Clustering
Suppose that Drink size corresponds to the size of drinks available at a restaurant. This
ordinal attribute has three possible values — small, medium, and large. However, we
cannot tell from the values how much bigger, say, a medium is from a large.
Ordinal variable can be discrete or continuous.
Order is important and can be treated like interval scaled.
Replace ordinal variables value by its rank ‘r € {1....My} .
Map the range of variable[0, 1].
Gd
Z:=M,-1
4, Mixed Composition attribute
A dataset contain all attribute types nominal, ordinal, symmetric binary, asymmetric
binary etc. To define an overall affinity measure which defines similarity between
observations d; and d; One can use weighted formula as follows,
sw a a’
a,j) = =4—
=1
PO
a
If f is numeric it uses the normalized distance.
If f is binary or nominal d;; = 0 if Xip= Xj,
If f is ordinal then it computes rank Zi¢ .
Hol
= Mol
3.8
Syllabus Topic : Partition Methods
Partition Methods
Partition methods are heuristic nature. They are.based on greedy methods where at each
‘Step they make the choice that locally appears the most advantageous.
There is guarantee that a good subdivision will be obtained for the majority of the
datasets. The K-means method and the K-medoids method, , are two of the best-known
Partition algorithms
[FT Business Inteligence (MU-B.Sc.1T-Ser-VI)_9-
34 Classification and Clustering
3.8.1 K-means algorithm
Q. 3.8.1 Explain K-means method. (Ref. Sec. 3.8.1) (4 Marks)
K means clustering is an algorithm is used to classify or group the objects based on
features or attributes. Algorithm is used to classify into k number of groups.
K is positive integer. The grouping is done by minimizing the sum of squares of distances
between data and the corresponding cluster centroid.
The algorithm assumes two clusters, and each individual's scores include two variables (as
in the example above
In non-hierarchical clustering such as the k-means algorithm. The relationship between
clusters is undetermined. Distance functions such as Manhattan and Euclidian distance

functions, are used to determine similarity.
Distance Functions:
1.
2:
3.
1.
Given two p-dimensional data objects i = (Xj,Xjg, -+-:Xjp) ANd j = (Xj /Xjar ---»Xjp), the
following common distance functions can be defined:
Euclidian Distance Function :
d(i,j) = \ViK;y - XI + Xj. — Xt Pa0t Ky = Xp”
Manhattan Distance Function : |
dG, j) = lx — Xi! + 1XQ—Xpl +... + 1X9 — XI
Steps of k-means Algorithm :
Choose k clusters arbitrarily.
Initialize cluster centres with those k clusters.
loop
a) Partition by assigning or reassigning all data objects to their closest cluster center.
b) Compute new cluster centers as mean value of the objects in each cluster.
¢) Until no change in cluster center calculation.
Example of implementation of k means algorithm using k=2(partitions)
Variable 1 | Variable 2
l 1.0
1.5 2.0
3 3 4.0
ey Business Intelligence (MU-B.Sc.-IT-Sem-VI)__3-35 Classification and Clustering
Variable 1 | Variable 2
4 5 7.0 '
5 3.5 5.0
6 4.5 5.0
7 3.5 4.5
Step 1:
Randomly we choose two centroids for k = 2.
In this case two centroids are c, and c, where c, = (1.0,1.0) and c, = (5.07.0).
Individual | Mean vector
Group | 1 (1.0,1.0)
Group2|. 4 (5.0,7.0)
d(m,,2) = /i1.0 —15P 411.0 —2.0F = 1.12
d(m,,2) = VI5.0—1.5I° +17.0-2.0F = 6.10

Step 2:
We obtain clusters containg ty 2 a and eniee 5,6,7}.
“| centroid 1 - centroid 2~
1 0 7.21
2(1.5,2.0) 1.12 6.10
3 3.61 3.61
4 7.21 0
5 4.72 2.06
6 §:31 2.06
7 4.30 3 “4
iL, = (1/36 1.0+1.54+3.0), 1/3(1.0+2.0+4.0) = (1.83,2.33) = cluster 1
L, = 1/4(5.0+3.5+3.5), 1/3(7.0+5.0+4.5) = (4.12,5.38)) = cluster 2
(m= x) + (m- yy"
d(m,,2) = Vil.0-1.5! +11.0- 2.0F = 1.12
d(m,,2) = *Vi5.0- 1.57 +17.0 - 2.0!" = 6.10
We are still not sure that each individual has been assigned to the right cluster. So, we
compare each individual’s distance to its own cluster mean and to that of the opposite cluster.
Classification and Clusterin, |
FP ausiness Intelligence (MU-B,Sc.-IT-Sem-Vl) 3-36 9
And we find :
o mean (centroid) of
Individpal aaa 3 Snes Distance t eee id)
1 1.5 5.4
2 0.4 4.3
3 21 1.8
4 57 1.8
5 3.2 0.7
6 3.8 0.6
7 2.8 11
Individual 3 is closer to the mean of the opposite cluster (Cluster 2) than its own
(Cluster 1). In other words, each individual's distance. to its own cluster mean should be
smaller that the distance to the other cluster's mean (which is not the case with individual 3),
Thus, individual 3 is relocated to Cluster 2 resulting in the new partition:

Cluster1| 1,2
Cluster 2 | 3, 4,5, 6,7 3.9, 5.1)
3.8.2 K-medoids Algorithm
Q.3.8.2 Explain‘K-medoids algorithm. (Ref.Sec.3.8.2) = =—-—~—~—=—«(S Marks)
- K-means tries to minimize the total squared error. While k-medoids minimizes the sum of
dissimilarities between points labelled to be in a cluster and a point designated as the
center of that cluster. In contrast to the k-means algorithm, k-medoids chooses datapoints
as centers
- Instead of taking mean value of the object in a cluster as reference point , mediods can be
used, which is the most centrally located object in cluster.
- K medoids is called as Partitioning Around Medoids (PAM) algorithm. :
- All the items from the input data set are examined by one to see that they are medoids are
not.
1. Initialize : arbitrarily select k out of the n data points as the medoids.
A Business Intelligence (MU-B.Sc.-IT-Sem-VI)__ 3.37 Classification and Clustering
iO
2. Associate each data point to the nearest medoid
For each medoid m and each data point h associated to m, swap m and h and compute
the total cost (that ms the average dissimilarity of h to all the data points associated to
m). Select the medoid h with the lowest cost of the configuration.
- Repeat alternating steps 2 and 3 until there is no change in the assignments.
- In more simpler terms for each pair of a medoid m and a non-medoid object h, measure
_ whether / is better than m as a medoid.
_— Use the squared-error criterion. .
1 PEG
E=
a d(p, m,)
i Me
Compute E,-En.
Choose the minimum swapping cost.
@ Four Swapping Cases

- When a medoid mm is to be swapped with a non-medoid object h, check each of other non-
medoid objects j.
fis in cluster of m => reassign j.
Case 1: j is closer to some & than to h; after swapping m and h, j relocates to cluster
represented by k. . |
Cian. = AG, k) — d(j,.m) 2 0
Case 2 : j is closer to h than to k; after swapping m and h, j is in cluster represented by h.
Cjoh = d(j, h) — dG, m)
j is in cluster of sone k, not m=compare k with h.
Case 3 : j is closer to some k than to h; after swapping m and A, j remains in cluster
represented by k .
Cin = OG, k)- dG, k) =0
Case 4 : jis closer to h than to k; after swapping ™ and h,
Cyan = dG, h)- dG, kK) <0
The K-medoids algorithm requires a large number of iterati
deriving clusters for large datasets.
jis in cluster represented by h.
ons and is not suited to
TT
[Ee Business inteligence (MU-B.Sc.IT-Sem-VI)_ 2-38
Classification and Cluster
ee et
Syllabus Topic : Hierarchical Methods
3.9 Hierarchical Methods
Q.3.9.1 Explain single linkage, complete linkage, average linkage and ward distance | |
(Ref. Sec. 3.9) (5 Marks)

- Hierarchical clustering generates hierarchy in clusters. No need to specify k. It is moze
deterministic.
- The graphical representation of the resulting hierarchy is a tree-structured graph called q
dendrogram. .
- In order to calculate the distance between two clusters, the hierarchical algorithms reson |
to one of five alternative measures: minimum distance, maximum distance, mean |
distance, distance between centroids, and ward distance.
Types of Hierarchical Clustering
1. Single Linkage
2. Complete Linkage
3. Average Linkage
Fig. 3.9.1 : Types of hierarchical clustering
“> 1, Single linkage
In single linkage hierarchical clustering, the shortest distance between two points in each
cluster is defined.
For example, the distance between clusters “Pr” and “
between their two closest points.
s” is equal to the length of the arrow
L(r,s) = min(D(x,.6)))
Fig, 3.9.2
(Ef eusiness Intelligence (MU-B.Sc.-IT-Sem-Vl) 3-39 Classification and Clusterin
- 2. Complete linkage
In complete linkage hierarchical clustering, longest distance between two points in each
cluster is defined.
For example, the distance between clusters “r’ and “s” is equal to the length of the
between their two furthest points.
L(r,s) = max(D(x4.%.))
Fig. 3.9.3
= 3. Average Linkage
In average linkage hierarchical clustering, the average distance between each point in one
cluster to every point in the other cluster is defined.
For example, the distance between clusters “” and “s” to the left is equal to the average
length each arrow between connecting the points of one cluster to the other.
Ward distance
The Ward distance, based on the analysis of the variance of the Euclidean distances
between the observations. °
Methods based on the Ward distance tend to generate a large number of clusters, each
containing a few observations.
Centroid Method
In centroid method, distance between the two mean_vectors of the clusters is consider as
the distance between two clusters. At each stage of the process we combine the two
Clusters that have the smallest centroid distance.
IP usiness Intelligence (MU-B.Sc.-IT-Sem-VI) _ 3-40 Classification and Clustering
- Hierarchical methods can be subdivided into two main groups: agglomerative and divisive
methods.
3.9.1 Agglomerative and Divisive Hierarchical Methods
3.9.1.1 Agglomerative Method
- Agglomerative method is bottom up clustering. Suppose there is set of N observations.
— Calculate the distances (similarities) between the clusters equal the distances (similarities)
between the items they contain. Join the two most similar clusters.
- In agglomerative or bottom-up clustering method we assign each observation to its own
cluster. Then,
Step1: Calculate the similarity (e.g., distance) between each of the .clusters and join the
two most similar clusters. f
Step2: Find the nearest (most similar) pair of clusters and merge them into a single
cluster, so that now you have one less cluster. .
Step3: Compute distances (similarities) between the new cluster and eachof the old
clusters.
Step4: Repeat steps 2 and 3 until all items are clustered into a single cluster of size N.
3.9.1.2 Divisive Hierarchical Methods

- In divisive or top-down clustering method we allocate all of the observations to a single
cluster. We partition the cluster to two least similar clusters.
- Finally, we proceed repetitively on each cluster until there is one cluster for -each
observation. There is evidence that divisive algorithms produce more accurate hierarchies
than agglomerative algorithms in some circumstances but is conceptually more complex.
In Divisible hierarchical clustering, top down approach is used. It starts with all objects in
one cluster. Clusters are subdivided into smaller and smaller clusters until each object
forms a cluster on its own. Certain termination condition is satisfied.
A cluster is split according to some principle, ¢.g., the maximum Euclidian distance
between the closest neighbouring objects in the cluster. Start with single cluster at the top
of the tree and continue splitting it into smaller and smaller
Clusters till the bottom is reached where there are n clusters with one member each.
Dendrogram is a tree data structure which illustrates hierarchical clustering techniques.
|
(&T Business Intelligence (MU-B.Sc.-IT-Sem-Vl) 3-41 Classification and Clustering
- Each level shows clusters for that level. Leaf- individual cluster, Root- one cluster. A
cluster at level i is the Union of its children clusters at level i + 1,

Fig. 3.9.5
—_——“——_—_—~—«x«x<—X——X—XXX—SESESaCTCTCT2CX2—NA2a—X—XK€#=#=[Z=[{[
=[[$}}_>_>>E>~—_ =e.
Syllabus Topic : Evaluation of Clustering Models
3.10 Evaluation of Clustering Models
Q. 3.10.1 How one evaluates clustering model? (Ref. Se Marks)
- To measure of performance of a clustering method, one need to verify the clusters
generated correspond to an actual regular pattern in the data. It is appropriate to apply
other clustering algorithms and to compare the results obtained by different methods.
- In this way it is also possible to evaluate if the number of identified clusters is robust with
respect to the different techniques applied.
Cluster cohesion : Measures how closely related are objects in cluster.

Cluster separation : Measures how distinct or well separated cluster is from other
cluster, .
Let X = {x,, x», ..., %,} be the set of K clusters generated.
x dist (Ci, C))
Cohesion is defined as (X,) coh = C,€ X,
C,e€ X,
Separation between a pair of clusters is defined as,
x dist (Ci, C,)
Sep (X,,X,) = Ge X,
C.e X,
Classification and Cluste C
EP eusinoss Intelligence (MU-B.Sc.-IT-Sem-VI) 3-42 Se maaaaemmaens ting
— Silhouette refers a method of interpretation and validation of ornate of clusters o¢
data. The silhouette value is 4 measure of how similar an object 1s to its Own Cluste,
(cohesion) compared to other cluster (separation).
- The coefficient value ranges from — 1 to + 1. The high value enn ame the object ig
well matched with its own cluster and poorly matched with neighbouring Cluster,
Silhouette can be calculated with distance metric such as eclulidean or Manhattan
distance,
* Syllabus Topic : Classification Problems
Q.1 What is classification? What are the components of classification problem?
(Refer Section 3.1) . (5 Marks)
Q.2 What are the three phases of classification model ? (Refer Section 3.1.1) (5 Marks)
Q.3 What are the main components of classification model ?
(Refer Section 3.1.2) (5 Marks)
* Syllabus Topic : Evaluation of Classification Models
Q.4 How you evaluate classification method? (Refer Section 3.2) | (5 Marks)
Q@.5 Explain the Holdout method. (Refer Section 3.2.1) (4 Marks)
Q.6 Explain the Repeated random sampling. (Refer Section 3.2.2) (4 Marks)
Q.7 Explain the cross validation. (Refer Section 3.2.3) (4 Marks)

Q.8 — Explain the confusion matrices. (Refer Section 3.2.4) (5 Marks)
Q@.9 Explain the ROC curve chart. (Refer Section 3.2.5) (5 Marks)
Q.10 Explain the Cumulative gain and lift chart. (Refer Section 3.2.6) (5 Marks)
7 Syllabus Topic : Bayesian Methods ‘
Q.11 Write short note on Bayesian methods. (Refer Section 3.3) (4 Marks)
Q.12 Explain naive Bayes classifier with example. (Refer Section 3.3.2) (5 Marks)
@.13 What is Bayesian networks ? (Refer Section 3.3.3) (4 Marks)
* Syllabus Topic : Logistic Regression
Q.14 Write short note on logistic regression. (Refer Section 3.4) (5 Marks)
er
(EF susiness Intelligence (MU-B.Sc.-IT-Sem-VI) 3-43 Classification and Clustering
@ Syllabus Topic : Neural Networks
a. 15 Write short note on neural network. (Refer Section 3.5) (5 Marks)
@ Syllabus Topic : Support Vector Machines
Q.16 Write short note on support vector machine. (Refer Section 3.6) (5 Marks)
@ Syllabus Topic : Clustering Methods
Q.17 Whatare the characteristics of clustering method? (Refer Section 3.7.1) (4 Marks)
Q.18 What is taxonomy of clustering method? (Refer Section 3.7.3) (4 Marks)
Q.19 Write short note on Binary attribute. (Refer Section 3.7.5(1.)) (5 Marks)
Q.20 Write short note on Nominal attribute. (Refer Section 3.7.5(2.)) (4 Marks)
Q.21 Write short note on Ordinal attribute. (Refer Section 3.7.5(3.)) _ (4 Marks)
@ Syllabus Topic : Partition Methods
Q.22 Explain K-means method. (Refer Section 3.8.1) (4 Marks)
Q.23 Explain K-medoids algorithm. (Refer Section 3.8.2) (5 Marks)
© Syllabus Topic : Hierarchical Methods
Q.24 Explain single linkage, complete linkage, average linkage and ward distance.

@ Syllabus Topic : Evaluation of Clustering Models
Q.25 How one evaluates clustering model? (Refer Section 3.10) "(5 Marks)
goo
Chapter Ends....
Uniti 4)
Mathematical Models for Decision
Making, Data Mining and
Data Preparation
2.1 Modeling
Modeling is building models for the representation of modules nenlele is al,
SO Called as the
entities of a System.
* The needs of modeling are as follows
- To decompose the system into its hasic entities.
- To identify the essential entities and linkages.
- To recompose a selected version of the sy
linkages (i.e. the model).

2.2 -Models
stem with its essential/relevant entities and
A Model is a simplified representation of the essential entities of some specific reality and
their characteristics,
‘* The Models are used for following cee :
— Exploration
— Explanation
- Extrapolation
2.2.1 Mathematical Models
_Q. 2.2.1 What are the different types of model? (Ret, Seo, 2.2.1)
ee eee
a Business Intelligence MU-B.Sc.-IT-Sem-VI) 2-2 Mathematical Models for Decision Making
Mathematical Models can be classified as follows :
@ Types of mathematical models
Types of Mathematical
Models
1. Iconic (Scale) Model
3. Symbolic Model
Fig. 2.2.1 : Types of mathematical models
=~ 1. Iconic (Scale) Model
An iconic model is a physical copy of a system usually based on a.different scale than the
original. These may appear in three dimensions like airplane, car or bridge model to scale.
Photographs are another type of iconic model but it is only two dimensions. An Iconic
Model is a look-alike representation of some specific entity for example house.
Iconic Models can be represented in :
Two Dimensions: e.g. photos, drawings, etc.
Three Dimensions : e.g. scale model.
A scale model can be a .
reduction (scaled down, e.g. the model of a building).
reproduction (same scale, e.g. copy model, prototype or working model).
enlargement (scaled up, e.g. the model of an atom).
+> 2. Analog Model
An analog model does not look like the real system but behaves like it. These are usually
“two dimensional charts or diagrams for e.g., organization charts, showing structure,
authority, and responsibility relationships.
are more abstract than iconic ones. An Analogue Model is the
Analog models oe mate es
representation of entities of a system by analogue entities pertaining to
through diagrams).
‘ ical Models for Decision,
-IT-Sem-Vl) 2-3 Mathemat on ae
i (MU-B.Sc.-IT-Se
(7 Business Intelligence
An Analogue Model can be built through :
(a) Two Dimensional Visualization
(b) Three Dimensional Visualization
<> (a) Two Dimensional Visualization
Charts, Graphs, Diagrams
(e.g. the colour coding of a geographical chart for representing different altitudes)
=> (b) Three Dimensional Visualization
Analogue Devices
(e.g. the flow of water in pipes to represent the flow of electricity in wires or the flow of
resources in an economic system)
“> 3. Symbolic Model
The complexity of relationships in some systems cannot be represented physically or the
physical representation may be cumbersome and take time to construct. Therefore a more
abstract model is used with the aid of symbols. °

~ Most management science analysis
which utilize mathematical symbols
describe diverse situations, '
is executed with the aid of mathematical models
- These are general rather than specific and can
Symbols can be :
© Mathematica.
© Logical.
© ad-hoc.
ar Business Intelligence MU-B.Sc.-IT-Som-VI) 2-4 Mathematical Models for Decision Making
A Symbolic Model is used whenever the reality is :
- too complex or too abstract to be portrayed through an iconic or analogue model
- the factors of the system (variables) can be represented by symbols that can be
manipulated in a meaningful and fruitful way.
Syllabus Topic : Structure of Mathematical Model
2.3 +The Structure of Mathematical Models
la. 2.3.1 Write short note on structure of mathematical model. (Ref. Sec. 2.3) (5 Marks)
Mathematical models are typically in the form of equations or other mathematical
| statements.
For example, the relationship between cost, revenue and profit can be expressed as:
P = R-C vee (2.3.1)
Where, _P is profit, .
R is revenues, and C is cost.
2.3.1 Classification of Mathematical Models

Classification of
Mathematical Models
1. Linear vs. nonlinear
2, Deterministic vs. probabilistic (stochastic)
3. Static vs. dynamic
4, Discrete vs. Continuous
5. Deductive, inductive, or floating

Fig. 2.3.1 : Classification of mathematical models
> 1. Linear vs, nonlinear '
Mathematical models are usually composed by’ variables, which are abstractions of
quantities of interest in the described systems, and operators that act on these variables,
Which can be algebraic operators, functions, differential operators, etc.
(>) susiness Intelligence (MU-B.Sc.-IT-Sem-Vl) __2-5 Mathematical Models for Decision Makin
If all the operators in a mathematical model exhibit linearity, the resulting mathematical
model is defined as linear.
A model is considered to be nonlinear otherwise. The question of linearity and
nonlinearity is dependent on context, and linear models may have nonlinear expressions
in them. ,
For example, in a statistical linear model, it is assumed that a relationship is linear in the
parameters, but it may be nonlinear in the predictor variables.
Similarly, a differential equation is said to be linear if it can be written with linear
differential operators, but it can still have nonlinear expressions in it.
In a mathematical programming model, if the objective functions and constraints are
represented entirely by linear equations, then the model is regarded as a linear model.
If one or more of the objective functions or constraints are represented with a nonlinear
equation, then the model is known as a nonlinear model.
Nonlinearity, even in fairly simple systems, is often'associated with phenomena such as
chaos and irreversibility. Although there are exceptions, nonlinear systems and models
tend to be more difficult to study than linear ones.
A common approach to nonlinear problems is linearization, but this can be problematic if
one is trying to study aspects such as irreversibility, which are strongly tied to
nonlinearity.
—> 2. Deterministic vs. probabilistic (stochastic) -
A deterministic model is one in which every set of variable states is uniquely determined
by parameters in the model and by sets of previous states of these variables.
Therefore, deterministic models perform the same way for a given set of initial
conditions.
‘Conversely, in . stochastic model, randomness is present, — variable states are not
described by unique values, but rather by probability distributions.
~> 3. Static ys, dynamic
Static model does not account for the element of t
Dynamic models typically are
equations.
ime, while a dynamic model does
Tepresented with difference €quations or differential
@ Business Intelligence (MU-B,Sc.-IT-Sem-V1) 2-6 Mathematical Models tor Decision Making
” 4. Discrete vs. Continuous
A discrete model does not take into account the function of time and usually uses time-
advance methods, while a Continuous model does.
Continuous models typically are represented with f (t) and the changes are reflected over
continuous time intervals.
5. Deductive, inductive, or floating
A deductive model is a logical structure based on a theory. An inductive model arises
from empirical findings and generalization from them. The floating model rests on neither
theory nor observation, but is merely the invocation of expected structure.
Application of mathematics in'social sciences outside of economics has been criticized for
unfounded models. Application of catastrophe theory in science has been characterized as
a floating model.
Seven Steps of Mathematical Modeling
Formulate the Problem.
Observe the System.
Formulate a Mathematical Model of the Problem.

Verify the Model and Use the Model for Prediction.
Select a Simulation Alternative.
Present the Results and Conclusion of the Study to the Organization.
Implement and Evaluate Recommendations.
Characteristics of mathematical models
To be used successfully in a typical Management Science (MS) project, a mathematical
model must meet the following criteria:
(i) The model should be as simple and aneesuaniable as possible.
(ii) The Model should be reasonable.
iii) The Model should be easy to maintain and control.
(iv) The model should be adaptive. The parameters and structure of the model should be easy
to change as new insights and information evolve.
¥) The model should be complete on important issues, i.e., all important variables and
factors should have been taken into consideration.
oe
Mathematical Models for Decision Making
Business Intelligence (MU-B.Sc.-IT-Sem-VI) 2-7 ee
l.
Advantages of mathematical models
Use of models avoids constructing costly plants and warehouses in locations that do not
best meet the present and future needs of the customers.
A model indicates gaps that are not immediately apparent, a
of the failure might give a clue to the model’s deficiencies.
Models have the advantage of time, since results can be obtained wi
time.
Because of the constant squeeze on profits, the cost and time saving that MS models
the manager.
allow make them decision-making tools of great value to
nd after testing, the character
ithin a relatively-short
Disadvantages of mathematical models
A model that oversimplifies may inaccurately reflect the real world situation.
If the person who builds a model does not know what he is doing, output from the model
will be incorrect.
Models can sometimes prove too expensive to originate when their cost is compared to
the expected return from their use.
2.4
Syllabus Topic : Classes of Models
Classes of Models
@.244 Explain classes of model. (Ref. Sec
(6 Marks)
There are various models which are used for meRS decisions. The various mathematical
models are as follows :
Classes of Models
Risk analysis model
Project management model
Predective model
id
Optimisation model
Waiting Line model
Pattern recognisation model
Fig. 2.4.1 ; Classes of Models
a gusiness Intelligence (MU-B.Sc.-IT-Sem-VI) 9.5
a i. Risk analysis model
Risk analysis is the process of assessing the likelihood of an adverse event occurring
within the corporate, government, or environmental sector,
Risk analysis is the study of the underlying uncertainty of a given course of action and
refers to the uncertainty of forecasted cash flow Streams, variance of portfolio/stock
returns, the probability of a project's success or failure, and possible future economic
states. .
Risk analysts often work in tandem with forecasting professionals to minimize future
negative unforeseen effects.
-+ 2. Project management model
Every project is extremely unique which means we cannot have a standard structure to
execute our projects and achieve success-in our endeavor.
However, to have a good plan we need some kind of framework or structure to follow
depending on the nature of the project: .
Project management models or methodologies provide the framework to execute projects.
A framework is something that tells you how often you will meet and discuss the
progress, how you will document results, how you will communicate and so on.
=» 3. Predective model
Predictive modeling is a process that uses data mining and probability to forecast
outcomes. Each model is made up of a number of predictors, which are variables that are
likely to influence future results.
Once data has been collected for relevant predictors, a statistical model is formulated. The
model may employ a simple linear equation, or it may be a complex neural network,
mapped out by sophisticated software.
As additional data becomes available, the statistical analysis model is validated or revised.
4. Optimisation model
The Optimization Model class provides a common API for defining and accessing
variables and constraints, as well as other properties of each model.
We will now discuss each of these components in more detail.
27" Business Intelligence (MU-B.Sc.-IT-Sem-VI)_2-9 Mathematical Models for Decision Makin
we Types of Optimization Models
Optimization problems can be classified in terms of the nature of the objective function
and the nature of the constraints. Special forms of the objective function ang the
constraints give rise to specialized algorithms that are more efficient.
From this point of view, there are four types of optimization problems, of increasing
complexity.
An Unconstrained optimization problem is an optimization problem where the objective
function can be of any kind (linear or nonlinear) and there are no constraints. These types
of problems are handled by the classes discussed in the earlier sections.
A linear program is an optimization problem with an objective function that is linear in
the variables, and all constraints are also linear. Linear programs are implemented by the
Linear Program class.
A quadratic program is an optimization problem with an objective function that is
quadratic in the variables (i.e. it may contain squares and cross products of the decision
variables), and all constraints are linear. A quadratic program with no squares or cross
products in the objective function is a linear program. Quadratic programs are
implemented by the Quadratic Program class.
A nonlinear program is an optimization problem with an objective function that is an
arbitrary nonlinear function of the decision variables, and the constraints can be linear or
nonlinear. Nonlinear programs are implemented by the Nonlinear Program class.
5. Waiting Line model

There are basically two costs that must be balanced in waiting line system - the cost of
service and the cost of waiting. Note that I am not considering another possible cost
component - the cost of a scheduling system.
Theoretically, a scheduling system is a management strategy designed to avoid waiting
lines (meaning you should never wait in the doctor's office - yeah, right!) and is not
covered in this module.
Scheduling systems are useful when the customer is known to the system and the short
and long run costs of waiting are relatively high. We will study scheduling system
applications in linear programming later on in the course.
Operational characteristics of waiting lines include:
1. The probability that no customers (or units) are in the system.
2. The average number of customers in the lines.
.
‘et gusiness Intelligence (MU-B.Sc.-IT-Sem-VI)__2-10 __ Mathematical Models for Decision Making
3, The average number of customers in the system (customers in line plus those being
served.
4, The average time a customer spends in the waiting line.
5, The average time a customer spends in the system (waiting time plus time in the
service facility.
6. The probability that an arriving customer has to wait for service.
-> 6. Pattern recognisation model
Pattern recognition deals with identifying a pattern and confirming it again. In general, a
pattern can be a fingerprint image, a handwritten cursive word, a human face, a speech
signal, a bar code, or a web page on the Internet.
_ The individual patterns are often grouped into various categories based on their
properties. When the patterns of same properties are grouped together, the resultant group
is also a pattern, which is often called a pattern class.
— Pattern recognition is the science for observing, distinguishing the patterns of interest, and
making correct decisions about the patterns or pattern classes, Thus, a biometric system
applies pattern recognition to identify and classify the individuals, by comparing it with
the stored templates.
Esse
Syllabus Topic : Definition of Data Mining

2.5 Data Mining
(2 Marks)
'@.2.5.1 Define Data Mining. (Ref. Sec. 2.5)
— Data mining is a process used by companies to turn raw data into useful information. By
using software to look for patterns in large batches of data, businesses can learn more
about their customers to develop more effective marketing strategies, increase sales and
decrease. costs.
- Data mining depends on effective data collection, warehousing and computer processing.
Data mining is also known as data discovery and knowledge discovery.
er Business Intelligence (MU-B.Sc.-IT-Sem-V1)
2-11 Mathematical Models for Decision Maki
Syllabus Topic : Representation of Input Data
| 2.6 Data Mining Parameters
anne
Q.2.6.1 Write short note on Data Mining parameters. (Ref. Sec. 2.6) (5 Maria]
In data mining, association rules are created by analysing data for frequent if/then
patterns, then using the support and confidence criteria to locate the most importan
relationships within the data,
Support is how frequently the items appear in the database, while confidence is the
number of times if-then statements are accurate.
Other data mining parameters include Sequence or Path Analysis, Classification,
Clustering and Forecasting. Sequence or Path Analysis parameters look for patterns wikis
one event leads to another later event.
A Sequence i is an ordered list of sets of items, and it is a common type of data structure
found in many databases. A Classification parameter. looks for new patterns, and might
result in a change in the way the data is organized. Classification algorithms predict
variables based on other factors within the database. |
Clustering parameters find and visually document groups of facts that were previously
unknown. Clustering groups a set of plots and aggregates them based on how similar
_ they are to each other.
There are different ways a user can a the cluster, which differentiate between
each clustering model. Fostering parameters within data mining can discover patterns in
data that can lead to reasonable predictions about the future, also known as predictive
analysis.
2.6.1 Data Mining Tools and Techniques
Data mining techniques are used in many research areas, including mathematics,
cybernetics, genetics and marketing. While data mining techniques are a means to drive
efficiencies and predict customer behavior, if used correctly, a business can set itself apart
from its competition through the use of predictive analysis.
wen mining, a type of data mining used in customer relationship management, integrates
information gathered by traditional data mining methods and techniques over the web.
Other data mining techniques include network approaches based on multitask learning for
classifying patterns, ensuring parallel and scalable execution of data mining algorithms,
}
(a pusiness Intelligence (MU-B.Sc.-IT-Sem-VI)__2-12 . Mathematical Models for Decision Making
the mining of large databases, the handling of relational and complex data types, and
machine learning. Machine learning is a type of data mining tool that designs specific
algorithms from which to learn and predict.
Syllabus Topic : Data Mining Process
97 Data Mining Architecture
——_———_—
0.2.7.1 Drawand explain architecture of data mining, (Ref. Sec.2.7) (5 Marks)
The major components of any data mining system are data source, data warehouse server,
data mining engine, pattern evaluation module, graphical user interface and knowledge
base.
' Graphical User Interface
| Pattern Evaluation _
Fig. 2.7.1 : Data Mining System
“> (a) Data sources
- Database, data warehouse, World Wide Web (WWW), text files and other documents are
the actual sources of data. You need large volumes of historical data for data mining to be
successful.
~ Organizations usually store data in databases or data warehouses. Data warehouses may
contain one or more databases, text files, spreadsheets or other kinds of information
repositories. Sometimes, data may reside even in plain text files or spreadsheets. World
Wide Web or the Internet is another big source of data.
emer
G7? Business Intelligence (MU-B.Sc.-IT-Sem-V1) 2-13 Mathematical Models for Decision Maki
@ Different processes
— The data needs to be cleaned, integrated and selected before passing it to the database o,
data warehouse server. As the data is from different sources and in different formats, jt
cannot be used directly for the data mining process because the data might not be
complete and reliable.
- So, first data needs to be cle
collected from different data
passed to the server.
These processes are not as simple as we think. A number of techniques may be performed
aned and integrated. Again, more data than required will be
sources and only the data of interest needs to be selected ang
on the data as part of cleaning, integration and selection.
=> (b) Database or Data warehouse server
The database or data warehouse server contains the actual data that is ready to be
processed, Hence, the server is responsible for retrieving the relevant data based on the data
mining request of the user.
+> (c) Datamining engine
The data mining engine is the core component of any data mining system. It consists of a
number of modules for performing data mining tasks including association, classification,
characterization, clustering, prediction, time-series analysis etc.

> (d) Pattern evaluation modules
The pattern evaluation module is mainly responsible for the measure of interestingness of
the pattern by using a threshold value. It interacts with the data mining engine to focus the
search towards interesting patterns.
“> (e) Graphical user interface
~The graphical user interface module communicates between the user and the data mining
system. This module helps the user use the system easily and efficiently without knowing
the real complexity behind the process.
- When the user specifies a query or a task, this module interacts with the data mining
system and displays the result in an easily understandable manner.
> Knowledge base
, Lis knowledge base is helpful in the whole data mining process. It might be useful for
guiding the search or evaluating the interestingness of the result patterns.
_
a gusiness Intelligence (MU-B.Sc.-IT-Sem-Vl) 2-14 Mathematical Models for Decision Making
The knowledge base might even contain user beliefs and data from user experiences that
can be useful in the process of data mining. The data mining engine might get inputs from
the knowledge base to make the result more accurate and reliable.
The pattern evaluation module interacts with the knowledge base on a regular basis to get
inputs and also to update it. ;
| 97.1 Four Types of Data Mining Architecture
@¢ Types of Data Mining Architecture
Types of Data Mining
Architecture
a. No-coupling Data Mining

b. Loose Coupling Data Mining
c. Semi-Tight Coupling Data Mining
d. Tight Coupling Data Mining
Fig. 2.7.2 : Types of Data Mining Architecture
=» (a) No-coupling data mining
- In this architecture, data mining system does not use any functionality of a database. A
no-coupling data mining system retrieves data from a particular data sources.
- The no-coupling data mining architecture does not take any advantages of a database.
That is already very efficient in organizing, storing, accessing and retrieving data.
- The no-coupling architecture is considered a poor architecture for data mining system.
But it is used for simple data mining processes.
> (b) Loose coupling data mining
| - In this architecture, data mining system uses a database for data retrieval. In loose
coupling, data mining architecture, data mining system retrieves data from a database.
And it stores the result in those systems.
_ ~ Data mining architecture is for memory-based data mining system. That does not must
high scalability and high performance.

%
>
IP Business Intelligence (MU-B.Sc_IT- SoM) eee Models for Decision Max
=> (c) Semi-Tight coupling data mining 7
In semi-tight coupling, data mining system uses several features of data es
os : Chous,
systems. That is to perform some data mining tasks. That includes Sorting, in
dexing
‘aggregation. ;
In this, some intermediate result can be stored in a database for better performance,
(d) ‘Tight coupling data mining |
In tight coupling, a data warehouse is treated as an information retrieval component, All
the features of database or data warehouse are used to perform data mining tasks.
This architecture provides system scalability, high performance, and _ integrate
information.
There are three tiers in the tight-coupling data mining architecture

Three Tiers in the tight-coupling
data mining architecture
i. Data Layer
ii. Data mining application layer
iii. Front-end layer
Fig. 2.7.3 : Three Tiers in the tight-coupling data mining architecture
(i) Data layer
We can define data layer as a database or data warehouse systems. This layer is a0
interface for all data sources.
Data mining results are stored in the data layer. Thus, we can present to end-user in form
of reports or another kind of visualization. |
=> (ii) Data mining application layer

It is to retrieve data from a database. Some transformation routine has toper form here.
That is to transform data into the desired format.
Then we have to process data using various data mining algorithms.
“> (ili) Front-end layer
It provides the intuitive and friendly user interface for end-user. That is to interact wit)
data mining system. ,
Business Intelligence (MU-B.Sc.-IT-Sem-VI)__2-16
Mathematical Models for Decision Making .
Data mining result presented in visualization form to the user in the front-end layer.
L572 Types of Data Mining Processes
Different data mining processes can be classified into two types: data preparation or data
preprocessing and data mining. In fact, the first four processes, that are data cleaning, data
integration, data selection and data transformation, are considered as data preparation
processes.
The last three processes including data mining, pattern evaluation and knowledge
representation are integrated into one process called data mining.

Data Preparation
A = _Gesnng = a
[[“Thintegration"| “pata
~ Data Mining
_ Knowledge
Evaluation
Fig. 2.7.4
(a) Data cleaning
Data cleaning is the process where the data gets cleaned. Data in the real world is
normally incomplete, noisy and inconsistent.
The data available in data sources might be lacking attribute values, data of interest etc.
For example, you want the demographic data of customers and what if the available data
does not include attributes for the gender or age of the customers? Then the data is of
course incomplete. Sometimes the data might contain errors or outliers.
An example is an age attribute with value 200. It is obvious that the age value is wrong in
this case. The data could also be inconsistent.
= (b) Data integration
tored differently in different data tab 3
les
For example, the name of )
the data is not clean
> the data Mining
or documents. Here, the data is inconsistent. If
be neither reliable nor accurate.
f techniques including filling in the missing nad
etc. The output of data nada
results would
g involves a number 0
Data cleanin .
d human inspection,
manually, combined computer an
process is adequately cleaned data.

Data integration is the process where data from different data sources are integrated intg
one. Data lies in different formats in different locations.
files, spreadsheets, documents, data cubes, Interne,
Data could be stored in databases, text
omplex and tricky task because data from different
and'so on. Data integration is a really c
sources does not match normally.
contains an entity named customer_id where as another table B
Suppose a table A
sure that, whether both these
contains an entity named number. It is really difficult to en
entities refer to the same value or not.
Metadata can be used effectively to reduce errors
issue faced is data redundancy.
The same data might be available in different tables in the same database or even in
different data sources. Data integration tries to reduce redundancy to the maximum
possible level without affecting the reliability of data.
in the data integration process. Another
(c) Data selection

Data anining process requires large volumes of historical data for analysis. So, usually the
data repository with integrated data contains much more data than actually required.
From the available data, data of interest needs to be selected and stored. Data selection is
the pr:
process where the data relevant to the analysis is retrieved from the database.
(d) Data transformation
Data transformation i
ise fom tha is the ‘Process of transforming and consolidating the data into
seaveieaien ate suitable for mining. Data transformation normally involves
» aggregation, generalization etc.
For i
example, a data set available as "5, 37, 100, 89, 78" can be transformed as "0,05
0.37, 1.00. 0 89 0. 78"
: » V.GF, U./8". Here data becom : .
integrati . es more suitabl ini
ration, the available data is ready for data mining mea;
a Business Intelligenc
@ (MU-B.Sc.-IT-Sem-V1)
ae.
(ec) Data mining
Data mining is the core process where a number of complex and intelligent methods are
applied to extract patterns from data.
Data mining process includes a number of tasks such as association, classification
prediction, clustering, time series analysis and so on.
(f) Pattern evaluation
The pattern evaluation identifies the truly interesting patterns representing knowledge
based on different types of interestingness measures.
A pattern is considered to be interesting if it is potentially useful, easily understandable by

humans, validates some hypothesis that someone wants to confirm or valid on new data
with some degree of certainty.
(g) Knowledge representation
The information mined from the data needs to be presented to the user in an appealing
way.
Different knowledge representation and visualization techniques are applied to provide
the output of data mining to the users. :
Benefits of data mining
Data mining technique helps companies to get knowledge-based information.
Data mining helps organizations to make the profitable adjustments in operation and
production.
The data mining is a cost-effective and efficient solution compared to other statistical data
applications.
Data mining helps with the decision-making process.
Facilitates automated prediction of trends and behaviors as well as automated discovery
of hidden patterns.
It can be implemented in new systems as well as existing platforms.
It is the speedy process which makes it easy for the users to analyze huge amount of data
in less time.
Disadvantages of data mining
There are chances of companies may sell useful information of their customers to other
companies for money. For example, American Express has sold credit card purchases of
their customers to the other companies.
Sem-V' -VI) 2-19 Mathematical Models for Decision | Making
SS Intelligence (%! (MU-B.£ Sc.-IT-S -IT-
(FP usine:
2, Many data mining analytics software
is difficult to operate and requires advance trainin
to work on.
3. Different dat
employed in their design. Therefore, the s
difficult task.
4. The data mining techniques are not accurate, and so it can cause serious Consequences in
certain conditions.
a mining tools work in different manners due to different algorithms
election of correct data mining tool is a very
Syllabus Topic : Analysis Methodologies

2.8 Analysis Methodologies
- (5 Marks)
@. 2.8.1 Write various application of data mining. (Ref. Sec. 2.8) ee
@ Data Mining Applications
Data mining is highly useful in the following domains :
1. Market Analysis and Management

2. Corporate Analysis and Risk Management
3. Fraud Detection
Fig. 2.8.1: Domain Types
Apart from these, data mining can also be used in the areas of production control,
customer retention, science exploration, sports, astrology, and Internet Web Surf-Aid
_ 2.8.1 Market Analysis and Management
Listed below are the various fields of market where data mining is used :
— Customer Profiling : Data mining helps determine what kind of people buy what kind of
products.
- Identifying Customer Requirements : Data mining helps in identifying the best
products for different customers. It uses prediction to find the factors that may attract new
customers.
[77 Business Intelligence (MU-B.Sc.-IT-Sem-VI)__ 2-20 Mathematical Models for Decision Making
Cross Market Analysis : Data mining performs Association/correlations between
product sales,
Target Marketing : Data mining helps to find clusters of model customers who share the
same characteristics such as interests, spending habits, income, etc.
Determining Customer purchasing pattern : Data mining helps in determining
customer purchasing pattern.
providing Summary Information : Data mining provides us various multidimensional
summary reports.
2.8.2 | Corporate Analysis and Risk Management
Q. 2.8.2 Write short note on Corporate Analysis and Risk Management.
(Ret. Sec. 2.8.2) ee ee (5 Marks)
Data mining is used in the following fields of the Corporate Sector :
Finance Planning and Asset Evaluation : It involves cash flow analysis and prediction,
contingent claim analysis to evaluate assets.
Resource Planning : It involves summarizing and comparing the resources and spending.
Competition : It involves monitoring competitors and market directions.
98.3 Fraud Detection
0.2.83 Wite short note on fraud detaction. (Ref. Sec. 2.8.3) —_—_—(5 Marks)
Data mining is also used in the
fields of credit card services and
telecommunication to detect frauds.
In fraud telephone calls, it helps to
find the destination of the call, duration
of the call, time of the day or week, etc. It
_ also analyzes the patterns that deviate
from expected norms.

Business [—)
understanding}¢— understanding
[AEF usinoss intetigence (MU-B.Se.IT
Sem-Vl) _ 2-21 __ Mathematical Models for
Decision Makin
~ 1. Business understanding
In the business understanding phase :
First, it is required to understand business objectives clearly and find out what are the
business’s needs,
Next, we have to assess the current situation by finding the resources, assumptions,
constraints and other important factors which should be considered. ©
Then, from the business objectives and current situations, we need to create data mining
goals to achieve the business objectives within the current situation.
Finally, a good data mining plan has to be established.to achieve both business and data
mining goals. The plan should be as detailed as possible.
2. Data understanding
First, the data understanding phase starts with initial data collection, which we collect
from available data sources, to help us get familiar with the data.
Some important activities must be performed including data load and data integration in
order to make the data collection successfully. y
Next, the “gross” or “surface” properties of acquired data need to be examined carefully
and reported.
Then, the data needs to be explored by tackling the data mining questions, which can be
addressed using querying, reporting, and visualization.
Finally, the data quality must be examined by answering some important questions such
as “Is the acquired data complete?”, “Is there any missing values in the acquired data?”
3. Data preparation
The data preparation typically consumes about 90% of the time of the project. The
outcome of the data preparation phase is the final data set.
Once available tata sources are identified, they need to be selected, cleaned, constructed
a Aoesiatind into the desired form. The data-exploration task at a greater depth may be
carried during this phase to notice the patterns based on business understanding.
> 4. Modeling
First, modeling techniques have to be selected to be used for the prepared dataset.
Next, the test scenario must be gencrated to validate the quality and validity of the model.
F
lo. 2.9.1 Draw diagram and explain data preparation. (Ref, Sec. 2.9)
~ Data preparation (or data pre-processing) in this context means manipulatio
a Business Intelligence (MU-B.Sc.-IT-Sem-VI)__2-22 Mathematical Models for Decision Making
Then, one or more models are created by running the modeling tool on the prepared
dataset.
Finally, models need to be assessed carefully involving stakeholders to make sure that
created models are met business initiatives. ’
5. Evaluation
In the evaluation phase, the model results must be evaluated in the context of business
objectives in the first phase. In this phas¢, new business requirements may be raised due
to the new patterns that have been discovered in the model results or from other factors.
Gaining business understanding is an iterative process in data mining. The go or no-go
decision must be made in this step to move to the deployment phase.
6. Deployment
The knowledge or information, which we gain through data mining process, needs to be
presented in such a way that stakeholders can use it when they want it.
Based on the business requirements, the deployment phase could be as simple as creating
a report or as complex as a repeatable data mining process across the organization.
In the deployment phase, the plans for deployment, maintenance, and monitoring have to ©
be created for implementation and also future supports.
From the project point of view, the final report of the project needs to summary the
project experiences and reviews the project to see what need to improved created learned
lessons.
The CRISP-DM offers a uniform framework for experience documentation and
guidelines. In addition, the CRISP-DM can apply in various industries with different
types of data. ;
EEE —————
Syllabus Topic : Data Preparation
What is Data Preparation ?
(5 Marks)
n of data into
a form suitable for further analysis and processing. It is a process that involves many
different tasks and which cannot be fully automated.’
—
2.23 - Mathematical Models for Decision Making
igonce (MU-8.S0-IT-Sem™VI)
(4) susiness Intel
f the data preparation activities
aration accoun
are routine, tedious, and time consuming. It has |
: a that data prep ts for 60%-80% of the time spent on a data |
mining project. |
— Data preparation i
orrect and unreliable data mining results.
of data and consequently helps improve the quality
n saying "garbage-in garbage-out" is very relevant
5 essential for successful data mining. Poor quality data typically result
in inc
Data preparation improves the quality
of data mining results. The well-know

to this domain.
Data Preparation
Fig. 2.9.1
Syllabus Topic : Data Validation
2.10 Data Validation
@.2.10.1 Write note on Data validation. (Ret. Sec. 2.10) ss Mark)
- Data validation is about checking the information and to ensure that it complements he
data needs of the system. This removes the chances of errors. One of the many examples
of data validation is range check.

— Data validation has nothing to do with what the user wants to input. Validation is about
checking the input data to ensure it conforms to the data requirements of the system [0
avoid data errors.
- An example of this is a range check to avoid an input number that is greater or smaller
than the specified range.
i aan Ann
Ete. pc i asl lS i Sie lak
| (ev Business Intelligence (MU-B.Sc.-IT-Sem-VI) 2-24 Mathematical Models for Decision Making
| Syllabus Topic : Data Transformation
2.11 Data Transformation
[a2tts Explain data transformation with suitable diagram. (Ref. Sec. 2.11) (5 Marks)
In data transformation process data are transformed from one format to another format,
that is more appropriate for data mining.
@ Some data transformation strategies

Data Transformation
Strategles
1. Smoothing
2. Aggregation
3. Generalization
4, Normalization :
5, Attribute Construction |
Fig. 2.11.1 : Data Transformation Strategies
> 1. Smoothing
Smoothing is a process of removing noise from the data.
> 2. Aggregation
Aggregation is a process where summ
ary or aggregation operations are applied to the

data. .
> 3. Generalization
In generalization low-level data are replaced with high-
hierarchies climbing.
level data by using concept
> 4. Normalization
Normalization scaled attribute data so as to fall within a small specified range, such as 0.0
to 1.0.
eee
[7 Business Intelligence (MU-B,Sc.-IT-Sem-VI) _2-25 Mathematical Models for Decision Makin
=> 5. Attribute Construction
In Attribute construction, new attributes are constructed from the given set of attributes
database or date warehouse may store terabytes of data. So it may take very | Ong to
perform data analysis and mining on such huge amounts of data.
ooo
Syllabus Topic ; Data Reduction
2.12 Data Reduction
—,
Q. 2.12.1 Write short note on data Reduction, (Ref. Sec. 2.12) (5 Marks)
Data reduction techniques can be applied to obtain a reduced representation of the data set
that is much smaller in volume but still contain critical information.
* Data reduction strategies
Types of Data
Reduction Strategies
1. Data Cube Aggregation
2. Dimensionality Reduction
3. Data Compression
4. Numerosity Reductions
5. Discretisation and concept hierarchy generation
Fig. 2.12.1 : Types of data reduction strategies
> 1. Data cube aggregation
Aggregation operations are applied to the data in the construction of a data cube.
> 2. Dimensionality reduction
In dimensionality reduction redundant attributes are detected and removed which reduce
the data set size.
> 3. Data compression
Encoding mechanisms are used to reduce the data set size.

(et Business Intelligence (MU-B.Sc.-IT-Sem-Vl) 2-26 Mathematical Models for Decision Making
SS ————————“€
of 4 Numerosity reductions
In numerosity reduction where the data are replaced or estimated by alternative.
_ = 5. Discretisation and concept hierarchy generation
Where raw data values for attributes are replaced by ranges or higher conceptual levels.
.—_—_———"——
| @.1 Whatare the different types of model? (Refer Section 2.2.1) (5 Marks)
| @ Syllabus Topic : Structure of Mathematical Model
| Q.2 Write short note on structure of mathematical model. (Refer Section 2.3) (5 Marks)
| @ Syllabus Topic : Classes of Models
Q.3 Explain classes of model. (Refer Section 2.4) (5 Marks)
| @ Syllabus Topic : Definition of Data Mining
| Q.4 Define Data Mining. (Refer Section 2.5) (2 Marks)
@ Syllabus Topic : Representation of Input Data
'Q.5 Write short note on Data Mining parameters. (Refer Section 2.6) (5 Marks)
| © Syllabus Topic : Data Mining Process
Q.6 Draw and explain architecture of data mining. (Refer Section 2.7) (5 Marks)
7 Syllabus Topic : Analysis Methodologies
| Q.7 Write various application of data mining. (Refer Section 2.8) (5 Marks)
Q.8 Write short note on Corporate Analysis and Risk Management.
| (Refer Section 2.8.2) (5 Marks)
| Q.9 Write short note on fraud detection. (Refer Section 2.8.3) . (5 Marks)
| * Syllabus Topic : Data Preparation
| Q.10 Draw and explain data preparation. (Refer Section 2.9) . (5 Marks)
* Syllabus Topic : Data Validation
Q.11 Write note on Data validation. (Refer Section 2.10) (5 Marks)
athematical Models for Decision Makin
(FP eusiness Intelligence (MU-B.Sc.-IT-Sem-VI) 2-27 M
@ Syllabus Topic : Data Transformation
Q.12 Explain data transformation with suitable diagram. (Refer Section 2.11) (5 Marks)
@ Syllabus Topic : Data Reduction
Q.13 Write short note on data Reduction. (Refer Section 2.12) (5 Marks)
| Q00
Chapter Ends...
ea CHAPTER
Business Intelligence and
Decision Support Systems
Syllabus Topic : Business Intelligence
1.1 Introduction to Business Intelligence
Q. 1.1.1 What do you mean by business intetigence ? Write its Bees
(Ref. Sec. 1.1) : Be dens
The term Business Intelligence (BI) refers to technologies, applications and practices for
the collection, integration, analysis, and presentation of business information. The main
(5 Marks)
reason behind Business Intelligence is to provide better business decision making.
These systems are data-driven Decision Support Systems (DSS). Business Intelligence is
sometimes used interchangeably with briefing books, report and query tools and executive
information systems. It is also called as a set of mathematical model and analysis
methodology which is very useful for decision making process which are complex.
Large amount of data can be easily accessed by individuals and organizations because of
numerous internet connections and low data storage technologies.
Transactions are commercial, financial and administrative, making the data heterogeneous
in origin, content and representation. Emails, texts and hypertexts, and the results of
clinical tests, are a few examples, |
Their accessibility opens various scenarios and opportunities, and raises a rather
important question: is it possible to convert such data into information and knowledge thal
can then be used by decision makers to assist and improve the operation of enterprises and
of public administration?
(er Business Intelligence (MU-B.Sc.-IT-Sem-V1) 1-2 Business Intelligence & Decision Support Sys.
Syllabus Topic : Effective and Timely Decisions
1.2 Effective and Timely Decisions
Q. 1.2.1. Write short note on Effective and Timely decisions. (Ref. Sec. 1.2) (5 Marks)
In public or private organizations, decisions are made continuously. Such decisions can
prove to be critical, have long-term or short-term effects and involve people and roles at
various rankings.
Performance and competitive strength of an organization is based on the ability of skilled
workers to make decisions as individuals and a community.
— Most people reach their decisions mainly using simple and easy approaches, which use
specific elements such as experience, knowledge of the application domain and the
available information.
- Decision-making processes within today’s organizations are often too complex and
dynamic to be effectively dealt with through an intuitive approach, and instead require a
much stricter attitude based on analytical tactics and mathematical models.
- Example 1 shows two complex decision-making processes in rapidly changing conditions.
@ Example 1 — Retention in the cellular industry |
- The marketing person of a cellular company realizes that most of the customers are
diverting towards other service provider due to better option and low cost. It is critical for
the company as it will reduce the number of customer which affects business.
— Socompany manager can decide conduct a customer retention campaign. With the help of
this campaign they can select the best target group which will maximize customer
retention this will help them in to business growth.
— The main purpose of business intelligence systems is to provide skilled workers with tools
and methodologies that allow them to make effective and timely decisions.
* Effective decisions
— The application of stricter analytical methods allows decision makers to rely on
information and knowledge, which are more dependable.
- As a result, they are able to make better decisions and formulate action plans that a
their objectives to be reached in a more effective way.
— Turning to formal analytical methods forces decision makers to describe both the c
for accessing alternative choices and the mechanisms regulates the problem
investigation.
ieep Business Intelligence (MU-B.Sc.-IT-Sem-VI) 1-3 Business Intelligence & Decision Support Sys.
Furthermore, the ensuing observation and thought lead to a better awareness and
knowledge of the unhidden logic of the decision-making process.
@ Timely decisions
Enterprises operate in economic environments characterized by growing levels of
competition and high dynamism. As a consequence, the ability to rapidly react to the
actions of competitors and to new market conditions is a critical factor in the success or
even the survival of a company.
Fig. 1.2.1 shows the benefits provided to organization, which can draw from the adoption
of a business intelligence system. When they face problem decision makers can ask
themselves a group of questions on the basis of that they can make analysis based on it.
Now it is easy to choose best solution by comparing several options.
If decision makers follow business intelligence system then the overall quality of the
decision-making process can be improved.

- questions _
Many alternatives considered
- Alternative i
actions --
More accurate conclusions
Business
intellingence
bowen enna Effective and timely decisions
Fig. 1.2.1 : Benefits of a business intelligence system
Therefore we can say that it is effective and advantageous to use a business intelligence
system for making decision.
As we saw that, a big amount of data we can store into the systems of public and private
organizations.
This data can be from internal transactions of an administrative, logistical and commercial
nature and some from external sources.
(7 Business Intelligence (MU-B.Sc.-IT-Sem-V1) 1-4 Business Intelligence & Decision Support Sys.
But even we collect it and store it systematically we cannot use it directly for decision-
making purposes. For that we need an extraction tools and methods which will convert
that information which can be used for decision making.
Syllabus Topic : Data, Information and Knowledge
1.3. Data, Information and Knowledge
Q. 1.3.1 What do mean by data, knowledge and information? (Ref. Sec. 1.3) (5 Marks)
The difference between data, information and knowledge can be better understood
through the below explanation :
@ Knowledge
Knowledge means what we know. We build world map in our brain as we know.
It’s like a physical map which helps us to know where things are but it contains more than
that. It also has our beliefs and expectations.

If we do. this, we will probably get that.” Crucially, the human brain links all these things
together into a giant network of ideas, memories, predictions, beliefs, etc. It is from this
“map” that we base our decisions, not the real world itself.
Our brains constantly update this map from the signals coming through our eyes, ears,
nose, mouth and skin. We can’t currently store knowledge in anything other than a brain,
because a brain.connects it all together.
Everything is inter-connected in our brain. Computers are not artificial brains. Computers
don’t understand what they are processing, and can’t make decisions by themselves and it
does what we tell them.
The knowledge uses two sources to build it they are Information and data.
Data is a set of representation of plain facts. Data are the facts of the world.
For example, take yourself. You may be 6ft tall, have black hair and brown eyes. All of
this is “data”,
The confusion between data and information often arises because information is made out
of data. Data can be defined differently in different sectors.
(er Business Intelligence (MU-B.Sc.-IT-Sem-VI) 1-5 Business Intelligence & Decision Support Sys.
We can perceive this data with our senses, and then the brain can process this. Human
beings have used data as long as we’ve existed to form knowledge of the world.
® Information
Information is used to expand our knowledge beyond the range of our senses. We can
capture data in information, and then move it about so that other people can access it at
different mediums.
For example if we click picture then photo is information how we look like is the data.
We can send the picture around through various medium without moving that person who
is in the picture. If we lose that photo it won’t change your look. In this case we lose
information not the data.
eee nc ee
Syllabus Topic : The Role of Mathematical Models
1.4 The Role of Mathematical Models
Q.1.4.1_ Write short note on the role of mathema

k (Ref. Sec. 14). (5 Marks)
Mathematical models and algorithms help decision makers to extract information and
knowledge from the data through the means of a business intelligence system.
Data can be graphically represented by histograms, whereas more elaborate analysis
requires development of advanced learning models.
Generally, business intelligence system is used to promote a scientific and rational
approach of organization.
Example- a spreadsheet is used to estimate the effects on the fluctuations in interest rates
with the help of that decision makers can generate a mental representation of the financial
flows process.
Classical scientific fields, such as physics, have always resorted to mathematical models
for the abstract representation of real systems.
Other areas, such as operations research, haye instead made full use of the application of
scientific methods and mathematical models to the study of artificial systems, for example
public and private organizations.
(ET eusiness Intelligence (MU-B.Sc.-IT-Sem-Vi)__1-6 __Business Intelligence & Decision Support Sys.
een)
- The characteristics of a business intelligence analysis which is used for summarizing
schematically are as follows :
o They identify the objectives of the analysis and the performance indicators which
used for identifying evaluating alternative options.
o Then mathematical models can be developed by exploiting the relationship of
parameters of system control also the parameters of evaluation metrics.
o Finally on the basis of variation in the control variable and changes in the parameters
the effects of the performance can be determined.
Syllabus Topic : Business Intelligence Architectures
1.5 Business Intelligence Architectures
Q. 1.5.1 Draw and explain architecture of Business Intelligence. (Ref. Sec. 1.5) (5 Marks)
Fig. 1.5.1, shows the architecture of a business intelligence system, which consist of three
major components they are as follows :
Operational
systems
Multidimensional cubes
Exploratory data analysis
Time series analysis
Data mining
Optimization
Fig. 1.5.1: A typical business intelligence architecture

|
ET) business Intelligence (MU-B.Sc.-IT-Sem-VI) 1-7 Business Intelligence & Decision Support Sys.
Optimization
oosing the bast alternative
Data mining
Models for leaming from data
' Data exploration
Statical analysis and visualization
Data warehouse/Data mart
Multidimensional cube analysis
; Data sources '
~ Operational data, documents and extemal data ~
Fig. 1.5.2 : The main components of a business intelligence system
+> 1. Data sources

~ It is very important to collect and integrate the data which are stored in the various
primary and secondary sources, they are heterogeneous in origin and type.
- The sources for most of the part o data belongs to operational system which also includes
an unstructured documents like emails and data received from various external sources.
“> 2. Data warehouses and data marts
- ETL stands for Extract Transform Load. In an ETL process data is extracted from the
operational systems and loaded into a data warehouse.
- The data from various sources are stored into a database which is made to support
business intelligence analysis, This database is called as data warehouse and data mart.
* Business intelligence methodologies
~ Methodologies provide a best practice framework for delivering successful business
intelligence and data warehouse projects,
~ This data is extracted to provide input to mathematical model and support decision
makers.
1. Time series analysis;
2. Inductive learning models for data mining;
3. Optimization models,
~ The pyramid in Fig. 1.5.2 shows pyramid of a business intelligence system. We have
discussed components of first two levels in Fig 1.5.1.
(7 Business Intelligence (MU-B.Sc,-IT-Sem-VI) 1-8 Business Intelligence & Decision Support Sys.
The description of the upper tiers.:
=» 3. Data exploration
This is the third level called as Data exploration. Data exploration is an informative search
which is used by data consumers to form real and true analysis from the information
collected Data Exploration is about describing the data by means of statistical and
visualization techniques.
We explore data in order to bring important aspects of that data into focus for further
analysis. Often, data is gathered in a non-rigid or controlled manner in large bulks.
For true analysis, this unorganized bulk of data needs to be narrowed down. This is where
data exploration is used to analyze the data and information from the data to form further
analysis.
Data often converges in a central warehouse called a data warehouse. This data can come
from various sources using various formats.
Relevant data is needed for tasks such as statistical reporting, trend spotting and pattern
spotting. Data exploration is the process of gathering such relevant data.
4. Data mining .
The fourth level is data mining. Data mining technique has to be chosen based on the type
of business and the type of problem your business faces.

A generalized approach has to be used to improve the accuracy and cost effectiveness of
using data mining techniques. ,
5. Optimization
If we go one level on top we get optimization models which allow us to select best
solutions among all other alternative.
6. Decisions
The top most level of the pyramid is the decision where we need to select best alternative
for decision making process. . |
When. business Meals methodology is successfully adopted it helps to make
decision.
FT ousinass Intelligence (MU-B.Sc.-IT-Sem-V!)__1-9__ Business Intelligence & Decision Support Sy,
1.5.1 Cycle of a Business Intelligence Analysis
Q. 1.5.2 Draw and explain Cycle of Business Intelligence Analysis. |
(Ref. Sec. 1.5.1) (5 Marks)
telligence analysis where it follows the path. This
- Fig. 1.5.3 shows the cycle of business in |
olution of business intelligence analysis.
is an ideal path which characterizes the ev
. Customers
Suppliers
nce ———
Business intellige!
Fig. 1.5.3 : Departments of an enterprise concerned with business intelligence systems
" Evaluation |
Decision ~
‘ei
Ss
Fig. 1.5.4 : Cycle of a business intelligence analysis
> 1. Analysis
In this phase we find out the problem and understand which path is critical for making
decision. Analysis is very important to proceed further to the next step.
- This phase helps us to take best suitable decision. —

> 2. Insight
- This phase helps us to understand the problem properly.
- For example if first phase shows the information of many customers who wants (0°
discontinue insurance policy after validity expires and second phase gives information
about the customers which is shared by the customer. .
(7 Business Intelligence (MU-B.Sc.-IT-Sem-VI)__1-10 Business Intelligence & Decision Support Sys.
_ In this phase information is carried out through the analysis phase. Insight Assessment
specializes in full service solutions for measuring learning outcomes.
We provide world class test instruments supported by high quality customer service to
higher education institutions worldwide.
At each phase of the assessment process, we offer the instrumentation, data gathering
capacity and report options to guide you to your goal of demonstrating institutional
effectiveness.
=> 3. Decision
This is a third phase where decision makers take decision. The availability of BI helps
analysis and Insight phase to take fastest decision.
— This is an important phase which decides over all time for execution.
=> 4, Evaluation
This is the final phase of cycle which performance measurement and evaluation.
1.5.2 Development of a Business Intelligence System
@. 1.53 Draw and explain phases of Business Intelligence. (Ref. Sec. 1.5.2) _( Marks)
=~ 1. Analysis
This step is about analyzing the performance of the software at various stages and making
notes on additional requirements. Analysis is very important to proceed further to the next
step. Needs of the organization should be identified properly.
This phase consist of some interviews and knowledge of workers who performs various
roles in the organization. We also needs to decide costing and benefits of developing
business intelligence system.
> 2. Design
- Once the analysis is complete, the step of designing takes over,
building the architecture of the project.
- — This step helps remove possible flaws by setting
is very important to make assessment of existing information.
which is basically
a standard and attempting to stick to it. It
-IT-Sem-VI). 1-11 Business Intelligence & Decision Support Sys
(FP eusiness Intelligance (MU-B.Sc.
Per es
he ll
Fig. 1.5.5 : Phases in the development o of a business intelligence system
+> 3. Planning
The main purpose of the planning phase is to know the requirement and understand.
opportunities, In this we need to find out cost, time, and benefits of the system.
What is the scope of the system? What will be the problem and solution for it?
Without the perfect plan, calculating the strengths and weaknesses of the project,
development of software is meaningless. Planning kicks off a project flawlessly and”
affects its progress positively.
“> 4. Implementation and control j
The actual task of developing the software starts here with data recording going on in the ©
background. Once the software is developed, the stage of implementation comes in where *) ‘
the product goes through a pilot study to see if it’s functioning properly.
‘3
sarees
eT Business |
_- A Metadata achieve should be created for this ETL procedures are used. And finally the
system can be release for testing and to usc it.
inteligence (MU-B.Sc-1T-Som-VI) _1-12_Business Intelligence & Decision Support Sys.
Multidimensional cubes
Relational
marketing ( )
Click stream o
analysis C ) Optimization
Campaigns (_)
“ (_) Time series
optimization
analysis
Sales force ( )
: C) Risk analysis
planning
Revenue Data envelopment
management analysis
Supply chain Balanced
optimization scorecard
Fig. 1.5.6 : Portfolio of available methodologies in a business intelligence system
EEE ___EESESS
Syllabus Topic : Ethics and Business Intelligence
1.6 Ethics and Business Intelligence

@.1.6.1 What are the ethics of Business Intelligence? (Ref. Sec. 1.6) (6 Marks)
~ The type of ethics in Business Intelligence (BI) is the ethical principles of conduct that
govern an individual in the workplace or a company in general. It is also known as
professional ethics and not to be confused with other forms of philosophical ethics
including religious conviction, or popular conviction.
- Professional ethics according to Griffin (1986) is that profit is not the only important
strategy of a business anymore, There is also more of a concern and motivator of
companies to do what is right.
- Companies must acknowledge that they have a common good to protect their local
community, improve employee relations and promote informational press to the public.
While back in 1986, Griffin was directing his argument towards ethics in accounting but it
is also true today in Business Intelligence.
[GFP eusiness Intetigence (MU-B.Sc-IT-Sem-VI)_ 1-13 Business Inteligence & Decision Support Sys,
- Government regulations are not changing fast enough to cover all the changes in
technology that bombards users on day to day bases. It is up to corporations to create g
code of ethics, and to persistently be receptive to the needs of the public being served.
- Everyday in BI management professionals may be at risk of making unethical practices jn
there decisions that regards the consumer, business and/or other employees data. Ethics js
a touchy subject, there is always going to be controversy on how companies choose to
handle business decisions.
- There is no definite decision to make when it comes to ethical decisions. While
sometimes it may involve illegal practices, other times it is just a decision that needs to be
made in a company to promote a better way of life for all.
- An example of an ethical decision would be a manager of a BI system that chooses to use
cheaper data in his/her data mining activities to save money. The data he/she chooses to
implement involves personal credit score reports.
— The cheaper data sets have a 20% possibility of being incorrect. The manager did not see
it as being an unethical decision when it was made, just a way to continue to generate
close-to-accurate reports and save money.
- The impacting decision on 20% of the company’s customers may have different results as
more people are turned down for credit because inaccurate reports. It is not a crime to
have implemented the inaccurate data sets but it may seem as an unethical practice to
others.
- While it is important for managers to be able to make their own decisions, this example
decision being made should have involved more managers since it affected the whole
business.
- The manager’s choice could bankrupt the company as user start to leave their business for
more accurate competitive companies. As the example points out, sometimes there is no
really clear answer to wither an issue involves an ethical or legal choice and each situation
can be different.
- Trying to make decisions based on individuals’ beliefs when dealing with a company can
amount to intellectual stalls and trying to come to a decision can be expensive and time
consuming.
- Today’s society.has come to the point where there are more solutions to problems than
ever before. What once was impossible can now be accomplished through the use of BI
and other technology similar to BI.
(ar Business Intelligence (MU-B.Sc.-IT-Sem
mia. Business Intelligence & Decision Support Sys.
It is not going to stop; technology is going to keep advancing. What seems improbable
now may be common in the near future,
- Because of business globalization, there is also a larger separation between companies
and customers, companies and competitors than there was when everything was done
locally in the past.
“ ‘+ ‘
- Larger separation between companies and the consumer has resulted in unethical and
sometimes illegal business decisions like data theft.
- Because of all the technology used in big businesses, and resulting exposure to unethical
practices by some of the larger corporations like Enron, there is growing anxiety of large
companies to be free of unethical practices.
- Additionally the general trust level of users has eroded to the point were trust really has to
be earned. Users are very aware of cases of identity information being lost to theft as well
as other case examples in the media.

- Users have taken up with the attitude of show me or prove to me that they are safe, that
there information is safe or they will not do business.
—E—— Sr
Syllabus Topic : Decision Support Systems
17 Introduction to Decision Support Systems
Q.1.7.1 Write short note on Decision Support System. (Ref.Sec.1.7) (6 Marks)
- A Decision Support System (DSS)-is a computer program application which analyzes
business data and presents it so that users who can make business decisions more easily.
A DSS allows users to compile information which can be used to solve problems and
make better decisions.
The advantage of decision support system is that it includes more informed decision-
making, timely problem-solving and improved efficiency for dealing with problems with
rapidly changing variables.
[GFF Business intoligence (MU-B.Sc-1T-SemV)__
1.8
1-15 Business Intelligence & Decision Support Sys, |
Syllabus Topic : Definition of System
Definition of System
Q. 1.8.1 Explain system with neat diagram. (Ref. Sec. 1.8) __ (5 Marks)
The term system is widely used in everyday language: for example, we refer to the solar
system, the nervous system or the judiciary system.
All these systems contains a common characteristic, which can be used for abstract
definition of the notion of system: each of them is made using collection of components
which are some way connected to each other to get the single collective result and a
common purpose.
Every system is characterized by boundaries that separate its internal components from —
the external environment. A system is also called as open if its boundaries can be crossed
in both directions by flowing of materials and information.
When such weakness is lacking in the system then it is knows as closed. In other words,
any system receives specific input flows, and gives an internal transformation process
then generates observable output flows.
This definition of the system can be used to describe a broad class of real-world
‘phenomena.
From the Fig. 1.8.1 it can be seen it uses a structure for describing concept of the system.
In this system it receives a group of input flows then returns a group of output flows from
the transformation process which is regulated by internal and external conditions.
Measurable performance indicators are used to assess effectiveness and efficiency of the
system. It can be classified into different categories.
The Fig. 1.8.1 shows the main types of metrics which is used to evaluate systems
embedded within the enterprises and the public administration.
A system uses feedback mechanism. Feedback occurs when a system component
generates an output flow i.e. fed back into the system itself as an input flow, possibly
because of a further transformation.
[42F eusiness inteligence (MU-B.Se.IT-Som
aL}
External conditlons
18 Business Intelligence & Decision Support Sys.
Input
System
e materials
e services:
_ information —
Transformation
process
Intermal conditions _
® products
esarvices _
_ «information
verall cost erisk
- System performances
rofitabitity © dependability
. Fig. 1.8.1: Abstract representation of a system
- System which modifies their output flaws depending upon feedback is known as closed
cycle system. For example, the closed cycle system explained in Fig. 1.8.2 describes the
development of a sequence of marketing campaigns.
XY
\
/
Fig. 1.8.2 : A closed cycle marketing system with feedback effects
The sales results of each campaign are collected and used as feedback input to design
subsequent marketing promotions so that they can make decision and improve the system.
evaluation metrics they are as follows.
It is very important for decision-making process. For this purpose we can use two main
[ep Business Intelligence (MU- -B.Sc.-IT-Sem-Vl)__1- 17 Business 3 Intelligence & Decision n Support
Ss 8,
f Effectiveness
Effectiveness means whether we are achieving desired outcome or not. In other word
doing effectiveness means doing accurate thing.
@ Efficiency
— Efficiency means whatever we are producing or performing is perfect or not. It should be
done in perfect way.
- Effectiveness metrics shows that whether the right action is being taken or not, whereas
efficiency metrics is used to check whether taken action is best possible way or not.
Sy
Syllabus Topic : Representation of the Decision-Making Process
1.9 Representation of the Decision-Making Process
- To build effective DSSs, we first need to describe in general terms how a decision-making
process is joined,
- We wish to understand the steps that lead individuals to make decisions and the extent of
the influence applied on them by the subjective attitudes of the decision makers and the
specific context within which decisions are taken.
1.9.1 Rationality and Problem Solving

Q.1.9.1 Explain process of problem solving. (Ref.Sec.1.9.1)) (5 Marks)
- A decision is done by selecting best alternative. Decision is very important in personal or
professional life.
- It’s plays vital role to achieve desired goal. We are focusing on decision which is made by
enterprises and organizations which can be public or private.
- This decision is used to developing strategic plan. The decision-making process is used
for problem solving, individuals fills the gap between current system’s operating
condition also tries to achieve better conditions in the future.
— In other words, the transition of a system towards the desired state implies overcoming
certain obstacles and is not easy to attain. It will force decision makes to devise a set of
alternative best options to get the required goal, and then it will make a decision based on
a comparison between the merits and demerit of each alternative.
- Therefore, the decision selected should be put into use first then check whether it has
enabled the planned objectives to be achieving goals. When this fails then problem is
reconsidered, according to recursive logic.
Business Intelligence (MU-B.Sc.-IT-Sem-Vl) _ 1-18 Business Intelligence & Decision Support Sys.
Toe OO ———————__eo’
Fig. 1.9.1 shows the process of the problem-solving. The alternatives represent the
possible actions targeted for solving the given problem and helping to achieve the planned
objective. ;
Sometime number of alternatives available can be less. While making decision of granting
Joan of an applicant there are only two alternatives available they are either approve or
reject.
But in other cases there can be many alternatives where we need to select best alternative
among all available alternative.
_ Environment _
Fig. 1.9.1: Process of problem-solving
Criteria are used to measure effectiveness of the various options and correspond to the
different kinds of system performance shown in Fig. 1.9.1 shows rational approach to
decision making where best alternative is selected among all other alternative.
Apart from economic criteria, which tend to prevail in the decision-making process within
companies, it is however possible to identify other factors influencing a rational choice.
Factors Influencing
, a rational cholce
2. Technical
3. Legal
4, Ethical.
6. Political
Fig. 1.9.2 : Factors influencing a rational choice
[GFP Business intoligence (MU-B.Sc-.T-S
1-19 Business Intelligence & Decision Support Sys, ©
em-V1)
1. Economic
Economic is the most important and influential factors for making decisions. It is also
used for reducing expenses and increasing profits.
For example, an annual logistic plan can be used rather than other alternative plans to
reduce cost and increase profit.
2. Technical
Alternatives which are technically not reasonable should be rejected.
For instance, a production plan which exceeds the maximum capacity of a plant cannot be
referred as a feasible option.
3. Legal
In this means decision maker should verify whether it is compatible with the legislation in
force within the application domain.

4. Ethical
In this decision maker should follow certain principles and. social rules related to the
system.
5. Procedural
A decision can be considered ideal from an economic, legal and social Standpoint, but it
can be unworkable due to cultural limitations of the organization in terms of prevailing
procedures and common practice.
6. Political
The decision maker can access the political consequences of a specific decision from
individuals, departments and organizations.
The process of evaluating the alternatives can be divided into two main phases as shown
in Fig. 1.9.3, Exclusion and Evaluation.
In first phase i.e. exclusion it checks rules and restriction of the alternative. In this
process, some alternatives can be rejected from consideration; others represent feasible
options which represent evaluation. In second phase best alternatives are compared based
on their performance,
| (7 Business Intelligence (MU-B.Sc.-IT-Sem-VI) 1-20 Business Intelligence & Decision Support Sys.
eS ——————eee
+ ( Altemative options )
T Ld re
- Constraints —
* operational
« technical
* procedural
e legal
esocial -
litical
Exclusion
+34
ay
Feasible options
° profitability: |
« overall cost
Evaluation
Fig. 1.9.3: Structure of decision —making process
1.9.2 The Decision-Making Process
- A compelling representation of the decision-making process was proposed in the early
1960s and remains today a major methodological reference. The model consist of three
stages they are intelligence, design and choice.
- Fig. 1.9.4 shows an enhanced version of the original scheme, It has additional two stages
they are implementation and control.

! Fig. 1.9.4 : Phases of the decision-making process
ep Business Intelligence (MU-B.Sc.- -IT-Sem-VI) __
_1-21 Business Intelligence & Decision Support Sys, :
> 1. Intelligence Phase
First phase of the decision-making process is Intelligence Phase. In this phase, decision
makers examine reality and then identify problems or opportunities correctly.
This phase is very important in decision making process as we are trying to identify
problems. |
For example, we like to practice Lean Startup methodology which emphasizes importance
of right problem definition before building anything that can be product or business.
Additionally, one of the Digital Transformation pillars is the aa, Organizations should
become data-driven.
That means proper usage and implementation of Business Intelligence (BI) systems.
Business Intelligence implementations are considered successful only if you have clear
business needs and see real benefits from it.
Business Intelligence is not just about data. It should be ganeetel with organizational
goals and objectives.
The intelligence phase can really remain for long time. But, since decision-making
process starts with this phase, it should be long as it has to be done properly.
2. Design Phase
The main aim of this phase is to define and construct a model which represent a system. It
is done by properly defining relationships between all collected variables.
Once we validate the model, we define the criteria of choice and search for several
possible solutions for defined problem (opportunity). In this phase we need to predict the
future outcomes for each alternative.
3. Choice Phase
In this phase we are actually making decisions by selecting best alternative. The end
product of this phase is a decision.
Decision is made by selecting and evaluating alternatives as described in previous step. If
we are sure that the decision we made can actually be achieved:and then we can move
towards next phase i.e. implementation phase.
4. Implementation Phase
All the previous steps we’ve made (intelligence, design and choice) are now
implemented.
(77 Business Intelligence (MU-B.Sc.-IT-Sem-VI) 1-22 Business Intelligence & Decision Support Sys.
eee anaes
It is mot necessary that implementation will be always successful. Successful
implementation will provide solution of defined problem but failure returns us to an
earlier phase.
We described Simon’s model which, even today, serves as the basis of most models of
decision-making process. The process describes series of events that precede final
decisions. . |
It is important to say that, at any point, the decision maker may choose to return to the
previous step for additional validation. This model is a concept, a framework of how
organizations and managers make decisions.
5. Control Phase | ‘
Once we are done with all the paases it is very important to check whether everything is
working fine or not.
This is the final stage of rational decision-making process, wherein, the outcomes of the
decision are measured and compared with the predetermined, desired goals.
If there is a discrepancy between the two, the decision-maker may restart the process of
decision-making by setting new goals.
1.9.3 Types of Decisions

Q.1.9.3 What are the types of decision? (Ref. Seo. 1.9
ee AG RP rae
(6 Marks)
Decision supports systems can be group of are group of manual or computer-based tools
which helps in some decision-making.
Decision Support Systems (DSS) are commonly understood to be computerized
management information systems designed to help business owners, executives, and
managers resolve complicated business problems and questions,
Good decision support systems will help us perform a wide variety of functions, including
cash flow analysis, concept ranking, multistage forecasting, product performance
improvement, and resource allocation analysis.
Previously regarded as primarily a tool for big companies, DSS has in recent years come
to be recognized as a potentially valuable tool for small business also.
There are various types of decisions they are described as follows :

I? Business Intelligence (MU-B.Sc.-IT-Sem-VI) __ 1-23 Business Intelligence & Decision Support Sy
Structured Semi-structured Unstructured
Strategic
Tactical
Operational
Fig. 1.9.5 : A taxonomy of decisions
“> 1. Structured Decisions e
Many analysts categorize decisions according to the degree of structure involved in the
decision-making activity. Business analysts describe a structured decision as one in which
all three components of a decision the data, process, and evaluation are determined.
Since structured decisions are made regularly in business environments, it makes sense to
place a comparatively rigid framework around the decision and the people making it.
Structured decision support systems are easy to use a checklist or form to so that we can
ensure that all necessary data are collected and that the decision making process there is
no data missing. .
If the choice is also to support the procedural or process component of the decision, then
it is quite possible to develop a program either as part of the checklist or form. It is also
important to develop computer programs which will collect and combine all data.
When there is a need to make a decision more structured, the support system for that
decision is designed to ensure consistency.
Many firms who hire individuals: without a great deal of experience provide them with
detailed guidelines on their decision making activities and support them by giving them
little flexibility.
One interesting consequence of making a decision more structured is that the liability for
inappropriate decisions is shifted from individual decision makers to the larger company
or organization.
2. Unstructured Decisions
It has same components like structured decision they are data, process, and evaluation:
Unstructured decisions are made when all elements of the business environment
_— —
(7 Business Intelligence (MU-B.Sc.-IT-Sem-VI)__1-24 Business Intelligence & Decision Support Sys.
i.e. customer expectations, competitor response, cost of securing raw materials, etc. are
not understood completely.
Unstructured decision systems typically focus on the individual who or the team that will
make the decision. These decision makers are usually entrusted with decisions that are
unstructured because of their experience or expertise; it is their individual ability that is of
value.
One approach to support systems in this area is to construct a program that simulates the
process used by a particular individual. The main aim of ‘ unstructured decisions is to
understand the role that individuals experience or expertise plays in the decision and to
allow for individual approaches.
3. Semi-Structured Decisions
Decisions of this type are characterized as having some agreement on the data, process,
and evaluation to be used.
Unstructured and semi-structured can be particularly problematic for small businesses,
which often have limited technological or work force resources. This unstructured or
semi-structured nature of these decisions situations can create the problem of limited
resources and staff expertise available to a small business executive to analyze important
decisions appropriately.
4. Strategic decisions
Strategic decisions are used for taking action or a major part of business enterprise. They
help to achieve common goals of the enterprise. They have long-term implications on the
business enterprise.
They may involve major departures from practices and procedures being followed earlier.
Usually, strategic decision is unstructured therefore a manager has to apply his business
judgement, evaluation and intuition into the definition of the problem.
These decisions are based on partial knowledge of the environmental factors which can be
uncertain or dynamic, These types of decisions are taken at the higher level of
management. :
5. Tactical decisions
This type of decision relate to the implementation of strategic decisions. °
They are directed towards developing divisional plans, structuring workflows,
establishing distribution channels, acquisition of resources such as men, materials and
money. These decisions are taken at the middle level of management.
n Su
EP Business Intelligence (MU-B.Sc.-IT-Sem-VI)__ 1-25 Business Intelligence & Decisio it Sys,
> 6. Operational decisions
- These decisions relate to day-to-day operations of the enterprise. They have a short-term
horizon as they are taken repetitively, It does not require business judgements and it ig
based on facts of events.
— Operational decisions are taken at lower levels of management. As the information jg
needed for helping the manager to take rational, well informed decisions, information
systems need to focus on the process of managerial decision making.
‘ _ Operational Tactical Strategic
Accuracy High <> ‘Low ;
Level of detail “Detailed <——»> Aggregate °
Time horizon “Present =<+——> Futuro
Frequency of use > High - —— Low
Source - Intem
Scope of information | Quantit
Nature of information lz a
Age of information esen < we ~ ul x

Fig. 1.9.6 : Characteristics of the information in terms of the scope of decisions
- The characteristics of the information very useful in a decision-making process which will
change depending upon the scope of the decisions to be supported, and consequently also
the orientation of a DSS will vary accordingly.
- Fig. 1.9.6 shows variations in the characteristics of the information as the scope of the
decisions changes, The scheme may be used as an assessment tool while designing a DSS.
1.9.4 Approaches to the Decision-Making Process
Q. 1.9.4 _Whatare the approaches of decision making process? (Ref. Sec. 1.9.4) (5 Marks
. Approaches to
the Decision-Making Process

1, The Behavioral Approach
2. The Practical Approach
3. The Personal Approach
Fig. 1.9.7 : Approaches to the Decision-Making Process
[7 Business Intelligence (MU-B.Sc.-IT-Sem-Vl)__ 1-26 Business Intelligence & Decision Support Sys.
-> 1. The Behavioral Approach
This approach assumes that decision-makers operate with bounded rationality instead of
perfect rationality assumed by the rational approach.
Bounded rationality is the idea which decision makers cannot deal with information about
all the aspects and alternatives pertaining to a problem and therefore choose to tackle
some meaningful subset of it.
Thus, this process is not exhaustive and completely rational solutions are not entirely
ideal. | , |
Decision-makers operating with bounded rationality restrict the inputs to the decision-
making process, focus their attention on two or three most favorable alternatives, process
these in great detail and base their decisions on judgment and personal biases as well as
logic.
-~> 2. The Practical Approach
- This approach combines the steps of the rational approach with the worthwhile features
and conditions in the behavioural approach to make more realistic Pi pIneons for making
decisions in institutions.
— This approach states that decision-maker should try to go beyond rules of thumb and
satisfying limitations and generate as many alternatives as possible within the given time,
money and other Practicalities of the situation.
- Here, the rational approach provides an analytical framework for making decisions while
_ the behavioural approach provides a moderating influence.
> 3. The Personal Approach
- The preceding three approaches explicitly explain the processes involved into decision-
making.
— However, they do not throw light on how people take decisions when they are nervous,
anxious, worried or agitated-whether in organizations or in personal matters.
ne ES
Syllabus Topic : Evolution of Information Systems
1.10 Evolution of Information Systems
Lo. 1.10.1, Write short not on evolution of Information Systems. (Ref. Sec. 1.10) (5 Marks)
FP eusiness Intelligence (MU-B.Sc.-IT-Sern-V1)
1-27 Business Intelligence & Decision Support Sys_
An information system is a combination of processes, hardware, trained personnel,
software, infrastructure and standards that are designed to create, modify, store, manage
and distribute information to suggest new business strategies and new products.
It leads to efficient work practices and effective communication to make better decisions
in an organization. There has been a significant evolution of Information System function
over the past few decades.
The evolution of Information System function can be summarized as follows :

1950-1960 | 1960-1970 | 1970-1980 | 1980-1990 | 1990-2000 | 2000—Present
Electronic | Management | Decision Executive Knowledge E-Business
Data Information | Support Information | Management —
Processing, | Systems Systems Systems Systems
Transaction
Processing
System
Collects, Pre-specified | Interactive | Provide both | Supports the Greater
stores, reports and ad-hoc internal and creation, . -| connectivity,
modifies displays to support for | external organization higher level
and retrieve | support the information | and of integration

day-to-day | business decision- relevant to dissemination | across
transactions | decision- making the strategic | of business applications
of an _ | making process goals of the knowledge
organization organization
Help Helps middle | Helps Helps Help available | Helps global |
workers managers senior Executives enterprise e-business |
managers wide |
Syllabus Toplc : Definition of Decision Support System
1.11. Definition of Decision Support System 4
Q. 1.11.1 Draw structure of DSS and explain, (Ref, Sec. 1.11) (5 Marks) |
- A Decision Support System (DSS) is a computer-based application which collects, |
organizes and analyzes business data to facilitate quality business decision-making for
' Management, operations and planning.
(FP Business Intelligence (MU-B.Sc.-IT-Sem-VI)__ 1-28 Business Intelligence & Decision Support Sys.
_— A well-designed DSS aids decision makers in compiling a variety of data from many
sources: raw data, documents, personal knowledge from employees, management,
executives and business models. DSS analysis helps companies to identify and solve
problems, and make decisions. °
Fig. 1.11.1 : Structure of a decision support system .

- The Decision Support System consists of following four components:
1. The database and the management of the database. , ,
The model base and the management of the model base.
3. The hardware.
The user system interface.
_ 1.11.1 Different Components of the Decision Support System
Components of Decision
Support System
1. Dialogue management
2. Model management
3. Database management
Fig. 1.11.2 : Components of decision support system

> 1. Dialogue management
Consists of the three sub systems; known as the user interface, the dialogue control, the
request translator.
1
ep Business Intelligence (MU-B.Sc.-IT-Sem-VI) 1-29 Business Intelligence & Decision Support Sys. i
- The user interface sub system controls the physical user interface.
— It also manages the appearance of the screen and also accepts the input from the user and
then displays the results. ‘ .
- The user interface sub system is also responsible for checking the user commands for the
correct syntax.
— The dialogue control sub system is responsible for the maintenance of the processing
context with the user.
— The request translator helps in the translation of the user command into the actions for the
model management or the data management components into such a pattern that can be
easily understood by the user.
“> 2. Model management
‘The command processor delivers those commands from the dialogue management
components to either the model base management system or the mode execution system
. after receiving the commands from the dialogue management components.
> 3. Database management
- Helps in the storage of the database.
- Also helps in the manipulation of the database.
— Works under the guidance of the either the mode] management component or the dialogue
management component.
- Helps in the maintenance of the interface with the data sources that are generally external
to the Decision Support System. ©
==
Syllabus Topic : Development of a Decision Support System,
Development of a Model
1.12 Development of a Decision Support System
Q. 1.12.1 What are the phases of DSS ? (Ref. Sec. 1.12) (& Marks)
Q. 1.12.2 Explain development of model. (Ref.Sec.1.12) = = § (5 Marks)
— DSSs are usually not available as standard programs like software applications, such as
information systems and office automation tools, |
Scanned by CamScanner |
&7P Business Intelligence (MU-B.Sc.-IT-Sem-VI) 1-30 Business Intelligence & Decision Support Sys.
— Multidimensional analysis environments have facilitated and standardized the access to
passive business intelligence functions. However, in order to develop most DSSs a
specific project is still required.
— Fig. 1.12.1 shows the major steps involved in the development of a DSS.
Requirements
> 1. Requirement
In this phases gather information and make report of the entire requirement
> 2. Planning
- The main purpose of the planning phase is to know the requirement and understand
opportunities. In this we need to find out cost, time, and benefits of the system. What is
the scope of the system?
- What will be the problem and solution for it? Without the perfect plan, calculating the
strengths and weaknesses of the project, development of software is . meaningless.
Planning kicks off a project flawlessly and affects its progress positively.
> 3. Analysis
- This step is about analyzing the performance of the software at vari
notes on additional requirements.
ous stages and making
!
[7T Business Intelligence (MU-B.Sc.-IT-Sem-VI) 1-31 Business Intelligence & Decision Support Sys.)
:|
- Analysis is very important to proceed further to the next step.
=> 4. Design
- Once the analysis is complete, the step of designing takes over, which is basically
building the architecture of the project.
- This step helps remove possible flaws by setting a standard and attempting to stick to it.
> 5. Implementation
— The actual task of developing the software starts here with data recording going on in the .
background. |
— Once the software is developed, the stage of implementation comes in where the product
goes through a pilot study to see if it’s functioning properly.
=> 6. Testing
The testing stage assesses the software for errors and documents bugs if there are any.
= 7. Maintenance
Once the software passes through all the stages without any issues, it is to undergo a
maintenance process wherein it will be maintained and upgraded from time to time to
adapt to changes.
=> 8. Delivery
— Successful project delivery requires the implementation of management systems that will
control changes in the key factors of scope, schedule, budget, resources, and risk to
optimize quality and, therefore, the investment.
- This section offers guidance for the entire team to successfully and effectively optimize
the quality of a high-performance building project.
@ Syllabus Topic : Business Intelligence
Q.1 What do you mean by business intelligence ? Write its advantages. - .
® Syllabus Topic : Effective and Timely Decisions
Q.2 Write short note on Effective and Timely decisions. (Refer Section 1.2) (5 Marks)
@ Syllabus Topic : Data, Information and Knowledge
Q.3 What do mean by data, knowledge and information ? (Refer Section 1.3) (5 Marks)
rarer
IF) Business Intelligence (MU-B.Sc.-IT-Sem-VI)__1-32_ Business Intelligence & Decision Support Sys.
———— SSS SSS
% Syllabus Topic : The Role of Mathematical Models
Q.4 — Write short note on the role of mathematical models. (Refer Section 1.4)
@ Syllabus Topic : Business Intelligence Architectures
Q.5 Draw and explain architecture of Business Intelligence. (Refer Section 1.5)
Q.6 Draw and explain Cycle of Business Intelligence Analysis.
(Refer Section 1.5.1)
Q.7 Draw and explain phases of Business Intelligence. (Refer Section 1.5.2)
* Syllabus Topic : Ethics and Business Intelligence
Q.8 What are the ethics of Business Intelligence ? (Refer Section 1.6)
* Syllabus Topic : Decision Support Systems
Q.9 Write short note on Decision Support System. (Refer Section 1.7)
‘* Syllabus Topic : Definition of System ;
Q.10 Explain system with neat diagram. (Refer Section 1.8)
* Syllabus Topic : Representation of the Decision-Making Process
Q.11 Explain process of problem solving. (Refer Section 1.9.1)

Q.12 Explain phases of decision making process. (Refer Section 1.9.2)
Q.13 What are the types of decision? (Refer Section 1.9.3)
Q.14 What are the approaches of decision making process ?
(Refer Section 1.9.4)
* Syllabus Topic : Evolution of Information Systems
Q.15 Write short not on evolution of Information Systems. (Refer Section 1.10)
* Syllabus Topic : Definition of Decision Support System
Q.16 Draw structure of DSS and explain. (Refer Section 1.11)
*" Syllabus Topic : Development of a Decision Support System
Q.17 What are the phases of DSS ? (Refer Section 1.12)
Q. 18 Explain development of model. (Refer Section 1.12)
(5 Marks)
(5 Marks)
(5 Marks)
(5 Marks)
(5 Marks)
(5 Marks)
(5 Marks)
(5 Marks)
(5 Marks)
(5 Marks)
(5 Marks)
(5 Marks)
(5 Marks)
(5 Marks)
(5 Marks)
000
Chapter Ends...

Tech Max

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Tech Max

Uploaded by

Copyright:

Available Formats

Unit V

Knowledge Management and

Artificial Intelligence and Expert

Syllabus Topic : Introduction to Knowledge Management

5.1___ Introduction to Knowledge Management | _

Process of knowledge management, these enterprises comprehensively gather information

using many methods and tools. .

techniques. The analysis of such information will be based on resources, documents,

people and their skills.

knowledge is later used for activities such as organiza

tional decision making and training

new staff members.

Processes have been automated.

— Therefore, information storing, retrieval and sha

ring have become Convenient, Nowadays,

Most enterprises have their own knowledge m

anagement framework in place, |

~ The framework defines the knowledge gatheri i

data storing tools and techniques and analysing mechanism.

5.1.1 The Knowledge Management Process

g.§.1.2 Explain knowledge management process. (Ref. Sec. 5.1.1) 6 Marks)

Q.5.1.3 _ Write short note on approaches knowledge management.

(Ref. Sec. 5.1.1)

should be a procedure in knowledge management process. These procedures should be

Properly documented and followed by people involved in data collection process.

attendance reports may be two good resources for data collection.

an online report where it is directly stored in the database.

certain rules. These rules are defined by the organization. |

accurately within a database. ,

organizing and reducing the duplication. °

- This way, data is logically arranged and

related to one another for easy retrieval. When

data passes step 2, it becomes information.

information is presented in tabular or graphical format and stored appropriately.

(Pareto, cause-and-effect), and different techniques.

reports) are combined together to derive various concepts and artefacts,

anywhere through the Internet,

implementation of the same for free.

Step 6: Decision Making

Syllabus Topic : Roles of People in Knowledge Mariagement

5.2 __ Roles of People in Knowledge Management

2. To provide proper storage and sharing of knowledge systems.

3. To empower them and continually train them

4. To keep them motivated

5. To give them adequate remuneration, to ensure their commitment.

the entire organization. .

they have learned.

(ee Business Intelligence (MU-B,Sc.-IT-S

Syllabus Topic : Organizational Leaming

5.3 Learning Organisa _

The learning organisation is an organisation characterised by a deep commitment ty

learning and education with the intention of continuous improvement.

Also, it examines some evidence on how learning organisations operate. Learning

organisations facilitate collective learning in order to continually improve the capacity to

respond to changing demands in the environment.

Syllabus Topic : Organizational Transformation

5.4 Organizational Transformation

Q.5.4.1- Write short tnot 2on.C rganizational t transfo mation. (Ref. S

is done or in the event of a re-engineering or restructuring activity. “

as the culture of the organization undergoes a significant change.

- It’s about re-modelling an organization in its entirety.

critical success factors for managing change at each stage.

5:6 ___Knowledge Mgmt. & Al & Expert Systems

stage 1: Break with the past