Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 47

Management of Information Systems

Prof. Dr. Christof Weinhardt – Ewa Lux

Institute of Information Systems and Marketing (IISM), Karlsruhe Service Research Institute (KSRI)

KIT – University of the federal state Baden-Württemberg


and national research institute of the Helmholtz-association www.kit.edu
economy

utilization

technology
law

Acquisition Storing Transformation Evaluation Commercialization

society

2 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Outline of the lecture

Information, Measuring/Observation, Experiments, Forecasting,


Acquisition Simulation, Survey, Interviews

Storing Databases, SQL, Pivoting, Semantics and Ontologies

Basics, Filtering
Transformation Regression, Cluster Analysis

Evaluation Utility Analysis, AHP, Decision Rules, Information Value, Page Rank

Internet Economics, Digital Goods, Network Effects,


Marketing Standardization Networks, Pricing, Bundling

3 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Motivation & Problem

 How can terms and data be put in relation with each


other and be ordered in a useful manner?

 How can relevant data be extracted from aggregated


data?

 …and the other way around: How is data aggregated,


filtered and transformed?

 With which methods can we recognize trends and


visualize semantic relations between the terms?

4 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Motivation & Problem
Information production

„Print, film, magnetic, and optical storage media produced about 5 Exabyte's of
new information in 2002. Ninety-two percent of the new information was stored on
magnetic media, mostly in hard disks.“

„[...] five Exabyte's of information is equivalent in size to the information


contained in 37,000 new libraries the size of the Library of
Congress book collections.“

Exabyte (EB) = 1,000,000,000,000,000,000 bytes


= 1018 bytes

(Study of UC Berkeley)

http://www.sims.berkeley.edu/research/projects/how-much-info-2003/, http://aboutgreen-t.blogspot.de/2010/03/information-overflow.html

5 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Overview: Transformation

Basics
Classification
Aggregation
OLAP

Filtering

Regression

Clusteranalysis

6 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Methods of Information classification

Methods to map
information

Classification Thesaurus
OB
Data processing - Computer science

004.015 Mathematical Principles

004.019 Human-computer interactions

004
004.028 Auxiliary techniques and procedures
Data processing - Computer science

004.015 Mathematical Principles

004.019 Human-computer interactions

004.028 Auxiliary techniques and procedures


Data processing - Computer science
UB
04.1General works on specific types of computers
004.015 Mathematical Principles

004.11Digital Supercomputers
004.019 Human-computer interactions

004.12Digital mainframe computers


004.028 Auxiliary techniques and procedures

004.125Specific digital mainframe computers


04.1General works on specific types of computers

004.14Digital minicomputers
004.11Digital Supercomputers

004.145Specific digital Supercomputers


004.12Digital mainframe computers

004.125Specific digital mainframe computers

manual automatic

7 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Methods of information classification

„Data, information and knowledge need to be in order“

Allocation of an inhomogeneous number of objects and relations in


homogeneous classes of object types where the classes are represented by
notations.
Objects are usually terms.
Notation systems determine the expression power of the classification.
Mapping the classes as
automobile
Hierarchy
Abstraction relation (generic relation) passenger truck automobile
car
Portfolio relation (partitive relation) chassis wheels
Association relation
Classes mostly fulfill the following criteria:
universality, continuity, timeliness
[Stock, 2000]
[Manecke, 2004]
Specific granularity of the class construction
8 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Methods of information classification

Generic term A ship


Hierarchy relation

„Species term“ B C Cruise ship Cargo ship

Monohierarchy

D E F Polyhierarchy
Assoziation
relation
G H I

Terms
Top term – highest term of the relation (root): A
Bottom term – lowest term of the relation e.g. G, H, I
Term ladder - pairs that have a hierarchy relation between each other , e.g. ABD
Term row – terms that have the same top term e.g. DEF
[Stock, 2000]
[Manecke, 2004]

9 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Methods of information classification

Classification systems

DIN 32705: „Tools to structure objects or knowledge with objects.“


Use of logical tools, to map knowledge units in an adequate manner.
Clarification of relations by structuring the knowledge
Usually mapping of complex term combinations in one class
Set complex combinations ex ante (precombined classification system)
Examples for classification systems:
Dewey Decimal Classification (Universal classification)
Nace (Special classification: General semantic of economy fields)
International patent classification (IPC)

10 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Method of information classification

Dewey Decimal Classification

Universal classification as a tool for catalogues or as a system for library


organization
Published first in 1876 in the US
Covers 22.000 classes in the main tables as well as 8.000 classes in the
additional tables (relations)
Contains not only the class terms but also their synonyms

The 10 classes of the upper hierarchy:


000 General 500 Mathematics, Science
100 Philosophy 600 Technology , applied science
200 Religion, Theology 700 Art, Game, Sport
300 Social sciences , Law 800 Literature science
400 Languages 900 Geography, Biographies, History

11 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Methods of information classification

Thesaurus – DIN1463/1:
„sorted collection of terms and their explanations, that help to indicate, store and find
them in a documenting surrounding“
Most important system for the production and query of literature verification data bases
Used for:
Representation of functional knowledge about a specific topic by a term system
Structuring a data warehouse by topics
Use of easy, intuitive terms instead of complex notations of classification systems
Describes a word pool, that is characterized by:
terminological control
relations between the terms

12 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Methods of information classification

Thesaurus – Terminological control I


Terms and their explanations are clearly related to each other
Synonyms are combined to one class , usually the most common synonym will be the
„descriptor“ of the class – all the other synonyms are „non descriptors“
Example: „money“ and „currency“ are synonyms - „money“ is much more common 
„money“ is descriptor - „currency“ is non descriptor
For the content only descriptor is used
Processing of non-descriptors has to be done by software of manually

13 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Methods of information systems

Thesaurus – Terminological control II


Special notation of Homonyms and Polysems
Insertion of Homonym- and Polysem additions
Example. Homonym: Except (excluding)
Accept (to receive)
Example. Polysem: head (part of the body above the neck)
head (person in charge of a company)

Relations between the terms:


Hierarchy- or. association relations are possible
Abbreviations for relations

14 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Methods of information classification
Comparison between Thesauri and classification systems:
Classifications Thesauri
Are in a systematical mostly mono Alphabetical order
hierarchical order

Are dependent on natural Use intuitive access (Synonyms,


languages etc.)

Are pre coordinated and stiff (not Flexible, can be used post
extendable) coordinated

Less expressive More expressive

Languages are more and more mixed


Depending on information retrieval most useful language is chosen
In addition knowledge based systems like terminologies and ontologies

[Manecke, 2004]
15 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Overview: Transformation

Basics
Classification
Aggregation
OLAP

Filtering

Regression

Clusteranalysis

16 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Information aggregation

Data aggregation with….

• Descriptive Statistics Methods


- measure of location (Mean)
- measure of scattering (Variance)
- measure of skew
- measure of arch
• Inductive Statistics
- Regression

• SQL Query languages


- GROUP BY-Operator incl.
- Functions: COUNT, SUM, MIN, MAX und AVG
- Problem: usual data analysis tool e.g. histograms are only indirectly visualizable

• OQL (Object Query Language)


• XQuery (from W3C specified query language for XML data bases)

17 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Information aggregation: Indexing
Indexing expresses the content of a document with a document language (vocabulary,
syntax).
Representation and aggregation of documents with meta data with the aim to make them
accessible for the retrieval.
Intellectual Indexing
Manual process for the correct and consistent representation of document content
Use of a fixed Indexing language
Especially prepared texts in non electronic form
Automatic Indexing:
Prepares documents so that they are accessible for retrievals with index terms in the best
possible way [Nohr, 2004]

Intellectual Indexing Automatic Indexing


• Can handle diversity of language and • Amount of documents to big to be
semantic indexed only by hand
• Indexing is seen as a process that can not • Automatic processes result in at least
be formalized the same performance at the retrieval

18 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Key words/ word clouds: wordle.net

source: http://rt.com/politics/official-word/163912-putin-interview-french-
media/

19 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Information aggregation: indicator systems
Key numbers can include the following numbers:
Absolut numbers (sum, mean, descriptive statistics, e.g. transportation time, project
costs)
Relation numbers
Indicator systems bound to dimensions and relations: e.g. salary per person

Indicator systems without dimensions:


Structure numbers, e.g. relation between equity/ total capital

Index numbers (user index. Stock market index)

Indicator numbers are bound to a purpose

Indicator numbers for the planning, management and control of the company as a whole

Indicator numbers for the planning, management and control of single sub sections

Indicator numbers are neutral

Increased significance only in combination with the company goal as well as


comparisons between companies
20 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Examples for indicator systems

 = 0.2455  = 4.1493

Variance of Euclidean distances to the next neighbor as indicator systems for the uniform
distribution of the items

21 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Information aggregation: indicator systems

Task of indicator systems:

Recognize most important thing (Aggregate) as fast as possible,


See relationships,
execute comparisons internal and external,
be goal oriented .

Internal comparison

Comparison of time Target actual comparison Standard target comparison

Comparison of numbers in Comparison of numbers in Comparison between target


a company from different a company with the target numbers from different
points in time numbers sections

[Siegwart, 2002]

22 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Information aggregation: Fuzzy Methods
Qualitative evaluation of sizes for example because of fuzzy classification
A certain amount of expert knowledge is necessary to transform data in
linguistic terms
Sharp aggregation of fuzzy data, e.g. evaluation on basis of secondary data
Fuzzy aggregation of shared data e.g. evaluation of complex criteria after
several measurement-results
Fuzzy aggregation of fuzzy data, e.g. evaluation of heterogeneous groups with
fuzzy processes

cold warm hot

23 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Overview: Transformation

Basics
Classification
Aggregation
OLAP

Filtering

Regression

Clusteranalysis

24 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Online Analytical Processing

Data evaluation and–analysis: Online Analytical Processing (OLAP)

Software-tools for a more complex analysis of multi-dimensional data

OLAP-Systems offer:
Comfortable and interactive access to company data
View from several dimensions
Visualization, reduction and analysis

Access to heterogenic, distributed data sources

multiple report-functions

Optional dimension- and aggregation levels

25 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Online Analytical Processing

Multidimensional data structure

Business information are generally multidimensional

Dimensions:
Sales
Customer groups
Regions
Costs
...

Typical questions:
Which sales man has the highest sales in the region Baden-Württemberg?
Which product was demanded the most by customer group A in winter 2008?

Dimensions define „Hyper-cube“

27 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Online Analytical Processing

Hyper-cube

Dimensions are orthogonal

Number of dimensions is optional

Dimensions are defined by the number of attributes (hierarchically structured)


Analysis by operations
Scale („Roll-Up; Drill-Down“)
Cuts („Slice/Dice“)
Turns („Pivot“)

28 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Online Analytical Processing

Operations in OLAP

 Roll-Up Location Stuttgart


Karlsruhe
München
 Drill-Down Augsburg

 Slice/Dice Q.1

 Pivoting Q.2
Time
Q.3

Q.4

Games PC Phone Security

Item

29 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Online Analytical Processing

Operations in OLAP

 Roll-Up Location
Bayern
 Drill-Down BW

 Slice/Dice Q.1

 Pivoting Q.2
Time
Q.3

Q.4

Games PC Phone Security

Item

30 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Online Analytical Processing

Operations in OLAP

Location Stuttgart
Karlsruhe
München
Augsburg
 Roll-Up Januar
Februar
 Drill-Down
März
 Slice/Dice April
Mai
 Pivoting
Juni
Time
Juli
August
September
Oktober
November
Dezember

31 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Online Analytical Processing

Operations in OLAP

 Roll-Up
 Drill-Down Location BW
Bayern
 Slice/Dice
Q.1
 Pivoting Time
Q.2

PC Other

Item

32 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Online Analytical Processing

Operations in OLAP

 Roll-Up
Item
 Drill-Down
 Slice/Dice Q.1

 Pivoting Q.2
Time
Q.3

Q.4

Au Mü Kar S
g sb nc lsr tuttg
urg hen uhe a rt

Location

33 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Overview: Transformation

Basics
Classification
Aggregation
OLAP

Filtering

Regression

Clusteranalysis

34 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Time series analysis: Filtering

Extraction of the relevant data of a time series


Trends (periodic or long term) are known
Data should be adjusted by this (seasonal) trends (e.g. orders in the
construction sector) to see „true“ changes
Trends (periodic or long term) are unknown
Data are covered by a lot of fluctuation; trends should be recognized by
filtering/smoothing the data

Elimination of disturbing side effects by


Linear filter
Non-linear filter

35 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Time series analysis: Filtering

Parameter free
Parametric Model
Model Filter, Smoothing,
Regression, …
Noise Removal, …

36 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Time series analysis: Filtering

linear Filter non-linear filter

Output of the filter is linear dependent on the original signal


Dependent on the timing of the filtering
Filter with one specification for the whole time series
Example: moving average filter

37 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Time series analysis: an example (h=1)
5 8 10 9 14 17 11 16

  7,67 9,00 11,00 13,33 14,00 14,67  

f(t)
x(t)

1 t h Easy filter: moving average


f (t ) 
2h  1 i t  h
x(i )
Problem: handeling of the fringe

38 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Time series analysis: the fringe (h=1)
Signal
x1 x2 x3 x4
Possibility to handle the fringe 2 5 3 9

Calculate average x0 x1 x2 x3 x4 x5
(2+5+3)/3 2 5 3 9 (5+3+9)/3

Set a fix value x0 x1 x2 x3 x4 x5


0 2 5 3 9 0

x0 x1 x2 x3 x4 x5
Approximation (e.g. polynomic) -19 2 5 3 9 36

Reduce the filter area

x1 x2 x3 x4
2 5 3 9
(2+5)/2 (2+5+3)/3 (5+3+9)/3 (3+9)/2

39 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Time series analysis: kernel smoothing

General, parametric, easy filter


Weighting of the signal with the equation w(i) (=kernel)
discret continuitive

h h
f (t )  x(t  i )  w(i ) f (t )  x(t  i ) w(i )di
h
i  h

h h
 w(i)  1
i  h
h
w(i)di  1

f (t ) : Filter value at the point t


x (t ) : (initial) signal at point t
w(i ) : weight of the signal value dependent on the distance di

40 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Time series analysis: kernel smoothing
Equal distribution Triangle filter

Epanechnikov Gauß filter

41 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Time series analysis: kernel smoothing
Equal distribution Triangle filter

Epanechnikov Comb filter Filter


Gauß‘scher

42 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Time series analysis: challenges
Incoming signal
Filter (h=10)
Filter (h=50)

Choice of the filter and parameter (h, , …)


Treatment of the fringe
Treatment of discontinuities
Treatment of different (disturbing) frequencies

43 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Time series analysis: Filtering

linear filter non-linear filter

Usually no filter coefficients or closed transmission functions


E.g. Min(), Max(), Variance()

44 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Non linear filter

Non linear filter can unusually not be characterized by filter coefficients or


closed transmission functions.

Example: Median Filter, h = 1.


Time serie x: 1 3 2 4 5 20 4 3 6 7

Filtering f: _2345 5 446_

moving average filter (h=1) median filter (h=1)

Other examples: min(), max(), variance(), boundaries

45 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Two dimensional filter: Example
4 3 6 12
3 2 1 8
6 5 3 1
4 6 0 0

33/9 41/9

30/9 26/9 w( x, y )

Gausscher Weichzeichner

Here: moving average filter  all other kernels also possible in 2D

46 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Filtering & image editing
Gaussian Blur (Photoshop)

h = 1px h = 2px h = 4px


Average Blur (Excel)

47 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)
Two dimensional filter: practical application

Radius h

48 Management of Information Systems – Prof. Christof Weinhardt, Ewa Lux Institute of Information Systems and Marketing (IISM)

You might also like