SCHEME OF VALUATION FOR M.Tech

Code No: I0505 R16
I M. Tech I Semester Supplementary Examinations, FEB – 2020

DW&DM
Time: 3 hours Max. Marks: 60
1. a) The classic definition of a Data Warehouse is architecture used to [2M]

maintain critical historical data that has been extracted from operational
data storage and transformed into formats accessible to the
organization’s analytical community. The creation, implementation
and maintenance of a data warehouse requires the active participation
of a large cast of characters, each with his or her own set of skills, but all
functioning as a series of teams within a large team.
An identification of the major roles and responsibilities for managing a
data warehouse environment would normally include these functions.
They are presented in the general order in which they would participate
in the warehouse development and implementation.
Three-Tier Data Warehouse Architecture
Generally a data warehouses adopts a three-tier architecture. Following
are the three tiers of the data warehouse architecture.
 Bottom Tier − The bottom tier of the architecture is the data
warehouse database server. It is the relational database system.
We use the back end tools and utilities to feed data into the
bottom tier. These back end tools and utilities perform the
Extract, Clean, Load, and refresh functions.
 Middle Tier − In the middle tier, we have the OLAP Server that
can be implemented in either of the following ways.
o By Relational OLAP (ROLAP), which is an extended
relational database management system. The ROLAP
maps the operations on multidimensional data to
standard relational operations.
o By Multidimensional OLAP (MOLAP) model, which
directly implements the multidimensional data and
operations.
 Top-Tier − This tier is the front-end client layer. This layer holds
the query tools and reporting tools, analysis tools and data
mining tools.
The following diagram depicts the three-tier architecture of data
warehouse − [4M]
b) Data Integration is a data preprocessing technique that involves [2M]
combining data from multiple heterogeneous data sources into a
coherent data store and provides a unified view of the data. These
sources may include multiple data cubes, databases or flat files.
The data integration approach are formally defined as triple <G, S, M>
where,
G stand for the global schema,
S stand for heterogenous source of schema,
M stand for mapping between the queries of source and global schema.
There are mainly 2 major approaches for data integration – one is “tight
coupling approach” and another is “loose coupling approach”.
Tight Coupling: [2M]
 Here, a data warehouse is treated as an information retrieval
component.
 In this coupling, data is combined from different sources into a
single physical location through the process of ETL – Extraction,
Transformation and Loading.
Loose Coupling:
 Here, an interface is provided that takes the query from the user,
transforms it in a way the source database can understand and
then sends the query directly to the source databases to obtain
the result.
 And the data only remains in the actual source databases.
Issues in Data Integration: [2M]
There are no of issues to consider during data integration: Schema
Integration, Redundancy, Detection and resolution of data value
conflicts. These are explained in brief as following below.
1. Schema Integration:
 Integrate metadata from different sources.
 The real world entities from multiple source be matched referred
to as the entity identification problem.
For example, How can the data analyst and computer be sure that
customer id in one data base and customer number in another reference
to the same attribute.
2. Redundancy:
 An attribute may be redundant if it can be derived or obtaining
from another attribute or set of attribute.
 Inconsistencies in attribute can also cause redundanciesin the
resulting data set.
 Some redundancies can be detected by correlation analysis.
3. Detection and resolution of datavalue conflicts:
 This is the third important issues in data integration.
 Attribute values from another different sources may differ for
the same real world entity.
 An attribute in one system may be recorded at a lower level
abstraction then the “same” attribute in another.
2. a) SNOWFLAKE SCHEMA is a logical arrangement of tables in a [4M]
multidimensional database such that the ER diagram resembles a
snowflake shape. A Snowflake Schema is an extension of a Star Schema,
and it adds additional dimensions. The dimension tables
are normalized which splits data into additional tables.
Characteristics of Snowflake Schema:
 The main benefit of the snowflake schema it uses smaller disk
space.
 Easier to implement a dimension is added to the Schema
 Due to multiple tables query performance is reduced
 The primary challenge that you will face while using the
snowflake Schema is that you need to perform more
maintenance efforts because of the more lookup tables.
b) In data warehouses, data cleaning is a major part of the so-called ETL [4M]
process. Data cleaning, also called data cleansing or scrubbing, deals
with detecting and removing errors and inconsistencies from data in
order to improve the quality of data.
Data warehouse is an information delivery system where we can

integrate and transform data into information used largely for strategic
decision making. The historic data in the enterprise from various
operational systems is collected and is clubbed with other relevant data
from outside sources to make integrated data as content of data
warehouse.
Data cleaning, also called data cleansing or scrubbing, deals with
detecting and removing errors and inconsistencies from data in order to
improve the quality of data.
c) Enterprise warehouse:- collects all of the information about subjects [2M]
spanning the entire organization
An Enterprise Data Warehouse (EDW) is a form of corporate

repository that stores and manages all the historical business data of an
enterprise. The information usually comes from different systems like
ERPs, CRMs, physical recordings, and other flat files. To prepare data
for further analysis, it must be placed in a single storage facility. This
way, different business units can query it and analyze information from
multiple angles.
Virtual warehouse:- A set of views over operational databases [2M]

Only some of the possible summary views may be materialized
A virtual data warehouse is a type of EDW used as an alternative to a
classic warehouse. Essentially, these are multiple databases connected
virtually, so they can be queried as a single system.
3. a) Types of Business Analysis Tools [6M]

There are three major categories of different types of tools on the
basis of the above-mentioned functions. In the next section, we
will explain those categories –
1. Requirement-related tools i.e. to describe, manage, and track
requirements
2. Modelling tools
3. Collaboration tools
b) IBM Cognos Business Intelligence is a web based reporting and [6M]

analytic tool. It is used to perform data aggregation and create user
friendly detailed reports. Reports can contain Graphs, Multiple Pages,
Different Tabs and Interactive Prompts. These reports can be viewed on
web browsers, or on hand held devices like tablets and smartphones.
Cognos also provides you an option to export the report in XML or PDF
format or you can view the reports in XML format. You can also
schedule the report to run in the background at specific time period so
it saves the time to view the daily report as you don’t need to run the
report every time.
IBM Cognos provides a wide range of features and can be considered as
an enterprise software to provide flexible reporting environment
and can be used for large and medium enterprises. It meets the need
of Power Users, Analysts, Business Managers and Company
Executives. Power users and analysts want to create adhoc reports and
can create multiple views of the same data. Business Executives want to
see summarize data in dashboard styles, cross tabs and visualizations.
Cognos allows both the options for all set of users.
Key Features of IBM Cognos
Cognos BI reporting allows you to bring the data from multiple
databases into a single set of reports. IBM Cognos provides wide range
of features as compared to other BI tools in the market. You can create
and schedule the reports and complex report can be designed easily in
the Cognos BI Reporting Tool.
The Cognos BI Reporting Tool allows to create a report for a set of users
like – Power users, Analysts, and Business Executives, etc. IBM Cognos
can handle a large volume of data and is suitable for medium and large
enterprises to fulfil BI needs.
3-Tier Architecture Cognos
Cognos BI is considered to be a 3-tier architecture layout. At the top,
there is a Web Client or a Web Server. The 2 nd tier consists of a Web
Application Server. While the bottom tier consists of a Data layer.
These tiers are separated by firewalls and communication between
these tiers happens using SOAP and HTTP protocols.
Tier-1 Web Clients

The web client allows BI users to access TM1 data and interact with
data in any of the supported browsers. Tier 1 is responsible to manage
the gateway and is used for encryption and decryption of passwords,
extract information needed to submit a request to the BI server,
authentication of server and to pass the request to Cognos BI dispatcher
for processing.
Tier-2 Web Application Server
This tier hosts the Cognos BI server and its associated services.
Application server contains Application Tier Components, Content
Manager and Bootstrap service.
Cognos TM1 Web Application Server runs on Java based Apache
Tomcat server. Using this tier, Microsoft Excel worksheets can be
converted to TM1 Web sheets and also allows to export web sheets back
to Excel and PDF format.
Tier-3 Data
This tier contains content and data sources. It contains TM1 Admin
server and at least one TM1 server.
TM1 Admin server can be installed on any computer on your LAN and
it must reside on same network as TM1 server. The version of TM1
server should be equal or most recent then the version of Cognos TM1
web.
4. a) The term Knowledge Discovery in Databases, or KDD for short, refers [2M]
to the broad process of finding knowledge in data, and emphasizes the
"high-level" application of particular data mining methods. The
unifying goal of the KDD process is to extract knowledge from data in
the context of large databases. [4M]
b) Descriptive data summarization helps us study the general charac- [2M]

teristics of the data and identify the presence of noise or outliers, which
is useful for successful data cleaning and data integration.
 MEAN [4M]
 MODE
 MEDIAN
 CENTRAL TENDENCY
5. a) Data mining systems can be categorized according to various criteria, as [6M]

follows:
1. Classification according to the application adapted:
This involves domain-specific application.For example, the data
mining systems can be tailored accordingly for
telecommunications, finance, stock markets, e-mails and so on.
2. Classification according to the type of techniques utilized:

This technique involves the degree of user interaction or the
technique of data analysis involved.For example, machine
learning, visualization, pattern recognition, neural networks,
database-oriented or data-warehouse oriented techniques.
3. Classification according to the types of knowledge mined:

This is based on functionalities such as characterization,
association, discrimination and correlation, prediction etc.
4. Classification according to types of databases mined:

A database system can be classified as a ‘type of data’ or ‘use of
data’ model or ‘application of data’.
b) I) Dbscan (Density Based Spatial Clustering of Applications with [2M]

Noise)
In machine learning and data analytics clustering methods are useful
tools that help us visualize and understand data better. Relationships
between features, trends and populations in a data set can be
graphically represented via clustering methods like dbscan, and can
also be applied to detect outliers in nonparametric distributions in
many dimensions.
Dbscan is a density based clustering algorithm, it is focused on finding
neighbors by density (MinPts) on an ‘n-dimensional sphere’ with radius
ɛ. A cluster can be defined as the maximal set of ‘density connected
points’ in the feature space.
Dbscan then defines different classes of points:
 Core point: A is a core point if its neighborhood (defined by ɛ)
contains at least the same number or more points than the
parameter MinPts.
 Border point: C is a border point that lies in a cluster and its
neighborhood does not contain more points than MinPts, but it is
still ‘density reachable’ by other points in the cluster.
 Outlier: N is an outlier point that lies in no cluster and it is not
‘density reachable’ nor ‘density connected’ to any other point. Thus
this point will have “his own cluster”.
Density-based
Distance-based outlier detection method consults the neighbourhood of
an object, which is defined by a given radius. An object is then
considered an outlier if its neighborhood does not have enough other
points.
A distance the threshold that can be defined as a reasonable
neighbourhood of the object. For each object o we can find a reasonable
number of neighbours of an object.
Formally, let r(r>0) be a distance threshold and 𝜋 (0<𝜋<1) be a fraction
threshold. An object o is DB(r,𝜋)
dist — distance measure.
The straightforward approach takes O(n²) time.
Algorithms for mining distance-based outliers:
 Index-based algorithm
 Nested-loop algorithm
 Cell-based algorithm
[2M]
II) distance-based: General idea – Given a set of data points (local group
or global set) – Outliers are points that do not fit to the general
characteristics of that Outliers are points that do not fit to the general
characteristics of that set, i.e., the variance of the set is minimized when
removing the outliers • Basic assumption – Outliers are the outermost
points of the data set
Model [Arning et al. 1996] – Given a smoothing factor SF(I) that
computes for each I ⊆ DB how much the variance of much the variance
of DB is decreased when is decreased when I is removed from is
removed from DB – With equal decrease in variance, a smaller
exception set is better – The outliers are the elements of the exception
set E ⊆ DB for which the following holds: the following holds: SF(E) ≥
SF(I) for all I ⊆ DB
• Discussion: – Similar idea like classical statistical approaches (k = 1
distributions) but independent from the chosen kind of distribution –
Naïve solution is in O(2n) for n data objects
– Heuristics like random sampling or best first search are applied
– Applicable to any data type (depends on the definition of SF)
Originally designed as a global method
– Outputs a labeling
III) deviation-based outlier detection
General Idea – Judge a point based on the distance(s) to its neighbors – [2M]
Several variants proposed Several variants proposed • Basic
Assumption Basic Assumption – Normal data objects have a dense
neighborhood – Outliers are far apart from their neighbors, i.e., have a
less dense neighborhood
6. a) Association rules generated from mining data at multiple levels of [6M]
abstraction are called multiple-level or multilevel association
rules. Multilevel association rules can be mined efficiently using
concept hierarchies under a support-confidence framework.
Multilevel Mining Association Rules:

 Items often form hierarchy.
 Items of the lower level are expected to have lower support.
 A common form of background knowledge as that an attribute
may be generated or specialized according to a hierarchy of
concepts.
 Rules which contain associations with hierarchy of concepts are
called Multilevel Association Rules.
Fig: Hierarchy of concept

Support and confidence of Multilevel association rules:
 Generalizing / specializing values of attributes affects support
and confidence.
 Support of rules increases from specialized to general.
 Support of rules decreases from general to specialized.
 Confidence is not affected for general or specialized.
Multidimensional Mining (MD) Association Rules:
 Single – dimension rules: It contains the single distinct predicate
i.e. buys Buys(X, “milk”) = buys (X,”bread”)
 Multi-dimensional rule: It contains more than one predicate
 Inter-dimension association rule: It has no repeated predicate
 Age (X,”19-25”) ^ occupation (X, “student”) = buys (X, “coke”).
 Hybrid dimension association rules: It contains multiple
occurrence of the same predicate i.e. buys Age(X, “19-25”) ^ buys
(X, “popcorn”) = buys (X, “coke”)
 Categorical Attributes: This have finite number of possible
values, no ordering among values. Example; brand, color.
 Quantitative Attributes: these are numeric and implicit ordering
among values Example; age, income.
b) CONSTRAINT BASED ASSOCIATION RULES: [6M]

 A data mining process may uncover thousands of rules from a
given set of data, most of which end up being unrelated or
uninteresting to the users.
 Often, users have a good sense of which “direction” of mining
may lead to interesting patterns and the “form” of the patterns or
rules they would like to find.
 Thus, a good heuristic is to have the users specify such intuition
or expectations as constraints to confine the search space.
 This strategy is known as constraint-based mining.
 Constraint based mining provides
o User Flexibility: provides constraints on what to be
mined.
o System Optimization: explores constraints to help
efficient mining.
 The constraints can include the following:
 Knowledge type constraints: These specify the type of
knowledge to be mined, such as association or correlation.
 Data constraints:These specify the set of task-relevant data.
o Dimension/level constraints: These specify the desired
dimensions (or attributes) of the data, or levels of the
concept hierarchies, to be used in mining.
 Interestingness constraints: These specify thresholds on
statistical measures of rule interestingness, such as support,
confidence, and correlation.
o Rule constraints: These specify the form of rules to be
mined. Such constraints may be expressed as rule
templates, as the maximum or minimum number of
predicates that can occur in the rule antecedent or
consequent, or as relationships among attributes, attribute
values, and/or aggregates. The above constraints can be
specified using a high-level declarative data mining query
language and user interface.
Constraint based association rules: - In order to make the mining
process more efficient rule based constraint mining : - allows users to
describe the rules that they would like to uncover. - provides a
sophisticated mining query optimizer that can be used to exploit the
constraints specified by the user. - encourages interactive exploratory
mining and analysis.
Constrained frequent pattern mining: Query optimization approach
 Given a frequent pattern mining query with a set of constraints
C, the algorithm should be:
o Sound: it only finds frequent sets that satisfy the given
constraints C.
o Complete: all frequent sets satisfying the given
constraints are found .
 A naïve solution:
o Find all frequent sets and then test them for constraint
satisfaction.
 More efficient approaches:
o Analyze the properties of constraints comprehensively.
o Push them as deeply as possible inside the frequent
pattern computation.
7. a) A decision tree is a structure that includes a root node, branches, and [2M]
leaf nodes. Each internal node denotes a test on an attribute, each
branch denotes the outcome of a test, and each leaf node holds a class
label. The topmost node in the tree is the root node.
The following decision tree is for the concept buy_computer that
indicates whether a customer at a company is likely to buy a computer
or not. Each internal node represents a test on an attribute. Each leaf
node represents a class.
The benefits of having a decision tree are as follows −

 It does not require any domain knowledge.
 It is easy to comprehend.
 The learning and classification steps of a decision tree are simple
and fast.
Generating a decision tree form training tuples of data

partition D
Algorithm : Generate_decision_tree
Input:
Data partition, D, which is a set of training tuples
and their associated class labels.
attribute_list, the set of candidate attributes. [3M]
Attribute selection method, a procedure to determine
the
splitting criterion that best partitions that the data
tuples into individual classes. This criterion includes
a
splitting_attribute and either a splitting point or
splitting subset.
Output:
A Decision Tree
Method
create a node N;
if tuples in D are all of the same class, C then

return N as leaf node labeled with class C;
if attribute_list is empty then

return N as leaf node with labeled
with majority class in D;|| majority voting
apply attribute_selection_method(D, attribute_list)

to find the best splitting_criterion;
label node N with splitting_criterion;
if splitting_attribute is discrete-valued and

multiway splits allowed then // no restricted to
binary trees
attribute_list = splitting attribute; // remove

splitting attribute
for each outcome j of splitting criterion
// partition the tuples and grow subtrees for each

partition
let Dj be the set of data tuples in D satisfying
outcome j; // a partition
if Dj is empty then
attach a leaf labeled with the majority
class in D to node N;
else
attach the node returned by Generate
decision tree(Dj, attribute list) to node N;
end for
return N;
[1M]
Tree Pruning
Tree pruning is performed in order to remove anomalies in the training
data due to noise or outliers. The pruned trees are smaller and less
complex.
Tree Pruning Approaches
There are two approaches to prune a tree −
 Pre-pruning − The tree is pruned by halting its construction
early.
 Post-pruning - This approach removes a sub-tree from a fully
grown tree.
b) A classification problem could be seen as a predictor of classes, [1M]

Predicted values are usually continuous whereas classifications are
discreet. Predictions are often (but not always) about the future
whereas classifications are about the present.
Classification −
 A bank loan officer wants to analyze the data in order to know
which customer (loan applicant) are risky or which are safe.
 A marketing manager at a company needs to analyze a customer [2M]
with a given profile, who will buy a new computer.
In both of the above examples, a model or classifier is constructed to
predict the categorical labels. These labels are risky or safe for loan
application data and yes or no for marketing data.
Prediction −
Suppose the marketing manager needs to predict how much a given
customer will spend during a sale at his company. In this example we
are bothered to predict a numeric value. Therefore the data analysis
task is an example of numeric prediction. In this case, a model or a
predictor will be constructed that predicts a continuous-valued-
function or ordered value.
Prediction Methods [3M]

Multiple Linear Regression
This method is performed on a dataset to predict the response variable
based on a predictor variable or used to study the relationship between
a response and predictor variable, for example, student test scores
compared to demographic information such as income, education of
parents, etc.

k-Nearest Neighbors
Like the classification method with the same name above, this
prediction method divides a training dataset into groups of k
observations using a Euclidean Distance measure to determine
similarity between “neighbors”. These groups are used to predict the
value of the response for each member of the validation set.

Regression Tree
A Regression tree may be considered a variant of a decision tree,
designed to approximate real-valued functions instead of being used
for classification methods. As with all regression techniques, XLMiner
assumes the existence of a single output (response) variable and one or
more input (predictor) variables. The output variable is numerical. The
general regression tree building methodology allows input variables to
be a mixture of continuous and categorical variables. A decision tree is
generated when each decision node in the tree contains a test on some
input variable's value. The terminal nodes of the tree contain the
predicted output variable values.

Neural Network
Artificial neural networks are based on the operation and structure of
the human brain. These networks process one record at a time and
“learn” by comparing their prediction of the record (which as the
beginning is largely arbitrary) with the known actual value of the
response variable. Errors from the initial prediction of the first records
are fed back into the network and used to modify the networks
algorithm the second time around. This continues for many, many
iterations.
8 a) Types Of Data Used In Cluster Analysis Are: [6M]

 Interval-Scaled variables
 Binary variables
 Nominal, Ordinal, and Ratio variables
 Variables of mixed types
b) A Hierarchical Clustering Algorithm for Categorical Attributes. [2M]

Abstract: Clustering, an important technique of data mining, groups
similar objects together and identifies the cluster number to which each
object of the domain being studied belongs to.
Clustering is the process of grouping similar objects. Similarity plays an
important role in grouping objects. Hierarchical clustering methods are
of two types: agglomerative and divisive. In agglomerative hierarchical
clustering, initially many small clusters are formed with high
coherence. These small clusters are recursively merged based on their
similarity, the process stops with a single cluster, whereas in the
divisive approach, initially the entire space of input data object forms
one large cluster, and it is decomposed at lower levels into smaller
clusters of increasing coherence.
Hierarchical clustering involves creating clusters that have a
predetermined ordering from top to bottom. For example, all files and [4M]
folders on the hard disk are organized in a hierarchy.
There are two types of hierarchical clustering,
 Divisive
 Agglomerative
Explanation
******
Code No:M2503 R19

I M. Tech I Semester Regular Examinations, FEB – 2020
E-Commerce
Time: 3 hours Max. Marks: 60
1 a) Electronic Commerce Framework From the business activity already [3M]

taking place, it is clear that ecommerce applications will be built on the
existing technology infrastructure-a myriad of computers,
communications networks, and communication software forming the
nascent Information Superhighway.
Figure shows a variety of possible e-commerce applications; including
both inter organizational and consumeroriented examples. None of
these uses would be possible without each of the building blocks in the
infrastructure:
· Common business services, for facilitating the buying and selling [4M]
process
· Messaging and information distribution, as a means of sending and
retrieving information
· Multimedia content and network publishing, for creating a product
and a means to communicate about it
· The Information Superhighway-the very foundation-for providing
the highway system along which all e-commerce must travel the two
pillars supporting all e-commerceapplications and infrastructure-are
just as indispensable.
· Public policy, to govern such issues as universal access, privacy, and
information pricing
· Technical standards, to dictate the nature of information publishing,
user interfaces, and transport in the interest of compatibility across the
entire network.
b) Supply chain management in e-commerce focusses on procurement of [2M]

raw material, manufacturing, and distribution of the right product at
the right time. It includes managing supply and demand, warehousing,
inventory tracking, order entry, order management, distribution and
delivery to the customer.
Supply Chain Management Functions
On a broader level, supply chain management consists of these four [6M]
major functions and key element components, such as:
Integration
This forms the crux of the supply chain and is meant to coordinate
communications to produce effective and timely results. It can include
innovation of new software or advanced technological processes to
improve communications.
Operations
This involves management of day to day operations in the eCommerce
business. For example, it may deal with keeping an eye on the
inventory or coming up with marketing approaches.
Purchasing
This deals with the purchasing decisions and management, such as
purchasing raw materials, source materials and so on.
Distribution
This deals with the management of logistics across wholesalers,
retailers, and customers. This may mean keeping an eye on the
shipment, and other details.
In addition to these, there are also some subsidiary functions that an
effective supply chain management process fulfills, such as:
 Aligning distribution flows
 Integrating the functions from manufacture to delivery
 Designing complex and advanced systems
 Managing and coordinating resources
2. a) Quick response is a method that allows manufacturers or retailers to [7M]

communicate inventory needs for their shelves or assembly lines in
near-real time.
To improve efficiency throughout their supply chains, some companies

have turned to quick response processes.
Quick response is a method that allows manufacturers or retailers to
communicate inventory needs for their shelves or assembly lines in
near-real time. These companies have traditionally communicated with
business partners about inventory replenishment via electronic data
interchange (EDI) systems, faxes or phone calls. But with the advent of
the Internet, a growing number of organizations have been turning to
Web-based systems.
High-profile companies that are either are in the process of
implementing or have installed such systems to boost the efficiency of
their supply chains include Switzerland's Nestle SA, San Francisco-
based Levi Strauss & Co. and Toronto-based Canadian Tire Corp.
Advocates claim that quick response systems can help retailers and
manufacturers trim the fat out of supply chains, speed up the time it
takes to replace depleted inventory, help avoid stock-outs and boost the
number of inventory turns (when a retailer or manufacturer turns over
its entire inventory).
Companies that offer quick response technologies include SAP AG,
Logility Inc. in Atlanta, GlobeRanger Corp. in Richardson, Texas, and i2
Technologies Inc. in Dallas.
Get Real
Quick response takes a variety of forms. "Real time is a different thing
to different people," says George Brody, president and CEO of
GlobeRanger. "So quick response means different things to different
people." For instance, he says, for some folks, asking for data in real
time might mean once an hour, for others, it might mean or once a day.
The process requires a company to share what could be seen as
proprietary data about its sales and manufacturing operations with its
partners. Some executives might be hesitant about this for fear of
exposing sensitive information about their companies' products or
business strategies to their rivals, say observers.
Nevertheless, when implemented correctly, quick response can be an
integral component in helping organizations create lean and efficient
supply chains.
"Our supply-chain efforts are based on real-time customer information,
quickness and speed," says a spokesman for Dell Computer Corp. in
Round Rock, Texas.
On the Cutting Edge
Dell is widely viewed as being on the cutting edge of supply-chain
management. For instance, if Dell.com receives an order from a
customer for a DVD device, that request is passed along to Dell's
suppliers. If the suppliers can't meet the demand for some reason, they
will signal back to Dell, which will then notify the customer that the
item is out of stock or unavailable.
Procter & Gamble Co. in Cincinnati recently detailed its own plans to
establish a quick response system through its entire supply chain. The
system is expected to connect a variety of participants, from raw
materials suppliers through the manufacturer to distributors out to
retailers and on to consumers, says a Procter & Gamble spokesman.
The firm, which makes consumer products ranging from Pampers to
Pepto-Bismol, says the system will help it manufacture the right
products as needed and get them to the appropriate warehouses and
distribution centers, thus reducing bloat in its supply chain.
This should result in happier customers, according to Procter &
Gamble. "A key benefit for consumers is that the products are fresher,"
says the company's spokesman.
Hypothetically, this means that when a consumer buys a roll of paper
towels, information about that purchase goes all the way through the
supply chain to the lumber company that cuts down the trees to make
the product.
Procter & Gamble plans to install its quick response systems over the
next few years; the company has pilots scheduled throughout this year.
Quick response systems have been around for about seven years. In the
grocery industry, where the use of these technologies has been fairly
prevalent, point-of-sale data is collected by a bar-coding EDI-based
device at the cash register. The aggregated information is sent each
night to vendors who then replenish the depleted shelves, says Karen
Peterson, an analyst at Gartner Group Inc., in Stamford, Conn.
Flow Manufacturing
Other companies use quick response during so-called flow
manufacturing. Whenever one unit of inventory is sold by a retailer, a
signal is sent through the supply chain to replace it immediately,
according to Kevin O'Marah, an analyst at AMR Research Inc. in
Boston.
That's slightly different from collaborative planning, forecasting and
replenishment (CPFR) processes, which are closely related to traditional
quick response methods but are Web-based and slightly more
sophisticated.
CPFR processes, say industry experts, let retailers and manufacturers
examine statistics regarding the flow of products to customers and
attempt to forecast how much inventory is going to be required to meet
future needs, according to O'Marah.
In fact, some major companies, including Canadian Tire, say that
regular quick response is already past its prime. Canadian Tire is
currently implementing software from Stockholm-based Industri-
Matematik International Corp. that should allow it to locate inventory
anywhere in its supply chain at any time.
The Next Big Thing
"Quick response is kind of outmoded," says Art Karrer, project manager
at Pharmavite Corp., a maker of vitamins and health-related products
in Mission Hills, Calif. The firm used EDI-based quick response
techniques between its supply centers and distribution centers for one
of its biggest partners, Troy, Mich.-based Kmart Corp., until seven years
ago.
Pharmavite has since shifted to CPFR to realize greater efficiencies in
forecasting and replenishment and to gain insight into retail sales
cycles. The company is now using applications from Logility to
establish CPFR links with Kmart.
b) Internal impacts [8M]

 Human resources.
 Capital resources.
 Operational efficiency.
 Organizational structure.
 Infrastructure.
 Innovation.
Explanation
External impacts
 Economic situation
 Laws
 Technological factors
 Customer demands
 Competition
Explanation
3. a) Electronic funds transfer (EFT) is the electronic transfer of money from [2M]
one bank account to another, either within a single financial
institution or across multiple institutions, via computer-based systems, [5M]
without the direct intervention of bank staff.
EFT can be segmented into three broad categories:
 Banking and financial payments. Large-scale or wholesale
payments (e.g., bank-to-bank transfer) ...
 Retailing payments. Credit Cards (e.g., VISA or MasterCard) ...
 On-line electronic commerce payments. Token-based payment
systems.
b) Third-party payment processors, or aggregators, are companies that [7M]
interface between the merchant (you) and a merchant services provider,
so you can accept payments without setting up a merchant account.
Third-party payment processors have one big merchant bank account
for all of the businesses they work with, so they generally let you start
processing customer transactions the same day you sign up without
requiring you to go through the extensive business analysis and
underwriting process you need to open a merchant account.
Many budding entrepreneurs, especially those who are just starting out,
wonder whether a third-party payment processor is the right fit for
them. After all, they hear that sign up is easy and they won’t have to
pay any fees. However, it’s important to dig a little deeper to
understand who third-party payment processors truly work for and
when they are necessary.
There are a variety of reasons a merchant might choose to go with a
third-party payment processors. Some companies might not be able to
afford the monthly fees associated with dedicated accounts. Similarly,
SMBs processing very low volume can often not afford the setup costs
of such an account. This makes a third-party payment processor a good
solution for your business when you are just starting out and do not
anticipate processing a high volume of credit card transactions.
It is important to remember, however, that while you do not pay
startup fees or monthly fees with a third-party payment processor, they
still have to make money somewhere. They make up for their lack of
fees in their per transaction percentage fee. This fee is significantly
higher than it would be with a dedicated merchant account. This means
that if you are processing high volume, a third-party payment
processor will more expensive for you.
 Easy account set-up

 No monthly or annual fees
 No long-term commitments and an easier cancellation process
4. Digital cash aims to mimic the functionality of paper cash, by [14M
providing such properties of anonymity and transferability of ]
payment. Digital cash is intended to be implemented data which can be
copied, stored, or given as payment (for example, attached to an email
message, or via a USB stick, bluetooth, etc).
properties of Electronic cash

 Digital cash must have a monetary value; it must be backed by
cash (currency), bank-authorized credit, or a bank-certified
cashier’s check. When digital cash created by one bank is
accepted by others, reconciliation must occur without any
problems. Without proper bank certification, digital cash carries
the risk that when deposited, it might be returned for insufficient
funds.
 Digital cash must be interoperable or exchangeable as payment
for other digital cash, paper cash, goods or services, lines of
credit, deposits in banking accounts, bank notes or obligations,
electronic benefits transfers, and the like.
 Digital cash must be storable and retrievable. Remote storage

and retrieval (such as via a telephone or personal
communications device) would allow users to exchange digital
cash (withdraw from and deposit into banking accounts) from
home or office or while travelling.
 Digital cash should not be easy to copy or tamper with while it is

being exchanged. It should be possible to prevent or detect
duplication and double-spending of digital cash.
b) The risk to the online payments are theft of payments data, personal [8M]
data and fraudulent rejection on the part of customers.
i. Credit risk
ii. Fraud risk
iii. Compliance risk
iv. Liquidity risk
v. Systemic risk
vi. Operational and transaction risk
vii. Strategic risk
viii. Reputation risk
5. a) Standard Layer- This layer of EDI architecture defines the structures of [8M]
the business form and some content which are related with the
application layer.
Transport Layer- EDI transport layer is a non electronic way of
sending the business form from one company to another company.
EDI stands for Electronic Data Interchange. EDI is an electronic way of

transferring business documents in an organization internally, between
its various departments or externally with suppliers, customers, or any
subsidiaries. In EDI, paper documents are replaced with electronic
documents such as word documents, spreadsheets, etc.
EDI Documents
Following are the few important documents used in EDI −
 Invoices
 Purchase orders
 Shipping Requests
 Acknowledgement
 Business Correspondence letters
 Financial information letters
Steps in an EDI System
Following are the steps in an EDI System.
 A program generates a file that contains the processed
document.
 The document is converted into an agreed standard format.
 The file containing the document is sent electronically on the
network.
 The trading partner receives the file.
 An acknowledgement document is generated and sent to the
originating organization.
Advantages of an EDI System
Following are the advantages of having an EDI system.
 Reduction in data entry errors. − Chances of errors are much
less while using a computer for data entry.
 Shorter processing life cycle − Orders can be processed as soon
as they are entered into the system. It reduces the processing
time of the transfer documents.
 Electronic form of data − It is quite easy to transfer or share the
data, as it is present in electronic format.
 Reduction in paperwork − As a lot of paper documents are
replaced with electronic documents, there is a huge reduction in
paperwork.
 Cost Effective − As time is saved and orders are processed very
effectively, EDI proves to be highly cost effective.
 Standard Means of communication − EDI enforces standards on
the content of data and its format which leads to clearer
communication.
b) The difference between horizontal and vertical organizations is [7M]
that vertical organizations have a top-down management structure,
while horizontal organizations have a flat structure that provides
greater employee autonomy.
One of the main differences between vertical and horizontal business

organizations is that in a vertical system, upper-level management
issues orders and employees follow those orders without input or
objection. In contrast, employees in a horizontal organization are
encouraged to make suggestions and offer ideas that can improve
workplace processes, and are given the authority to implement changes
without having to obtain authorization.
Another difference is that the multiple layers of management can
hamper communication in a vertical organization. For example, if a
CEO issues an order that employees simply can’t execute, it can take
weeks before employees tell their managers why they can’t achieve the
order they’ve been given, and then another week before managers
communicate this information back up to the top of the chain. In a
horizontal organization, communication flows freely between team
members, because there is no rigid hierarchy, and this can boost
efficiency and productivity. Employees in a horizontal organization are
also more collaborative because they can communicate freely with each
other. In a vertical organization, collaboration only occurs when
managers schedule meetings with employees.
6. a) Common Financial EDI Transaction Types [8M]

 Invoicing and Crediting. One of the most common financial
EDI transaction types is invoicing.
 Loans. Loans are another common financial EDI transaction
category, as consumers and businesses alike require loans.
 Healthcare Payments.
 Federal and State Government Codes.
EXPLANATION
b) With a push-based supply chain, products [7M]
are pushed through the channel, from the production side up
to the retailer.
In a pull-based supply chain, procurement, production and
distribution are demand-driven rather than to forecast. However,
a pull strategy does not always require make to order production.
7. a) A digital library is an integrated set of services for capturing, [8M]

cataloguing, storing, searching, protecting, and retrieving information,
which provide coherent organization and convenient access to typically
large amounts of digital information.
Digital Libraries are systems that combine machinery of the digital
computing, storage & communication, the content, the software needed
to reproduce, emulate and extend the services of collecting, cataloguing,
finding & disseminating information offered by traditional libraries.
A digital library can have multi-tier architecture. Different digital
libraries follow different architectures and models. Here we discuss the
basic concepts and principles involved in its design and architecture.
1. Low cost, including all hardware and software components
2. Technically simple to install and manage
3. Robust
4. Scalable
5. Open and inter-operable
6. Modular
7. User Friendly
8. Multi-user (including both searching and maintenance);
9. Multimedia digital object enabled
10. Platform independent
b) Data Warehouse Concept [7M]
The basic concept of a Data Warehouse is to facilitate a single version of
truth for a company for decision making and forecasting. A Data
warehouse is an information system that contains historical and
commutative data from single or multiple sources. Data Warehouse
concept, simplifies reporting and analysis process of the organization.
 Subject-Oriented
 Integrated
 Time-variant
 Non-volatile
8 a) Featured snippet from the web [7M]

There are four major types of electronic files that are used for online or
off line lessons.
The electronic documents are:

 Word Documents (. doc or . docx)
 Portable File Documents (PDF)
 Spreadsheet (. xls or . xlsx)
 PowerPoint (. ppt or . pptx)
b)  Rapid Growth. The digital landscape just keeps growing. [8M]

 Ad Blockers. Ad blockers are costing advertisers billions of
dollars.
 Reduced Exposure. Social media was a gift to businesses.
 Increasing Costs.
 Elusive Audiences.
9. a) directory business organization and its importance [8M]

b)  Centralization of control: access, resources and integrity of the [7M]
data are controlled by the dedicated server so that a program or
unauthorized client cannot damage the system.
 Scalability: You can increase the capacity
of clients and servers separately.
10 a) The main difference between the white pages and yellow pages is that [7M]

. the yellow pages contain business phone numbers only. The white
pages only have residential phone number listings. The yellow
pages were organized by type of service in a geographical area and
then alphabetically.
White Pages
The white pages in a phone book are for personal land line phone
numbers and street addresses in a specific region. The white pages are
organized alphabetically by name, with the surname (or last name) first,
then first name followed by middle name or initial, if applicable.
Everyone with a land line telephone service is registered with the
phone book printer under the name of the phone service account unless
they opt out of the phone book by calling the phone book company and
asking to be on the red list. This red list will stop a person's name from
appearing in the phone book and online on the phone book website.
Yellow Pages
The yellow pages generally follow the white pages in the phone book,
in the back half. The yellow pages are all business listings, with the
name, number and address of local businesses. They differ from the
white pages in that yellow pages are paid listings, meaning that
businesses must pay for the listing in the book and can also pay extra
money for larger more attention-grabbing ads. The second major
difference is that the businesses are first listed by category and then in
alphabetical order by name. For example, Tony's Pizza would be listed
under the "Pizza" category and then between the two other pizza
restaurants that come immediately before and after it alphabetically.
b) The term mainframe computer is used to distinguish very large [8M]

computers used by institutions to serve multiple users from personal
computers used by individuals. Mainframe computers are capable of
handling and processing very large amounts of data very quickly -
much more data than a typical individual needs to work with on his or
her own computer.
Mainframe systems can be used by a large number of users. This means
that, in a large organization, individual employees can sit at their desk
using a personal computer, but they can send requests to the
mainframe computer for processing large amounts of data. A typical
mainframe system can support hundreds of users at the same time. As
for the actual hardware components inside a mainframe computer,
they are similar in type to what personal computers use: motherboard,
central processing unit and memory. The individual components are
just a lot more powerful and a lot more expensive.
Parallel Processing Approaches:
Single Instruction, Single Data (SISD) computers have one processor
that handles one algorithm using one source of data at a time. The
computer tackles and processes each task in order, and so sometimes
people use the word "sequential" to describe SISD computers. They
aren't capable of performing parallel processing on their own.
Multiple Instruction, Single Data (MISD) computers have multiple
processors. Each processor uses a different algorithm but uses the same
shared input data. MISD computers can analyze the same set of data
using several different operations at the same time. The number of
operations depends upon the number of processors. There aren't many
actual examples of MISD computers, partly because the problems an
MISD computer can calculate are uncommon and specialized.
Single Instruction, Multiple Data (SIMD) computers have several
processors that follow the same set of instructions, but each processor
inputs different data into those instructions. SIMD computers run
different data through the same algorithm. This can be useful for
analyzing large chunks of data based on the same criteria. Many
complex computational problems don't fit this model.
Multiple Instruction, Multiple Data (MIMD) computers have multiple
processors, each capable of accepting its own instruction stream
independently from the others. Each processor also pulls data from a
separate data stream. An MIMD computer can execute several different
processes at once. MIMD computers are more flexible than SIMD or
MISD computers, but it's more difficult to create the complex
algorithms that make these computers work. Single Program, Multiple
Data (SPMD) systems are a subset of MIMDs. An SPMD computer is
structured like an MIMD, but it runs the same set of instructions across
all processors.
*****

SCHEME OF VALUATION FOR M.Tech

Uploaded by

Copyright:

Available Formats

You might also like

SCHEME OF VALUATION FOR M.Tech

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SCHEME OF VALUATION FOR M.Tech

Uploaded by

Copyright:

Available Formats

Code No: I0505 R16

I M. Tech I Semester Supplementary Examinations, FEB – 2020

1. a) The classic definition of a Data Warehouse is architecture used to [2M]

Data warehouse is an information delivery system where we can

An Enterprise Data Warehouse (EDW) is a form of corporate

Virtual warehouse:- A set of views over operational databases [2M]

3. a) Types of Business Analysis Tools [6M]

b) IBM Cognos Business Intelligence is a web based reporting and [6M]

Tier-1 Web Clients

b) Descriptive data summarization helps us study the general charac- [2M]

5. a) Data mining systems can be categorized according to various criteria, as [6M]

2. Classification according to the type of techniques utilized:

3. Classification according to the types of knowledge mined:

4. Classification according to types of databases mined:

b) I) Dbscan (Density Based Spatial Clustering of Applications with [2M]

Multilevel Mining Association Rules:

Fig: Hierarchy of concept

b) CONSTRAINT BASED ASSOCIATION RULES: [6M]

The benefits of having a decision tree are as follows −

Generating a decision tree form training tuples of data

if tuples in D are all of the same class, C then

if attribute_list is empty then

apply attribute_selection_method(D, attribute_list)

if splitting_attribute is discrete-valued and

attribute_list = splitting attribute; // remove

// partition the tuples and grow subtrees for each

b) A classification problem could be seen as a predictor of classes, [1M]

Prediction Methods [3M]

8 a) Types Of Data Used In Cluster Analysis Are: [6M]

b) A Hierarchical Clustering Algorithm for Categorical Attributes. [2M]

Code No:M2503 R19

1 a) Electronic Commerce Framework From the business activity already [3M]

b) Supply chain management in e-commerce focusses on procurement of [2M]

2. a) Quick response is a method that allows manufacturers or retailers to [7M]

To improve efficiency throughout their supply chains, some companies

b) Internal impacts [8M]

 Easy account set-up

properties of Electronic cash

 Digital cash must be storable and retrievable. Remote storage

 Digital cash should not be easy to copy or tamper with while it is

EDI stands for Electronic Data Interchange. EDI is an electronic way of

One of the main differences between vertical and horizontal business

6. a) Common Financial EDI Transaction Types [8M]

7. a) A digital library is an integrated set of services for capturing, [8M]

8 a) Featured snippet from the web [7M]

The electronic documents are:

b)  Rapid Growth. The digital landscape just keeps growing. [8M]

9. a) directory business organization and its importance [8M]

10 a) The main difference between the white pages and yellow pages is that [7M]

b) The term mainframe computer is used to distinguish very large [8M]

You might also like