DM Ya

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

Sales forecasting:

Sales forecasting in retail Industry is a significantly complex problem in today’s global market

scenario. However at the management level, forecasts of sales is essential to all decision

activities in various functional areas of a retail industry company such as marketing, sales, and

production/purchasing, as well as finance and accounting. It provides the basis for regional and

national distribution and stock replenishment plans. In this research study, we focus on exploring

the concept of soft computing and data mining techniques to develop a prototype solution based

on artificial neural network in sales forecasting and inventory management.

Knowledge Discovery

Today every company uses IT technology to store their entire business operations transaction

data. This enormous expansion of collected data from different data sources and fields, can be

accessed and analyzed to extract valuable knowledge that can be helpful in the decision making

process of the business. This process is called Knowledge discovery, a pattern search

methodology that searches exact patterns in the database using different algorithms and methods.

The proposed prototype knowledge discovery consists of a number of interactive and iterative

steps with many decisions and queries introduced by the user. Below is the description of these

steps:

1) Goals of the Application Domain: The first step of the knowledge discovery process begins

by understanding the goals of the application domain and the goals of the data mining process.

The main goal being knowledge discovery i.e., previously unknown patterns that are useful and

effective in the decision making process from extracted patterns that describe current and past

trends and behaviors.


2)Data Collection and Integration: In this step the target data sets are collected from different

heterogeneous databases and data warehouses and integrated into a research database. The

relevant data for the analysis process is targeted and retrieved from this data source. Further data

characterization is performed by summarizing the general characteristics of the target class of data set.

3) Data Cleaning and Clustering: Using various algorithms the database is cleaned by erasing

errors, ensuring consistency, removing redundancy of data and transforming the selected

data set to format appropriate for the prototype data mining procedure. Clustering is performed

on the data set by using Associate algorithm.

4) Data Mining Process: Data mining is performed on the selected data sets to extract

interesting patterns by using the appropriate data mining algorithms and methods.

An association rule is an expression in the form X ⇒ Y, where X and Y are sets of elements

with no common elements between them. In a given a data source of transactions D where each

transaction T ∈ D, the association rule X ⇒ Y denotes that whenever a transaction T contains X

then there is a probability that it contains Y too. The rule X ⇒ Y states that in the transactions set

T with confidence c if c% of transactions in T that contains X contains Y too. The rule states that

support s in T if s% of the transactions in T contains both X and Y. Association rule based data

mining locates all association rules that are greater than or equal to a user-specified minimum

support (minsup), and minimum confidence (minconf). The data mining process for extracting

valuable association rules consists of 1) Discovery of all item sets that satisfy minsup (known as

Frequent-Itemset generation) and 2) Generating all association rules that satisfy minconf using

itemsets generated by the first step. To perform these steps Association rule mining algorithms

employ either Breadth-first search approach (BFS) or Depth-first search approach (DFS)

approach. In our prototype we use BFS approach to determine the support values of all (k −1)-
item sets before calculating the support values of the k-item sets where k is a positive integer.

DFS approach the algorithm data are represented in a tree structure and can start from, say, node

a in the tree and counts its support to determine whether it is frequent. If true, the algorithm

expands to the next level of nodes until an infrequent node is reached. It then backtracks its

search to another branch and continues the search from there.

5) Classification or supervised learning is the process of finding a set of models (functions) that

describe data classes where the models derived based on a set of training data

6) Visualization: The discovered knowledge is visually represented using different visualization

techniques.

7) Pattern Evaluation: All discovered patterns representing meaningful knowledge are

identified using predefined measures.

Event Product
Event ID Pattern Product ID
Event Description Pattern ID Product Descripton
Pattern Description

Branch
Branch ID
Fact Table
Event Hierarchy Branch Description
Event ID
Event ID Product ID
Event Level Pattern Description
Event Condition

Time granularity
Event Pattern Year
Event ID Quarter
Pattern ID Month
Week
Day

You might also like