Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 30

Frequent Pattern Mining using Evolutionary Techniques

Presented by Maryam Zardad

Contents
Introduction Swarm Intelligence Ant Colony Algorithm Bee Algorithm Genetic Algorithm

Frequent Pattern Mining


Frequent pattern mining is an important area of Data mining. The frequent patterns are patterns (such as itemsets, subsequences) that appear in a data set frequently.

For example, a set of items, such as milk and bread that appear frequently together in a transaction data set is a frequent itemset. A subsequence, such as buying first a PC, then a digital camera, and then a memory card, if it occurs frequently in a shopping history database, is a (frequent) sequential pattern.

Association Rule Mining


Association rule mining (ARM) is one of the core data mining techniques. The major aim of ARM is to extract rules on how a subset of items influences the presence of another subset.

Swarm Intelligence
A branch of nature inspired algorithms which are called as swarm intelligence is focused on insect behavior. Interaction between insects contributes to the collective intelligence of the social insect colonies. Ant Colonies (AC) are currently the most popular algorithms in the swarm intelligence domain.

Ant Colony
Ant Colonies Optimization (ACO) algorithms were introduced around 1990 . These algorithms were inspired by the behavior of ant colonies. Ants are social insects, being interested mainly in the colony survival rather than individual survival.

When searching for food, ants initially explore the area surrounding their nest in a random manner. While moving, ants leave a chemical pheromone trail on the ground.

Ants are guided by pheromone smell. Ants tend to choose the paths marked by the strongest pheromone concentration . When an ant finds a path, it evaluates the quantity and the quality of the food by pheromone.

During the return trip, the quantity of pheromone that an ant leaves on the ground may depend on the quantity and quality of the food. The pheromone trails will guide other ants to the food source.

Types of pheromone
There are generally two types of pheromone 1. Food pheromone 2. Nest pheromone While ant is looking for food it drops nest pheromone and when it finds food it drops food pheromone.

The main steps of the ACO algorithm are given below: 1. Wander randomly, in general direction of any nearby pheromones. 2. If the ant is holding food, drop food pheromone while looking for and following a nest pheromone that leads in the general direction of nest. If the ant is not holding food, drop nest pheromone while looking for and following a food pheromone trail. 3. If the ant finds itself at food and is not holding any, pick the food up. 4. If the ant finds itself at the nest and is carrying food, drop the food.

Ant Colony Optimization and Data mining


Ant colony based clustering algorithms have been first introduced by Deneubourg et al. by mimicking different types of naturally-occurring emergent phenomena. Ants gather items to form heaps (clustering of dead corpses or cemeteries) Ramos et al. proposed ACLUSTER algorithm to follow real ant-like behaviors as much as possible. Abraham and Ramos proposed an ant clustering algorithm to discover Web usage patterns.

The Bees Algorithm (BA)


Bees in nature

A colony of honey bees can extend itself over long distances in multiple directions (more than 10 km)

Waggle dance of bees


By performing this dance, successful foragers share the information about the direction and distance to patches of flower and the amount of nectar within this flower with their hive mates. So this is a successful mechanism which foragers can recruit other bees in their colony to productive locations to collect various resources.

while performing the waggle dance, the direction of bees indicates the direction of the food source in relation to the Sun, the intensity of the waggles indicates how far away it is and the duration of the dance indicates the amount of nectar on related food source.

Foragers
Unemployed foragers: If it is assumed that a bee have no knowledge about the food sources in the search field, bee initializes its search as an unemployed forager. There are two possibilities for an unemployed forager: 1. Scout Bee : If the bee starts searching spontaneously without any knowledge, it will be a scout bee.

2. Recruit : If the unemployed forager attends to a waggle dance done by some other bee, the bee will start searching by using the knowledge from waggle dance.

Employed foragers : When the recruit bee finds the food source, it will raise to be an employed forager who memorizes the location of the food source. After the employed foraging bee loads a portion of nectar from the food source, it returns to the hive and unloads the nectar to the food area in the hive.

There are three possible options related to residual amount of nectar for the foraging bee. 1. If the nectar amount decreased to a low level or exhausted, foraging bee abandons the food source and become an unemployed bee.

2. If there are still sufficient amount of nectar in the food source, it can continue to forage without sharing the food source information with the nest mates 3. Or it can go to the dance area to perform waggle dance for informing the nest mates about the same food source. The probability values for these options highly related to the quality of the food source.

Bee Algorithm
1. Initialise population with random solutions. 2. Evaluate fitness of the population. 3. While (stopping criterion not met) //Forming new population. 4. Select sites for neighbourhood search. 5. Recruit bees for selected sites and evaluate fitnesses. 6. Select the fittest bee from each patch. 7. Assign remaining bees to search randomly and evaluate their fitnesses. 8. End While.

Genetic Algorithm
GAs are one of the best ways to solve a problem for which little is known. Standard GA apply genetic operators such selection, crossover and mutation on an initially random population in order to compute a whole generation of new strings. The process is terminated when an acceptable or optimum solution is found.

The functions of genetic operators are as follows:


1) Selection: Selection deals with the probabilistic survival of the fittest, in that, more fit chromosomes are chosen to survive. Where fitness is a comparable measure of how well a chromosome solves the problem at hand.

2) Crossover: This operation is performed by selecting a random gene along the length of the chromosomes and swapping all the genes after that point.

3) Mutation: Alters the new solutions so as to add in the search for better solutions. This is the chance that a bit within a chromosome will be flipped (0 becomes 1, 1 becomes 0).

The genetic algorithm based method for finding frequent itemsets repeatedly transforms the population by executing the following steps: (1) Fitness Evaluation: The fitness (i.e., an objective function) is calculated for each individual. (2) Selection: Individuals are chosen from the current population as parents to be involved in recombination.

(3) Recombination: New individuals (called offspring) are produced from the parents by applying genetic operators such as crossover and mutation. (4) Replacement: Some of the offspring are replaced with some individuals (usually with their parents).

Thanks

You might also like