Professional Documents
Culture Documents
Updated CL-1 Manual
Updated CL-1 Manual
EL-1
Experiment No: 01
Title: Using Divide and Conquer Strategies
design a function for Binary Search using C.
Roll No:________
Batch:_____
Class:_____
Particulars
Attendance (05)
Journal (05)
Performance (05)
Understanding(05)
Total (20)
Signature of Staf
Marks
Member
Experiment No 01
Dept. of Computer Engg. (ZESs DCOER,Pune)
Computer Laboratory - I
EL-1
TITLE:-Using Divide & Conquer Strategies design function for Binary Search using C++/Java
AIM:-
THEORY:
Divide & conquer Strategy:
A divide and conquer algorithm works by recursively breaking down a problem into two or more
sub-problems of the same (or related) type (divide), until these become simple enough to be
solved directly (conquer). The solutions to the sub-problems are then combined to give a
solution to the original problem.
Binary Search:
The binary search algorithm begins by comparing the target value to value of the middle
element of the sorted array. If the target value is equal to the middle element's value, the position
is returned. If the target value is smaller, the search continues on the lower half of the array, or if
the target value is larger, the search continues on the upper half of the array. This process
continues until the element is found and its position is returned, or there are no more elements
left to search for in the array and a "not found" indicator is returned.
Algorithm:
int binary_search(int A[], int key, int low, int high)
Dept. of Computer Engg. (ZESs DCOER,Pune)
Computer Laboratory - I
EL-1
{
if (high<low)
return KEY_NOT_FOUND;
else
{
int mid =( low+high)/2;
if (A[mid] > key)
return binary_search(A, key, low, mid - 1);
else if (A[mid] < key)
return binary_search(A, key, mid + 1, high);
else
return mid;
}
}
Time Complexity:
The binary search is a logarithmic algorithm and executes in O(log N) time
Example:
For example, consider the following sequence of integers sorted in ascending order and say we
are looking for the number 55:
0
13
19
22
41
55
68
72
81
98
We are interested in the location of the target value in the sequence so we will represent the
search space as indices into the sequence. Initially, the search space contains indices 1 through
11. Since the search space is really an interval, it suffices to store just two numbers, the low and
high indices. As described above, we now choose the median value, which is the value at index 6
(the midpoint between 1 and 11): this value is 41 and it is smaller than the target value. From this
we conclude not only that the element at index 6 is not the target value, but also that no element
at indices between 1 and 5 can be the target value, because all elements at these indices are
smaller than 41, which is smaller than the target value. This brings the search space down to
indices 7 through 11:
55
68
72
81
98
Proceeding in a similar fashion, we chop off the second half of the search space and are left with:
Dept. of Computer Engg. (ZESs DCOER,Pune)
Computer Laboratory - I
EL-1
55
68
Depending on how we choose the median of an even number of elements we will either find 55
in the next step or chop off 68 to get a search space of only one element. Either way, we
conclude that the index where the target value is located is 7.
CONCLUSION
Thus we have studied & implemented Binary search algorithm.
FAQ
1. What is devide and concure strategy how it is used in binary search?
2. What is complexity of binary search ?
3. A binary search can be performed on both sorted and unsorted lists Justify.
Computer Laboratory - I
Consider the following list.
int[] intList = {16, 30, 24, 7, 25, 62, 45, 5, 65, 50};
If intList above were sorted, what would be the middle element?
6. Consider the following list.
intList = {4, 18, 29, 35, 44, 59, 65, 98};
If intList were to be searched for the number 44 using a binary search,
key comparisons would have to be made?
EL-1
5.
how many
Computer Laboratory - I
EL-1
Experiment No: 02
Title: Using Divide and Conquer Strategies
design a function for Concurrent Quick Sort
using C++.
Roll No:________
Batch:_____
Class:_____
Particulars
Attendance (05)
Journal (05)
Performance (05)
Understanding(05)
Total (20)
Signature of Staf
Member
Marks
Computer Laboratory - I
EL-1
Experiment No 02
TITLE:Using Divide and Conquer Strategies design a class for Concurrent Quick Sort
using C++.
AIM:-
THEORY:
Quick Sort:
QuickSort is a Divide and Conquer algorithm. It picks an element as pivot and partitions the
given array around the picked pivot. There are many different versions of quickSort that pick
pivot in different ways.
1) Always pick first element as pivot.
2) Always pick last element as pivot
3) Pick a random element as pivot.
4) Pick median as pivot.
The key process in quickSort is partition(). Target of partitions is, given an array and an element
x of array as pivot, put x at its correct position in sorted array and put all smaller elements
(smaller than x) before x, and put all greater elements (greater than x) after x. All this should be
done in linear time.
Partition Algorithm:
There can be many ways to do partition. The logic is simple, we start from the leftmost element
and keep track of index of smaller (or equal to) elements as i. While traversing, if we find a
smaller element, we swap current element with pivot., Otherwise we ignore current element.
partition(array, lower, upper)
Dept. of Computer Engg. (ZESs DCOER,Pune)
Computer Laboratory - I
{
pivot is array[lower]
while (true)
{
scan from right to left using index called RIGHT
STOP when locate an element that should be left of pivot
scan from left to right using index called LEFT
stop when locate an element that should be right of pivot
swap array[RIGHT] and array[LEFT]
if (RIGHT and LEFT cross)
pos = location where LEFT/RIGHT cross
swap pivot and array[pos]
all values left of pivot are <= pivot
all values right of pivot are >= pivot
return pos
end pos
}
}
EL-1
Computer Laboratory - I
EL-1
Example:
Time Complexity:
Computer Laboratory - I
EL-1
termination, each process holds an element of the array, and the sorted order can be recovered by
traversing the processes.
CONCLUSION
Thus we have studied and implemented concurrent Quick sort.
Computer Laboratory - I
EL-1
FAQ
1.
2.
3.
4.
5.
Computer Laboratory - I
EL-1
Experiment No: 10
Title: Implement Apriori approach for data
mining to organize the data items on a shelf
using table of items purchased in a Mall
Roll No:________
Batch:_____
Class:_____
Particulars
Attendance (05)
Journal (05)
Performance (05)
Understanding(05)
Total (20)
Signature of Staf
Member
Marks
Computer Laboratory - I
EL-1
Experiment No 10
AIM:-
Understand the importance Apriori Algorithm for finding frequent item sets
To learn mining association rules
THEORY:
Apriori Algorithm:
General Process Association rule generation is usually split up into two separate steps:
1. First, minimum support is applied to find all frequent itemsets in a database.
2. Second, these frequent itemsets and the minimum con fidence constraint are used to form
rules.
While the second step is straight forward, the firs t step needs more attention. Finding all
frequent itemsets in a database is difficult since it involves searching all possible item sets (item
combinations). The set of possible itemsets I s the power set over I and has size 2 n 1
(excluding the empty set which is not a valid itemset). Although t he size of the powerset grows
exponentially in the number of items n in I , efficient search is possible using the downwardclosure property of support (also called anti-monotonicity ) which guarantees that for a frequent
itemset, all its subsets are also frequent and thus for an infrequent itemset, all its supersets m ust
Dept. of Computer Engg. (ZESs DCOER,Pune)
Computer Laboratory - I
EL-1
also be infrequent. Exploiting this property, efficient algorithms (e.g., Apriori ) can find all f
requent itemsets.
Computer Laboratory - I
EL-1
Organize the data items on a shelf means finding the items that are purchased
together more
frequently than others. Apriori is the classic and probably the most basic algorithm to
do it.
Now, we follow a simple golden rule: we say an item/itemset is frequently bought if it is
bought at
least 60% of times(i.e Minimum Support=3). So for here it should be bought at least 3
times.
For simplicity M = Mango ,O = Onion , J=Jar, K= Key-chain, E=egg, C= Chocolate,
Co=Corn, A=Apple Kn=Knife and so on So the table becomes
Original table:
Transaction
ID
Items Bought
T1
{M, O, J, K, E, C}
T2
{N, O, J, K, E, C}
T3
{M, A, K, E}
T4
{M, T, Co, K, C}
T5
{Co, O, O, K, Kn, E}
Step 1: Count the number of transactions in which each item occurs, Note O=Onion is bought
4 times in total, but, it occurs in just 3 transactions.
Item
No of
transactions(Sup
Computer Laboratory - I
EL-1
(Candidat
e SetC1 )
port)
Kn
Co
Step 2: Now we said the item is said frequently bought if it is bought at least 3 times. So in this
step we remove all the items that are bought less than 3 times from the above table and we are
left with
Item
(Frequent
Item Sets
L1)
Number of
transactions(Sup
port)
This is the single items that are bought frequently. Now lets say we want to find a pair of items
that are bought frequently. We continue from the above table (Table in step 2)
Dept. of Computer Engg. (ZESs DCOER,Pune)
Computer Laboratory - I
EL-1
Step 3: We start making pairs from the first item, like MO,MK,ME,MC and then we start with
the second item like OK,OE,OC. We did not do OM because we already did MO when we were
making pairs with M and buying a Mango and Onion together is same as buying Onion and
Mango together. After making all the pairs we get,
Item pairs
MO
MK
ME
MC
OK
OE
OC
KE
KC
EC
Step 4: Now we count how many times each pair is bought together. For example M and O is
just bought together in {M,O,N,K,E,C}
While M and K is bought together 3 times in { M, O, J, K, E, C }, { M, A, K, E } AND { M, T,
Co, K, C } After doing that for all the pairs we get
Item Pairs
(Candidate
set C2)
Number of
transactions
(Support)
MO
Dept. of Computer Engg. (ZESs DCOER,Pune)
Computer Laboratory - I
EL-1
MK
ME
MC
OK
OE
OC
KE
KC
EC
Step 5: Golden rule to the rescue. Remove all the item pairs with number of transactions less
than three and we are left with
Item Pairs
(Frequent
Item Sets L2)
Number of
transactions
(Support)
MK
OK
OE
KE
KY
Step 6: To make the set of three items we need one more rule (its termed as self-join),
Computer Laboratory - I
EL-1
It simply means, from the Item pairs in the above table, we find two pairs with the same first
Alphabet, so we get
OK and OE, this gives OKE
KE and KC, this gives KEC
Then we find how many times O,K,E are bought together in the original table and same for
K,E,Y and we get the following table
Item Set
(Candidate
Set C3)
Number of
transactions
(Support)
OKE
KEY
While we are on this, suppose you have sets of 3 items say ABC, ABD, ACD, ACE, BCD and
you want to generate item sets of 4 items you look for two sets having the same first two
alphabets.
And so on In general you have to look for sets having just the last alphabet/item different.
Step 7: So we again apply the golden rule, that is, the item set must be bought together at least 3
times which leaves us with just OKE, Since KEY are bought together just two times.
Thus the set of three items that are bought together most frequently are :
Item Set L3={O, K, E} .
Frequent
CONCLUSION
Thus we have successfully implemented Apriori Approach for data mining to organize the data
items on a shelf using following table of items purchased in a Mall.
Computer Laboratory - I
EL-1
FAQ
1. Define Frequent sets, confidence, support and association rule.
2. Explain whether association rule mining is supervised or unsupervised type of
learning.
3. What is Association rule?
Dept. of Computer Engg. (ZESs DCOER,Pune)
Computer Laboratory - I
EL-1
Items
100
A,C,D
200
B,C,E
300
A,B,C,E
400
B,E
Experiment No: 12
Title: Implementation of K-NN approach take
sutaible Example
Computer Laboratory - I
Roll No:________
Batch:_____
EL-1
Class:_____
Particulars
Attendance (05)
Journal (05)
Performance (05)
Understanding(05)
Total (20)
Signature of Staf
Member
Marks
Computer Laboratory - I
EL-1
Experiment no 12
TITLE:AIM:-
THEORY:
K-NN Approach:
K nearest neighbors is a simple algorithm that stores all available cases and classifies new cases
based on a similarity measure (e.g., distance functions). KNN has been used in statistical
estimation and pattern recognition. Nearest-neighbor classifiers are based on learning by
analogy, that is, by comparing a given test tuple with training tuples that are similar to it. The
training tuples are described by n attributes. Each tuple represents a point in an n-dimensional
space. In this way, all of the training tuples are stored in an n-dimensional pattern space. When
given an unknown tuple, a k-nearest-neighbor classifier searches the pattern space for the k
training
tuples that are closest to the unknown tuple. These k training tuples are the k nearest neighbors
of the unknown tuple.
Closeness is defined in terms of a distance metric, such as Euclidean distance.
The Euclidean distance between two points or tuples, say, X1 = (x11, x12, : : : , x1n) and
X2 = (x21, x22, : : : , x2n), is
Computer Laboratory - I
EL-1
Algorithm:
1. Determine the parameter K = number of nearest neighbors beforehand. This value is all
up to you.
2. Calculate the distance between the query-instance and all the training samples. You can
use any distance algorithm.
3. Sort the distances for all the training samples and determine the nearest neighbor based
on the K-th minimum distance.
4. Since this is supervised learning, get all the Categories of your training data for the sorted
value which fall under K.
5. Use the majority of nearest neighbors as the prediction value.
Example:
An implementation of knn.
* Uses Euclidean distance
* Main method to classify if entry is male or female based on:
* Height, weight
Height
Weight
Class
175
80
Male
193.5
110
Male
163
110
Female
160
60
Female
Computer Laboratory - I
EL-1
Height
Weight
Distance
175
80
193.5
110
(193.5-170)2+(110-60)2= 3052.25
163
110
(163-170)2+(110-60)2= 2549
160
60
(160-170)2+(60-60)2= 100
3. Sort the distance and determine nearest neighbors based on kth minimum distance
Height
Weight
Distance
Rank
Is it included in 3
nearest neighbors
175
80
425
Yes
193.5
110
3052.25
Yes
163
110
2549
No
160
60
100
Yes
Category of the
in 3 nearest
nearest
neighbors
neighbors
175
80
425
Yes
Male
193.5
110
3052.25
No
Male
163
110
2549
Yes
Female
160
60
100
Yes
Female
5. Use simple Majority of the category of nearest neighbor as the prediction value of the
query instance
6. Out Put: For our query Height= 170 and weight=60 Class= Female
CONCLUSION
Dept. of Computer Engg. (ZESs DCOER,Pune)
Computer Laboratory - I
Thus we have successfully implemented KNN Approach for Classifying data.
FAQ
1. Define the concept of classification.
2. What is Decision Tree?
3. What is Attribute Selection Measure?
4. Describe Tree pruning methods.
Dept. of Computer Engg. (ZESs DCOER,Pune)
EL-1
Computer Laboratory - I
5. Explain the data mining functionalities?
6. Classification is supervised learning. Justify.
7. Explain different classification Techniques.
EL-1