Professional Documents
Culture Documents
National Institute of Technology Karnataka, Surathkal: A Project Report On
National Institute of Technology Karnataka, Surathkal: A Project Report On
National Institute of Technology Karnataka, Surathkal: A Project Report On
A Project Report on
Prof. Ananthanarayana V. S.
Submitted by:
This algorithm starts from the attribute affinity matrix and generates
initial groups based on the affinity values between attributes. Then, it
attempts to merge the initial groups to produce final groups that will represent
thefragments.
Contents
1. Introduction
2. Related Work
5. Implementation Details
6. Conclusion
7. References
1. Introduction
2. Related Work
Step 1. Iterate starting from the first attribute (first row in affinity matrix) trying to
generate a group by joining it to other attribute(s) with the highest affinity value
(Max(aff(i, j)) forming the first initial group. The resulted group will have a power
factor P(g) that takes the affinity value aff(i, j). Here we have three possible
scenarios: First; the two attributes are independent (do not belong to any initial
group), in this case we perform a direct grouping if the selected highest affinity
value aff(i, j) ≥ P(Ai) * ALF/100. Second; one of the attributes i or j belongs to a
group k, in this case we join the independent attribute to group k if the condition
aff(i,j) ≥ P(gk) is true. Third; having attribute Ai in group k and attribute Aj in
group l, then we will join the two groups if P(gk) = P(gl). By the end of this step
we end up having all possible initial groups.
Step2. Iterate starting from the first initial group produced in step 1, trying to
search for “best extension”. At this step we have two possible
scenarios: First; the “best extension” connects attribute Ai in group k and attribute
Aj that has not been joined to any initial group in step 1, in this case the
independent attribute Aj will be joined to group k if the condition aff(i,j) ≥ P (gk) *
GLF/100 is true, then the extended group’s
power will be equal to aff(i,j) value. Second; the “best extension” connects
attribute Ai in group k and attribute Aj in group l, in this case
we need to ensure that the two conditions aff(i, j) ≥ P(gk) * GLF/100 and P (gl) ≥
P (gk) * GLF/100 are true. The new group’s power will be equal to the power of
group l.
We will keep repeating this last step until there is no possible “best
extension” found, and then we will be obtaining the final groupings of our
algorithm.
5. Implementation Details
The implementation Language of the enhanced grouping algorithm is in (core)
JAVA. The aim of the algorithm is to divide the attributes of the relation. This can
be shown as follows.
Input:
Here user has to specify the number of attributes in the relations and usage matrix.
Then user has to input the access frequency for each query. Once the user press
enter button, internally it first computes all the attribute affinity matrix which will
going to used in dividing the attributes in groups.
Output:-
By using the Usage matrix and Access frequency for each query Attribute
Affinity Matrix is calculated. After that algorithm tries to find out all possible
groups starting from first row and first column and search for the attributes who
satisfy the condition for Attribute Link Factor and group that attributes as shown. It
also calculates the group power for each possible groups in a relation. Finally it
performs the group extension to get the final groups with their group power and
we get the output in the form of portioned attributes. The output of the enhanced
algorithm is as follows:
6. Concluion
This algorithm is more flexible compared to our previous Grouping Algorithm and
more efficient for vertical partitioning problem because the added factors provided
more control on the final produced groups
based on the problem specifications. The major advantage of this algorithm is that
it is simple to understand and easy to implement (only two steps).
Our final results using the 10 and 20 attributes examples were identical to that
obtained by Navathe et al and Navathe & Ra’s Graphical algorithm but with better
performance and more flexibility.
This algorithm is more efficient for vertical partitioning problem because it
eliminates the deficiencies of binary partitioning and the complexity of graphical
algorithm. Finally, we note that the values for the enhancement factors are chosen
based on several qualitative and quantitative issues, such as; the network
bandwidth, number of sites, number of attributes in a relation, the
queries/transactions frequency and their type (retrieval or update), the nature of the
distributed
database management system used (heterogeneous or homogeneous), and lastly the
person who designs and
tests the results.
7. References