$25
In this assignment you are required to implement the Bond Energy Algorithm of vertical fragmentation. Your code should contains two separate procedures AA Generator and CA Generator, where AA Generator takes the input of all attributes of a relation, a set of queries and their access frequencies at different sites, and produces the output of an affinity matrix AA, and CA Generator takes input of an affinity matrix AA and produces a clustered affinity matrix CA. For description of the BEA algorithm, definitions of AA and CA, please see lecture slides/textbook.
In this assignment, the Attribute Affinity is measured by the extended Otsuka-Ochiai coefficient (https://en.wikipedia.org/wiki/Yanosuke Otsuka) instead of the traditional method described in the textbook. The following equations show the details of the computation, where q is the number of attributes, and m is the number of sites, Aik is the number of times Attribute Ai is accessed by Query qk, considering of all sites. For the result of division, you must round it up to the nearest integer. (Use DOUBLE ,instead of FLOAT ,during calculation ,may help you get correct result)
S1
S2
S3
q1
15
20
10
q2
5
0
0
q3
25
25
25
q4
5
0
0
,
Example
For AA Generator:
Input
• The relation, called PROJ, has the following features Ai:
Label Name A1 PNO A2 PNAME
A3 BUDGET A4 LOC
• Queries (qi):
q1: SELECT BUDGET FROM PROJ WHERE PNO=Value q2: SELECT PNAME, BUDGET FROM PROJ q3: SELECT PNAME FROM PROJ WHERE LOC=Value q4: SELECT SUM(BUDGET) FROM PROJ WHERE LOC=Value
• Access frequency matrix ACC, where Si denotes the i-th site:
Output
• The attribute affinity matrix AA:
A1
A2
A3
A4
A1 45
0
41
0
A2 0
71
1
71
A3 41
1
38
1
A4 0
For CA Generator:
Input
• The attribute affinity matrix AA:
71
1
71
A1
A2
A3
A4
A1 45
0
41
0
A2 0
71
1
71
A3 41
1
38
1
A4 0
Output
• The attribute affinity matrix CA:
71
1
71
A1
A3
A4
A2
A1 45
41
0
0
A3 41
38
1
1
A4 0
1
71
71
A2 0
1
71
71