Database Management Systems

computer science


The University - Computer and Information


CIS Database Management Systems Fall 201


Total points: 30

Implement the hash-join algorithm for a DBMS. Use C++ or Java programming language to write

the program. The program should include comments to make it readable. Turn in a brief report/

description about the program design and implementation (e.g., high-level program diagram and

data/file structures) and program usage; (2) the program source code; (3) proof of compilation (e.g.,

the screen snapshot of a successful compilation); and (4) sample execution outputs. Assemble the

above required contents in a Word or PDF file for submission.

The program specificiation is given as follows. Let R1(a1, a2, a3) and R2(b1, b2, b3, b4) be two realtions with all i nteger attributes. Tuples i n these two relations are sequentially stored in two data

files, respectively (use .txt files).

 Use the hash-join algorithm to implement a join (equijoin) of R1 and R2. Assume that the

a hash function is f(k) = k mod N, where N is the number of buckets allowed in your hash


 Your program should allow a user to choose the joining attributes from the two relations, i.e.,


R1 ./ R2

for any chosen pair of ai and bj, where ai

are the i-th attribute in R1 (1 ≤ i ≤ 3) and bj is the j-

th attribute in R2 (1 ≤ j ≤ 4). For example, a user may want to perform R1 ./R1.a2=R2.b3 R2.

 Your program should display the join result and output the selectivity of the join.

 You may request a user to interactively input the necessary parameters, such as the data file

names for R1 and R2, the number of tuples in each relation, and the joining attributes (e.g.,

1 for the 1st attribute, 3 for the 3rd attribute).

 Use your program to perform several joins for different relation instances of R1 and R2.

Related Questions in computer science category