If you are using simple chaining to handle collisions, and there are 4 buckets in your hash table

computer science

Description

Question 1 

a) If you are using simple chaining to handle collisions, and there are 4 buckets in your hash table, what happens when inserting the following key values [17,9,37,29,69,1649,2405,89,265,165,789].

How does this show problems with the chaining approach?

 

b) If you switch to using linear probing collision resolution and a larger number of buckets i.e. 16 buckets, where each contains one value, what will the resulting structure look like for the same key values from part a?

Will this improve search performance for those values when compared to the chaining approach?

 

Tip for Question 1:

You should provide a hash function.  The hash functions in these cases can be simplified to using k% Number_of_buckets, where % is the modulus/remainder operator, to split the input key values among the buckets.  (This is the same approach as is used in the example in the slides this week for both chaining and linear probing).    

Sketch out the resulting data structure with the key values inserted into the appropriate buckets.

 

 

Question 2

 

a) Linear hashing is one approach to hashing values to a dynamically changing file. Briefly outline this approach and illustrate the approach using the following record key values [5,56,13,41,18,20,11,23,43,57,94].

Note: You may assume that each block can contain two records, that the initial file contains two blocks (M = 2), and that splits will occur whenever an insert causes an insert into an overflow bucket. 

 

b) Change your approach to instead split when, after inserting, the load factor > 0.75. Will this result in the same hash table? Illustrate your answer. (load factor = number of all keys currently in table / number of (regular non-overflow) buckets currently in table*max items per bucket)

 

 

Question 3

 

a) What is the "Birthday paradox/Birthday problem" and how is it relevant to hash tables?

b) What is a "Random Oracle" and why is it not a hashing algorithm? 


Related Questions in computer science category