联系方式

您当前位置:首页 >> Java编程Java编程

日期:2024-10-08 08:25

Data Structures and Algorithms Hash Tables

CRICOS Provide Code: 00301J Page 1 of 3

Note:

• hashArray stores the key, value and state (used, free, or previously-used)

of every hashEntry.

We must store both the key and value since we need to check hashArray

to tell if there is a collision and we should keep probing until we find the

right key.

• put(), hasKey() and get() must take the passed-in key and call hash() to

convert the key into an integer. This integer is then used as the index for

hashArray.

• Java Students: If you use a private inner class for DSAHashEntry, then

put(DSAHashEntry will need to be private, otherwise it will be public.

• There are many hash functions in existence, but all hash functions must

be repeatable (i.e., the same key will always give the same index). A good

hash function is fast and will distribute keys evenly inside hashArray.

Hash Tables

Updated: 21st

July, 2023

Aims

• To implement a hash table.

• To make the above hash table automatically resize.

• To save the hash table and reload it from a file.

Before the Practical

• Read this practical sheet fully before starting.

Activities

g

1. Hash Table Implementation

Following the lecture slides as a guide, Create DSAHashTable class and a companion

class called DSAHashEntry to implement a hash table with a simple hash function. Use

linear probing first since it’s easier to think about, then convert to double-hashing.

Assume the keys are strings and the values are Objects.

Data Structures and Algorithms Hash Tables

CRICOS Provide Code: 00301J Page 2 of 3  

Note:

• Of course, the latter depends on the distribution of the keys as well, so it’s

not easy to say what a good hash function will be without knowing the

keys.

For the purpose of this practical, just use one of the hash functions from

the lecture notes.

• Use linear probing or double-hashing to handle collisions when inserting.

• hasKey(), get() and remove() will need to use the same approach since

they also need to find the right item.

It’s probably a good idea to try make a private find() method that does

the probing for these three functions and returns the index to use. Use the

DSAHashEntry state to tell you when to stop probing.

• Be aware that remove() with probing methods adds the problem that it

can break probing unless additional measures are taken.

– In particular, say we added Key1, then Key2 which collides with

Key1, so we linearly probe and add Key2 to the next entry.

If we remove Key1, later attempts to get Key2 will fail because Key2

maps to where Key1 used to be.

Since it is now null, probing will abort and imply that Key2 doesn’t

exist.

– The solution is to use the state filed in DSAHashEntry that tracks

whether the entry has been used before or not.

2. Resizing a Hash Table

Modify your DSAHashTable to allow it to resize. There are various ways to determine

when to and how to resize a hash table.

The simplest way to determine when is to set an upper and lower threshold value

for the load factor. When the number of elements is outside of this, the put() or

remove() methods should call resize(size) automatically.

• Remember, this will be computationally expensive (what is it it in Big-O?), so it is

important not to set the threshold too low. Also, collisions occur more frequently

at higher load factors, thus it is equally important to not set the threshold too

high. Do some research to find "good" values.

A simple way to resize is to create a new array, then iterate over hashArray (ignoring

unused and previously used slots) and re-hashing (using put().

• To select a suitable size for the new array, you can either use a "look up" table of

suitable primes or re-calculate a new prime after doubling/halving the previous

size.

Test your resize functionality with a small hash table size, just so you know it will

work when you increase the size of the table. Data Structures and Algorithms Hash Tables

CRICOS Provide Code: 00301J Page 3 of 3

3. File I/O

To truly test your hash table implementation, you will need a large dataset. Read in

the RandomNames7000.csv as input to insert values into your hash table. There are some

duplicates in the file, so your program should be able to handle them.

It is also useful to be able to save the hash table. The save order is not important,

so just iterate through the keys and values in the order they are stored in the hash

table and write it to a .csv.

Submission Deliverable

• Your code are due 2 weeks from your current tutorial session.

– You will demonstrate your work to your tutors during that session

– If you have completed the practical earlier, you can demonstrate your work

during the next session

• You must submit your code and any test data that you have been using electronically

via Blackboard under the Assessments section before your demonstration.

– Java students, please do not submit the *.class files

Marking Guide

Your submission will be marked as follows:

• [6] Your DSAHashTable and DSAHashEntry are implemented correctly.

• [4] Your hash function is well thought out and properly implemented.

This means that it meets at least the first three criteria of a good hash function and

you can argue that it at least partially meets the last.

• [5] Your hash table resizes as you put and remove hash entries.

• [5] You can read in and save .csv files.

End of Worksheet


版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:821613408 微信:horysk8 电子信箱:[email protected]
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:horysk8