Lecture Notes

CMPT-225 Assignment4

Hash Tables.

Due on Tuesday April 15th, at 23:59.

Problem Statement

You are required to investigate difference strategies for implementing the hash tables. You are given a list of keywords and the specification of a hashing method in an input file and is asked to insert or delete the keywords to a hash table with the given specification. The hashing methods include: open address linear probing, open address quadratic probing and separate chaining. After the insertion and deletion operations is done you need to output the average number of comparisons and the work load.

Input file.

Input file name is hash-input.txt. The first line of the input file specifies the hashing method specifications, table size and the number of operations. The first string indicates the hashing method. LP stands for linear probing, QP stands for quadratic probing and SC stands for separate chaining. The second item is the table size and third item is the number of operations that follows. For instance:

LP 100 7

means that the hashing method is linear probing, the size of the table is 100 and there will be 7 operations for this hash table. The operations are either Insert or Delete. A positive number indicate an insertion of an item with the key equal to that number into the hash table and a negative number indicates the deletion of the item with a key equal to that number from the hash table. For instance:

LP 100 7

786237 //meaning insert an item with the key 786237 into the table.

980986

134237

-980986 //meaning delete the item with key 980986 from the table

-235137

340901

-680471

A file may contain several runs, for instance the following file has 3 runs:

LP 100 7 //First Run

786237

980986

134237

-980986

-235137

340901

-680471

QP 1000 3 //Second Run

797937

658208

-582937

SC 10 5 //Third Run

686296

687429

597929

700746

-358929

Output.

For each run in the file you need to output the average number of comparisons for the operations and also the work load of the table at the end.

How to calculate the number of comparison for each operation?

It’s very important that you follow the below guidelines to compute the average number of operations. If you fail to follow exactly according to the following specification your output won’t match the correct output and you will lose a great deal of marks.

For Separate Chaining hash the number of comparison for the insert operation is 0 because we insert each new item in the beginning of the corresponding chain. The number of comparisons for delete operation in a separate chaining hash table is the number of the nodes that is visited in the corresponding linked list. If the linked list is empty (i.e the location in the table for the key to be deleted is null) then the number of operation is 0.

For the Linear or Quadratic probing the number of comparisons is the number of the locations in the table that is visited when a key is inserted or deleted from the table. This include the items that were deleted and marked as available.

The output for the above input file with three runs must be as follows:

Run1

Average number of comparisons: 1.4285714285714286

Work load: 0.03

Run2

Average number of comparisons: 1.3333333333333333

Work load: 0.0020

Run3

Average number of comparisons: 0.4

Work load: 0.4

You can download more sample input and their corresponding output from here.

Implementation

Basically you need to implement two hash table classes one that implements the open address hashing and one that implements separate chaining. For the open address hash table you can use the following template.

public class HashNode<K, V>{

Private K key;

Private V value;

….

}

public class OpenAddressHashTable <K, V>{

private int tableSize;

private HashNode<K,V> [] table;

private probeType;

public OpenAddressHashTable(int ts, String probeType){

tableSize = ts;

table = new HashNode[ts];

this.probeType = probeType;

}

….

}

For the separate chaining hash table you can use the following template.

public class ChainNode<K, V>{

private K key;

private V value;

private ChainNode<K, V> next;

}

public class SeparateChainHashTable <K, V>{

private int tableSize = 101;

private ChainNode<K, V>[] table;

public SeparateChainHashTable (int ts){

tableSize = ts;

table = new ChainNode[ts];

}

….

}

For each run in the file you need to create a separate hash table object from the appropriate class and perform the operations in the run and print the statistics at the end.

Specific Requirements

· For Java programmers: You must name your main class (the class with the main method) hashTest.java.

Important Remarks

It’s very important that you make sure your program compiles without any problem (you may get 0 if your program didn't compile-see the submission section below).

Make sure that your input and output file format is exactly the same as the requirement stated previously (see the display information option).

Submission

Submit your source files only (for java users: .java and for c++ users: .cpp and .hpp or .h). You can use any compiler that you want but you need to make sure that your program will compile under the Linux. It’s easy, create a directory called src and copy all your source files in it then in a Linux shell type javac hashTest .java for compiling your program and java hashTest for running your program. For c++ user you need to run g++ *.cpp –o hashTest command to compile your program and then ./hashTest to run it. If you are writing in Java this should work without any problem however, your c++ source files from some compilers (for instance Visual C++) may not directly compile in Linux. Drop by my or your TA’s office hours if you have difficulty compiling in Linux. Zip all your source files in the src directory and submit it.