Lab 16: Hashing Part 2

A. Intro

Pull the lab 16 code from github and create a new IntelliJ project out of it, as usual.

Learning Goals

Today's exercises almost all focus on hashing of non-strings, to counter student misconceptions that hashing is used only with strings or even integers. In particular, there are exercises that involve memoization, a technique for avoiding repeated computation that is very important when calculating certain expensive functions.

B. Exercise: PhoneBook with HashMap

Complete the following exercise to cement your understanding of the last lab on hashing.

Add code to complete the PhoneBook class and possibly to the Person and PhoneNumber class. You must use HashMap from java.util! The point of this exercise is to give you practice with it.

Run the test cases in PhoneBookTest.java to check that the code works. Then answer the following questions to check your understanding of the desired behavior.

What type was the Key in the HashMap book used in the PhoneBook?

PhoneNumber Object
Incorrect.
String representing the name
Incorrect.
String representing the phone number
Incorrect.
PhoneBook Object
Incorrect.
Person Object
Correct!
Check Solution

What type was the Value in the HashMap book used in the PhoneBook?

PhoneNumber Object
Correct!
String representing the name
Incorrect.
String representing the phone number
Incorrect.
PhoneBook Object
Incorrect.
Person Object
Incorrect.
Check Solution

What happens when you try to add the same person to the phone book twice?

An exception is thrown
Incorrect.
The second entry is ignored because an entry already exists
Incorrect.
Duplicate entries are added to the HashMap
Incorrect.
The second entry's value replaces the first's
Correct!
Check Solution

What happens if you modify a PhoneNumber that is already stored in the Book?

You are now no longer able to look up the PhoneNumber for that Person object.
Incorrect.
The PhoneBook remembers the old entry and is not changed because they reference different objects
Incorrect.
An exception is thrown
Incorrect.
The entry is reflected in the Phone book because they reference the same object
Correct!
Check Solution

What happens if you modify a Person that is already stored in the Book?

The entry is reflected in the Phone book because they reference the same object and you can still look up the PhoneNumber
Incorrect.
The entry is reflected in the Phone book because they reference the same object, but you can no longer look up the PhoneNumber
Correct!
The PhoneBook creates a second entry for the second person and still references the old Object as well
Incorrect.
An exception is thrown
Incorrect.
Check Solution

Note: For each question ensure that you understand why the answer is so. Consult with your neighbors and TA, especially for the forth and fifth questions. Before moving on to the rest of the lab, take a moment to explain what happened in questions 4 and 5 to either your TA or a lab assistant.

C. Hashing Different Objects

Hashing Tic-tac-toe Boards

The code in TTThashTest.java stores all possible Tic-tac-toe boards into a java.util.HashSet object. Create a hashCode method in the TTTboard class. Now, test these two ways to hash Tic-tac-toe boards (submit the second approach to the autograder):

Toggle Solution

Hint about hashing a board as a nine-digit numeral:

This tic-tac-toe board ...

| X | X | O |
|---|---|---|
| X | O |   |
|---|---|---|
|   |   | O |

could be represented as ...

| 1 | 1 | 2 |
|---|---|---|
| 1 | 2 | 0 |
|---|---|---|
| 0 | 0 | 2 |

or as:

112120002 (base 3).

which is equal to:

1*3^8 + 1*3^7 + 2*3^6 + 1*3^5 + 2*3^4 + 0*3^3 + 0*3^2 + 0*3^1 + 2*3^0

which is equal to:

2*3^0 + 0*3^1 + 0*3^2 + 0*3^3 + 2*3^4 + 1*3^5 + 2*3^6 + 1*3^7 + 1*3^8

which is equal to:

2 + 3*(0 + 3*(0 + 3*(0 + 3*(2 + 3*(1 + 3*(2 + 3*(1 + 3*(1)))))))) 

Discussion: Two Hash Functions

Which hash function performed better? Suggest why.

Properties of Valid Hash Codes:

Properties of Good Hash Codes (non-exhaustive list):

Discussion: Random Hash Code

If I made my hashCode function return a random int equally spread over all possible ints, the hash codes would be quick to compute and spread over the ints as perfectly as possible. Why is this not a good hash code?

Discussion: Linked List Hash Code

Imagine a simple LinkedList class that has ListNode objects inside. The LinkedList class has the following hashCode method:

public int hashCode() {
    if (myHead == null) {
        return 17;
    } else {
        return myHead.hashCode();
    }
}

This class essentially relies on the hash codes of its nodes. So, consider the following three options for the hashCode method to put inside ListNode. For each, do you think it is a good hashCode method? Evalutate it according to the criteria given above.

In the ListNode class:

1.

    public int hashCode() {
        int returnValue = myItem.hashCode();
        if (myNext != null) {
            returnValue += myNext.myItem.hashCode();
        }
        return returnValue;
     }

2.

    public int hashCode() {
        int returnValue = myItem.hashCode();
        if (myNext != null) {
            returnValue += myNext.hashCode();
        } 
        return returnValue;
    } 

3.

public int hashCode() {
    int returnValue = super.hashCode(); // the machine address of this ListNode
    if (myNext != null) {
        returnValue += myNext.hashCode();
    } 
    return returnValue;
} 

D. Hashing for Memoization

Implementing Memoization Using Hashing

In the next few steps you'll implement memoization for fibonacci. Memoization allows redundant recursive calls to be eliminated by saving the result of each new call and then looking up the result instead of recomputing it.

This method can also be used to solve games:

Take for example a game-playing program that makes moves. It will search the tree of moves to find winning configurations:

Search for a winning move.
    If the configuration has been seen before, 
        return its value.
    If it's an immediately losing configuration,
        store the configuration and "loss" in the table
        and return "loss".
    If it's an immediately winning configuration,
        store the configuration and "win" in the table
        and return "win".
    Otherwise, for each possible move, do the following:
        make the move;
        make a recursive call on Search;
        store the configuration and the value in the table;
        if the value is "win", return it;
        otherwise unmake the move and try another.
    If all moves have been analyzed and none are winners,
        return "loss".

Exercise: Implement Memoization for Fibonacci

Modify the Fibonacci.java code to use memoization. We recommend that you use a HashMap even though you could keep track of previously computed values with an array.

--------

Suppose that 31 distinct integers are arranged in ascending order in an array named values. Suppose also that the following code is used in a method to locate an integer named key in the array.

int leftIndex = 0;
int rightIndex = values.length() - 1;
while (leftIndex <= rightIndex) {
    int middleIndex = (leftIndex+rightIndex)/2;
    if (values[middleIndex] == key) {
        return true;
    } else if (values[middleIndex] < key) {
        leftIndex = middleIndex+1;
    } else {
        rightIndex = middleIndex-1;
    }
}
return false;

The next two questions ask you to count the comparisons needed to find all 31 integers, and then to do the same thing with the values stored in a hash table.

Counting Binary Search Comparisons

The code in BinarySearch.java contains the code in the previous step, along with code to look up each element of the array. Modify the code to count the number of times a comparison is made between values[middleIndex] and key. (Correctness check: The total number of comparisons to find all the array elements is 227.)

Self-test: Counting hash table comparisons

Now suppose that the 31 integers are stored in a chained hash table with k chains, in which the hash function distributes the integers to chains as evenly as possible. We want to find every value in the hash table and count the number of comparisons necessary. What is the smallest value of k for which the total number of comparisons to find all 31 values in the hash table is less than the answer you computed in the previous part?

(Recall that 1 + 2 + ... + n = n(n+1)/2.)

Toggle Solution

Answer: 3

If you didn't get this right, try guess and check. Calculate how many elements are in each chain for 1 bucket (and the number of comparisons) then for 2 buckets (and the number of comparisons) etc.

E. Conclusion

Submission

Please turn in the following as lab 16:

Readings

The reading over the next two labs is DSA Ch 10.4-5.