Lab 10: Generics and Iterators

A. Intro

Learning Goals

This is a two-part lab. The first part introduces you to generic classes and methods. These appear throughout the java.util class library. You will be using them for your own code as well. We intend that the lab activities demystify the concept of a generic class and method and reduce the likelihood of puzzling syntax errors.

Remaining activities address the use and implementation of iterators, objects that control iteration through the items in a collection via three methods: a constructor, hasNext(are there more items yet to be iterated?, and next (return the next item in the iteration). Iterators are provided for every collection class in java.util. More generally, they provide a mechanism for unifying the operations of processing an array in several formats, handling input, and traversing more complicated structures.

B. Generics

Intro to Generics

We noticed in recent lab activities that Java uses both an item's static type and its dynamic type to determine how a given reference should be interpreted. The static type of a variable constrains types values it may contain, and is determined by what the compiler sees. The dynamic type is the type of the variable's value. Here's an example.

    ModNCounter ctr1 = new ModNCounter(4);
    Counter ctr2 = new ModCounter(8);

ctr1 has static type ModNCounter and ctr2 has static type Counter because that was the information given to the compiler. Both variables have dynamic type ModNCounter because that's the type of the contents of ctr1 and ctr2.

Suppose now that we wish to store a collection of counters, some with type Counter and some with type ModNCounter. Suppose further that we wish to store the counters in an ArrayList.

The ArrayList class can store a collection of any type of Java Object. Since any class inherits from Object, you might use it to store a collection of Strings, a collection of Integers, a collection of Dogs, or even a collection of other ArrayLists, and of course a collection of counters. Because you could potentially receive any type of Object when you access an ArrayList element, you have to cast the result of get if you don't want it to have static type Object.

ArrayList collection = new ArrayList();
collection.add(new ModNCounter(5));
ModNCounter ctr = (ModNCounter) collection.get(0); // Without the cast it's a compile time error!

This is rather messy.

A generic class lets us clean up this mess. The online Java Tutorial says why:

"In a nutshell, generics enable types to be parameters when defining classes and methods. Much like the more familiar formal parameters used in method declarations, type parameters provide a way for you to re-use the same code with different inputs. The difference is that the inputs to formal parameters are values, while the inputs to type parameters are types."

Here's the rewritten code.

ArrayList<ModNCounter> collection = new ArrayList<ModNCounter>();
collection.add(new ModNCounter(5));
ModNCounter ctr = collection.get(0);

Writing a Generic Class

Let's see how to define our own generic class. Suppose we have a class Box that can store one object of any type, using generics. It's a contrived example — the Box class isn't very useful — but hopefully it gives you an example of generics syntax. Here's the way we imagine using the Box class:

Box<String> b1 = new Box<String>();
b1.putItem("abc");

Box<Point> b2 = new Box<Point>();
b2.putItem(new Point(1, 2));

String s = b1.getItem();
Point p = b2.getItem();

Notice that we can put and get String objects from a Box<String> object, or we could put and get Point objects from a Box<Point> object.

Now we'll show you how you could write this class:

public class Box <T> {
    // instance variable declaration
    private T item;

    // put an item in the box
    public void putItem(T item) {
        this.item = item ;
    {

    // get the item from the box
    public T getItem() {
        return item ;
    }
}

What is this T that we see appearing all over the class definition? It looks like its being used as a type. You can see it is the type of item, it is the return type of getItem, and it is the type of the input to putItem. But you've never heard of a Java class called T, right?

That's because T is a generic type. You can think of a generic type as a variable type. When you create a box, you can "pass in" other types to take the place of T. When we write Box b1 = new Box<String>();, String will take the place of T everywhere. So you can imagine that for b1, its item has type String, its getItem returns type String, and its putItem takes an input of type String. For Box b2 = new Box<Point>();, everything is in terms of the Point class intead.

T can be thought of as just a variable, so you can give it whatever name you want. You could have written E instead of T, or even something like ItemType. It is convention to name single generic types with single uppercase letters, however.

To declare that a class is generic and that it will be using something like T as a generic type, all you do is include the <T> in the class header.

Self-test: Using Generics

Let's test that you understand how to use generics. For each of the following snippets of code, answer whether or not they will compile. Imagine that these lines of code exist in some file outside the Box class.

Box<String> b1 = new Box<Point>();
Yes
Incorrect.
No
Correct! Although the box class can take any type, each box object is committed to using only one.
Check Solution
Box<String> b2 = new Box<String>();
b2.putItem(new Point(1, 2));
Yes
Incorrect.
No
Correct! Since b2 is declared to take only String s, it cannot take a Point as input.
Check Solution
Box<T> b3 = new Box<String>();
Yes
Incorrect.
No
Correct! T is a variable type that exists only inside of the Box class. Since this code is not inside of the Box class, it makes no sense to reference T , because there is no way to tell what type T should be.
Check Solution
Box<Point> b4 = new Box<Point>();
b4.putItem(new TracedPoint(3, 4)); // TracedPoint extends Point.
Yes
Correct! b4 is meant to take only Point objects. A TracedPoint is a Point , so it is allowed to be passed in.
No
Incorrect.
Check Solution
Box<Point> b5 = new Box<TracedPoint>(); // TracedPoint extends Point.
Yes
Incorrect.
No
Correct! It's true that the code Point p = new TracedPoint(3, 4); does work, because Point is a superclass of TracedPoint . But, Box<Point> is not considered a superclass of Box<TracedPoint> . Think about why: If you could do this line, then the compiler would think you should be allowed to put any Point object into b5 , when really you can only put a TracedPoint .
Check Solution

Generic Interfaces

In addition to generic classes, we can also make generic interfaces. The syntax for writing a generic interface is exactly the same as that for writing a generic class, so if you understood the former, you should be able to transfer that knowledge to the latter.

Exercise: Make DLList Generic

Remember your DLList from lab 6, and how we had to have each node in the list store an Object so that it could hold anything, but if we took something out and used it, we would have to cast it? Let's make it more versatile by adding in generics. You can feel free to use your implementation from lab 6, or if you don't want to, the skeleton includes our solution which you can build off of.

With generics, you should be able to do this:

public static void main(String[] args) {
    DLList<String> l = new DLList<>();
    l.insertFront("Me");
    l.insertBack("ow~!");
    System.out.println(l.get(0) + l.get(1));
}

Note that I can simply write new DLList<>() without the dynamic type if I want it to be constructed with the same generic type as declared statically.

C. Intro to Iterators

Iterators Defined

In CS 61BL, we're going to encounter a variety of different data structures, or ways to organize data. You've already seen the array, which we used to represent a FixedSizeList, an ordered list of elements. Later, we'll see more complicated data structures such as trees, maps, priority queues, and graphs.

A common operation on a data structure is to process every item in it. For the FixedSizeList or index-based arrays, it is relatively straightforward how to do this. We wrote a for loop:

for (int k = 0; k < length of the array; k++) {
    process the kth array element...
}

For IntList, the pattern wildly differs from above.

IntList someList = ...
for (IntListNode p = someList.head; p != null; p = p.next() ) {
    int item = p.item;
    // process the item...
}

Evidently, we need to write two very different codes in order to do the exactly same thing. It would be nice if we can write one code that we can reuse for different data structures. In other words, we wish to abstract away the internal implementation of data structures from the operations on them.

Furthermore, if we use a different data structure, a for loop like the one above may not make sense. For example, what would it mean to access the kth item of a set, where order of items is not defined? We need a more abstract notion of processing every item in a data structure, something that allows us to check every item regardless of how they're organized. To do that, we're going to use something called an iterator.

Here's an example of how we might want to use an iterator on some arbitrary data structure:

Iterator iter = initIterator();
while (iter.hasNext()){ // loop while we have more items to process.
    value = iter.next(); // "fetch" the next item through the iterator.
    process value in some way...
}

The idea is that we process an item, then ask for the next one, then repeat until there is no next item. next doesn't necessarily return items in a specific order, like our old for loop, because the notion of linear order may not even apply to our data structure. The only thing next guarantees is that it will return every item exactly once, somehow.

We have abstracted away a lot behind the next method — if our data structure is very complicated, next may have to do a lot of work to figure out what item should be returned next.

Iterator Methods

Let's look at the iterator methods we used in the last step in more detail.

In order for next to know what value to return next, there must be some place-keeping information that keeps track of how far along the iteration is. The iterator may need an extra instance variable or two for this.

Exercise: Using an Iterator

Let's practice by using iterators. All java collections have one. Check out the class PointUtils, and fill it out. Complete the methods using the given iterator — Don't use a for loop to directly access the elements via indices!

It's okay if you're unfamiliar with the details of the List class or the LinkedList class that appear in the code. The point of an iterator is that you don't have to understand how these classes work in order for you to use the iterator. You just have to be familiar with the iterator methods.

An Iterator for FixedSizeList

Now that you've had some practice using an iterator, let's see how we can write our own. Here's an example of inserting the iterator methods into the FixedSizeList class from the previous lab:

public class FixedSizeList implements SimpleList {

    // instance variables
    int nextIndexToReturn;    // index of next item to return by iterator
    protected int[] values;   // list elements
    int count;                // number of array cells used by list

    public void initIterator() {
        nextIndexToReturn = 0;
    }

    public int next() {
        int valToReturn = values[nextIndexToReturn];
        nextIndexToReturn++;
        return valToReturn;
    }

    public boolean hasNext() {
        return nextIndexToReturn < count;
    }
        ...
}

We could build an FixedSizeList as follows.

public static void main(String [] args) {
    FixedSizeList list = new FixedSizeList(10);
    list.add(5);
    list.add(3);
    // list now contains the integers 5, 3;
    // thus values[0] is 5, values[1] is 3,
    // and count is 2.
        ...
}

Then we can use an iterator.

list.initIterator();
int m;
m = list.next();
// m now contains 5 and nextIndexToReturn contains 1
m = list.next();
// m now contains 3 and nextIndexToReturn contains 2

At this point, another call to next would be invalid since nextIndexToReturn is past the end of the list values.

The code maintains an important invariant: prior to any call to next, nextIndexToReturn contains the index of the next value in the list to return.

The hasNext Method

The hasNext method, as noted earlier, warns if the iteration runs out of elements. Most of the time we expect people to call hasNext() between calls to next(), but we can't count on it. The code on the previous page, which did not include a call to hasNext, is perfectly legal and works fine provided the list contains at least two elements.

An important feature of the code is that hasNext doesn't change any state. It only examines existing state (by comparing the progress of the iteration to the number of list elements). hasNext can then be called twice in a row and nothing should change, or it could be called not at all and the iteration should still work.

For example, if hasNext() returns true, we should be able to do this:

list.initIterator();
list.hasNext();
list.hasNext();
list.hasNext();
int m = list.next();

The fact that we called hasNext() multiple times shouldn't change anything.

Self-test: Iterators

Consider the following code, intended to be included in the FixedSizeList class. Note: this code is intentionally gross.

private int iteratorIndex = 0;
private boolean done = false;

public void initIterator() {
}

public boolean hasNext() {
    if (done) {
        return false;
    }
    if (iteratorIndex == count-1) {
        done = true;
    }
    return true;
}

public int next() {
    int rtn = values[iteratorIndex];
    iteratorIndex++;
    return rtn;
}

Suppose also that the FixedSizeList main method begins as follows.

public static void main (String [ ] args) {
    FixedSizeList list = new FixedSizeList(1);
    list.add(5);
    _______ ;

This self-test involves identifying the program segments that, when substituted for the blank line in the main method, perform correctly. (Note that "correct" behavior may involve a crash.)

list.initIterator();
boolean b;
int n;
b = list.hasNext();
b = list.hasNext();

Performs correctly?

Yes
Incorrect. Examine hasNext more carefully
No
Correct! hasNext changes the state. It will do something different the second time, which shouldn't happen
Check Solution
list.initIterator();
if (list.hasNext()) {
    System.out.println(list.next());
}
if (list.hasNext()) {
    System.out.println(list.next());
}

Works correctly?

Yes
Correct! Only the first if statement evaluates to true, as it should
No.
Incorrect.
Check Solution
list.initIterator();
System.out.println(list.next());
System.out.println(list.next());

Works correctly?

Yes
Correct! It crashes, as it should.
No
Incorrect.
Check Solution
list.initIterator();
if (list.hasNext()) {
    System.out.println(list.next());
}
list.initIterator();
if (list.hasNext()) {
    System.out.println(list.next());
}
Yes
Incorrect. Does initIterator really reset the iteration, as it should?
No
Correct! Notice initIterator does not reset the value of done
Check Solution
list.initIterator();
boolean b;
int n;
n = list.next();
b = list.hasNext();

Works correctly?

Yes
Incorrect. Trace through the calls to next and hasNext more carefully
No
Correct! hasNext will return true when it shouldn't
Check Solution

Discussion: Iterator Invariants

Link to the discussion

Consider the following iterator methods, slightly different from those we just encountered.

public void initIterator() {
    index = -1;
}

public int next() {
    index++;
    return values[index];
}

public boolean hasNext() {
    return (index + 1) < count;
}
  1. What's the invariant relation that's true between calls to next?
  2. In general, most experienced programmers prefer the organization introduced first over this organization. What might explain this preference?

D. Using Java's Iterator and Iterable

So far in this lab, we've been adding iterator methods directly to the classes you were iterating over. The more common approach is to use a separate iterator object that has those methods. But first, some motivation:

A More Abstract for Loop

You may have felt that writing while loops that repeatedly call next and hasNext is a bit of a hassle and seems kind of useless. After all, iterating over all items in a list in Python is comparatively much easier. Here's an example of printing every item in a list in Python:

l = ["a", "b", "c", "d"]
for item in l:
    print(item)

It turns out there is similar syntax in Java! Here's an example of it in use:

String[] l = {"a", "b", "c", "d"};
for (String item : l) {
    System.out.println(item);
}

This is a different kind of for loop than you have seen before. The for loop here is doing the exact same thing as in Python. It iterates through every item in l, temporarily storing each one in a variable called item (with static type String) on each iteration.

You may remember that you can use Python's for loop to iterate over many different kinds of objects. For example, this code does the same as above,

for letter in "abcd"
    print(letter)

Somehow, Python knows what the right thing to do is whether it is iterating over a list or a string. It can iterate over many other things as well.

Java's for loop can also be used on data structures other than arrays. Here's an example on an ArrayList.

ArrayList<String> l = new ArrayList<String>();
l.add("a");
l.add("b");
l.add("c");
l.add("d");

for (String item: l) {
    System.out.println(item);
}

Notice how this for loop abstracts away the underlying data structure. It works the same with arrays and ArrayLists! With a little set up, it turns out we'll be able to use this for loop on lots of other data structures too, and this is the real reason why we care about iterators. The ability to iterate over arbitrary data structure - and any Collection - with the same, simplistic loop structure is incredibly valuable.

Iterable

We saw in the previous part that Java has a for loop that can be used to process items in different kinds of data structures. But how does Java know how to iterate over different data structures, including complicated ones? It turns out that, in order to use Java's for loop to iterate over an object of a certain class, that class must implement the interface Iterable.

To implement Iterable, a class must define the following method:

public Iterator iterator();

This is the same method you saw being used in the PointUtils exercise. The iterator() method intializes an iteration by returning an object of type Iterator, which can then be used to iterate. What is Iterator? It turns out this is another interface!

(Note: Iterable and Iterator are generic interfaces. In the above example, our iteration returned items of static type String; this means the ArrayList<String> actually implemented Iterable<String> and defined the method public Iterator<String> iterator();).

Here are the methods required for an object to implement the interface Iterator<E>. Do they look familiar? These are just the methods you used in the PointUtils exercise.

public boolean hasNext();
public E next();

When you use Java's special for loop, it makes use of these methods automatically behind the scenes, so you don't have to go through the trouble of calling them yourself.

In the above example,

for (String item: l) {
    System.out.println(item);
}

is semantically equivalent to:

Iterator it = l.iterator()

while (it.hasNext()) {

String item = it.next();

    System.out.println(item);
}

Note: The method remove() deletes from the collection the last element returned by the iterator. You'll notice using the clean for loop syntax won't ever use the method remove(). The method is just there for you to call manually if you need it. When writing your own code, you may feel like you don't need the remove method either. That's okay: many Java classes just throw an exception in case someone calls the method, claiming that the class doesn't actually implement the method. In fact, since it's default it's even okay to leave it out entirely and it will throw an exception when called without you even having to specify that behavior yourself.

Exercise: An Iterator for Iterable DLList

First, if you haven't yet switched which partner is coding this lab, please switch now.

As mentioned before, it is standard practice to use a separate iterator object (and therefore a separate, typically nested class) as the actual Iterator. This separates out the idea of an Iterator and the Iterable class.

In our case, we want DLList<T> to implement Iterable<T>, to state that our list of T's can be iterated over. Modify the DLList class accordingly, adding the required override methods:

public Iterator<T> iterator()

If you don't know what should go in here yet, read on ahead and do the following part first.

Now, note that DLList itself does not implement Iterator. This is why we need a separate, nested, private class to be the iterator. Typically, this class is nested inside the data structure class itself so that it can access the internals of the object that instantiated the instance of the nested class. Suppose you call your nested, private class DLListIterator that implements Iterator<T> (notice that we don't need a generic type on the nested class name. Since it's already nested, the outer generic type still applies to the inner class). Write a basic iterator for DLList by adding the following methods to the DLListIterator class:

public boolean hasNext()
public T next()

satisfying the iterator requirement, and whatever constructor, instance variables, and additional code you need.

FAQ:

Q: What if someone calls next() when hasNext() returns false?

A: This violates the contract of your code and you can crash or do whatever you want. We won't be testing this. However, a common convention is to throw a NoSuchElementException.

A: Yes.

Q: Will hasNext() always be called before next()?

A: No.

Checklist

  1. Does your DLList object know anything about its DLListIterator's state? Information about iteration (index of the next item to return) should be confined to Iterator alone.
  2. Are multiple Iterators for same DLList object independent of each other? There can be multiple Iterators for a single DLList object, and one iterator's operation should not affect the state of another.
  3. Is your hasNext() method altering the state of your Iterator? It should not.

After you have modified your DLList class, feel free to write some test code to see if Java's enhanced for loop works as expected on your DLList.

E. Discussion on Concurrent Modification

For our lab, we regarded our data structure to be "frozen," while the Iterator was at work. In other words, we assumed that while we were operating on the iterators, the data structure would remain as is. However, this is not generally true in real life scenarios. Consider the following:

Iterator<BankAccount> it = accounts.iterator();
while (it.hasNext()) {
    // HERE, NEXT ACCOUNT WAS REMOVED SOMEHOW!!!
    checkValidity(it.next()); // This code breaks since the account was removed.
}

If all clients of the data structure were to only read, there would be no problem. However, if any were to modify the data structure while others are reading, this could break the fundamental invariant that next() returns the next item if hasNext() returns true!

To handle such situations, many Java Iterators throw ConcurrentModificationException if they detect that the data structure has been externally modified during the iterator's lifetime. This is called a "fail-fast" behavior. You can see the behavior if you try doing something like this:

ArrayList<String> lst = new ArrayList<>();
lst.add("Nyan");
lst.add("Meow");
lst.add("Nyaahh");
for (String s : lst) {
    lst.remove(1);
    System.out.println("s = " + s);
}

Discuss with your partner about how to implement this "fail-fast" property for a data structure. How can we propagate the modification to all existing iterators? If you want to see the Java standard library solution, run the above code, see the error message:

Exception in thread "main" java.util.ConcurrentModificationException at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901)

and click on the link in the stack trace ArrayList.java:901 to see where the fail-fast happened. You can even see how Java does their nested Iterator implementation!

F. Further Exercises (Ungraded)

I. Generics

  1. Generics are not confined to a single type parameter. Search the source code for java.util.Map interface to see how Java handles multiple generic type parameters in a single class. Then, write a class called SimplePair, which takes two type parameters, one for key and one for value. It should look like:

    public SimplePair(K key, V val) {
        ...
    }
    
    public K getKey() {
        ...
    }
    
    public V getValue() {
        ...
    }
  2. Search about Java generic wildcards and how they are used.
  3. Search about super and extends keywords used in generics.

II. Iterators

  1. Write EvenIterator for DLList that returns only even integers in DLList.
  2. Write ReverseIterator for DLList that iterates over DLList in reverse order.
  3. Write RandomIterator for DLList that randomly returns the items in DLList. Remember, each item should be returned only once.

G. Conclusion

Summary

In our experience, the biggest problem encountered by users of generics are syntax errors. We hope that lab activities alerted you to the problems.

Various aspects of iterators cause trouble, mainly with the maintenance of the internal state of the iteration. The constructor initializes the state, typically to the position of the first item to be enumerated. Subsequent processing maintains a data invariant that relates the position of the most recently returned element to the number of items in the collection being enumerated. And a detail: the hasNext method should never change the state of the iterator.

Submission

Submit the following files as lab10: