Lab 06: Linked List Enhancements

FAQ

The FAQ for Lab 6 is located here.

Before You Begin

Pull the skeleton code from GitHub and open it on Intellij like usual.

Linked Lists

In today’s lab, we’re going to continue our discussion of linked lists by introducing a new way of organizing programs through encapsulation. By hiding our linked list behind an abstraction barrier, it will be more user-friendly.

We will first define a few terms.

Module: A set of methods that work together as a whole to perform some task or set of related tasks.
Encapsulated: A module is said to be encapsulated if its implementation is completely hidden, and it can be accessed by the user only through a documented interface.

Enhancement 1: Encapsulation

The basic implementation of IntList that we learned fom Lab 5 is impractical to use at scale. In it, the idea of a node in the list and the list itself are not separate. This means that someone who wants to use the IntList class would need to spend a lot of time adding null checks and handling errors that might come as a result of accidentally or intentionally modifying the item or next of a list. By giving users direct control to the IntList data structure’s internals, in some ways, we’ve actually made it harder to reason about the correctness of a program.

We want to separate the idea of the list as an entity from an entry in the list, much like an abstraction barrier. We’ll see that this design philosophy of encapsulation will come up over and over again for each of the data structures we study in the coming weeks.

Let’s update our lab 5 IntList with encapsulation, which we will now call a SLList (Singly Linked List). Note that in order to emphasize the encapsulation in box-and-pointer diagrams, we will be breaking the normal box-and-pointer rule that we don’t put Objects inside other Objects.

Note that throughout the lab, some of the images do not depict the size variable for simplicity. This variable is an instance variable for all SLLists even if it is not shown in our illustrations.

All operations on the list are handled through an instance of an SLList object, which keeps a single instance variable head in order to track the first node in the linked list. The SLList object operates directly on the IntListNodes that make up the SLList. Note that IntListNode does not need any methods (other than some utility methods such as toString() and equals(), which you will use later). The main effect of using encapsulation, by using a reference to the head of a list, is hiding the internal representation and operations of the list from users. Instead, the users could simply operate on list as a whole and the SLList representation will take care of the operations, including any null checks, size checks, or further operations.

Enhancement 2: Sentinel

Although encapsulation shields users of our data structure from having to deal with cumbersome null checks, we as the implementors of our linked list class still have to handle them. However, we can eliminate null checks and simplify our code greatly with usage of a sentinel node instead of the head pointer. The sentinel node always exists (even when our list is empty), and does not represent an actual node in the list. Its next field represents the front of the list, unless the list is empty. The sentinel’s item is some meaningless value (in this example, we have chosen 42, but it doesn’t matter what its value is; a sentinel’s item should never be accessed). The last node in the list should have its next pointing back at the sentinel, and an empty list is represented by just a sentinel node whose next points back to itself. In a box-and-pointer diagram, a empty singly-linked list with a sentinel looks like this:

and a 2-item singly-linked list with a sentinel looks like this:

This has many benefits. When iterating through the list, we do not need to worry about reaching a null pointer at any point; when our moving pointer reaches the sentinel, it’s the end of the list. When inserting and removing nodes, we don’t need to manually handle being near the end or beginning or being the first or last element inserted or removed of the list; the code can be written with disregard for null-checking edge cases and it will still work properly because of the sentinel!

In code, an implementation might look like this:

public class SLList {

    /**
     * IntListNode is a nested class that represents a single node in the
     * SLList, storing an item and a reference to the next IntListNode.
     */
    private static class IntListNode {
        /**
         * The access modifiers inside a private nested class are irrelevant:
         * both the inner class and the outer class can access these instance
         * variables and methods. 
         * Though making the instance variables of the inner class 
         private is better practice. 
         */
        public int item;
        public IntListNode next;

        public IntListNode(int item, IntListNode next) {
            this.item = item;
            this.next = next;
        }
    }

    /**
    *  sentinel is a dummy node that points to the first real node. That is,
    *  sentinel.item is a dummy value, while sentinel.next.item is our true
    *  first item.
    */
    private IntListNode sentinel;
    private int size;

    /* ... more methods to operate on ... */
}

The inner class is declared private to enforce our abstraction barrier, but what does the static modifier do? The static modifier prevents the inner class from accessing instance variables or methods of the outer class. (Note that the inner static class can still access static (class) variables or methods of the outer class). This saves a bit of memory and helps prevent mistakes by making sure we don’t modify anything we shouldn’t.

`toString` and `equals`

Just like yesterday, we will want a toString method so that our linked lists will be displayed in a more readable format in the debugger or when printing to the console. We will also want an equals method that checks whether or not list A and list B contain the same content, rather than checking if A and B refer to the same object. Here is an illustration of this difference:

These implementations have been provided for you this time. Take a brief look at their implementations in your skeleton file SLList.java to understand what we’ve changed.

Exercise: `add`

As a bit of practice working with these improved lists, implement the add method in SLList.java, which adds the int x to the list at the specified int index. Be sure to look at and utilize the given completed methods. We suggest you read the entire class before starting coding this problem.

In the case that index is greater than or equal to the current size of the list, x should just be added to the end of the list. For example, inserting into index 10 of an empty list should result in a list of size 1 with the inserted item in position 0.

public void add(int index, int x) {
    // TODO: YOUR CODE HERE
}

Check your solution with the tests provided in SLListTest.java. As always, feel free to add more testing methods!

Implement the add method.

Exercise: `reverse`

Over the next few steps, we’re going to be completing the method reverse in SLList using our current SLList implementation.

/* Reverses a linked list destructively and assigns sentinel.next to the head of the reversed
linked list.*/
public void reverse() {
    // TODO: YOUR CODE HERE
}

We will assume the restriction that the reversal is to be done in place, destructively, without creating any new nodes.

However, don’t implement the method just yet!

Testing `reverse`

Make sure that you’re familiar with how to test with the Truth library from the section on Testing in the last few labs.

Once again, we’re going to write the test before we write reverse, as part of our ongoing crusade of test-driven development (TDD).

Especially for tricky problems like this, thinking about test cases before writing code can help us write better code more quickly. This foresight hopefully will lead to a better outcome when compared to rushing ahead and trying to solve the problem without first having considered all the different scenarios.

Add tests in SLListTest.java to cover at least the following three situations:

General case: That the function returns a reversed list (for lists of size \(\ge 2\)).
Base case: That the function “reverses” a list of size 1 by completing without erroring.
Base case: That the function “reverses” a list of size 0 by completing without erroring.

Implementing `reverse`

public void reverse() {
    // TODO: YOUR CODE HERE
}

Try running the test you just wrote; it should fail. This is a good sign because it means our tests are working as expected.

Implement reverse recursively, and rerun the tests until all of them pass. A helper method that accepts an IntListNode will be helpful.

Click to Expand for Hint / Visual Walkthrough

This is a challenging problem. The following diagrams show the progression of reversing a linked list recursively using a helper method called reverseHelper.

Initial call to helper function

Given a reference to the true first node in our linked list, we will take the recursive leap of faith on the rest of the linked list as shown below.

Hit a base case(the end of the list)

Trusting our recursive leap of faith, we assign a temp variable to the returned value of our recursive call, which should have reversed the current frame’s L tail.

Set the end of the returned list to the current frame’s L.

Since the current frame is the “end” of our the reversed linked list so far, we set the current frame’s L tail to be the sentinel.

Return the front of the reversed list.

Update sentinel to point to the front of the reversed linked list.

By now, hopefully you are feeling more comfortable writing your own tests using the Truth library! Writing tests will be one of the most important skills you learn here at Berkeley, and getting test-infected will save you and your future colleagues an enormous amount of time and misery.

If you find the concept of test-driven development interesting, or just want to learn more about testing philosophy and the conversation around TDD, here are a couple interesting leisure readings:

Checking In on the State of TDD

A Response to “Why Most Unit Testing is Waste”

Implement the reverse method.

Enhancement 3: Doubly-Linked

There are some major issues, both efficiency-wise and code-simplicity-wise with the linked list implementations we’ve been working with so far:

It’s easy to insert into the front of the list, but requires a lot more work to insert into the back of the list.
If we want to remove a node in our linked list, even if we have a reference to the node we want to remove, we have to traverse until we find the node before it in order to remove it.

The solution to these problems is to use a doubly-linked list. Each node keeps track of the node after and before itself. If we aren’t using a sentinel, then instead of just a head pointer, the list object maintains both a head and tail pointer, to the front and back of the list respectively. It looks like this:

encapsulation

This adds a bit of complexity to the operations, but allows for constant time insertion to the front and back, and allows the user to traverse the list either forwards or backwards. In other words, regardless of the length of the list, the runtime of an insertion at the front or back of the list would take the same amount of time. We will be covering constant time insertion in much more detail once we get to the asymptotics analysis lab!

Think about how a doubly-linked list will compare to that of a singly-linked list, and discuss with someone how the doubly-linked list improves these runtimes. Make sure you understand this concept before continuing.

If we are using a sentinel, then it will look like this:

sentinel

Note that the sentinel’s item is marked with an “X”, again because that value is irrelevant. If our list is of non-primitive items, then we can make it null. If it is of primitive items (like the ints we’ve been using, then it can just be any arbitrary value (since primitives cannot be null)).

Here’s a more simplified visualization of the circular structure with DLList:

sentine2l

With all of these enhancements, the encapsulated doubly-linked list with a sentinel node is typically used in practice as the linked list representation, like in Java’s own standard library class, java.util.LinkedList.

Invariants

An invariant is a fact about a data structure that is guaranteed to be true (assuming there are no bugs in your code). This gives us a convenient checklist every time we add a feature to our data structure. Users are also guaranteed certain properties that they trust will be maintained. For example, an SLList with a sentinel node has at least the following invariants:

The sentinel reference always points to a sentinel node.
The front item (if it exists), is always at sentinel.next.item.
The size variable is always the total number of items in the list (doesn’t include the sentinel).
The last node’s next always points back at the sentinel node.

Be sure that any invariant broken while performing an operation on a data structure is restored by the end of the function, so that subsequent function calls can rest assured that they are acting on a valid data structure and that all of the instance variables hold the proper values.

Enhancement 4: Generic Lists

Our final alteration will be somewhat more minor than the other structural changes, but this will have a powerful impact on how generalizable our code is. So far we have considered our lists to be of some fixed type (SLLists contained ints). What if we want to use a list to store doubles instead? What if we then want to extend it further to be a list of String objects? Up until now we would have had to create a new copy of the class for each of these data types.

The solution to this is to use something called generics. The basic principle is that instead of writing many classes which are each specific to one type of data (e.g. ints, doubles, Strings, etc.), we will write one class which has a generic type. This generic type can be thought of as a placeholder type that will be filled in by a real type when we go to use the class. This is illustrated further with the example of how to generalize a DLList (doubly-linked list) by rewriting it with generics.

Generic DLList

How can we modify our DLList so that it can be a list of whatever objects we choose? The class definition for DLList with integers looks like this:

  public class DLList {
      public void addLast(int x) { ... }
      // more methods here
  }

We will change this to:

  public class DLList<T> {
      public void addLast(T x) { ... }
      // more methods here
  }

where T is a placeholder object type. Notice the angle bracket syntax. Also note that we don’t have to use T; any variable name is fine. In our DLList, our item is now of type T, and our methods now take T instances as parameters. We can also rename our IntNode class to TNode for accuracy.

Using Generic DLList

Recall that to create a DLList, we originally typed:

  DLList list = new DLList(10);

Before we did not need to specify the type for the DLList since it was always int. If we now want to create a DLList holding String objects (or any other type of objects), then we must specify what the placeholder T should be filled in with. An example of what that looks like :

  DLList<String> list = new DLList<>("cs61bl");
  list.addLast("rocks!")

On list creation, the compiler replaces all instances of T with String! Through the rest of the labs in the course we will continue to work with generics, as for most data structures, they will be more useful if we can use them with many types of data!

Recap

Yesterday, we worked with a basic IntList, which was difficult to use because knowledge of implementation details were necessary in order for its use.

Today, we solved this problem through encapsulation. By nesting our node class inside another class, we can setup an abstraction barrier to shield our users from the details, and we can provide utility functions to allow users to operate naively upon our data structure.

To eliminate the need to have if statements that protect us from NullPointerExceptions, we can use a sentinel.

For efficiency purposes, we can introduce backwards prev links as well to make it doubly-linked.

Finally, we generalized the code to work with any type of data through the use of generics.

Deliverables

As always, ensure you have git add, git commit, and git push origin main all of your required files before submission.

SLList.java
SLListTest.java

Though there are no associated deliverables, we strongly recommend reading the Doubly-Linked section twice as it is useful (hint: vital) for the project!!

If you haven’t already, work on implementing your LinkedListDeque for Project 1A in lab! You should now have all of the necessary skills to tackle the project.

FAQ