The worksheet today will be worth 1 point and the code submission will be worth 2 points.
Introduction
As usual, pull the files from the skeleton and make a new IntelliJ project.
Over the past few labs, we have analyzed the performance of algorithms for access and insertion into binary search trees under the assumption that the trees were balanced. Informally, that means that the paths from root to leaves are all roughly the same length, resulting in a runtime logarithmic to the number of items. Otherwise, we would have to worry about lopsided, unbalanced trees in which search is linear rather than logarithmic.
This balancing doesn’t happen automatically, and we have seen how to insert items into a binary search tree to produce worstcase search behavior. There are two approaches we can take to make tree balancing happen:
 Incremental balancing
 At each insertion or deletion we do a bit of work to keep the tree balance
 Allatonce balancing
 We don’t do anything to keep the tree balanced until it gets too lopsided, then we completely rebalance the tree.
In the activities of this segment, we will start by analyzing some tree balancing code. Then, we will explore how much work is involved in maintaining complete balance. Finally, we’ll move on to explore three kinds of balanced search trees: Btrees, redblack trees, and leftleaning redblack trees.
Exercise: sortedQueueToTree
Implementation
In BST.java
, complete the sortedQueueToTree
method which should build a balanced binary search tree out of an Queue
that returns the items in sorted order.
Warning: This problem is challenging. Don’t spend too much time on it and come back to it at the end of the lab. Talk to your lab partner and let the ideas mingle in the back of your mind as you work through the rest of lab.
Hint: Recursion! Think about the parameters and the order of recursive calls. The very first item in the Queue is also the first item in the inorder traversal of the tree and, therefore, the leftmost child of the balanced binary search tree.
Draw out a couple of cases by hand and break down what needs to be done. There is only one
Queue
object (it will be passed through recursive calls), so make sure you think about the order things are being returned from theQueue
and how to construct a tree from it!
You may notice that in our BST, the items are of type Object
. However, in previous labs we have seen that the items must be Comparable
. Why is this the case? It’s because you shouldn’t need to ever do any comparisons yourself. Just trust that the order of the items returned from the Queue
is correct.
234 Trees
The first balanced search tree we’ll examine is the 234 tree. 234 trees guarantee height in any case. That is, the tree is always balanced by height.
234 trees are named that way because each nonleaf node can have either 2, 3, or 4 children. Additionally, any nonleaf node must have one more child than it does keys. That means that a node with 1 key must have 2 children, 2 keys with 3 children, and 3 keys with 4 children.
Here’s an example of a 234 tree:
Notice that it looks like a binary search tree, except each node can contain 1, 2, or 3 items rather than just 1. In addition, the tree also follows the ordering invariants of the binary search tree. For example, take the [2, 4] node. Each key in the subtree rooted at the first child has items less than 2. Each key in the second subtree has items between 2 and 4. Each key in the third subtree has items greater than 4. This extends to nodes with more or less keys as well. We can take advantage of this to construct a search algorithm similar to the search algorithm for binary search trees.
Discussion: 234 Tree Height
What does the invariant that a 234 tree has one more child than keys tell you about the length of the paths from the root to leaf nodes? How does that keep the 234 tree balanced? Discuss this with your partner.
Insertion into a 234 Tree
Although searching in a 234 tree is like searching in a BST, inserting a new item is a little different.
Like a BST, we always insert the new key at a leaf node. We must find the correct place for the key that we insert to go by traversing down the tree, and then insert the new key into the appropriate place in the existing leaf. However, unlike in a BST, nodes themselves in a 234 tree can be modified as a result of insertion if a leaf node is able to accept more keys.
Suppose we have the following 234 tree:
If we were to insert 10
into the tree, we first traverse down the tree until we find the proper leaf node to insert it into. We subsequently find the leaf node that contains 8
and 11
. That node still has room (because a node can contain up to 3 keys), and we can insert 10
between 8
and 11
, leaving us with a three key node, as shown below.
However, what if the leaf node we choose to insert into already has 3 keys? Even though we’d like to put the new item there, it won’t fit because nodes can have no more than 3 keys. What should we do?
The solution is to push up the middle key in the node to the parent, which will split up the node into two single nodes. Then, we will add the key that we are trying to insert into the correct place.
It is possible that this will cause the parent or the parent’s parent to have too many keys. In cases like this, we will repeat this process until our tree has no more nodes with 4 keys in it.
Let’s try to insert 9 into the 234 tree above.
We must first find which leaf to insert into. 9 is smaller than 20, so we go down to the left. Then we find the leaf that we would like to insert into. Because this node has 3 keys, [8, 10, 11], we push up the middle key (10), then split the 8 and the 11 into separate nodes. We insert 9 into the node with 8 in it, as seen below.
There is one little special case we have to worry about when it comes to inserting, and that’s if the root is a 3key node. Roots don’t have a parent to push the middle key to. In that case, we still push up the middle key and instead of giving it to the parent, we make it the new root. Only when this happens does the height of the 234 tree increase.
Worksheet: 1. BTrees
Complete this exercise on your worksheet.
Removal from a 234 Tree
We will not be covering how to remove from a 234 tree in this course, but notes on how to do so are in Shewchuk’s notes. The idea behind removal is similar to removal from a BST. Like in insertion, we modify nodes as we find what to delete. Instead of getting rid of 3key nodes, remove will create 3key nodes. Again, you will not be responsible on understanding the exact mechanics of the removal algorithm.
Btrees
A 234 tree is a special case of a structure called a Btree. Other types of Btrees include 23 trees, 2345 trees (often called 25 trees), etc. As you can see, what varies among Btrees is the number of keys/subtrees per node.
Btrees with lots of keys per node (on the order of thousands) are especially useful for organizing storage of disk files, for example, in an operating system or database system. Since retrieval from a hard disk is quite slow compared to computation operations involving main memory, we generally try to process information in large chunks in such a situation, in order to minimize the number of disk accesses. The directory structure of a file system could be organized as a Btree so that a single disk access would read an entire node of the tree. This would consist of, say, the text of a file or the contents of a directory of files.
RedBlack Trees
We saw that 234 trees are balanced, guaranteeing that a path from the root to any leaf is . However, 234 trees are notoriously difficult and cumbersome to code, with numerous corner cases for common operations.
What’s used in practice is a redblack tree (in fact, it is the tree behind Java’s TreeSet
and TreeMap
). A redblack tree is a usual binary search tree, but has additional invariants related to “coloring” each node red or black. This in fact creates a mapping between 234 trees and redblack trees!
The consequence is quite astounding: redblack trees maintain the balance of Btrees while inheriting all normal binary search tree operations (a redblack tree is a binary search tree after all) with additional housekeeping. This selfbalancing quality is why Java’s TreeMap
and TreeSet
are implemented as redblack trees!
234 Tree ↔ RedBlack Tree
Notice that a 234 tree can have 1, 2, or 3 elements per node, and 2, 3, or 4 pointers to its children. Consider the following Btree node:
parent

4 7 10
   
a b c d
a
, b
, c
, and d
are pointers to its children nodes/subtrees. a
will have elements less than 4, b
will have elements between 4 and 7, c
will have elements between 7 and 10, and d
will have elements greater than 10.
Let us transform this into a “section” in redblack tree that represents above node:
parent (red or black)

7 (black)
/ \
/ \
4 (red) 10 (red)
/ \ / \
a b c d
For each node in the 234 tree, we have a group of redblack tree nodes (or colored binary search tree nodes) that correspond to it. Each element in a node of a 234 tree will correspond to the following in a redblack tree:
 Middle element ↔ a black node that is reachable from parent.
 Left element ↔ a red node on the left of the middle element’s black node.
 Right element ↔ a red node on the right of the middle element’s black node.
As a general rule, any red node in a redblack tree will be in the same node as its parent in a Btree.
Notice that a
, b
, c
, and d
are also “sections” that corresponds to a 234 tree node. Thus, the subtrees will have black root nodes and other colored nodes that follow the above correspondence.
How about a 234 node with two elements?
parent

5 9
  
a b c
This node can be transformed as:
parent

5 (black)
/ \
a 9 (red)
/ \
b c
or
parent

9 (black)
/ \
5 (red) c
/ \
a b
Both are valid.
Trivially, for 234 node with only one element, it will translate into a single black node in redblack tree.
Discussion: Correspondence
It may be hard to see at first that the balancedness of 234 tree still holds on redblack tree, and that a redblack tree is still a binary search tree.
Given a 234 tree, how can we apply the correspondence rule above to determine the maximum height of the corresponding redblack tree? If each 4child node in the 234 tree is equivalent to two red nodes and one black node in a redblack tree, what can we say about the height of the redblack tree?
Then, discuss with your partner about which of the following binary search tree operations we can use on redblack trees without any modification.
 Insertion
 Deletion
 Search (is
k
in the tree?)  Range Queries (return all items between
a
andb
)
Exercise: Constructor
Read the code in RedBlackTree.java
and BTree.java
.
Then, in RedBlackTree.java
, implement buildRedBlackTree
which returns the root node of the redblack tree which has a onetoone mapping to the given 234 tree. For a 234 tree node with 2 elements in a node, you must use the left element as the black node to pass the autograder tests. For the example above, you must use the first representation of redblack tree where 5 is the black node.
If you’re stuck, refer to the example conversions shown above to help you write this method!
LeftLeaning RedBlack Trees
In the last section, we examined 234 trees their relationship to redblack trees. Here, we want to extend our discussion a little bit more and examine 23 trees and leftleaning redblack trees.
23 trees are Btrees, just like 234 trees. However, they can have up to 2 elements and 3 children, whereas 234 trees could have one more of each. Naturally, we can come up with a redblack tree counterpart. However, since we can only have nodes with 1 or 2 elements, we can either have a single black node (for a oneelement 23 node) or a “section” with a black node and a red node (for a twoelement 23 node). As seen in the previous section, we can either let the red node be a left child or a right child. For these trees, we will choose to always let the red node be the left child of the black node. This leads to the name, “leftleaning” redblack tree. An example of a node containing 5 and 9 is shown below:
parent

9 (black)
/
5 (red)
/
a
The advantages of LLRB tree over the usual redblack tree is the ease of implementation. Since there are less special cases for each “section” that represents a 23 node, the implementation is much simpler.
Normal binary search tree insertions and deletions can break the redblack tree invariants, so we need additional operations that can “restore” the redblack tree properties. In LLRB trees, there are two key operations that we use to restore the properties: rotations and color flips.
Rotations
Consider the following tree:
parent

7
/ \
1 c
/ \
a b
a
is a subtree with all elements less than 1, b
is a subtree with elements between 1 and 7, and c
is a subtree with all elements greater than 7. Now, let’s take a look at another tree:
parent

1
/ \
a 7
/ \
b c
There are few key things to notice:
 The root of the tree has changed from 7 to 1.
a
,b
, andc
are still correctly placed. That is, their items do not violate the binary search tree invariants. The height of the tree can change by 1.
Here, we call this transition from the first tree to the second a “right rotation on 7”.
Now, convince yourself that a “left rotation on 1” on the second tree will give us the first tree. The two operations are symmetric, and both maintain the binary search tree property!
Worksheet: 2. LLRBs
Complete this exercise on your worksheet.
Exercise: Rotations
Now we have seen that we can rotate the tree to balance it without violating the binary search tree invariants. Now, we will implement it ourselves!
In RedBlackTree.java
, implement rotateRight
and rotateLeft
. For your implementation, make the new root have the color of the old root, and color the old root red.
Hint: The two operations are symmetric. Should the code significantly differ? If you find yourself stuck, take a look at the examples that are shown above!
Color Flip
Now we consider the color flip operation that is essential to LLRB tree implementation. Given a node, this operation simply flips the color of itself, and the left and right children. However simple it may look now, we will examine its consequences later on.
For now, take a look at the implementation provided in RedBlackTree.java
.
Insertion
Finally, we are ready to put the puzzles together and see how insertion works on LLRB trees!
Say we are inserting x
.
 If the tree is empty, let
x
be the root with black color.  Otherwise, do the normal binary search tree insertion, and color
x
red. Then, restore LLRB tree properties.
But, how do we “restore LLRB tree properties”? Let’s list out all the cases that can occur, and how we will deal with each of them.
Case 1
Let’s assume that our new node x
is the only child of a black node. That is:
parent (black)
/
x (red) or
parent (black)
\
x (red)
Since we decided to make our tree left leaning, we know that the first tree is the valid form. If we end up with the second tree (x
> parent
) we can simply apply rotateLeft
on parent
to get the first tree.
Case 2
Now, let’s consider the case when our new node x
becomes a child of a black node which already has a left child, or a child of a red node. This is like trying to insert x
into a 23 tree node that already has 2 items and 3 children!
Here, we have to deal with 3 different cases, and we will label them case A, B, C.
 Case 2.A
x
ends up as the right child of the black node. 5 (black) / \ 1 (red) x (red)
Notice that for case 2.A, the resulting section is the same as a 23 tree node with one extra element:
 1 5 x
To fix it, we “split” the 23 node into two halves, “pushing” up the middle element to its parent:
 5 (sibling)    1 x (nephews)
Analogously, for our LLRB section, we can apply
flipColor
on5
. This results in: 5 (red) / \ 1 (black) x (black)
Notice that this exactly models the 23 node we desired. 5 is now a red node, which means that is is now part of the “parent 23 node section.” Now, if 5 as a new red node becomes a problem, we can recursively deal with it as we are dealing with
x
now. Also, note that the root of the whole tree should always be black, and it is perfectly fine for the root to have two black children. It is simply a root 23 node with single element and two children, each with single element. Case 2.B
x
ends up as the left child of the red node. 5 (black) / 1 (red) / x (red)
In this case, we can apply
rotateRight
on5
, which will result in: 1 (black) / \ x (red) 5 (red)
This should look familiar, since it is exactly case 2.A that we just examined before! After a rotation, our problem reduces to solving case 2.A. Convince yourself that rotation performed here correctly handles the color changes and maintains the binary search tree properties.
 Case 2.C
x
ends up as the right child of the red node. 5 (black) / 1 (red) \ x (red)
In this case, we can apply
rotateLeft
on1
, which will result in: 5 (black) / x (red) / 1 (red)
This also should look familiar, since it is exactly case 2.B that we just examined. We just need one more rotation and color flip to restore LLRB properties.
Deletion
Deletion deals with many more corner cases and is generally more difficult to implement. For time’s sake, deletion is left out for this lab.
Exercise: insert
Now, we will implement insert
in RedBlackTree.java
. We have provided you with most of the logic structure, so all you need to do is deal with normal binary search tree insertion and handle case A, B, and C. Make sure you follow the steps from all the cases very carefully! The root of the RedBlackTree
should always be black.
Use the helper methods that have already been provided for you in the skeleton files (flipColors
and isRed
) and your rotateRight
and rotateLeft
methods to simplify the code writing!
If you’re stuck, write a recursive helper method similar to how we’ve seen insert
implemented in previous labs!
 What should the return value be? (Hint: It’s not
void
.)  What should the arguments be?
In addition, think about the similarities between cases 2.A, 2.B, and 2.C, and think about how you can integrate those similarities to simplify your code. Feel free to discuss all these points with your lab partner and the lab staff.
Discussion: insert
Runtime
We have seen that even though LLRB trees guarantee that the tree will be almost balanced, the LLRB tree insert
operation requires many rotations and color flips. Examine the procedure for insert
and convince yourself and your partner that insert
still takes as in balanced binary search trees.
Hint: How long is the path from root to the new leaf? For each node along the path, are additional operations limited to some constant number? What does that mean?
Other Balanced Trees
AVL Trees
AVL trees (named after their Russian inventors, Adel’sonVel’skii and Landis) are heightbalanced binary search trees, in which information about tree height is stored in each node along with the item. Restructuring of an AVL tree after insertion is done via a familiar process of rotation, but without color changes.
Splay Trees
Another type of selfbalancing BST is called the splay tree. Like other selfbalancing trees (AVL, redblack), a splay tree uses rotations to keep itself balanced. However, for a splay tree, the notion of what it means to be balanced is different. A splay tree doesn’t care about differing heights of subtrees, so its shape is less constrained. All a splay tree cares about is that recently accessed nodes are near the top. Upon insertion or access of an item, the tree is adjusted so that item is at the top. Upon deletion, the item is first brought to the top and then deleted.
Deliverables
 Complete the
sortedQueueToTree
method fromBST.java
.  Complete the following methods in
RedBlackTree.java
:buildRedBlackTree
rotateRight
androtateLeft
insert
 Submit the worksheet to your lab TA by the end of your section.