Why we care about balanced (binary) trees
Other than representing information with binary relationships, an important reason why we use binary trees is because, if we’ve organized information in just the right way, searching for the information is fast. For example, suppose we have the following set of numbers: { 0, 1, 2, 3, 4, 5, 6, 7 }
.
We could build the following tree:
0
\
1
\
2
\
3
\
4
\
5
\
6
\
7
While this is a binary tree, it’s not a very good one. Computer scientists often like to ask: how much work do we need to do? Suppose we want to find the node storing the number 7
. To find it, we’d have to do at least 7 traversal operations. Is 7 == 0
? No. Is 7 < 0
? No. Go right. Is 7 == 1
? No. Is 7 < 1
? No. Go right. And so on… until we get to the last node, which is equal to 7
.
In fact, you might be thinking “this tree looks awfully like a list”, which you already know has undesirable worst-case properties for searching: O(n)
, where n
is the size of the list. Indeed, when a tree “degenerates” this way, that’s also the worst case for a tree. The time it takes to find an arbitrary element in the worst case is O(n)
, because in that case, a tree is essentially a list.
Contrast the worst case with the following tree:
4
/ \
2 6
/ \ / \
1 3 5 7
/
0
For this tree, we have to do at most 3 traversal operations to find the deepest node, 0
. Clearly, if we have to store the numbers 0
through 7
in a tree, this is a better tree, because for any given node, we’ll spend less time looking than in the worst-case tree above. In fact, while you may reorganize the nodes in this better tree (e.g., suppose 2
was at the root), there is no more “efficient” binary tree than this one. How do I know? Because it’s balanced.
Tree “height”
We often refer to the length of the longest path from the root to any leaf in the tree as tree height. If this bothers you (because, let’s face it… the tree is upside-down), just flip the tree rightside-up in your head. You’ll get used to it eventually.
I’ve referred to height in a hand-wavy way up to this point, but let’s be a little more formal. The height of a tree can be defined recursively:
edgeHeight(tree) =
-1
if the tree is empty, ormax(height(tree.left), height(tree.right)) + 1
This definition is for a property we call “edge height”. For edge height, think of the base case: the tree is empty. It has a height of -1
. That’s a little weird, right? Fortunately, it all makes sense once you look at a non-empty tree. Suppose you have the following tree:
2
In other words, suppose you have a tree with just a root–no children. What is the height of the left subtree? By the edgeHeight
definition, it is -1
. The same is true for the right subtree: -1
. So what is the height of the node 2
? max(-1,-1) + 1 = 0
. This make sense because it has no edges.
A cute theorem
Given the above definition, you might be wondering about the converse question. Before, we asked “How can we measure the height of a tree?” Now we want to know: “How tall must a tree be in order to store \(n\) items?” It turns out that we can find out:
To store \(n\) nodes, \(h \geq \lfloor log_2n \rfloor\)
where \(h\) is the edge height of the tree. Note that the question above is really asking about a lower bound on \(h\), which is why the theorem above is an inequality. What kind of tree shape ends up being the shortest? It turns out that the most compact, i.e., the most balanced, tree ends up being the shortest.
If you don’t see why right away, read on for a long-winded analogy.
Tree “balance” and your little brother’s annoying hobby
Suppose your little brother is really quite the avid tree enthusiast, and he cares deeply about your trees. He’s worried about insects eating the trees. He wants you to help him hang little insect traps on every tree branch. Sadly, you wore the wrong pants today, and your pocket can only hold one trap. So that means that you need to climb up the tree from the bottom for every branch on the tree in order to hang a trap. Grr. Little brothers.
There are two trees. They both have the same number of branches (edges). But one of the trees is taller. If you want to minimize your work, which tree do you climb? Let’s use the trees we mentioned before.
For the first tree, which grows really straight and tall, you need to climb, from the bottom, 8 times. But the amount of work is (climb 0 branches to get to 0) + (climb 1 branch to get to 1, climb down) + (climb 2 branches to get to 2, climb down) + (climb 3 branches to get to 3, climb down) + (climb 4 branches to get to 4, climb down) + (climb 5 branches to get to 5, climb down) + (climb 6 branches to get to 6, climb down) + (climb 7 branches to get to 7, climb down) = 28 branches to climb. Phew!
0
\
1
\
2
\
3
\
4
\
5
\
6
\
7
For the second tree, you also need to climb from the bottom 8 times. But this tree is a little more well-rounded. You need to (climb 0 branches to get to 0) + (climb 1 branch to get to 2) + (climb 1 branch to get to 6) + (climb 2 branches to get to 1) + (climb 2 branches to get to 3) + (climb 2 branches to get to 6) + (climb 2 branches to get to 7) + (climb 3 branches to get to 0) = 13 branches to climb. As a budding computer scientist you definitely know how to count, and consequently you know that 13 < 28. You pick the second tree.
4
/ \
2 6
/ \ / \
1 3 5 7
/
0
But since you’re a budding computer scientist it also makes you wonder… is there a way to hang 8 insect traps and be able to climb fewer than 13 branches? In this insane, make-believe world, each branch splits into only two other branches of course (i.e., the tree is binary). So the answer is “no”, because the tree is “balanced.” It’s already as short as it can be. You aren’t just a lazy older sibling, you’re optimally lazy.
So how do we know this? We know because the second tree is balanced. And again, since we’ve talked up until this point about “balance” in a hand-wavy kind of way, let’s get a little more formal. You will not be surprised that there’s an elegant recursive definition for balance:
isBalanced(tree) =
is true
if and only if
- case 1: the tree is empty, or
- case 2:
isBalanced(tree.left)
istrue
andisBalanced(tree.right)
istrue
and| height(tree.left) - height(tree.right) | <= 1
Sure enough, we can walk through computing this for the second tree and see that it is balanced. Give it a try!
It turns out that the amount of time it takes to climb to a branch in the worst case is always \(O(\log_2n)\) if a tree is balanced. For big trees, \(O(\log_2n)\) is a lot smaller than \(O(n)\). The overview in the Wikipedia article on self-balancing binary search trees has a nice, clear derivation of this logarithmic time complexity.
OK, OK, enough of this stupid story. Where’s the code?
Here’s an example. It uses the following BinaryTree.java implementation.