Binary trees

The trees in computer science
have the curious tendency to grow downwards…

[Computer Science trees grow from the root down]

A binary tree is a data structure more general than a linked list.  This chapter introduces some basics operations on binary trees.  The next chapter, Binary search trees, will discuss a fundamental application.

Table of contents:

Nodes and their children

A binary tree is a set of records that satisfy certain conditions.  The conditions will not be given explicitly, but they will become clear in the context.  The records will be called nodes (but could also be called cells).  Each node has an address.  We shall assume, for now, that each node has only three fields:  an integer number and two pointers to nodes.  The nodes can, then, be defined as follows:

   typedef struct record {
      int   contents;
      node *lft;
      node *rght;
   } node;

[cell of binary tree]

The contents field is the payload of the node;  the other two fields define the structure of the tree.  The lft field of each node contains NULL or the address of another node.  An analogous assumption holds for the rght field.

If the lft field of a node P is the address of a node L, we shall say that L is the left child of P.  Similarly, if P.rght is equal to &R, we shall say that R is the right child of P.  If a node C is a child (left or right) of P, we shall say that P is the parent of C.  A leaf is a node that has no children.

When talking about binary trees, it is very convenient to blurr the distinction between a node and its address.  So, if x is a pointer to a node (that is, if x is of type *node), we say  consider the node x  instead of  consider the node whose address is x.

The figure below shows two examples of binary trees. On the left, a curious binary tree in nature. On the right, a schematic representation of a binary tree whose nodes contain the numbers 2, 7, 5, etc.

[complete-binary-tree-in-nature.png] Binary_tree.png

Exercises 1

  1. List the conditions that a set of nodes must satisfy to be considered a binary tree.

Trees and subtrees

Suppose that x is a node.  A descendant of x is any node that can be reached by iterating the instructions  x = x->lft  and  x = x->rght  in any order.  (Of course these instructions can only be iterated while x is not NULL. We are assuming that NULL is eventually reached.)

A node x together with all its descendants is a binary tree.  We say that x is the root of the tree.  If x has a parent, this tree is a subtree of some larger tree.  If x is NULL, the tree is empty.

For any node x, the node  x->lft  is the root of the left subtree of  x  and  x->rght  is the root of the right subtree of x.

The address of a tree

The address of a binary tree is the address of its root.  In verbal communication, it is convenient to blurr the distinction between trees and their addresses:  we say  consider a tree r  instead of  consider a tree whose root has address r.  This convention suggests the introduction of the alternative name tree for the data type pointer-to-node:

typedef node *tree;

Given this convention, we can give a recursive definition of binary tree:  a pointer-to-node r is a binary tree if

  1. r is NULL  or
  2. r->lft  and  r->rght  are binary trees.

Many algorithms on trees become simpler when written in recursive style.

Exercises 2

  1. Sequences of parentheses.  Binary trees have a very intimate relation with certain well-formed sequences of parentheses. Discuss this relation.
  2. Arithmetic expressions.  Binary trees can be used, very naturally, to represent arithmetic expressions (such as ((a+b)*c-d)/(e-f)+g, for example).  Discuss the details.

Left-root-right traversal

A binary tree can be traversed (i.e., scanned) in many different ways. One particularly important order of traversal is left-root-right, also known as inorder traversal, or infix traversal.  The left-root-right traversal visits

  1. the left subtree of the root, in left-root-right order,
  2. then the root,
  3. and finally the right subtree of the root, in left-root-right order.

In the following figure, the nodes are numbered in the order in which they are visited by the left-root-right traversal.

[arvore binária]

Here is a recursive function that does the left-root-right traversal of a binary tree:

// Receives the root r of a binary tree
// and prints the contents of its nodes
// in left-root-right order.

void leftrootright (tree r) {
   if (r != NULL) {
      leftrootright (r->lft);
      printf ("%d\n", r->contents);
      leftrootright (r->rght); 
   }
}

Writing an iterative version of this function is an excellent exercise. The version below uses a stack of nodes. The sequence of nodes on the stack is a kind of to do list:  each node x on the stack is a reminder that we must still print the contents of x and the contents of the right subtree of x.  The element at the top of the stack can be a NULL, but all the other elements are non NULL.

void i_leftrootright (tree r) {
   createstack ();  // stack of nodes 
   push (r);
   while (true) {
      x = pop ();
      if (x != NULL) {
         push (x);
         push (x->lft);
      }
      else {
         if (stackisempty ()) break;
         x = pop ();
         printf ("%d\n", x->contents);
         push (x->rght);
      }
   }
   freestack ();
}

(The code would be slightly simpler — but a little less readable — if it were to manipulate the stack directly.)  The figure below is the trace of execution of the i_leftrootright function. Each line of the table gives the state of things at the beginning of an iteration: on the left is the stack and on the right the printed nodes.  The value NULL is indicated by -.

          5        
        /   \      
                   
     3         8   
   /   \     /   \ 
  1     4   6     9
 / \         \     
0   2         7    
     stack        print
     5               
     5 3             
     5 3 1           
     5 3 1 0         
     5 3 1 0 -       
     5 3 1 -      0
     5 3 2        1
     5 3 2 -       
     5 3 -        2
     5 4          3
     5 4 -         
     5 -          4
     8            5
     8 6           
     8 6 -         
     8 7          6
     8 -          7
     9            8
     9 -           
     -            9

The exercises below discuss two other orders of traversal of a binary tree:  the root-left-right traversal and the left-right-root traversal.  The first one is also known as preorder traversal, or prefix traversal.  The second one is also known as postorder traversal, or postfix traversal.  (By the way, see the exercise about infix, postfix, and prefix notations below.)

Exercises 3

  1. Consider the function i_leftrootright. Is it true that the sequence of nodes on the stack is a path that begins at some node and follows the left and right pointers in some order?
  2. Check that the code below is equivalent to the function i_leftrootright.  The code uses a stack of nodes, all different from NULL, plus a node x (which is a candidate to be pushed onto the stack or NULL).
    void i_leftrootright (tree r) {
       node *x = r;
       createstack ();
       while (true) {
          while (x != NULL) {
             push (x);
             x = x->lft; }
          if (stackisempty ()) break;
          x = pop ();
          printf ("%d\n", x->contents);
          x = x->rght; }
       freestack (); }
    
  3. Number of nodes.  Write a function to calcule the number of nodes of a binary tree.
  4. Leaves.  Write a function to print, in left-root-right order, the contents of the leaves of a binary tree.
  5. Root-left-right traversal.  Write a function to do the root-left-right (i.e., prefix) traversal of a binary tree.  (This traversal is also known as depth-first search.)
  6. Left-right-root traversal.  Write a function to do the left-right-root (i.e., postfix) traversal of a binary tree.
  7. Infix, postfix, and prefix notations.  Show that a left-root-right traversal of the tree of an arithmetic expression corresponds exactly to an infix representation of the expression.  Show that a left-right-root traversal of the tree of an arithmetic expression corresponds exactly to the representation of the expression in postfix notation.  Show that a root-left-right traversal of the tree of an arithmetic expression corresponds to prefix notation.

Height and depth

The height of a node x in a binary tree is the distance between x and its most distant descendant. More precisely, the height of x is the number of steps in the longest path that leads from x to a leaf. The paths that this definition refers to are those obtained by iterating the instructions x = x->lft and x = x->rght, in any order.

The height of a tree is the height of the root of the tree.  A tree with only one node has height 0. The figure shows a tree of height 3.

[arvore binária]

Here is how the height of a tree with root r may be computed:

// Returns the height of the binary tree r.

int height (tree r) {
   if (r == NULL) 
      return -1; // height of empty tree
   else {
      int he = height (r->lft);
      int hd = height (r->rght);
      if (he < hd) return hd + 1;
      else return he + 1;
   }
}

What is the relation between the height, say h, and the number of nodes of a binary tree?  If the tree has n nodes then

lg (n)  ≤  h  ≤  n−1 ,

where  lg (n)  denotes  ⌊log n,  i.e., the floor of log n.  A binary tree of height n−1 is a trunk without branches: each node has at most one child.  At the other extreme, a tree of height lg (n) is quasi complete: all the levels are full except perhaps the last one.

      n    lg(n)
      4       2
      5       2
      6       2
     10       3
     64       6
    100       6
    128       7
   1000       9
   1024      10
1000000      19

A binary tree is balanced if, for each of its nodes, the left and right subtrees have nearly the same height. The height of a balanced binary tree with n nodes is close to  log n.  (The tree in the example above is balanced).  We should work with trees that are balanced whenever possible. But this is not easy if the tree grows and contracts during the execution of your program.

Depth.  The depth of a node s in a binary tree having root r is the distance from r to s. More precisely, the depth of s is the length of the only path that goes from r to s. For example, the depth of r is 0 and the depth of r->lft is 1.

A tree is balanced if all its leaves have nearly the same depth.

Exercises 4

  1. Draw a binary tree with 17 nodes that has the smallest possible height. Repeat with the greatest possible height.
  2. Write an iterative function to calculate the height of a binary tree.
  3. Write a function to print, one per line, the contents of the nodes of a binary tree. Indent each line in proportion to the depth of the node. Here is an example of a tree and the corresponding output (the characters '-' represent NULL):
             555                555       
           /     \                 333    
                                      111 
        333       888                    -
       /   \         \                   -
     111   444       999              444 
                                         -
                                         -
                                   888    
                                      -   
                                      999 
                                         -
                                         -
    
  4. Level traversal.  The level traversal, or breadth-first traversal, of a binary tree visits all the nodes at depth 0, then all the nodes at depth 1, then all the nodes at depth 2, and so on.  For each depth, the nodes are visited from left to right.  Write a function that will print the contents of the nodes as it traverses the tree in level order.  (Hint: Use a queue of nodes. See the calculation of distances in a graph.)
  5. Let's say that h is the height and d is the depth of a node x in a binary tree.  Is it true that h + d is equal to the height of the tree?
  6. Write a function that will compute the depth of a given node in a binary tree.
  7. Write a function that will decide whether a given binary tree is quasi complete.
  8. From heap to tree.  Write a function to transform a heap v[1..n] into a (quasi complete) binary tree.
  9. From tree to heap.  A binary tree is hierarchical if  x->contents ≥ x->lft->contents  and  x->contents ≥ x->rght->contents  for every node x, provided the children exist.  Is it possible to transform any hierarchical quasi complete tree into a heap?
  10. Hierarchization.  A leftist binary tree is one having the following property: for every node x, if x->lft == NULL then also x->rght == NULL.  Write a function that will move the contents fields of a leftist tree from one node to another so that the tree becomes hierarchical.  (Hint: Emulate the function that transforms an array into a heap. In particular, emulate the sieve function.)
  11. AVL trees.  A tree is balanced in the AVL sense if, para each node x, the heights of the subtrees rooted at x->lft and x->rght differ by at most 1. Write a function to decide whether a given tree is balanced in the AVL sense. Try to write your function so that it visits each node at most once.

Parent fields

In some applications (see the next section) it is convenient to have direct access to the parent of each node. To achieve this, we must add a prnt field to each node:

typedef struct record {
   int            contents;
   struct record *prnt;
   struct record *lft, *rght;
} node;

Since the root of a tree has no parent, we need to take a design decision about the value of the prnt field in this case. If r is the root, we could set r->prnt = NULL. But it is better to set

r->prnt = r

when r is the root. Of course the root will be the only node of the tree to have this property.

Exercises 5

  1. Write a function to correctly fill in all the prnt fields of a binary tree.
  2. Priority queue.  Write an implementation of a priority queue using a hierarchical tree with prnt fields as the underlying structure.  A hierarchical tree is a quasi complete binary tree such that x->contents ≥ x->lft->contents and x->contents ≥ x->rght->contents for every node x (provided the children exist).  The implementation must provide functions to create an empty queue, to delete a largest element from the queue, and to insert a new element into the queue.  (Hint: see the implementation of priority queue represented by a heap.)

First and last nodes

Consider the following problem: find the first node of a tree in the left-root-right order.  Here is a function that solves the problem:

// Receives a nonempty binary tree r and
// returns the first node of the tree in
// the left-root-right traversal.

node *first (tree r) {  
    while (r->lft != NULL) 
       r = r->lft;
    return r;
}

It is not difficult to write an analogous function to find the last node in the left-root-right traversal.

Next and previous nodes

Suppose x is a node of a binary tree. Our problem:  find the successor of x, i.e., the next node after x in the left-root-right order. We assume, of course, that x is not NULL.

The following function solves the problem assuming that the nodes have prnt fields.  The function returns the successor of x or returns NULL if there is no sucessor.

// Receives a node x. Returns the successor of
// x in the left-root-right traversal or NULL
// if x is the last node. The function assumes
// that x != NULL.

node *nextnode (node *x) {  
    if (x->rght != NULL) {
       node *y = x->rght; 
       while (y->lft != NULL) y = y->lft;
       return y;                                  // A
    }
    while (x->prnt != NULL && x->prnt->rght == x) // B 
       x = x->prnt;                               // B
    return x->prnt;
}

(Note that the function does not need to know where the root of the tree is.)  On line A, y is the first node (in the left-root-right order) of the subtree whose root is x->rght. The lines B move x up the tree while it is someone's right child.

Exercises 6

  1. Write a recursive version of the function first.
  2. Write a function to find the last node of a tree in the left-root-right order.
  3. Write a function to receive a node x of a binary tree and find the predecessor or x, i.e., the node that precedes x in the left-root-right order.
  4. Write a function to make a left-root-right traversal of a binary tree using the functions first and nextnode.