Trees (ADT)

Trees are “an abstract model of a hierarchical structure. A tree consists of nodes with a parent-child relation.”

*Data Structures and Algorithms in Java, Goodrich, Tamassia, Goldwasser

Fundamental Operation Value returned Effect
size() Number of nodes in the tree -
isEmpty() Whether the tree is empty -
iterator() An iterator for the tree -
positions() An iterable container of all nodes in the tree -
root() The root node -
parent(p) The parent of the node p -
children(p) An iterable container of the children of the node p -
numChildren(p) The number of children of the node p -
isInternal(p) Whether the node p is an internal node (node with at least 1 child) -
isExternal(p) Whether the node p is an external node (node with no children) -
isRoot(p) Whether the node p is a root node (node without parent) -
insert(p,e) - Add a node of value e as a child of the node p
update(p,e) - Update the value of the node p to be e
delete(p) The value of the node p Delete the node p

The methods for insertion, deletion, and searching are more complicated, and so are outlined in more detail in the binary search tree section

Tree Traversals

There are various ways a tree can be traversed. Shown here is a figure of a binary tree.

In-order (Left, Root, Right). DBE A FCG

Pre-order (Root, Left, Right). A BDE CFG

Post-order (Left, Right, Root). DEB FGC A

Breadth First/Level Order. ABCDEFG

We will come back to breadth first traversal in a later topic (Breadth First Search). For now we will focus on the first 3.

In-Order Traversal

For every node, we print the left child, the node itself, then the right child. Since this is a recursive function, if we start at a node n, the algorithm will start from the left-most child of the tree, then that child’s parent then its sibling and on for the entire tree that the n is the root of.

Function inOrder(n)
  if n != null
    inOrder(n.leftChild())
    Print n
    inOrder(n.rightChild())

Note that the above algorithm applies only to binary trees, for a more general form of in-order traversal, there will need to be an additional definition of what makes a node a “left child”. This can either be that left child nodes have a smaller value than the parent/root, or left children are just the first m number of nodes etc.

Pre-order traversal

Each node is printed before its descendants, and descendants are taking in ascending order

Function preOrder(n)
  if n != null
    Print n
    For each child m of n
      preOrder(n)

Post-order traversal

Each node is printed after its descendants, and descendants are taking in ascending order

Function postOrder(n)
  if n != null
    For each child m of n
      postOrder(n)
    Print n

Binary trees (ADT)

Binary trees are a specialised tree where each node has at most two children, called left and right

Fundamental Operations Value returned Effect
left(p) The left child of node p -
right(p) The right child of node p -
sibling(p) The sibling of node p -

Properties

A binary tree with \(n\) nodes, \(e\) external nodes, \(i\) internal nodes, and a height \(h\) has the properties

\[\begin{gather} e = i + 1 \tag1 \\\\ n = 2e - 1 \tag2 \\\\ h \leq i \tag3 \\\\ h \leq \frac{(n-1)}{2} \tag4 \\\\ e \leq 2^h \tag5 \\\\ h \geq log_2 e \tag6 \\\\ h \geq log_2 (n+1) - 1 \iff n = 2^{h+1} -1 \tag7 \end{gather}\]

As mentioned earlier, Binary Trees by definition have a discrete middle node, and inherently support in-order traversal.

Implementations

There are two common concrete implementations of binary trees

  • Linked structure
  • Array based

Linked structure

In the linked structure implementation, each node is an object which stores its value, references to its child nodes (and sometimes a reference to its parent), as shown in the diagram below:

binaryTreeLinkedStructure

This has a linear space complexity irrespective of the balance of the tree, and has a lookup time of \(O(log_2n)\) for lookup operations.

Array based

In the array based implementation, node values are stored in an array, and their children can be found at indices based on arithmetic operations of their own index

  • \[index(root) = 0\]
  • If \(l\) is the left child of \(n\), then \(index(l) = 2 \cdot index(n) + 1\)
  • If \(r\) is the right child of \(n\), then \(index(r) = 2 \cdot index(n) + 2\)

This can be very inefficient for unbalanced trees, for example, a tree which is just a “line” of nodes would grow with \(O(2^n)\) space, but it has a similarly good lookup time of \(O(log_2n)\)