Trees

  • Depth of a node: the length (no. of edges) of the branch to the root
  • Height of a tree: the maximal depth (no. of layers - 1)
  • Perfect tree: when all leaves are on the same depth h, there are $2^{h+1} - 1$ nodes which all all reachable in $h$ steps
  • Complete $n$-tree: each vertex except for for the leafs has $n$ children, in total $\geq 2^h$ nodes (no gaps in array representation)
  • Binary tree: every node has at most two children
  • Balenced binary tree: the left and right subtrees of every node differ in height by no more than 1
  • #edges = #nodes - 1 (because every node execpt for the root has an edge to its parent)

Tree traversal #

On a binary tree, in which: Node (N), Left (L), Right (R):

  • Preorder (NLR)
  • Postorder (LRN)
  • Inorder (LNR)

Search trees #

Search tree property: if $x$ in node $k$ then:

  • Everything in the left subtree of $k$ is smaller than $x$
  • Everything in the right subtree of $k$ is greater than $x$

Operations: search, add, remove are in $\mathcal{O}(h)$

Expression trees #

In prefix notation:

  • Operator before argumentens
  • Never needs parentheses!
  • Corresponds to preorder traversal

In infix notation:

  • The 'usual' way of notating math
  • Corresponds to inorder traversal

Tries #

A standard trie $T$ on a collection of words $W$, is a tree with the following properties:

  • The root of T is empty, and every other node contains a letter
  • The children of a node T contains different letters and are in alphabetical order
  • The branches in T from the root correspond exactly with the words in W

Compressed trie: strings are concatenated

Compact trie: nodes store range of indicies referencing positions in word

Let $n$ be the sum of lengths of the words and $m$ be the number of words:

TrieNumber of nodesMemory use
Standard trie$\mathcal{O}(n)$$\mathcal{O}(n)$
Compressed trie$\mathcal{O}(m)$$\mathcal{O}(n)$
Compact trie$\mathcal{O}(m)$$\mathcal{O}(m)$

Suffix trie: store only suffixes of substrings since every substring of a string is the prefix of a suffix.

Implementation of a trie data structure in C #

Using an array:

#include <stdbool.h>
#define N 26

typedef struct trieNode *trie;
typedef struct trieNode {
  bool isEnd;
  trie next[N];
} trieNode;

Using a linked list:

typedef trieNode *trie;
struct trieNode {
    char letter;
    trie nextVert; // Pointer next child (NULL if none)
    trie nextHor; // Pointer to next sibling (NULL if none)
};

Depending on your data either implementation can be more memory efficient. However the lookup times for entries in a liked list are slower because the list has to be traversed.

Heaps #

  • Heap property: for each node v, its descendants have a value that is smaller or equal than the value of v.
  • enqueue / removeMax in $\mathcal{O}(\lg(n))$
  • Restoring heap property by:
    • upheap: keep swapping with parent untill heap property is maintained.
    • downheap: move node down untill it is smaller than its anchestors.
  • Efficiently implemented as an array (since the heap is a complete tree)

Heap Sort #

Construct a heap in $\mathcal{O}(n)$ and then keep removing the min/max in $\mathcal{O}(n\lg(n))$.