A binary tree is a hierarchical structure where each node has at most two children, called left and right. Every node except the root has exactly one parent. Trees are the natural fit for hierarchical data — file systems, parse trees, decision trees, organizational charts.
Vocabulary
- Root: the top node with no parent
- Leaf: a node with no children
- Depth of a node: number of edges from the root to that node (root has depth 0)
- Height of a node: number of edges on the longest downward path to a leaf
- Subtree: a node together with all its descendants
A tree's height bounds the cost of many operations. A tree of nodes has height between (balanced) and (degenerate — every node has one child, essentially a linked list).
Common shapes
- Full: every node has 0 or 2 children
- Complete: every level filled except possibly the last, and the last is filled left to right
- Perfect: every internal node has exactly 2 children and all leaves are at the same depth. A perfect tree of height has exactly nodes.
- Balanced: every node's two subtrees differ in height by at most 1. Operations stay .
Node representations
Two common layouts. Linked: each node holds value, left pointer, right pointer (and sometimes parent). Array (implicit): for complete trees, store node at array index ; its children are at and . The array layout is used for heaps — it's cache-friendly and needs no pointer storage.
Why trees show up everywhere
A tree is the simplest non-linear data structure. Once you can do linear (arrays, lists) and tree, you can compose them into nearly anything: graphs are trees-plus-cycles, tries are letter-keyed trees, syntax trees compile programs, B-trees back databases, Merkle trees back blockchains.