Swift Algorithms & Data Structures

Heaps

In the previous essay, we reviewed Dijkstra's algorithm for searching a graph. Originally published in 1959, this popular technique for finding the shortest path is an elegant solution to a complex problem. The design involved many parts including graph traversal, custom data structures and the greedy approach.

When designing programs, it's great to see them work. With Dijkstra, the algorithm did allow us to find the shortest path between a source vertex and destination. However, our approach could be refined to be more efficient. In this essay, we'll enhance our algorithm with the addition of binary heaps.

How it works

In its basic form, a heap is just an array. However, unlike an array, we visualize it as a tree. The term visualize implies we use processing techniques normally associated with recursive data structures. This shift in thinking has numerous advantages. Consider the following:

//a simple array of unsorted integers
let numberList : Array<Int> = [8, 2, 10, 9, 11, 7]

As shown, numberList can be easily represented as a heap. Starting at index 0, items fill a corresponding spot as a parent or child node. Each parent also has two children with the exception of index 2.

The array visualized as a 'nearly complete' binary tree.

The array visualized as a "nearly complete" binary tree.

Since the arrangement of values is sequential, a simple pattern emerges. For any node, we can accurately predict its position using these formulas:

formula1 formula2

Sorting Heaps

An interesting feature of heaps is their ability to sort data. As we've seen, many algorithms are designed to sort entire datasets. When sorting heaps, nodes can be arranged so each parent contains a lesser value than its children. In computer science, this is called a min-heap.

A heap structure that maintains the min-heap property.

Fig. 2. A heap structure that maintains the min-heap property.

Exploring The Frontier

With Dijkstra, we used a concept called the frontier. Coded as a simple array, we compared the total weight of each path to find the shortest path.

//obtain the best path
var bestPath: Path = Path()
while(frontier.count != 0) {
    //support path changes using the greedy approach
    bestPath = Path()

    var x: Int = 0
    var pathIndex: Int = 0

    for (x = 0; x < frontier.count; x++) {
        var itemPath: Path = frontier[x]

        if (bestPath.total == nil) || (itemPath.total < bestPath.total) {
            bestPath = itemPath
            pathIndex = x
        }
    } //end for

While it accomplished our goal, we applied a brute force technique. In other words, we examined every path to find the shortest path. This code is said to run in linear time or O(n). If the frontier contained a million rows, how would this impact the algorthim's overall performance?

The Heap Structure

Let's create a more efficient frontier. Named PathHeap, the class will extend the functionality of an Array.

//a basic min-heap data strcture
public class PathHeap {
    private var heap: Array

    //the number of frontier items
    var count: Int {
        return self.heap.count
    }

    init() {
        heap = Array()
    }
}

PathHeap includes two properties - Array and Int. To support good design (e.g., encapsulation), the heap has been declared a private property. To track the number of items, the count has also been declared as a computed property.

Building the Queue

Finding the best path more efficiently than O(n) will require a new way of thinking. We can improve our algorithm's performance to O(1) through a "bubble-up" approach. Using our heapsort formulas, this involves "swapping" index values so the smallest item is positioned at the root.

//sort shortest paths into a min-heap (heapify)
func enQueue(key: Path) {
    heap.append(key)

    var childIndex: Float = Float(heap.count) - 1
    var parentIndex: Int! = 0

    //calculate parent index
    if (childIndex != 0) {
        parentIndex = Int(floorf((childIndex - 1) / 2))
    }

    //use the bottom-up approach
    while (childIndex != 0) {
        var childToUse: Path = heap[Int(childIndex)]
        var parentToUse: Path = heap[parentIndex]

        //swap child and parent positions
        if (childToUse.total < parentToUse.total) {
            heap.insert(childToUse, atIndex: parentIndex)
            heap.removeAtIndex(Int(childIndex) + 1)

            heap.insert(parentToUse, atIndex: Int(childIndex))
            heap.removeAtIndex(parentIndex + 1)
        }

        //reset indices
        childIndex = Float(parentIndex)

        if (childIndex != 0) {
            parentIndex = Int(floorf((childIndex - 1) / 2))
        }
    } //end while
} //end function

The enQueue method accepts a single path as a parameter. Unlike other sorting algorithm's, our primary goal isn't to sort each item, but to find the smallest value. This means we can increase our efficiency by comparing a subset of values.

A heap structure that maintains the min-heap property.

Fig. 3. A heap structure that maintains the min-heap property.

The enQueue process compares a newly added value with its parent in a process called "bubbling-up".

Fig. 4 . The enQueue process compares a newly added value with its parent in a process called "bubbling-up".

The compare / swap process continues recursively until the smallest value is positioned at the root.

Fig. 5. The compare / swap process continues recursively until the smallest value is positioned at the root.

Since the enQueue method maintains the min-heap property (as new items are added), we all but eliminate the task of finding the shortest path. Here, we implement a basic peek method to retrieve the root-level item:

//obtain the minimum path
func peek() -> Path! {
    if (heap.count > 0) {
        var shortestPath: Path = heap[0]
        return shortestPath
    }
    else {
        return nil
    }
}

The Results

With the frontier refactored, let's see the applied changes. As new paths are discovered, they are automatically sorted by the frontier. The PathHeap count forms the base case for our loop condition and the bestPath is retrieved using the peek method.

//construct the best path
var bestPath: Path = Path()

while(frontier.count != 0) {
    //use the greedy approach to obtain the best path
    bestPath = Path()
    bestPath = frontier.peek()
}

Want to see the entire algorithm? Here's the source.