Prim's algorithm for Minimum Spanning Tree
January 24, 2017
Given an undirected weighted graph, a minimum spanning tree (MST) is a subset of the edges of the graph which form a tree and have the minimum total edge weight. For a MST to exist, the graph must be connected (that is, every pair of nodes must be reachable from each other).
Example of Minimum Spanning Tree. Total edge weight = 5 + 8 + 8 + 4 + 11 + 6 = 42
Prim’s algorithm builds the minimum spanning tree one vertex at a time. To see what that means, watch the following video.
Video with step-by-step example
Algorithm steps
- Start at an arbitrary node. Mark the node visited. Mark all other nodes unvisited.
- While there is an unvisited node
- Consider all edges which connect a visited node to an unvisited node. Out of these edges, pick the one with the minimum weight.
- Mark the unvisited node corresponding to the edge visited. Add edge to minimum spanning tree.
- Done!
Pseudo code
The following is the pseudo code function definition to implement prim’s algorithm in python:
define prims_algorithm(graph):
# initialization
unvisited = set(all nodes in graph)
heap = empty heap
MST = empty set
# start at an arbitrary node
source = choose node at random
unvisited.remove(source) # mark visited
for neighbor of source:
# add edge to heap with priority=edge weight
heap.insert_with_priority(edge_weight(source, neighbor), (source, neighbor))
# while there is an unvisited node
while not unvisited.empty():
# if heap is empty, graph is disconnected
if heap.empty(): return INFINITY
# get minimum-weight edge to an unvisited node from heap
# (note: all edges in heap have at-least one end = visited node)
edge_weight, (previous, node) = heap.pop()
if node not in unvisited: continue # already visited
# found new reachable node. mark visited and update MST
unvisited.remove(node)
MST.add((previous, node))
# add all edges from this node to the heap
for neighbor of node:
# add edge to heap with priority=edge weight
heap.insert_with_priority(edge_weight(node, neighbor), (node, neighbor))
return MST
Implementation
The following is the python code for prim’s algorithm:
from collections import defaultdict
import heapq
# initialisation
graph = defaultdict(list)
visited = [0]*100
heap = []
n = 4
# adding the following edges
edges = [(0, 1), (1, 2), (2, 3), (3, 4), (4, 0), (1, 4), (1, 3)]
weight = [1, 4, 2, 6, 7, 5, 3];
for i in range (0, len(edges)):
x = edges[i][0]
y = edges[i][1]
w = weight[i];
#Adding the weighted edges
graph[x].append((w,y))
graph[y].append((w,x))
# start at an arbitrary node - here 0
# add edge to heap with edge weight priority
for i in range(0,len(graph[0])):
heapq.heappush(heap,(graph[0][i],0))
visited[0] = 1
cost = 0
mst = []
# while there is an unvisited node
# here we consider that the given graph is connected
while len(mst) < n:
# get minimum-weight edge to an unvisited node from heap
# heap[0] gives the top element in the min heap with edge weight priority
w = heap[0][0][0]
previousnode = heap[0][1]
node = heap[0][0][1]
heapq.heappop(heap)
# already visited
if visited[node] == 1:
continue
# found new reachable node. Mark visited and update MST
visited[node] = 1
cost = cost + w
mst.append((node, previousnode))
# add all edges from this node to the heap
for i in range(0, len(graph[node])):
heapq.heappush(heap, (graph[node][i], node))
print("Minimum cost of spanning tree is: " + str(cost))
print("Following are the edges in MST: ")
for i in range(0,len(mst)):
print(str(mst[i][0]) + " " + str(mst[i][1]))
The following is the implementation in Cpp of Prim’s algorithm to build a minimum spanning tree.
#include <bits/stdc++.h>
using namespace std;
int vis[1001];
int main()
{
int n = 5, m;
vector<pair<int, int> >graph[n];
vector<pair<int, int> >mst;
pair<int, int>edges[] = { {0, 1}, {1, 2}, {2, 3}, {3, 4}, {4, 0}, {1, 4}, {1, 3} };
int weight[] = {1, 4, 2, 6, 7, 5, 3};
m = sizeof(weight) / sizeof(weight[0]);
// Build the adjacency list
for(int i = 0; i < m; i++)
{
graph[edges[i].first].push_back({edges[i].second, weight[i]});
graph[edges[i].second].push_back({edges[i].first, weight[i]});
}
/*
priority queue in cpp is
max heap by default. We need
min heap for prim's algorithm.
We store (weight, (u, v))
*/
priority_queue< pair<int, pair<int, int> >, vector< pair<int, pair<int, int> > >,greater< pair<int, pair<int, int> > > >pq;
/*
We can start with any random node.
WLOG we start with node 0
*/
for(auto i: graph[0])
pq.push({i.second, {i.first, 0}});
vis[0] = 1;
int cost = 0;
while(!pq.empty())
{
// get minimum-weight edge to an unvisited node from heap
// pq.top() gives the (weight,(node,previousnode)) with min weight priority
int w = pq.top().first;
int node = pq.top().second.first;
int previousnode = pq.top().second.second;
pq.pop();
// already visited
if(vis[node])continue;
// found new reachable node. Mark visited and update MST
vis[node] = 1;
cost += w;
mst.push_back({node, previousnode});
// add all edges from this node to the heap
for(auto i: graph[node])
pq.push({i.second, {i.first, node}});
}
cout << "Minimum cost of spanning tree is: " << cost << endl;
cout << "Following are the edges in MST: " << endl;
for(auto i: mst)
cout << i.first << " " << i.second << endl;
return 0;
}
Run-time analysis
Given a graph with $n$ nodes and $m$ edges, Prim’s algorithm has a run-time of O($m \log m$). Since $m \leq n^2$, this is also O($m \log n$). The most expensive operation is heap inserts and heap pops, and we do each operation $m$ times (once per edge).
Correctness of Prim’s algorithm
The cut property
The main idea behind both Prim’s algorithm and Kruskal’s algorithm is the cut property.
Consider any subset S of all vertices V of the graph. Consider all edges going from a vertex in S to a vertex outside S. The cut property states that, if $e$ is the edge with minimum weight among all these edges, then $e$ must be part of the minimum spanning tree.
Proof of cut property: The proof is quite simple. Assume the contrary, that is, the MST of the graph does not include $e$. Since the minimum spanning tree spans all vertices, it has at-least one edge $e'$ going from S to outside S. If this edge is replaced by $e$, we get a spanning tree with lesser weight. This is a contradiction, since the original tree was a minimum spanning tree.
Correctness of Prim’s algorithm
In Prim’s algorithm, in each step, we consider a cut and choose the minimum weight edge crossing the cut. Each of these edges must be part of the MST, as proved above.