Dijkstra's algorithm (Shortest path algorithm)

May 07, 2017

Dijkstra’s algorithm is an efficient single-source shortest path algorithm. That is, it helps us in finding the shortest distance from the source vertex to all the other vertices in the graph. As opposed to breadth-first search, it efficiently solves the single-source shortest path problem for weighted graphs (graph with weighted edges). For the algorithm to work, all edge weights must be non negative.

Video with Step-by-Step Example

This video describes the problem statement and shows what the Dijkstra’s algorithm does step-by-step on a well chosen example.

Algorithm steps

Given a graph and source, Dijkstra’s algorithm calculates the distance of the shortest path from source to every node in the graph. The algorithm steps are as follows :-

Mark all nodes as unvisited.
Assign every node a tentative distance. distance(source) = 0. distance(node) = infinity for all the nodes.
Set the source as the current node.
Mark current node as visited. (This ensures each node is visited exactly once).
Consider all unvisited neighbors of the current node:
Calculate a new tentative distance for the neighbor for the path through the current node. tentativedistance = distance(current node) + edgeweight(current node, neighbor)
If the newly calculated distance is better (smaller) than the current value of distance(neighbor), update distance(neighbor).
Set current node = unvisited node with the smallest tentative distance. This node’s tentative distance is now final and known for sure. Repeat from step 4. If there are no more unvisited nodes, or distance for all unvisited nodes = infinity (happens for disconnected graphs), then stop.

Pseudo code

Following is the pseudo code in Python.

define dijkstra(graph, source):
    # initialization 
    distance[node] = INFINITY for all nodes in graph
    previous[node] = UNKNOWN for all nodes in graph
    visited[node] = False for all nodes in graph
    heap = empty heap 
    # start from source  
    distance[source] = 0
    heap.insert_with_priority(distance[source], source)
    while not heap.empty(): # at-least one unvisited node
        # unvisited vertex with minimum distance
        node = heap.pop() 
        if visited[node]: continue 
        visited[node] = True # mark visited 
        for each unvisited neighbor of node:
            # calculate distance of new path 
            new_distance = distance[node] + edge_weight(node, neighbor)
            # check if we found a better path than current best 
            if distance[neighbor] is None or new_distance < distance[neighbor]:
                # if yes, update distance[] and previous[]
                distance[neighbor] = new_distance
                previous[neighbor] = node 
                heap.insert_with_priority(distance[neighbor], neighbor)
    return distance, previous

Code implementation

The following is the implementation of Dijkstra’s Algorithm in cpp. The default edges and their weights are the same as in the video given above.

#include <bits/stdc++.h>
using namespace std;
const int N = 100;
const int INF = 1e9;
vector<pair<int,int> >adj_list[N];
int main() 
{
  // Number of nodes and edges
  int n = 5, m = 7;
  vector<int>dist(n,INF);
  /*
    We will be adding the 
    following 7 edges.
  */
  pair<int, int> edges[] = {{0, 1}, {1, 2}, {2, 4}, {4, 3}, {3, 0}, {3, 1}, {1, 4}};
  int weight[] = {6, 5, 5, 1, 1, 2, 2};
  for (int i = 0; i < m; i++)
  {
    int x = edges[i].first;
    int y = edges[i].second;
    int w = weight[i];
    // As graph is undirected
    adj_list[x].push_back({y,w});
    adj_list[y].push_back({x,w});
  }
  priority_queue< pair<int, int>, vector< pair<int, int> >, greater< pair<int, int> > >pq;
  int src = 0;
  pq.push({0, src}); // {distance from source, source}
  dist[src] = 0;
  cout << "Shortest distance from source vertex " << src << " to " << endl;
  while (!pq.empty())
  {
    int node = pq.top().second;
    pq.pop();
    for (auto i:adj_list[node])
    {
      int v = i.first;
      int w = i.second;
      if (dist[v] > dist[node] + w)
      {
        dist[v] = dist[node] + w;
        pq.push({dist[v], v});
      }
    }
  }
  for (int i = 0; i < n; i++)
  {
    cout << "Vertex " << i << ": " << dist[i] << endl;
  }
  return 0;
}

The following is the implementation of Dijkstra’s Algorithm in python. Similar to the code in cpp, the default edges and their weights are the same as in the video given above.

from collections import defaultdict
import heapq
# initialisation
n = 5
graph = defaultdict(list)
dist = [float('inf')]*n
heap = []
# adding the following edges
edges = [(0, 1), (1, 2), (2, 4), (4, 3), (3, 0), (3, 1), (1, 4)]
weight = [6, 5, 5, 1, 1, 2, 2];
for i in range (0, len(edges)):
    x = edges[i][0]
    y = edges[i][1]
    w = weight[i];
    #Adding the weighted edges
    graph[x].append((w, y))
    graph[y].append((w, x))
src = 0
# add edge to heap with edge weight priority
heapq.heappush(heap, (0, src))
dist[0] = 0
while len(heap) > 0:
    # heap[0] gives the top element in the min heap
    node = heap[0][1]
    heapq.heappop(heap)
    for i in range(0, len(graph[node])):
        nextnode = graph[node][i][1]
        w = graph[node][i][0]
        if dist[nextnode] > dist[node] + w:
            dist[nextnode] = dist[node] + w;
            heapq.heappush(heap, (dist[nextnode], nextnode))
print("Shortest distance from source vertex " + str(src) + " to vertex")
for i in range(0,n):
    print(str(i) + ": " + str(dist[i]))

Why does the algorithm work?

The main invariant that Dijkstra’s algorithm maintains is the following: At the beginning of each iteration, for all visited nodes distance(node) is the optimal shortest-path distance. For all unvisited nodes, distance(node) is the shortest-path distance via visited nodes only.

We can prove this by induction.

Initially, this is true when we mark the source as visited and finish iterating over its neighbors.

Now, the unvisited node with the smallest distance must also have the correct shortest distance. (why? hint: use invariant above; this also assumes graph weights are positive). So when we choose this node and mark it visited, the first half of the invariant (regarding visited nodes) is still true. Right after this, we calculate tentative distances for all it’s neighbors and update them if we find a better distance. Hence, the second half of the invariant (regarding unvisited nodes) is also true.

Run-time analysis

For a graph with $n$ nodes and $m$ edges, the runtime of Dijkstra’s algorithm is O( $m \log n$ ) - assuming we use heaps to find the “unvisited vertex with minimum distance”.

This is because each node gets visited exactly once, and each edge is relaxed exactly once. Hence, we insert into the heap $m$ times (once per edge), which takes O( $\log n$ ) time per operation.

All other operations (apart from heap insert and pop) take O( $m$ ) time total, since we perform O(1) work per edge. This is dominated by the time taken by the heap operations.

CommonLounge Archive