My page on graph search shows a few simple graph algorithms, mostly about search in graphs. On this page we present some further algorithms for graphs. Unlike the first page, here we are dealing with weighted graphs. In a weighted graph, the edges between nodes have different weights, shown above as the thickness of the lines between the circles. The weights can have different meanings, depending on the algorithm in question.
We will start off with minimum spanning trees, where the weight resembles the cost of adding an edge to a tree. Later on we will present algorithms for finding shortest paths in graphs, where the weight represents a length between two nodes. For simplicity the weights of the edges are chosen to be between 1 and 4. These can be directly translated into thicknesses of the line representing the edges.
A spanning tree is a subset of the edges of a connected graph that connects all nodes, but has no cycles. A connected graph is a graph without disconnected parts that can't be reached from other parts of the graph. All graphs used on this page are connected. A minimum spanning tree (MST) is such a spanning tree that is minimal with respect to the edge weights, as in the total sum of edge weights. In the panel above, the green edges are part of a minimum spanning tree that was found by Kruskal’s algorithm. If you have a close look, you can see that all nodes can be reached by the MST. The next two panels show algorithms for finding an MST.
Kruskal’s algorithm works as such: start with every node in a separate singleton tree. All trees together make up a forest. The algorithm then considers each edge, sorted by non-decreasing order of weight, and only adds an edge to the MST if it connects two previously unconnected trees in the forest. So every step merges two trees into one, and hence reduces the number of trees by one. Once the entire forest consists only of one tree, an MST has been found. This strategy is of the so-called ‘greedy’ type, but it finds a globally optimal solution for this problem.
Prim’s algorithm uses a different greedy strategy than Kruskal’s algorithm. It starts at a random node, and grows a single tree node by node until it has been turned into an MST. The key point is how the next edge to expand on is chosen: from the list of all edges bridging the current spanning tree and the rest of the graph, the edge with the least weight is chosen. Ties can be broken randomly. Note that Prim’s algorithm and Kruskal’s algorithm find the same minimum total weight (as reported after the run below the graphic), but don’t necessarily use the exact same edges in their MSTs (but often do).
Dijkstra’s algorithm computes what is called the ‘single-source shortest paths’ problem: For a given source node we want to know how far the total shortest distance to any other node in the graph is. Hereby the edge weights we had previously, are now considered lengths of distance between the nodes. Thicker lines indicate longer distances between the nodes. The geometric distance between the nodes as they are displayed is not relevant here, only the distance defined by the edges. The computed shortest distances are displayed as the numbers inside the explored nodes, where the green filled node is the starting point and has zero distance to itself.
The algorithm can be loosely characterised as a weighted breadth-first search. To choose which edge to traverse next, consider the edges between the explored and unexplored nodes, sorted by the distance to the new node in nondescending order. If you have a look at the animation above, note how the distance of the newly added nodes never decreases from the last one, often just increasing by one. This is a very similar strategy to Prim’s algorithm, and indeed the traversed edges in green here make up a spanning tree, just not an MST.
One hidden assumption of Dijkstra’s algorithm is that all edge lengths have to be positive or zero for the algorithm to work. So if we are modelling a road network for example, this makes sense as there are no negative drive times there. But we could want to have a weighted graph that includes negative edge lengths for other applications. In this case we can no longer rely on the simple Dijkstra’s algorithm to find shortest paths. In the example above, negative-length edges are indicated in red, the thickness of the lines now indicates magnitude of length, so a thick red line is quite negative in value. We have now chosen a directed graph, in which the edges have a direction, because paths through an undirected graph with negative edges aren't well defined.
The Bellman-Ford algorithm can find single-source shortest paths in a graph with negative edge lengths. But this only works if there are no negative cycles, which are cycles where the path length adds up to a negative value. In their presence, any path that moves around the cycle can become arbitrarily negative, just by cycling around the negative cycle. If this is the case, the algorithm has to terminate and indicate the presence of a negative cycle. On the panel above we always work with a graph that has no negative cycles. To see the opposite case, have a look at the next panel.
The starting condition for the algorithm is that all nodes except for the source node (green) have a distance of infinity (∞). It then loops over all nodes in any order and considers the currently shortest path to the node and updates the distance (denoted as a small green number in the node circle). At the end of one iteration through the node (from left to right) the numbers are updated from the new values that where just computed. In the beginning not much changes as most new distances are still infinity. After a while the shorter, non-infinite distances start to spread from the source node. If the distances don't change despite having processed all nodes, the algorithm has become stable and returns the resulting distances. Some nodes might remain at infinity if there is no path from the source node to them.
There are two possible indicators for a negative cycle: the first is that after repeating the loop over all nodes as often as there are nodes in the graph, and the algorithm does not become stable, i.e. distances are still changing, then there must be at least one negative cycle in the graph. Another, sufficient but not necessary condition is that the source node takes on negative values. This also means there is a negative cycle in the graph. These cases are demonstrated in the panel above, where we always show a graph with negative cycles present. We have speeded up the algorithm display a bit, because detecting negative cycles can take a while.
If you enjoyed this page, there are more algorithms and data structures to be found on the main page. In particular there is more about Graph search.