This page introduces another asymptotically optimal sorting algorithm: Mergesort. The working principle of Mergesort is remarkably simple: just start with the elements in separate subarrays and repeatedly merge them while maintaining the sort property. This will result in ever longer, sorted subarrays until the entire array is sorted. Mergesort is on par for speed with Quicksort and Heapsort, two other general-purpose sorting algorithms. The main drawback of Mergesort is that it does not work in place, but requires an additional array of the same size as the input to work with.
In the panel above, two arrays are merged into one, larger array while maintaining the sorting. This is done by choosing the next element to move into the new array from the beginning of either input array. The choice of which array to take the next element from is made by applying the sort comparison between the two candidate elements. In this case we are sorting the array to an ascending order, so the smaller of the two possible next elements is chosen to move into the output array. The merge step only works because the input arrays have already been sorted by the same ordering.
To sort the entire array, we start by placing all elements virtually into a separate subarray, and then start merging, as shown in the previous panel. As the output of a merge can't overwrite the input arrays immediately, we place the merge results into a second array. Once all the merged data is in the other array we can merge larger subarrays back into the original array. This goes to and fro until one array contains the sorted elements, as can be seen above.
The green shading of the number boxes signifies the end of a sorted subarray. In the beginning, all elements are in an array of their own, and all boxes are green. Later on, the single green boxes delimit the subarrays. We are done sorting, when only the last box is green. Note that the array length is not a power of 2, so the last subarray is a bit shorter than the others. This isn't a problem for Mergesort though, as we can merge arrays of different lengths.
There is a good way to speed up this sorting procedure: if we have access to multiple processor cores, they can sort the subarrays separately in parallel. This is possible because we know the boundaries between the subarrays, and can merge them independently. Of course the number of separate merge threads is reduced over the course of the sorting, as the number of subarrays is reduced. This procedure is on display above, note how much faster the early steps of the algorithm are performed.
For the end of this page, I want to show a larger array being sorted with Mergesort. Here we are back to single-threaded sorting and the array contains 128 numbers, which is a power of 2, so you can observe the symmetry of sorting such an even number of elements. I have also clocked up the speed by a factor of 2.5.
If you enjoyed this page, there are more algorithms and data structures to be found on the main page.