Monthly Archives: March 2016

Array Sorting Algorithms in the Java Library

The class java.util.Arrays provides overloaded static methods for sorting arrays of primitive types (byte, char, int, long, float, double) as well as objects. The entire array or a range within the array can be specified.

For each type the library provides two overloaded methods: a single-threaded sort() method and a multithreaded parallelSort() implementation.

It is interesting to analyse the time complexity of the algorithms used in the library and compare its performance against that of a text book merge-sort implementation.

Single-threaded Sorting of Primitive Type Arrays

The Java SE 8 API documentation provides the following notes regarding the implementation of the sorting algorithms for primitive types:

“The sorting algorithm is a Dual-Pivot Quicksort by Vladimir Yaroslavskiy, Jon Bentley, and Joshua Bloch. This algorithm offers O(n log(n)) performance on many data sets that cause other quicksorts to degrade to quadratic performance, and is typically faster than traditional (one-pivot) Quicksort implementations.”

If you look at the library source code you will find that the methods actually switch to different algorithms based on the size of the input array. For instance for tiny arrays of less than 50 elements it switches to the insertion-sort algorithm which is very efficient when the number of inversions is small (that is the number of pairs of elements that are out of order).

Single-threaded Sorting of Object Arrays

The library provides an overloaded sort() method for arrays of objects that implement the java.lang.Comparable interface and another that takes a java.util.Comparator implementation as argument if you need to specify custom ordering.

As for the algorithm used the Java SE 8 API documentation provides the following implementation notes:

“This implementation is a stable, adaptive, iterative merge-sort that requires far fewer than n log(n) comparisons when the input array is partially sorted, while offering the performance of a traditional merge-sort when the input array is randomly ordered. If the input array is nearly sorted, the implementation requires approximately n comparisons. Temporary storage requirements vary from a small constant for nearly sorted input arrays to n/2 object references for randomly ordered input arrays”

As noted before the implementation actually switches algorithms based on the size of the input array.

Multithreaded Array Sorting

The algorithm used by the overloaded parallelSort() methods for both primitive and object types still relies on the single-threaded implementations but leverages the Fork-Join framework to obtain parallelism as described in the API documentation:

“The sorting algorithm is a parallel sort-merge that breaks the array into sub-arrays that are themselves sorted and then merged. When the sub-array length reaches a minimum granularity, the sub-array is sorted using the appropriate Arrays.sort method. If the length of the specified array is less than the minimum granularity, then it is sorted using the appropriate Arrays.sort method. The algorithm requires a working space no greater than the size of the original array. The ForkJoin common pool is used to execute any parallel tasks”

Performance Comparison

We will now compare the execution time of the library sorting algorithms for object arrays against a textbook implementation of merge-sort . The source code is on GitHub.

The data used for the tests are arrays of random Integer objects generated by a utility class. The size of the arrays is 10^6 or 1,000,000 elements with a bound value of 10^7 or 10,000,000.

The running time is measured by taking the difference between System.currentTimeMillis() calls before and after the execution of each algorithm.

The JUnit tests are executed by the Maven Surefire Plugin from the command line on a Windows 7 Professional machine based on Intel Core i5-3320M CPU running @ 2.60 GHz clock speed.

These are the results in milliseconds:

Run Merge-Sort JDK sort() JDK parallelSort()
1 966 510 239
2 958 533 241
3 971 505 241
4 989 507 237
5 965 493 237

Whilst the absolute running time is not necessarily indicative as it is heavily influenced by environmental factors, the relative running times should give a good indication of performance differences between the different algorithms when executed under the same conditions.

The full source code can be found on this GitHub repository setup as a Maven project. The development environment used is described here.