Think about the complexity of the functions you call

Last night I wrote a basic Radix sorting algorithm in Kotlin that can sort an IntArray. As a quick CS refresher, Radix sort is an O(n) complexity sorting algorithm that only works with int keys. CS theory tells us that the best possible complexity for a general-purpose sorting algorithm is O(n log n). This means that a basic Radix sort algorithm should be able to out-perform any general-purpose algorithm on a large enough input size (n).

In this post I'll be comparing my Radix sort algorithm vs. the built-in general-purpose Arrays.sort() algorithm.

Here's my initial Radix sort implementation:

fun IntArray.radixSort(): IntArray {  
  var result = this
  val max = getMax()

  var place = 1
  while (max / place > 0) {
    result = result.countingSort(place)
    place *= 10
  }

  return result
}

fun IntArray.countingSort(place: Int): IntArray {  
  val result = IntArray(size)
  val count = IntArray(10)

  (0 until result.size)
    .map { (this[it] / place) % 10 }
    .forEach { count[it] += 1 }

  (1 until count.size)
    .forEach { count[it] += count[it - 1] }

  for (i in size - 1 downTo 0) {
    val digit = (this[i] / place) % 10
    result[count[digit] - 1] = this[i]
    count[digit]--
  }

  return result
}

fun IntArray.getMax(): Int {  
  var mx = this[0]
  for (i in 1..size - 1)
    if (this[i] > mx)
      mx = this[i]
  return mx
}

The code works which is cool and all, but is it fast? More specifically, is it faster than just using the built in Arrays.sort() function?

Let's sort 10,000,000 ints and find out!

Sorting...  
Quick sort took 335ms.  
Radix sort took 479ms.  

Nope, it's slower.

Either CS is a lie, or I'm a bad coder (turns out I'm a bad coder).

Thing is, this code was translated from Java and then had some Intellij auto-suggestions applied to it. I essentially transformed imperative-style loops into ranges with higher order functions applied, e.g. this hotness:

(0 until result.size)
  .map { (this[it] / place) % 10 }
  .forEach { count[it] += 1 }

Sadly, this had a cost to it. Here's the (functionally equivalent) original loop:

for (i in 0 until result.size) {  
  val digit = (this[i] / place) % 10
  count[digit] += 1
}

Running our program with the original loop yields the following output:

Sorting...  
Quick sort took 330ms  
Radix sort took 247ms  

Just by reverting that quick-fix, I achieved the goal of making a conditionally faster sorting algorithm than the built in one.

I got curious as to why this was the case, so I dove into the compiled bytecode for both implementations.

The standard loop essentially produces something that looks like this (translated into Java):

IntRange range = RangesKt.until(0, result.length);  
int i = range.getFirst();  
int n = range.getLast();  
while (i <= n) {  
  int digit = $receiver[i] / place % 10;
  ++count[digit];
  ++i;
}

Nothing expensive here. The only difference is an additional allocation of the IntRange but allocations are cheap.

Now let's translate the compiled bytecode of the slower version into Java code:

Iterable range = (Iterable) RangesKt.until(0, result.length);  
Collection digits = new ArrayList(10);  
Iterator rangeIterator = range.iterator();

while (rangeIterator.hasNext()) {  
  int i = ((IntIterator) rangeIterator).nextInt();
  Integer digit = Integer.valueOf($receiver[i] / place % 10);
  digits.add(digit);
}

Iterator digitsIterator = digits.iterator()  
while (digitsIterator.hasNext()) {  
  Number digit = (Number) digitsIterator.next();
  ++count[digit.intValue()]
}

The differences here are pretty obvious:

  1. We now have an extra loop of size n!
  2. We store the digits in a collection which can't hold primitives. Hence we have to box a lot of integers!
  3. (Minor) there are a few other allocations (rangeIterator, digits, and digitsIterator).

This is actually really obvious when you read the functions that we're calling in the slow version, e.g: map and forEach. It's almost in plain English that we're iterating twice now instead of once.

It always pays to think about the cost of the functions we're calling. I just got baited by the pretty Kotlin code and blindly followed Intellij's advice.

Brad Campbell

Android application developer, currently working as the lead on the ANZ goMoney NZ applications. All of the opinions and code on this blog are my own, and not that of ANZ.