Java – Threaded Radix Sort

I have been working on different variations of the Radix Sort. At first I used chaining, which was really slow. Then I moved onto using a count sort while using val % (10 * pass), and most recently turning it into the respective bytes and count sorting those, which allows me to sort by negative values also.

I wanted to try it with multithreading, and can only get it to work about half the time. I was wondering if someone can help look at my code, and see where I’m going wrong with the threading. I have each thread count sort each byte. Thanks:

public class radixSort {

    public int[] array;
    public int arraySize, arrayRange;
    public radixSort (int[] array, int size, int range) {
        this.array = array;
        this.arraySize = size;
        this.arrayRange = range;
    public int[] RadixSort() {
        Thread[] threads = new Thread[4];
        for (int i=0;i<4;i++)
            threads[i] = new Thread(new Radix(arraySize, i));
        for (int i=0;i<4;i++)
        for (int i=0;i<4;i++)
            try {
            } catch (InterruptedException e) {
        return array;
    class Radix implements Runnable {
        private int pass, size;
        private int[] tempArray, freqArray;
        public Radix(int size, int pass) {
            this.pass = pass;
            this.size = size;
            this.tempArray = new int[size];
            this.freqArray = new int[256];
        public void run() {
            int temp, i, j;
            synchronized(array) {
                for (i=0;i<size;i++) {
                    if (array[i] <= 0) temp = array[i] ^ 0x80000000;
                    else temp = array[i] ^ ((array[i] >> 31) | 0x80000000);
                    j = temp >> (pass << 3) & 0xFF;
                for (i=1;i<256;i++)
                    freqArray[i] += freqArray[i-1];
                for (i=size-1;i>=0;i--) {
                    if (array[i] <= 0) temp = array[i] ^ 0x80000000;
                    else temp = array[i] ^ ((array[i] >> 31) | 0x80000000);
                    j = temp >> (pass << 3) & 0xFF;
                    tempArray[--freqArray[j]] = array[i];
                for (i=0;i<size;i++)
                    array[i] = tempArray[i];


Besides wrong class and method names (class should start with capital letter, method shouldn’t), I can see that you are synchronizing all thread works on the array. So it’s in fact not parallel at all.

There is a basic problem with this approach. To get a benefit from multithreading, you need to give each thread a non-overlapping task compared to the other treads. By synchonizing on the array you have made it so only one thread does work at a time, meaning you get all the overhead of threads with none of the benefit.

Think of ways to partition the task so that threads work in parallel. For example, after the first pass, all the item with a 1 high bit will be in one part of the array, and those with a zero high-bit will be in the other. You could have one thread work on each part of the array without synchronizing.

Note that your runnable has to completely change so that it does one pass at a specified subset of the array then spawns threads for the next pass.

I am pretty sure that you can’t really parallelize RadixSort, at least in the way you are trying to. Someone pointed out that you can do it by divide-and-conquer, as you first order by the highest bits, but in fact, RadixSort works by comparing the lower-order bits first, so you can’t really divide-and-conquer. The array can basically be completely permuted after each pass.

Guys, prove me wrong, but i think it’s inherently impossible to parallelize this algorithm like you try to. Maybe you can parallelize the (count) sorting that is done inside of each pass, but be aware that ++ is not an atomic operation.