Tuesday, January 12, 2010

Reversable sorting algorithm CONTEST

Most of you guys out there know a few things about sorting, there are lots of sorting algorithms and from what I know Quick Sort is the fastest.
More information about sort algorithms can be found on Wikipedia(click for link).

As you know a sort algorithm can sort an array of bytes/chars/integers/etc. in ascending or descending order, but what IF I wish to sort a array of bytes/chars/integers/etc.(of variable length) in any order(ascending or descending) and store some information that helps me restore it to initial state?

Contest rules:
- Develop a sort algorithm which sorts a array of bytes/chars/integers/etc.(in ascending and/or descending order) which stores information that helps you restore the array in initial state.
- The sort information must NOT use more than half the size of the array(i.e. we have a array of bytes of 4096 elements, the sort information must not use more than 2048 bytes)
- If you use code written by other people you must enter a comment before the code in the source code(i.e. this code is created by AUTHOR NAME|link to website on next line the code appears).
- The algorithm must work on any number of elements in a array

The application/unit/source code must be sent to me at this e-mail address: duminicadorin{AT}gmail{DOT}com(you can send me e-mail if you do not understand any of the above rules -- comments are also welcomed).

The winning code will be the fastest algorithm or the algorithm which stores information about the sort in less space.

1st place: a 4GB memory stick and a customized T-Shirt
2nd place: a 2GB memory stick
I will request your address in order to send you the prize IF AND ONLY IF YOU WIN 1st or 2nd place.

NOTE: that the contest ends Friday on 22 Jan. 2010.


  1. Nice challenge, I hope I'll have time to comptete...

    By the way, it rReminds me a cool story. I asked once on a forum on information theory whether a sorted list of words contains more or less information than the same list, but unsorted. I was pretty convinced a sorted dictionary or phone book contains more information since you can retrieve data much easily than if it wasn't sorted. But I got this answer : "an contains information list more unsorted" :-)

  2. I wish you good luck in the competition!

  3. 10 mins ago I thought it was impossible, now I know how to do it!

    Just need a day or two with 50 hours/day, or a time machine ...

  4. :) the contest is open until 22. Jan.(this month) 2010 this month, you got more than enough time from what your saying...

  5. Well I had more than enough time to think about it, but I didn't code anything. Because in fact there is no solution.

    Sorting an array of n records with a good algorithm looses n.log(n) bits of information, which are the results of the n.log(n) comparisons. You could imagine storing those and write an "unsort" function, but it's even simpler to realize that the original index of each element also requires log(n) bits, so a simple table of the initial index of each elements also takes n.log(n) bits !

    So in short there is no way to fulfil the requirement that "The sort information must NOT use more than half the size of the array(i.e. we have a array of bytes of 4096 elements, the sort information must not use more than 2048 bytes)" : for 4096 elements, you need at least 4096*log(4096)=4096*12=6K bits.

    At one point I thought one could use the Burrows-Wheeler transform (http://en.wikipedia.org/wiki/Burrows%E2%80%93Wheeler_transform), which makes data "more compressible", but it doesn't perform a real sort.

    However, I believe there might be specific solutions based on compression for data with high redundancy.

    Also, I'd be very pleased to be wrong and would really love to see a solution to this problem, because it would be equivalent to a sort algorithm with O(n) complexity.

  6. @drgoulu Oh man! if you could have just e-mailed the this in time you would have been declared the winner, unfortunately no one e-mailed, I'm am not sure about your calculations, but the point is that there is no solution from what I am awear.
    I already told that the solution is "out of the box" by that I mean that anyone who participated in the contest should have thought first if it's possible first and then start to waste time, anyways this is a good exercise for beginners, "think first, act after".
    I really like the way you think, I hope you participate on Delphi Torrent client contest because it's doable :) and it should not take too much time, have fun and I'm sorry you didn't e-mailed me, because you would have won first prize!

  7. Hmm.. this should be quite easy to do. His calculations are probably not correct, nor is quicksort a correct sorting algorithm. Quicksort has a worst case running time of n^2.

    A sorting algorithm which has a constant running time of N * Log N will solve this problem in 1850 bytes of stored sorting information.

    Reverse sorting is quite easy to implement for any comparision algorithm.

    The real question here is why would somebody want such an algorithm:

    Compression comes to mind.

    The sorting elements would be easy to compress with RLE.

    Then only the sorting information needs to be stored.

    So let's assume the 256 bytes are compressed to the following:

    256 byte content values at most.

    These can be reduceed to just 1 bit indicating if the byte value/index was in the content or not.

    So this leaves 256 bits which would consume at most 32 bytes.

    This then leaves the lengths to be encoded for the RLE.

    Let's assume 3 bytes per length value which would be enough to describe a video codec resolution of 1920x1200. This would cost at most: 3 * 256 = 768

    So total consumption of RLE compression: 32 + 768 bytes.
    1850 bytes must be added towards this:
    total: 2650

    So this is a pretty good compression for such a small byte array.

    Let's examine what happens if a true video codec window/resolution would be compressed:

    1920x1200 = 2304000 pixels.

    2.198.873.944 bytes would be needed to store sorting information. (n*log n grows exponentially).

    This is close to 2GB of sorting information. multiplied by 3 color channels would be 6 GB.

    While 1920x1200x3 would be close to only 180 MB if encoded rawly.

    So it's pretty clear... that this idea is bad for compression purposes.

    It grows out of control.

    So any sorting algorithm will have to be N... linear to be of any use.

    Only algorithm that comes close to that is radix/bucket sort... but it would require more bits... than just swap yes/no per comparision.

    So at first glance this contest seems pretty useless :)

  8. Then again... if the video frame is split up into blocks of 4096 bytes.

    It may still be usefull... interesting...


Blogroll(General programming and Delphi feeds)