LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   C: A Fastest Array Sorting, faster than bubble sort ? (https://www.linuxquestions.org/questions/programming-9/c-a-fastest-array-sorting-faster-than-bubble-sort-4175469391/)

Xeratul 07-12-2013 06:30 AM

C: A Fastest Array Sorting, faster than bubble sort ?
 
Hi,

On sorting an array of array, the simplest and well known one is bubble sort.

Would you mind proposing a code that would do similar things but faster?

I dont sort all, just the first chars, herewith:

many thanks

Code:



void sort_bubble( void ) {
  int i , k , j , inversion ;
  char stringtmpswap[PATH_MAX];

  mvprintw(0,0,"Sorting File (Bubble Sort). Please wait (actually long)...");
  refresh();

  for(j=6 ; (j>=0) ; j--)
  {
    inversion=1 ;
    while( inversion == 1 ) {
      inversion = 0 ;
      for(i=1 ; (i<datalistfile_max_numberitems) ; i++)
      {
        if ( datalistfile[i][j]  >  datalistfile[i+1][j]  )
        {
          strncpy( stringtmpswap , datalistfile[i] , PATH_MAX);
          strncpy(  datalistfile[i] , datalistfile[i+1]  , PATH_MAX);
          strncpy(  datalistfile[i+1] , stringtmpswap  , PATH_MAX);
          inversion=1;
        }

      }
    }
  }
  mvprintw(1,0, "Sorted!! ");
  refresh();
}


mina86 07-12-2013 06:35 AM

http://en.wikipedia.org/wiki/Sorting_algorithm

psionl0 07-12-2013 06:47 AM

Quicksort, Mergesort and Heapsort all run faster than bubble sort.

Also, if datalistfile is an array of character pointers rather than character arrays, then you wouldn't need to do all those strncpy's. You could just swap pointers datalistfile[i] and datalistfile[i+1]

Xeratul 07-12-2013 11:39 AM

Quote:

Originally Posted by psionl0 (Post 4989019)
Quicksort, Mergesort and Heapsort all run faster than bubble sort.

Also, if datalistfile is an array of character pointers rather than character arrays, then you wouldn't need to do all those strncpy's. You could just swap pointers datalistfile[i] and datalistfile[i+1]

Hi, thanks. In which way would you mean please?
"You could just swap pointers datalistfile[i] and datalistfile[i+1]"

psionl0 07-12-2013 06:31 PM

Quote:

Originally Posted by Xeratul (Post 4989167)
Hi, thanks. In which way would you mean please?
"You could just swap pointers datalistfile[i] and datalistfile[i+1]"

Code:

char *datalistfile[datalistfile_max_numberitems], *t;

. . . // allocate memory for strings, read strings, start sort etc

t = datalistfile[i];
datalistfile[i] = datalistfile[i+1];
datalistfile[i+1] = t;

. . .

Of course, unless you are running bubble sort, you would not necessarily be swapping adjacent strings.

EDIT: Looking at your original code, I see that you have written
Code:

for(j=6 ; (j>=0) ; j--)
{
  ... // complete bubble sort
}

No wonder your sort is "(actually long)..."! You are running a complete bubble sort 7 times in succession based on each individual character comparison.

Much better to use
Code:

...
if (strncmp(datalistfile[i] , datalistfile[i+1], 7) > 0) ...


konsolebox 07-12-2013 07:22 PM

Quicksort is my personal favorite. A concept of it has been discussed well on the book I bought which is A Book on C 4th Edition. I even made an implementation of it for Bash here which now I used in my scripts.

ta0kira 07-12-2013 07:36 PM

Quote:

Originally Posted by konsolebox (Post 4989361)
Quicksort is my personal favorite. A concept of it has been discussed well on the book I bought which is A Book on C 4th Edition. I even made an implementation of it for Bash here which now I used in my scripts.

Quicksort is great, unless you have non-identical elements that compare as equal, in which case you might consider mergesort, which is stable. Contrary to all of the examples I've seen, mergesort can be done without recursion, although it always requires a buffer the same size as the array being sorted and it always performs the max number of steps regardless of the element values.

Kevin Barry

konsolebox 07-12-2013 08:36 PM

Quote:

Originally Posted by ta0kira (Post 4989362)
Quicksort is great, unless you have non-identical elements that compare as equal

That depends on the material considered and implementation I guess. I haven't seen any trouble with quicksort. If you're referring to objects sorted by identity and serialization I guess that would apply. But for strings or numbers which are common I don't think it will.

---- Add ----

And seeing the comparison of it with mergesort, I see mergesort as a more conservative and less aggressive algorithm, which isn't my taste. I think it was also discussed in the book A Book on C along with Quicksort and if it was I must have had the same reasons why I picked Quicksort over it.

About the problem of possibly having non-identical elements that compare as equal, the fault would already be about the comparer function and not really Quicksort itself.

And thanks to this discussion I found this interesting algorithm: http://en.wikipedia.org/wiki/Introsort.

ta0kira 07-13-2013 06:51 AM

Quote:

Originally Posted by konsolebox (Post 4989382)
About the problem of possibly having non-identical elements that compare as equal, the fault would already be about the comparer function and not really Quicksort itself.

Not true. Suppose I have a spreadsheet and I want to sort rows by the values in column A. Suppose rows X and Y have the same value for A, and X comes before Y. With a stable algorithm, X will remain before Y, whereas with an unstable algorithm their order will depend on artifacts of the algorithm. So, with a stable algorithm I can sort with A, then B, then C, and the result will be the same as sorting with C, B, and A all at once, whereas with an unstable algorithm the result will be the same as sorting with only C.

Kevin Barry

psionl0 07-13-2013 11:30 PM

Quote:

Originally Posted by konsolebox (Post 4989361)
Quicksort is my personal favorite.

Quicksort seems to be a universal favourite. However, like all comparison-based sorts, the minimum time taken to sort a large set of elements is proportional to N * Log(N) where N is the number of elements. This is much faster than the N^2 time taken by more primitive sorts like Insertion Sort or Bubble Sort but we can do better.

Bucket Sort or Counting Sort can be made to run in linear time since they don't rely on comparisons. Counting sort is also stable since it doesn't change the order of two elements with the same key.

Radix Sort is a method of sorting large integers or strings by doing a counting sort digit by digit starting (usually) with the least significant digit first. http://www.jimloy.com/computer/radix.htm shows a simple demonstration of how a radix sort works.

konsolebox 07-14-2013 07:43 PM

Quote:

Originally Posted by ta0kira (Post 4989582)
Not true. Suppose I have a spreadsheet and I want to sort rows by the values in column A. Suppose rows X and Y have the same value for A, and X comes before Y. With a stable algorithm, X will remain before Y, whereas with an unstable algorithm their order will depend on artifacts of the algorithm.

Well like I said it would depend on the comparer function. I don't see it essential to consider the original order of two similar values unless the object in which they belong like in a link list of a set of cells within a row would be affected. If that is the case the comparer function would have to consider as well the original order of those two objects within that list. If the object is arranged through sort IDs instead of lists which is rare then the function would also have to consider the ID.

If an initial mapping is still required to make that work then perhaps you're correct.

Quote:

So, with a stable algorithm I can sort with A, then B, then C, and the result will be the same as sorting with C, B, and A all at once, whereas with an unstable algorithm the result will be the same as sorting with only C.
It seems only that it would be easier to implement or is more natural on stable sorts.

konsolebox 07-14-2013 08:05 PM

Quote:

Originally Posted by psionl0 (Post 4989940)
Quicksort seems to be a universal favourite. but we can do better.

Well Quicksort has been invented in 1960 so it's no longer unusual to seee better algorithms or improved variations of it. I did said that it was my favorite, but it doesn't mean that I'm no longer expecting other better algorithms over it.

Yet again if I would consider the balance between complexity of implementation and speed, with natural elements I would go for Quicksort.

Mergesort seems to be something to be considered with objects and linked lists.

sundialsvcs 07-14-2013 11:10 PM

Dr. Knuth wrote an entire book called Sorting and Searching.

mina86 07-15-2013 07:41 AM

Really? No one mentioned that QuickSort is O(n^2) yet?[1] In the worst case it degenerates to select sort. That's why I much prefer Heap Sort (if I don't care if sorting is stable, but even if I do, I would probably just number the elements and sort by double key).

[1] Unless one uses median of medians approach, but then it's just ugly for my taste.

johnsfine 07-15-2013 08:32 AM

Quote:

Originally Posted by sundialsvcs (Post 4990379)
Dr. Knuth wrote an entire book called Sorting and Searching.

I learned about sorting from that book long ago. But I wouldn't recommend it.
He does a poor job of describing the methods. He focuses on the analysis of the performance. For a mathematician planning to invent a new sorting method, that focus on analysis is an important starting point.

For a software engineer planning to select and code a sorting method, you want a clear description of the method and a clear description of the performance bottom line reached from all that analysis. Knuth's pseudo code implementations are all lame and his performance analysis math may be better than I would do, but it is just a distraction from algorithm plus bottom line performance that I would want.

Before the internet, Knuth was the best you could easily find in an ordinary library. After the internet existed, it was time to forget Knuth.


All times are GMT -5. The time now is 08:09 PM.