C: A Fastest Array Sorting, faster than bubble sort ?

ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Welcome to LinuxQuestions.org, a friendly and active Linux Community.

You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!

Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.

If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.

Having a problem logging in? Please visit this page to clear all LQ-related cookies.

Introduction to Linux - A Hands on Guide

This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.

Click Here to receive this Complete Guide absolutely free.

Also, if datalistfile is an array of character pointers rather than character arrays, then you wouldn't need to do all those strncpy's. You could just swap pointers datalistfile[i] and datalistfile[i+1]

Also, if datalistfile is an array of character pointers rather than character arrays, then you wouldn't need to do all those strncpy's. You could just swap pointers datalistfile[i] and datalistfile[i+1]

Hi, thanks. In which way would you mean please?
"You could just swap pointers datalistfile[i] and datalistfile[i+1]"

No wonder your sort is "(actually long)..."! You are running a complete bubble sort 7 times in succession based on each individual character comparison.

Much better to use

Code:

...
if (strncmp(datalistfile[i] , datalistfile[i+1], 7) > 0) ...

Quicksort is my personal favorite. A concept of it has been discussed well on the book I bought which is A Book on C 4th Edition. I even made an implementation of it for Bash here which now I used in my scripts.

Quicksort is my personal favorite. A concept of it has been discussed well on the book I bought which is A Book on C 4th Edition. I even made an implementation of it for Bash here which now I used in my scripts.

Quicksort is great, unless you have non-identical elements that compare as equal, in which case you might consider mergesort, which is stable. Contrary to all of the examples I've seen, mergesort can be done without recursion, although it always requires a buffer the same size as the array being sorted and it always performs the max number of steps regardless of the element values.

Quicksort is great, unless you have non-identical elements that compare as equal

That depends on the material considered and implementation I guess. I haven't seen any trouble with quicksort. If you're referring to objects sorted by identity and serialization I guess that would apply. But for strings or numbers which are common I don't think it will.

---- Add ----

And seeing the comparison of it with mergesort, I see mergesort as a more conservative and less aggressive algorithm, which isn't my taste. I think it was also discussed in the book A Book on C along with Quicksort and if it was I must have had the same reasons why I picked Quicksort over it.

About the problem of possibly having non-identical elements that compare as equal, the fault would already be about the comparer function and not really Quicksort itself.

About the problem of possibly having non-identical elements that compare as equal, the fault would already be about the comparer function and not really Quicksort itself.

Not true. Suppose I have a spreadsheet and I want to sort rows by the values in column A. Suppose rows X and Y have the same value for A, and X comes before Y. With a stable algorithm, X will remain before Y, whereas with an unstable algorithm their order will depend on artifacts of the algorithm. So, with a stable algorithm I can sort with A, then B, then C, and the result will be the same as sorting with C, B, and A all at once, whereas with an unstable algorithm the result will be the same as sorting with only C.

Quicksort seems to be a universal favourite. However, like all comparison-based sorts, the minimum time taken to sort a large set of elements is proportional to N * Log(N) where N is the number of elements. This is much faster than the N^2 time taken by more primitive sorts like Insertion Sort or Bubble Sort but we can do better.

Bucket Sort or Counting Sort can be made to run in linear time since they don't rely on comparisons. Counting sort is also stable since it doesn't change the order of two elements with the same key.

Radix Sort is a method of sorting large integers or strings by doing a counting sort digit by digit starting (usually) with the least significant digit first. http://www.jimloy.com/computer/radix.htm shows a simple demonstration of how a radix sort works.

Not true. Suppose I have a spreadsheet and I want to sort rows by the values in column A. Suppose rows X and Y have the same value for A, and X comes before Y. With a stable algorithm, X will remain before Y, whereas with an unstable algorithm their order will depend on artifacts of the algorithm.

Well like I said it would depend on the comparer function. I don't see it essential to consider the original order of two similar values unless the object in which they belong like in a link list of a set of cells within a row would be affected. If that is the case the comparer function would have to consider as well the original order of those two objects within that list. If the object is arranged through sort IDs instead of lists which is rare then the function would also have to consider the ID.

If an initial mapping is still required to make that work then perhaps you're correct.

Quote:

So, with a stable algorithm I can sort with A, then B, then C, and the result will be the same as sorting with C, B, and A all at once, whereas with an unstable algorithm the result will be the same as sorting with only C.

It seems only that it would be easier to implement or is more natural on stable sorts.

Quicksort seems to be a universal favourite. but we can do better.

Well Quicksort has been invented in 1960 so it's no longer unusual to seee better algorithms or improved variations of it. I did said that it was my favorite, but it doesn't mean that I'm no longer expecting other better algorithms over it.

Yet again if I would consider the balance between complexity of implementation and speed, with natural elements I would go for Quicksort.

Mergesort seems to be something to be considered with objects and linked lists.

Really? No one mentioned that QuickSort is O(n^2) yet?[1] In the worst case it degenerates to select sort. That's why I much prefer Heap Sort (if I don't care if sorting is stable, but even if I do, I would probably just number the elements and sort by double key).

[1] Unless one uses median of medians approach, but then it's just ugly for my taste.

Dr. Knuth wrote an entire book called Sorting and Searching.

I learned about sorting from that book long ago. But I wouldn't recommend it.
He does a poor job of describing the methods. He focuses on the analysis of the performance. For a mathematician planning to invent a new sorting method, that focus on analysis is an important starting point.

For a software engineer planning to select and code a sorting method, you want a clear description of the method and a clear description of the performance bottom line reached from all that analysis. Knuth's pseudo code implementations are all lame and his performance analysis math may be better than I would do, but it is just a distraction from algorithm plus bottom line performance that I would want.

Before the internet, Knuth was the best you could easily find in an ordinary library. After the internet existed, it was time to forget Knuth.

LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.