LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 01-30-2015, 10:15 AM   #1
andrew.comly
Member
 
Registered: Dec 2012
Distribution: Trisquel-Mini 7.0, Lubuntu 14.04, Debian lxde 8.0
Posts: 311
Blog Entries: 2

Rep: Reputation: 16
Question C programming index number confusion


C programming / selection sort / index number confusion

The below code uses functions to create a selection sort. It's source is http://www.sanfoundry.com/c-program-...ing-functions/

Code:
/*
   C program for SELECTION sort that uses the following functions to:
     a) Find maximum of elements
     b) Swap two elements
 */
#include<stdio.h>

int findmax(int b[],int size);
void swap(int b[],int size);

void main()
{
    int n;
    printf("Enter the value of n: ");
    scanf("%d",&n);

    int array[n];
    int i,j,temp;

    puts("Enter the array's numbers: ");
    for(i=0;i<n;i++)
    {
        scanf("%d",&array[i]);
    }
    printf("The unsorted array elements are: ");
    for(i=0;i<n;i++)
    {
        printf("%d ",array[i]);
    }
    printf("\n");
    //Selection Sort Begins
    swap(array,n);
    printf("  The sorted array elements are: ");
    for(i=0;i<n;i++)
    {
        printf("%d ",array[i]);
    }
    printf("\n");
}       //end main

//Function to find the maximum value
int findmax(int b[],int size)
{
    int max, j;
    max=0;

    for(j=1;j<=size;j++)
    {
        if(b[j]>b[max])
        {
            max=j;
        }
    }
    return(max);
}

void swap(int b[],int size)
{
    int temp, big, j;
    for(j=size-1;j>0;j--)       
    {
        big=findmax(b,j);
        temp=b[j];
        b[j]=b[big];
        b[big]=temp;
    }
    return;
}
Specifically, why does "int findmax(int b[],int size)" specify 'j' in "for(j=1;j<=size;j++)"?
Just think what 'j<=size' implies. if you have a 5 element array, e.g. int b[5], then int b[5] declares (as integer) the array b[] which is comprised of b[0], b[1], b[2], b[3] and finally b[4]. THERE IS NO b[size]!!!!! I even wrote another program where I enter in the elements of an array, then I enter in the element number. Enter in a '0' and you get back the 1st element(NOT the 0th element). Enter in a 2 and an array will give you back the 3rd element. For a array of size 5 (int array[5]), enter in a five, and then you're in trouble, the computer will produce an erroneous number. This is simple to see, but here the above program for some reason defies all c programming logic and states "for(j=1;j<=size...", which implies to retrieve the element n+1 in an array of size n.

Does anyone know why j<=size makes the program work yet when I edit the program to "j<size" the program doesn't sort correctly?
 
Old 01-30-2015, 10:23 AM   #2
NevemTeve
Senior Member
 
Registered: Oct 2011
Location: Budapest
Distribution: Debian/GNU/Linux, AIX
Posts: 4,851
Blog Entries: 1

Rep: Reputation: 1868Reputation: 1868Reputation: 1868Reputation: 1868Reputation: 1868Reputation: 1868Reputation: 1868Reputation: 1868Reputation: 1868Reputation: 1868Reputation: 1868
> Specifically, why does "int findmax(int b[],int size)" specify 'j' in "for(j=1;j<=size;j++)"?
> Just think what 'j<=size' implies.

It doesn't implies that, because 'size' isn't the size, it's the 'highest index'. You'd better use clearer variable names, eg:

Code:
//Function to find the maximum value
int findmax (const int b[], int highest_index)
{
    int maxpos, j;

    if (highest_index<1) exit (77);

    maxpos= 0;
    for(j=1;j<=highest_index;j++) {
        if(b[j]>b[maxpos]) {
            maxpos=j;
        }
    }
    return maxpos;
}

Last edited by NevemTeve; 01-30-2015 at 10:24 AM.
 
2 members found this post helpful.
Old 01-30-2015, 12:48 PM   #3
smallpond
Senior Member
 
Registered: Feb 2011
Location: Massachusetts, USA
Distribution: Fedora
Posts: 4,125

Rep: Reputation: 1260Reputation: 1260Reputation: 1260Reputation: 1260Reputation: 1260Reputation: 1260Reputation: 1260Reputation: 1260Reputation: 1260
The reason it works is that it starts at size-1

Code:
for(j=size-1;j>0;j--)
It is always comparing the whole list to the last element of the list. Which also means this is not a good algorithm for very large lists.
 
1 members found this post helpful.
Old 01-30-2015, 02:22 PM   #4
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,687

Rep: Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274
I would not say this code works. In some cases probably it produces the required result, but in some cases that code will simply die.
size means the number of elements in the array, and the index is running from 0 to size-1. Therefore the for loop should be:
Code:
for(j=0;j<size;j++)
but j=0 can be skipped in your case. Therefore:
Code:
//Function to find the maximum value
int findmax(int b[],int size)
{
    int max, j;
    max=0;

    # just exit in this case 
    if (size < 1) exit (77);
    
    for(j=1;j<size;j++)
    {
        if(b[j]>b[max])
        {
            max=j;
        }
    }
    return(max);
}
but in some cases - like in the original example you linked - you pass the index of the last item (that is not equal to the size, but less), and therefore that code will work.
 
1 members found this post helpful.
Old 01-30-2015, 03:45 PM   #5
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
The key to the answer is that C doesn't care what a variable's name means. It only cares what a variable's usage means.

In swap() we have a variable named "size" and the way it is used, it actually is the size. But in findmax() we have another variable named "size" but based on the way it is used, it should have been named "last".

In each pass through the loop in swap() the number passed to findmax() is the index of the last unsorted position. It is not the size of the unsorted portion. It is one less than the size of the unsorted portion.

Bad variable names are very confusing to anyone who tries to read the code. So even though the code works, the author should be ashamed.

Quote:
Originally Posted by smallpond View Post
It is always comparing the whole list to the last element of the list.
Not in any obvious meaning of "always" or "last". I understand you basic objection. But you aren't expressing it very well.

Quote:
Which also means this is not a good algorithm for very large lists.
I'm sure it was never intended as a good algorithm for very large lists. I expect it was intended as an example of implementing a simple algorithm in C (meant to teach C programming, not to teach sorting algorithms).

As an example used to teach C programming, it is terrible for several reasons. But not because it is a poor sorting algorithm. Selecting best algorithms probably should not be part of early teaching of C. Best algorithm is largely a distraction from learning the basics of competent coding.

Using meaningful, rather than misleading, variable names should be part of teaching C coding from the start.

Expressing ranges as first,limit in most cases and first,size in others and almost never as first,last is an important aspect of decent C style, that should be learned at first by example and rote, because using first,last for a range (or using last alone when first is implied) is a very bad practice whose flaws will not be explainable to beginners nor matter in trivial examples. But if you are in the habit of using last (rather than size or limit) to end a range, by the time you reach serious work where that habit creates a mess, it will be too late to change.

Using int rather than unsigned for inherently non negative quantities is another bad habit, best avoided from the start.

Quote:
Originally Posted by pan64 View Post
but in some cases that code will simply die.
That is a best a strange point of view, rather than a true statement. In effect, you are saying the function is coded incorrectly and then used incorrectly in a way that makes it work. You are saying the function could be called in other ways that would make it crash. But that is true of most functions. C is not intended for strictly defensive programming.

I see that as a badly misnamed variable. "size" in that function is not a size. You see the variable name as a contract, which is then violated by both the function itself and its caller.

Last edited by johnsfine; 01-30-2015 at 04:12 PM.
 
1 members found this post helpful.
Old 01-31-2015, 07:53 AM   #6
andrew.comly
Member
 
Registered: Dec 2012
Distribution: Trisquel-Mini 7.0, Lubuntu 14.04, Debian lxde 8.0
Posts: 311

Original Poster
Blog Entries: 2

Rep: Reputation: 16
Talking Thanks everyone, Core problem solved!

Thanks to all, I was indeed mislead by the example's use certain misleading words used as variables. More specifically the formal function definition for int findmax(int b, size), the "size" variable mixed me up. I thought the variable going into this function was "size". Actually the variable going into the function findmax was "j" from the swap function's "j" variable, which as all know so well from the previous line of code is "for(j=size-1;j>0;j--)" thus this variable only ranges from size-1 to 1, thus a span falling short of the array's size by 1 element! Naturally then Sanfoundry had written "j<=size" in findmax function's "for(j=1;j<=size;j++)" statement, and not "j<size".

I should have some more time tomorrow evening to look some more at johnsfine and pan64 at more detail to enable me to make the program even better for pedagogical purposes, but the core question has most certainly been solved.

Thanks everyone so much!

Last edited by andrew.comly; 02-01-2015 at 05:58 AM. Reason: grammar
 
Old 02-01-2015, 08:21 AM   #7
andrew.comly
Member
 
Registered: Dec 2012
Distribution: Trisquel-Mini 7.0, Lubuntu 14.04, Debian lxde 8.0
Posts: 311

Original Poster
Blog Entries: 2

Rep: Reputation: 16
johnsfine,
Thanks so much for your pointers, and the final program is as below.

Code:
/*
   C program for SELECTION sort that uses the following functions to:
     a) Find maximum of elements
     b) Swap two elements
 */
#include<stdio.h>

unsigned findmax(int b[],unsigned si);
void swap(int b[],unsigned limit);

void main()
{
    unsigned lmt;       //limit of array b's size
    printf("Enter the value of the array: ");
    scanf("%d",&lmt);

    int array[lmt];
    unsigned i;

    puts("Enter the array's numbers: ");
    for(i=0;i<lmt;i++)
    {
        scanf("%d",&array[i]);
    }
    printf("The unsorted array elements are: ");
    for(i=0;i<lmt;i++)
    {
        printf("%d ",array[i]);
    }
    printf("\n");
    //Selection Sort Begins
    swap(array,lmt);
    printf("  The sorted array elements are: ");
    for(i=0;i<lmt;i++)
    {
        printf("%d ",array[i]);
    }
    printf("\n");
}       //end main

//Function to find the maximum value
unsigned findmax(int b[],unsigned si)
{
    unsigned maxpos, m;     //m= findmax's index
    maxpos=0;

    for(m=1;m<=si;m++)
    {
        if(b[m]>b[maxpos])
        {
            maxpos=m;
        }
    }
    return(maxpos);
}

//Function to swap variables
void swap(int b[],unsigned limit)
{
    int temp, big;
    unsigned si;             //swap's index]
    for(si=limit-1;si>0;si--)
        /*
        1)'limit-1' because an array a[n] only has n elements going from a[0],a[1],...,a[n-1]
        2)si>0  and not si>=0 because 5 elements at the most only need 4 rounds of swaping(just experiment with cut out numbers), hence n elements at most need n-1 times of swapping.
        */
    {
        big=findmax(b,si);
        temp=b[si];
        b[si]=b[big];
        b[big]=temp;
    }
    return;
}

/*
TESTING
I thought about making both findmax and swap both go forward rather than the current way where swap goes backwards, but that way then at the beginning of the loop then find max gets fed a really low swap index number which ends up blocking findmax from returning the absolute maximum early on.  If the abs. max isn't swapped early on then each future loop will get stuck sorting the abs. max not to the absolute end memory address but merely one more memory address upward each time, causing the remaining numbers to remain totally unsorted.
*/
Thanks soooo much!!
 
Old 02-01-2015, 08:23 AM   #8
andrew.comly
Member
 
Registered: Dec 2012
Distribution: Trisquel-Mini 7.0, Lubuntu 14.04, Debian lxde 8.0
Posts: 311

Original Poster
Blog Entries: 2

Rep: Reputation: 16
Question

Quote:
Originally Posted by pan64 View Post
I would not say this code works. In some cases probably it produces the required result, but in some cases that code will simply die.
How specifically doesn't the code work, i.e. could you give an example?
 
Old 02-01-2015, 09:56 AM   #9
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
I don't think changing the name of "size" in swap() to "limit" was helpful. As the size of the unsorted portion, I think the name "size" was just as meaningful as "limit".

I'm less happy with changing the name of "size" in findmax() to "si". That does get rid of the agressively wrong name. But what does "si" mean? If you keep the bad design in which the variable represents a last index, you should make that clearer by naming it "last". Passing a "last" index in C is bad practice, but naming it unclearly makes that worse, not better. The better choice is to change the code to not pass a last index:

Code:
/*
   C program for SELECTION sort that uses the following functions to:
     a) Find maximum of elements
     b) Swap two elements
 */
#include<stdio.h>

unsigned findmax(int b[],unsigned size);  /* Takes location and size of unsorted portion */
void swap(int b[],unsigned size);  /* Takes location and size of unsorted portion */

void main()
{
    unsigned size;       //array b's size
    printf("Enter the value of the array: ");
    scanf("%d",&size);

    int array[size];
    unsigned i;

    puts("Enter the array's numbers: ");
    for(i=0;i<size;i++)
    {
        scanf("%d",&array[i]);
    }
    printf("The unsorted array elements are: ");
    for(i=0;i<size;i++)
    {
        printf("%d ",array[i]);
    }
    printf("\n");
    //Selection Sort Begins
    swap(array,size);
    printf("  The sorted array elements are: ");
    for(i=0;i<size;i++)
    {
        printf("%d ",array[i]);
    }
    printf("\n");
}       //end main

//Function to find the maximum value
unsigned findmax(int b[],unsigned size)
{
    unsigned maxpos, m;     //m= findmax's index
    maxpos=0;

    for(m=1;m<size;m++)
    {
        if(b[m]>b[maxpos])
        {
            maxpos=m;
        }
    }
    return(maxpos);
}

//Function to swap variables
void swap(int b[],unsigned size)
{
    int temp, big;
    unsigned si;             //swap's index]
    for(si=size;si>1;si--)
        /*
        si>1  and not si>0 because 5 elements at the most only need 4 rounds of swaping(just experiment with cut out numbers), hence n elements at most need n-1 times of swapping.
        */
    {
        big=findmax(b,si);
        temp=b[si-1];
        b[si-1]=b[big];
        b[big]=temp;
    }
    return;
}
Quote:
I thought about making both findmax and swap both go forward rather than the current way where swap goes backwards, but that way then at the beginning of the loop then find max gets fed a really low swap index number which ends up blocking findmax from returning the absolute maximum early on. If the abs. max isn't swapped early on then each future loop will get stuck sorting the abs. max not to the absolute end memory address but merely one more memory address upward each time, causing the remaining numbers to remain totally unsorted.
The fact that you are finding the max each time is tied to the fact that the loop in swap() run backwards. The loop in findmax (or if you changed it to findmin) can run in either direction. But the loop in swap() must run in the direction consistent with the choice to find max or min each time.

BTW, I think using unsigned instead of int is good practice when quantities are inherently unsigned. But it does require some extra care. One of your comments about >=0 implied you had missed that detail. This program did not need any >=0 checks on unsigned loops. But that need is quite common, and becomes and extra detail to worry about when using unsigned. I think it is worth it, but others likely disagree:

The following is a very common loop:
Code:
    for (i=size-1; i>=0; i--)
The values of i are inherently non negative, but when you use unsigned instead on int, the loop doesn't work. The invalid (negative) value of i is needed in the test to terminate the loop.

I code that loop in a lot of serious functions, where the flaws of using int instead of unsigned matter, so I use a form that probably looks strange the first time:

Code:
   for (i = size; i-- > 0; )
Within the body of the loop, i goes through the correct set of values (size-1 downto 0 inclusive), and has whatever advantages are gained by using an unsigned variable for the unsigned quantity. If you aren't used to seeing that form of loop, it will be confusing at first glance, which is a real disadvantage (other people need to be able to follow my code). But I think the advantages end up outweighing that disadvantage.

Last edited by johnsfine; 02-01-2015 at 10:14 AM.
 
1 members found this post helpful.
Old 02-01-2015, 08:06 PM   #10
andrew.comly
Member
 
Registered: Dec 2012
Distribution: Trisquel-Mini 7.0, Lubuntu 14.04, Debian lxde 8.0
Posts: 311

Original Poster
Blog Entries: 2

Rep: Reputation: 16
Quote:
Originally Posted by johnsfine View Post
The following is a very common loop:
Code:
for (i=size-1; i>=0; i--)
...
I code that loop in a lot of serious functions, where the flaws of using int instead of unsigned matter, so I use a form that probably looks strange the first time:

Code:
   for (i = size; i-- > 0; )
Within the body of the loop, i goes through the correct set of values (size-1 downto 0 inclusive),
I have two questions:
1) Why doesn't the code "i = size;" include 'size'?
2) The above code works well in the swap function, but when I adapt it for the findmax function and write 'for(m = 1; m-->s; )' I get the following result:
Code:
Enter the array's size: 5
Enter the array's numbers: 
5
4
3
2
1
The unsorted array elements are: 5 4 3 2 1 
The sorted array elements are: 4 3 2 1 5
Should the syntax 'for(s = size; s-- >0; )' only be used for unsigned index numbers approaching 0(as to avoid testing a negative value that can't inherently exist), and never anything else?

Last edited by andrew.comly; 02-01-2015 at 08:30 PM.
 
Old 02-01-2015, 08:42 PM   #11
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
Each of your questions implies you are failing to see the basic operation i-- in that less usual context.

i-- is a basic C operation, saying to decrement i, but use the value before decrementing for the current expression.

So in: for (i = size; i-- > 0; ) { body }
in each iteration, the value of i before it is decremented is compared to zero, but the value of i after it is decremented is used within body.

Quote:
1) Why doesn't the code "i = size;" include 'size'?
The test expression of a for() is computed at the beginning of each iteration, including the first iteration. So the i-- in the test expression is executed before the first iteration and so i is changed from size to size-1 before the body of the loop is executed the first time.

Quote:
2) The above code works well in the swap function, but when I adapt it for the findmax function and write 'for(m = 1; m-->s; )' I get the following result:
The m-- part of the test expression decrements m, so obviously it means the loop is running backward.

Quote:
Should the syntax 'for(s = size; s-- >0; )' only be used for unsigned index numbers approaching 0(as to avoid testing a negative value that can't inherently exist), and never anything else?
Pretty much. It decrements the index, so it should be used only in cases where decrementing the index is what you want. It is a non obvious syntax whose benefit is avoiding the need to test against negative. So it could be used for signed values, but that would be less readable for no reason.

So the only good usage of that syntax is as you said: for unsigned index numbers approaching 0(as to avoid testing a negative value that can't inherently exist)

Last edited by johnsfine; 02-01-2015 at 08:45 PM.
 
1 members found this post helpful.
Old 02-02-2015, 03:05 AM   #12
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,687

Rep: Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274
Quote:
Originally Posted by andrew.comly View Post
How specifically doesn't the code work, i.e. could you give an example?
That is quite simple: b[size] is invalid, since the last element of the array is b[size-1]. Trying to access to a non-existent member may cause crash and/or other strange things (like overwriting another variable)
 
  


Reply

Tags
c programming, selection, sorting


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Version number confusion donallen Slackware 4 10-01-2014 08:34 AM
[SOLVED] Compare array index number against element in bash rewtnull Programming 10 11-01-2011 02:53 PM
C++ find array index for largest number. joshp Programming 7 03-15-2009 10:56 AM
mysql version number confusion (SuSE 10.0) itchy99 Linux - Software 2 02-06-2006 10:58 AM
Confusion on number of ISO's for 9.1 deejayqf Slackware 2 01-11-2004 08:38 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 10:17 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration