LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   undefined in c (https://www.linuxquestions.org/questions/programming-9/undefined-in-c-751030/)

lemon09 08-29-2009 03:11 AM

undefined in c
 
hello,

seemingly the following code:

Code:


int i=10,j=20
i= i++ + j++

should produce an output '30'...when the truth is that the output is '31'.

here i should mention that these are some undefined in C(K&R didn't say anything if such a situation arises).

nearly all the compilers(atleast for those which i have tested) produces an output of '31'.

can you please say how the compilers handle such a situation.

in java however you get the correct answer, i.e., 30.
why do the results differ when they are actually the same.

Telemachos 08-29-2009 06:05 AM

I believe that C and its compilers don't want you to do multiple operations on the one item (the variable i) at the same time. If you add a third variable, everything is fine:

Code:

telemachus ~ $ cat count.c
#include <stdio.h>

int main (void) {
    int i = 10;
    int j = 20;
    int k = 0;
    int x = 10;
    int y = 20;

    k = i++ + j++;
    printf("The total is %d.\n", k);

   
    x = x++ + y++;
    printf("The total is %d.\n", x);

    return 0;
}
telemachus ~ $ gcc -Wall --pedantic count.c -o count
count.c: In function ‘main’:
count.c:14: warning: operation on ‘x’ may be undefined
telemachus ~ $ ./count
The total is 30.
The total is 31.

When the compiler says "operation on 'x' may be undefined", I usually take that as a sign that I did something wrong and need to change it.

Sergei Steshenko 08-29-2009 08:12 AM

Quote:

Originally Posted by lemon09 (Post 3661698)
hello,

seemingly the following code:

Code:


int i=10,j=20
i= i++ + j++

should produce an output '30'...when the truth is that the output is '31'.

here i should mention that these are some undefined in C(K&R didn't say anything if such a situation arises).

nearly all the compilers(atleast for those which i have tested) produces an output of '31'.

can you please say how the compilers handle such a situation.

in java however you get the correct answer, i.e., 30.
why do the results differ when they are actually the same.

Why do you think it should ? I.e. could you justify your expectations by clauses from, say, C99 standard ?

jlinkels 08-29-2009 09:01 AM

When i++ is used, it is said that variable i is increased 'after evaluation'. If it were increased before evaluation, the outcome would be 32 as both i and j are increased.

The difference that i seems to be increased before evaluation is obviously caused by using i as the result as well. Which is also demonstrated by the example of Telemachos.

It would be interesting to see the assembly code produced by the compiler, could someone do that? (I know it is possible, but I forgot long ago how to do that)

I am also wondering if there is some specification stating that this kind of evaluation is correct/not correct/undefined when the result is one of the operands.

Last, unless your goal is to write obfuscated code, it is best to avoid such statements. However from a scientific point of view it is certainly interesting to look into it.

jlinkels

johnsfine 08-29-2009 09:20 AM

Quote:

Originally Posted by lemon09 (Post 3661698)
can you please say how the compilers handle such a situation.

in java however you get the correct answer, i.e., 30.
why do the results differ when they are actually the same.

You can look at i=i++ + j++; as several separate operations, such as

1) temp = i + j
2) i++
3) j++
4) i = temp

But the sequence of those operations is not fully defined. We know 1 comes before any of 2, 3 or 4, but we don't know whether 2 comes before or after 3 or 4.

It is perfectly legitimate for the compiler to choose the sequence:
temp = i + j
i = temp
j++
i++

So 30 is not "the correct answer". It is one possible answer. 31 is an equally valid answer.

Sergei Steshenko 08-29-2009 09:24 AM

Quote:

Originally Posted by jlinkels (Post 3661941)
When i++ is used, it is said that variable i is increased 'after evaluation'. If it were increased before evaluation, the outcome would be 32 as both i and j are increased.

The difference that i seems to be increased before evaluation is obviously caused by using i as the result as well. Which is also demonstrated by the example of Telemachos.

It would be interesting to see the assembly code produced by the compiler, could someone do that? (I know it is possible, but I forgot long ago how to do that)

I am also wondering if there is some specification stating that this kind of evaluation is correct/not correct/undefined when the result is one of the operands.

Last, unless your goal is to write obfuscated code, it is best to avoid such statements. However from a scientific point of view it is certainly interesting to look into it.

jlinkels

AFAIK, this a classical example of undefined by standard behavior, and the compiler warns about it. Such code should be rejected.

mrtiller 08-29-2009 11:14 AM

please see

http://www.parashift.com/c++-faq-lit...html#faq-39.15

on sequence points or the corresponding section of the C faq's (the discussion applies to both languages).

paulsm4 08-29-2009 11:38 AM

I hate it when people want a reasonable explanation for completely unreasonable cases like this example.

To add insult to injury, Java does *not* "give the correct answer. The snippet as posted won't even *compile* in Java (and it may or may not compile in C - "it depends").

Why would anyone want to waste any time on silly, nonsensical, pointless questions like this?

If there's a pressing *need* to experiment in the land of "undefined behavior" - why not experiment on your own car (or any other handy motor vehicle)? Drain all the oil. Drive it around as long as you can. Does the engine seize up? Or does it just explode in a ball of flame? Why one and not the other?

Sigh...

wje_lq 08-29-2009 12:53 PM

Quote:

Originally Posted by paulsm4 (Post 3662068)
Why would anyone want to waste any time on silly, nonsensical, pointless questions like this?

Because it's fun to play with mudpies.

When children do pointless things in play, it's easy to smile at them. But this is their work. This is the way they join battle with the universe. Their work is just as serious as our work.

When we play with undefined behavior in a C program, we are reappropriating for ourselves the wisdom we once had as children.

An aside: Toy C programs are cheap. Much cheaper than automobile engines.

Sergei Steshenko 08-29-2009 02:02 PM

Quote:

Originally Posted by paulsm4 (Post 3662068)
I hate it when people want a reasonable explanation for completely unreasonable cases like this example.

To add insult to injury, Java does *not* "give the correct answer. The snippet as posted won't even *compile* in Java (and it may or may not compile in C - "it depends").

Why would anyone want to waste any time on silly, nonsensical, pointless questions like this?

If there's a pressing *need* to experiment in the land of "undefined behavior" - why not experiment on your own car (or any other handy motor vehicle)? Drain all the oil. Drive it around as long as you can. Does the engine seize up? Or does it just explode in a ball of flame? Why one and not the other?

Sigh...

There is a wide spread opinion that if one faces such an interview question, he/she better run from the company where they ask such questions and expect anything but "undefined behavior".

ta0kira 08-29-2009 04:19 PM

This must be unique to integral types, i.e. some sort of built-in operator ambiguity. Take the following example, compiled with g++:
Code:

#include <stdio.h>


class fake_int
{
public:
        fake_int(int vVal = 0) : self(++instance), value(vVal)
        { fprintf(stderr, "[create %u (%i)]\n", self, value); }


        fake_int &operator = (const fake_int &rRight)
        {
        fprintf(stderr, "[set %u to %u (%i -> %i)]\n", self, rRight.self,
          value, rRight.value);
        value = rRight.value;
        return *this;
        }


        fake_int operator + (const fake_int &rRight) const
        {
        fprintf(stderr, "[add %u and %u (%i and %i)]\n", self, rRight.self,
          value, rRight.value);
        return fake_int(value + rRight.value);
        }


        fake_int operator ++ (int)
        {
        fprintf(stderr, "[post-increment %u (%i -> %i)]\n", self, value, value + 1);
        fake_int temp(value);
        ++value;
        return temp;
        }


        int get_value() const
        { return value; }


private:
        static unsigned int instance;

        const unsigned int self;
        int value;
};


unsigned int fake_int::instance = 0;


int main(int argc, char *argv[])
{
        fake_int i(10), j(20);
        i = i++ + j++;
        fprintf(stderr, "final value: %i\n", i.get_value());
}

gcc apparently doesn't react well to this sort of thing in general. For example:
Code:

#include <stdio.h>


int main()
{
        int i, j;

        i=10; j=20;
        i= i++ + j++;
        fprintf(stderr, "%i\n", i);

        *(unsigned char*) &i= 0x11;
        fprintf(stderr, "%i\n", i);
}

The underlined statement actually makes i++ in the italicized line cause a (different?) malfunction (on my system,) the underlined operation itself being valid (as far as I know.) I get 11 and 17, whereas commenting out the underlined line gives me 31 and 31. Removing ++ from i++ in the italicized line gives me 30 and 17. This is obviously something one shouldn't mess around with in real code.
Kevin Barry

edit:
If that weren't enough, try this one. I get 30!
Code:

#include <stdio.h>


int main()
{
        int i, j;

        i=10; j=20;
        fprintf(stderr, "%i\n", (i = i++ + j++));
}


ta0kira 08-29-2009 04:29 PM

Quote:

Originally Posted by paulsm4 (Post 3662068)
Why would anyone want to waste any time on silly, nonsensical, pointless questions like this?

If someone happens upon something like this, whether intentionally or accidentally, they (like OP) will probably find the outcome counterintuitive. In a way OP is asking what it is about C and/or gcc that allows such an outcome, which alludes to the internal function of the compiler, to say the least. If the statement were compiled as one would expect from programming-language theory, the statement probably shouldn't result in 31; however, it does, which means something unexpected (to us) is going on with the compiler. Although "undefined" means "you'll get what I give you", that doesn't mean the gcc maintainers are snickering about how they can make this statement return 31; there must be a underlying logic that was intended to deal with something else, yet it expresses itself in the context in question.
Kevin Barry

ntubski 08-29-2009 06:04 PM

Quote:

If the statement were compiled as one would expect from programming-language theory, the statement probably shouldn't result in 31; however, it does, which means something unexpected (to us) is going on with the compiler.
I'm not sure what theory you refer to here, but 31 makes sense to my intuition. My understanding of x++ is that the increment happens after the statement is evaluated, so
Code:

i= i++ + j++;
 is equivalent to
i = i + j; i += 1; j += 1;

It just so happens that this corresponds to what gcc compiles to, but as mrtiller already pointed out, this is a well known violation of the standard. The key phrase is "SEQUENCE POINTS". It's in the C faq too.

Quote:

This must be unique to integral types, i.e. some sort of built-in operator ambiguity. Take the following example, compiled with g++:
Overloaded operators are functions, and function calls are sequence points.

Quote:

*(unsigned char*) &i= 0x11;
Doing just (void) &i; is enough to get 11, it seems taking the address of a variable causes gcc to generate different (and more convoluted) code for incrementing. In particular, it uses LEA (on a register holding the old value of i) instead of ADD.

Quote:

If that weren't enough, try this one. I get 30!
Well, you weren't printing i, you printing the value of the assignment expression which is the value of the sum, so of course you get 30.

Code:

#include <stdio.h>
int main()
{
    int i, j;
    i=10; j=20;
    fprintf(stderr, "%i\n", (i = i++ + j++));
    fprintf(stderr, "%i\n", i);
}

This outputs
30
31

jlinkels 08-29-2009 06:59 PM

Quote:

Originally Posted by paulsm4 (Post 3662068)
I hate it when people want a reasonable explanation for completely unreasonable cases like this example.

I am not sure that it is so unreasonable. It compiles correctly, so why not trying to explain what goes wrong.

Quote:

Originally Posted by paulsm4 (Post 3662068)
Why would anyone want to waste any time on silly, nonsensical, pointless questions like this?

Because when you understand something it is much easier to remember. And increase general understanding which might help to avoid similar mistakes in the future. Great inventions were done trying to understand why something didn't work as expected.

Personally, I spend some of my time playing around with totally useless things, weird C constructs was one of them when I was still regularly programming in C. And ...uhm if you look on the internet for totally useless scientific or technical experiments, I bet you'll find quite a few hits.

jlinkels

ta0kira 08-29-2009 06:59 PM

Quote:

Originally Posted by ntubski (Post 3662335)
I'm not sure what theory you refer to here, but 31 makes sense to my intuition. My understanding of x++ is that the increment happens after the statement is evaluated, so

What sense does it make to pull the post-increment operator entirely out of the statement it exists in? It's a unary operator after all, not to mention that's a violation of operator precedence*. The only thing of lower precedence than assignment is comma, and post-increment is ahead of addition. Additionally, the standard post-increment operation is to copy the object, increment the value, and return the copy (that's the theory I'm talking about.)
Quote:

Originally Posted by ntubski (Post 3662335)
Doing just (void) &i; is enough to get 11, it seems taking the address of a variable causes gcc to generate different (and more convoluted) code for incrementing. In particular, it uses LEA (on a register holding the old value of i) instead of ADD.


Well, you weren't printing i, you printing the value of the assignment expression which is the value of the sum, so of course you get 30.

Code:

#include <stdio.h>
int main()
{
    int i, j;
    i=10; j=20;
    fprintf(stderr, "%i\n", (i = i++ + j++));
    fprintf(stderr, "%i\n", i);
}

This outputs
30
31

It was merely observational information to demonstrate inconsistencies. Clearly the post-increment operations should occur within the (), which doesn't seem to be the case. Because this is undefined, however, *it makes sense to optimize out the stack space used by actually copying and incrementing in favor of putting i into a register for the addition, then incrementing at the end.
Kevin Barry


All times are GMT -5. The time now is 06:35 PM.