ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
I recently implemented an evolutionary algorithm which creates valid machine code expressions. This obviously speeds up my experiments a great deal. However, when I go to optimise the C++ code that uses the evolved machine code instructions; the machine code is executed incorrectly.
After a great deal of trial and error, I isolated the cause of the incorrect execution to Gcc's optimisation algorithms. For example, when using O1 optimisation, I have to include the flag "-fno-tree-fre" to get the code to execute correctly. When using O3 optimisation I have to include the flag "-fno-gcse" to get the code to execute correctly.
Unfortunately, these solutions are not consistent and the incorrect execution of code has started to occur again, in a far more complex context.
I'm certain that the problem arises from gcc believing that my dynamically generated code is dead code of some sort and that the manner in which gcc attempts to remove the percieved dead code is different within each context.
This probably accounts for the inconsistency of the compiler flags in solving the problem. Is anyone aware of a pragma / compiler directive which can tell gcc to disable optimisation for specfic area of code?.
I have also tried explicitly including every flag for O3 optimisation, this does not produce the same result as simply specifiying "O3" optiimsation at compile time, does anyone understand why this is so?
I don't think so, but you can always isolate that source it its own file and build it with -O0. That's essentially what you're looking for. The problem with optimizing a single part of the source is that it might use headers that are optimized in other cases. You might also want to use __attribute__ ((packed)) for all structures involved in case for some strange reason the difference in optimization causes them to align differently.
Why use optimization in the first place, then? If there are static parts of your program that need it then I suggest making that into a library so your only dynamic code is that which is generated.
I'm inclined to agree but if it was just a case of avoiding the problem I wouldn't be here at the forums, I'm simply irked that I do not understand exactly why this is happening. In my experience such issues rarely go away and its usually best to get to the bottom of it.
Whats really driving me nuts is the fact that in some cases it works (the volatile keyword) and in some cases it doesn't, I'm working backwards from a failed example to a working one in the hope that I'll be able to isolate the problem.
Ok, thats done, my only consistent strategy is to declare volatile the function pointer (to the dynamic code), the return value of that function and any variables which control loops that the function is executed in.
I can understand the return value and the function pointer, but surely the looping variables should be implicitly volatile once they operate on volatile variables?, anyway shit happens.
So what type of code is generated? It sounds to me like it might be non-standard code or that it relies on certain expectations that are otherwise construed as "undefined" by the various standards out there. Do you have an example of the generated code?
The generated code always follows this pattern :
save the frame pointer;
set the frame pointer = stack ptr
if statements with floating point or integer condtional arguments returning various values.
put return value in eax.
pop the frame ptr
I had thought that the code might be breaking some caller/callee saves protocol, because some of my debugging seemed to indicate this, however, disabling the feature did not result in properly executed code.