LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   need help to further *optimize* my small C program that produces absolute path (https://www.linuxquestions.org/questions/programming-9/need-help-to-further-%2Aoptimize%2A-my-small-c-program-that-produces-absolute-path-744726/)

konsolebox 08-04-2009 09:21 AM

Again guys if you find a good alternative to the library calls in the code just tell me. Asm is a good option but not yet. I want to know what's best to have it done.. in C. Thanks again all.

johnsfine 08-04-2009 09:40 AM

Quote:

Originally Posted by SciYro (Post 3630479)
2) As a side note, using a switch statement usually produces less effiecent code for the CPU, using a switch statement in a loop while trying to make that loop faster is probably bad.

If you need the actual more than two way dispatch, switch might be the most speed optimal way to do it and it is usually the most readable.

I'm not sure whether you are suggesting redesign to avoid needing a three way branch or you are suggesting that two if statements would be faster for a three way branch (it usually isn't).

I haven't looked carefully enough to be certain but contrast
Code:

                switch (*p) {
                case '/':  ...
                case '.':  ...
                default:  ... }

with
Code:

          switch (dotsCount) {
                case 1:  ...
                case 2:  ...
                default:  ... }

The first of those looks fundamental to the underlying problem and there is nothing the programmer knows about this three way switch that the compiler doesn't know. So I don't think you could code anything better than the switch.
In the second case the programmer may know things about the likely or possible values that would allow a double if statement to be faster than a switch, and the whole three way branch might not be as fundamental to the problem. So I'm not sure, but maybe something faster than the switch could be coded.

A switch is often less efficient than if statements because the compiler is usually forced to generate a "default" path even if you don't make one explicit and even if the problem doesn't require it. But in these examples the problem does require a default path and it is explicit in the source code.

ntubski 08-04-2009 08:38 PM

Quote:

Originally Posted by konsolebox (Post 3630473)
I'll be using it in a shell script program that manipulates paths one at a time. The script is still slow and probably there's no other way to make it faster because of dependencies and limitations of shells but just in case that a little additional speed will help...

Why don't you ask for help optimizing the shell script...

Quote:

Originally Posted by SciYro
By 'library calls' I ment calls that went to the kernel, such as outputting text. String/array calls should be fine, and probably faster then a equivelent you can come up with.

The libc I/O routines do buffering in order to avoid multiple system calls. The output of this program is usually pretty short, there is probably only a single system I/O call.

konsolebox 08-07-2009 12:43 PM

Quote:

Originally Posted by ntubski (Post 3631636)
Why don't you ask for help optimizing the shell script...

Lately I've been busy finishing the whole project. Ok I'll think about posting the code and asking for help when I'm finished. It's just too bad that I still can't decide what's the best license for the code. Maybe not use a license at all and use CC0. Or just send it to the public domain. Right now I still can't find any license that gives total safety for the software and freedom for the users. Maybe there's none.
Quote:

Originally Posted by ntubski (Post 3631636)
The libc I/O routines do buffering in order to avoid multiple system calls. The output of this program is usually pretty short, there is only probably only a single system I/O call.

Now I guess that's the end of the line. Perhaps there's nothing more I can do to make the code faster.


By the way I made some comparisons with my original code and a version of the code where I made modifications that includes 'p[-1]' and 'token = p + 1'.

The gcc compiler (verson 4.1.2) generates the same output from both codes when optimization flag is enabled.

But when the outputs are not optimized, the modified version appeared to be faster. Both still have the same number of instructions and both also use increment statements (INCL and INC) but there was a difference in size and the targets. The compiler understood my original code as something that implies direct modification to the memory (like INCL) and not something that uses the registers first. With this form, it's probably slower right? It was probably quicker if the compiler did not recognize p + 1 as "INC AX".

You guys are right. Maybe I really shouldn't over-underestimate compilers.

P[-1] by the way also appears to yield faster code. ntubski and johnsfine was right all along. I'll change my code with those and have it finalized. Thanks for the help guys. I'll take a note of all the things you posted here.


And by the way I still can't understand some of johnsfine's statements. I know the concept of caches but what are cache misses and what's a partial tail merge? :)

Maybe I'll study those when I have time and after I finish the project. The first working prototype of its first and maybe final protocol (at last after many attempts to make it complete and perfect according to its requirement) is soon to be finished. ;)

johnsfine 08-07-2009 01:19 PM

Quote:

Originally Posted by konsolebox (Post 3634945)
I know the concept of caches but what are cache misses

Any time the thing you want doesn't happen to be in the cache and must be brought in from slower memory.

Quote:

what's a partial tail merge?
Both compilers (advanced optimization switches) and compiler theory documents are inconsistent. But two related concepts use the term "tail merge".

1) Flow of control. If the last action before a point where flow of control joins is identical on both sides, the compiler can reduce code size and maybe cache misses by joining earlier than the source code indicates. This is most common and most obvious in the join after if/then/else.
Compare:
Code:

if (a) { foo(x,y,z); } else { p=something; foo(p,y,z); }
vs.
Code:

p=x; if (!a) { p=something; } foo(p,y,z);
If those two sequences have the same net effect, the programmer probably knows it, but the compiler might not understand later uses of p well enough to know it, but anyway probably can't make that transformation.

In the first case, the last operation before joining is calling foo() so the compiler could join it early, but the benefits are tiny and an optimizer would be unlikely to guess joining early was right.

Since parameters are normally set up right to left, there is a more obvious early join in
Code:

if (a) { foo(x,y,z); } else { p=something; foo(x,y,p); }
The compiler can join after setting up z or p and use common code to set up y and x and then call. That is something the compiler could optimize but the programmer couldn't.

Now consider
Code:

p=x; if (a) {foo(p,y,z);} else { p=something; foo(p,y,z);}
Now the compiler can see the whole potential merge easily and can even see how to save a branch instruction by merging away the whole then part. But usually if you want to go that far to help the optimizer you would go the rest of the way and merge it yourself.

2) The other meaning of "tail merge" is when the last operation of a function (before returning) is to call a function with a compatible parameter stack and the same return value (turns out to be very common in advanced C++ programming). The compiler can change the code to jump to the other function instead of calling it, and thus save the extra layer of return.

konsolebox 08-08-2009 08:14 AM

Got it. I now know what you mean. It seems that there are parts of the code that only compilers can optimize.

By the way, regarding my second switch statement, do you think it's better if I just do something like:
Code:

} else if (dotsCount && dotsCount != 1) {
  if (dotsCount == 2) {
    ...
  } else {
    ...
  }
}

or

Code:

} else if (dotsCount) {
  if (dotsCount != 1) {
    if (dotsCount == 2) {
      ...
    } else {
      ...
    }
  }
}

?

johnsfine 08-08-2009 08:30 AM

I don't feel like studying your source code enough to decide the optimal way to branch on dotsCount. But I doubt that either of the above would be better than just using a switch.

What's wrong with
Code:

} else if (dotsCount >= 2) {
  if (dotsCount == 2) {
    ...
  } else {
    ...
  }
}

That is probably better than a switch. The optimizer usually figures out in that construct that one of the two comparisons is redundant so you have two branches based on one comparison.

konsolebox 08-08-2009 08:42 AM

Thanks for figuring that out. I was always focusing on something like 'n |/&/^ m' so i forgot simple things like > or >=.

ntubski 08-08-2009 08:02 PM

If we're still in the micro-optimization game, then I would suggest optimizing for the common case: directories which are not "." or "..". This way for most of the time the code doesn't have to look out for dots, just slash and end of string. EDIT: And by putting a '/' at the end of workBuffer a lot of checks for end of string can be removed. By the way, I uncovered a bug when I tried this out:
Code:

workBufferSize += cwdLength + 1; // +1 is needed for the '/' between cwd and sourceArg


Code:

    // ensure that workBuffer ends in "/", to reduce checks for '\0'
    // exercise for reader: rewrite so realloc isn't needed
    endWithSlash = (sourceArg[sourceArgSize-2] == '/');
    if (!endWithSlash) {
        workBufferSize += 1;
        workBuffer = realloc(workBuffer, workBufferSize);
        workBuffer[workBufferSize-2] = '/';
        workBuffer[workBufferSize-1] = '\0';
    }

    t = 0;

    for (p = workBuffer + 1; *p; ++p) {
        if (p[0] == '.') {
            if (p[1] == '/') {
                // "." means same dir
                p += 1;
                continue;
            } else if (p[1] == '.') {
                if (p[2] == '/') {
                    // ".." means go up one dir
                    if (t > 0) --t;
                    p += 2;
                    continue;
                }
            }
        } else if (p[0] == '/') {
            continue;
        }

        // not "." or ".."
        CHECK_TOKENS(t);
        tokens[t++] = p;
        do {
            ++p;
        } while (p[0] != '/');
        p[0] = '\0';
    }

    for (i = 0; i < t; i++) {
        putchar('/');
        fputs(tokens[i], stdout);
    }
    if (endWithSlash) putchar('/');
    else if (t == 0) fputs("/.", stdout);
    putchar('\n');


konsolebox 08-10-2009 02:18 AM

Thanks for the fix. That was a bug indeed. I'll take a look at the new code as well. Later.

Sergei Steshenko 08-10-2009 03:20 AM

Quote:

Originally Posted by konsolebox (Post 3637390)
Thanks for the fix. That was a bug indeed. I'll take a look at the new code as well. Later.

For example, Perl has a bunch of modules for path manipulation.

Your performance bottleneck is not "C", it's your shell script.

konsolebox 08-10-2009 03:53 AM

Quote:

Originally Posted by Sergei Steshenko (Post 3637435)
Your performance bottleneck is not "C", it's your shell script.

I already know that before I posted the code in this thread. I just want to finish the code optimized and leave it like that forever and never be bothered to play around with it again. The program is not only useful for my current script. It can also be used in my other scripts and many other things. I just want to make it fast that's all.

Some folks say that I'm wasting so much time to this code but actually I only created the code in an hour or two and posted it. I also took time making replies to this thread and analyzing the code but it was fun anyway.

I spent most of the time on my scripts and not on this program.

Now regarding my shell script, I think it is already in its most optimized form but I know that I can never be sure with it so I also plan to post it later but the script is part of a project and I still can't find a proper license for it. There's nothing much in the project but I just want to be careful.

Regardless, thanks for your reminders.

Sergei Steshenko 08-10-2009 06:02 AM

Quote:

Originally Posted by konsolebox (Post 3637475)
I already know that before I posted the code in this thread. I just want to finish the code optimized and leave it like that forever and never be bothered to play around with it again. The program is not only useful for my current script. It can also be used in my other scripts and many other things. I just want to make it fast that's all.

Some folks say that I'm wasting so much time to this code but actually I only created the code in an hour or two and posted it. I also took time making replies to this thread and analyzing the code but it was fun anyway.

I spent most of the time on my scripts and not on this program.

Now regarding my shell script, I think it is already in its most optimized form but I know that I can never be sure with it so I also plan to post it later but the script is part of a project and I still can't find a proper license for it. There's nothing much in the project but I just want to be careful.

Regardless, thanks for your reminders.


And why is the script a shell one in the first place ?

konsolebox 08-10-2009 07:15 AM

Quote:

Originally Posted by Sergei Steshenko
And why is the script a shell one in the first place ?

It's a helper script for shell scripts.

Sergei Steshenko 08-10-2009 07:47 AM

Quote:

Originally Posted by konsolebox (Post 3637707)
It's a helper script for shell scripts.

Is it source'd or called ?


All times are GMT -5. The time now is 10:25 AM.