LinuxQuestions.org
Visit the LQ Articles and Editorials section
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 08-04-2009, 09:21 AM   #16
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,245
Blog Entries: 15

Original Poster
Rep: Reputation: 233Reputation: 233Reputation: 233

Again guys if you find a good alternative to the library calls in the code just tell me. Asm is a good option but not yet. I want to know what's best to have it done.. in C. Thanks again all.
 
Old 08-04-2009, 09:40 AM   #17
johnsfine
Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,107

Rep: Reputation: 1114Reputation: 1114Reputation: 1114Reputation: 1114Reputation: 1114Reputation: 1114Reputation: 1114Reputation: 1114Reputation: 1114
Quote:
Originally Posted by SciYro View Post
2) As a side note, using a switch statement usually produces less effiecent code for the CPU, using a switch statement in a loop while trying to make that loop faster is probably bad.
If you need the actual more than two way dispatch, switch might be the most speed optimal way to do it and it is usually the most readable.

I'm not sure whether you are suggesting redesign to avoid needing a three way branch or you are suggesting that two if statements would be faster for a three way branch (it usually isn't).

I haven't looked carefully enough to be certain but contrast
Code:
		switch (*p) {
		case '/':  ...
                case '.':  ...
		default:   ... }
with
Code:
          switch (dotsCount) {
                case 1:   ...
                case 2:   ...
                default:  ... }
The first of those looks fundamental to the underlying problem and there is nothing the programmer knows about this three way switch that the compiler doesn't know. So I don't think you could code anything better than the switch.
In the second case the programmer may know things about the likely or possible values that would allow a double if statement to be faster than a switch, and the whole three way branch might not be as fundamental to the problem. So I'm not sure, but maybe something faster than the switch could be coded.

A switch is often less efficient than if statements because the compiler is usually forced to generate a "default" path even if you don't make one explicit and even if the problem doesn't require it. But in these examples the problem does require a default path and it is explicit in the source code.
 
Old 08-04-2009, 08:38 PM   #18
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian
Posts: 2,494

Rep: Reputation: 850Reputation: 850Reputation: 850Reputation: 850Reputation: 850Reputation: 850Reputation: 850
Quote:
Originally Posted by konsolebox View Post
I'll be using it in a shell script program that manipulates paths one at a time. The script is still slow and probably there's no other way to make it faster because of dependencies and limitations of shells but just in case that a little additional speed will help...
Why don't you ask for help optimizing the shell script...

Quote:
Originally Posted by SciYro
By 'library calls' I ment calls that went to the kernel, such as outputting text. String/array calls should be fine, and probably faster then a equivelent you can come up with.
The libc I/O routines do buffering in order to avoid multiple system calls. The output of this program is usually pretty short, there is probably only a single system I/O call.

Last edited by ntubski; 08-07-2009 at 05:16 PM. Reason: only 1 only
 
Old 08-07-2009, 12:43 PM   #19
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,245
Blog Entries: 15

Original Poster
Rep: Reputation: 233Reputation: 233Reputation: 233
Quote:
Originally Posted by ntubski View Post
Why don't you ask for help optimizing the shell script...
Lately I've been busy finishing the whole project. Ok I'll think about posting the code and asking for help when I'm finished. It's just too bad that I still can't decide what's the best license for the code. Maybe not use a license at all and use CC0. Or just send it to the public domain. Right now I still can't find any license that gives total safety for the software and freedom for the users. Maybe there's none.
Quote:
Originally Posted by ntubski View Post
The libc I/O routines do buffering in order to avoid multiple system calls. The output of this program is usually pretty short, there is only probably only a single system I/O call.
Now I guess that's the end of the line. Perhaps there's nothing more I can do to make the code faster.


By the way I made some comparisons with my original code and a version of the code where I made modifications that includes 'p[-1]' and 'token = p + 1'.

The gcc compiler (verson 4.1.2) generates the same output from both codes when optimization flag is enabled.

But when the outputs are not optimized, the modified version appeared to be faster. Both still have the same number of instructions and both also use increment statements (INCL and INC) but there was a difference in size and the targets. The compiler understood my original code as something that implies direct modification to the memory (like INCL) and not something that uses the registers first. With this form, it's probably slower right? It was probably quicker if the compiler did not recognize p + 1 as "INC AX".

You guys are right. Maybe I really shouldn't over-underestimate compilers.

P[-1] by the way also appears to yield faster code. ntubski and johnsfine was right all along. I'll change my code with those and have it finalized. Thanks for the help guys. I'll take a note of all the things you posted here.


And by the way I still can't understand some of johnsfine's statements. I know the concept of caches but what are cache misses and what's a partial tail merge?

Maybe I'll study those when I have time and after I finish the project. The first working prototype of its first and maybe final protocol (at last after many attempts to make it complete and perfect according to its requirement) is soon to be finished.
 
Old 08-07-2009, 01:19 PM   #20
johnsfine
Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,107

Rep: Reputation: 1114Reputation: 1114Reputation: 1114Reputation: 1114Reputation: 1114Reputation: 1114Reputation: 1114Reputation: 1114Reputation: 1114
Quote:
Originally Posted by konsolebox View Post
I know the concept of caches but what are cache misses
Any time the thing you want doesn't happen to be in the cache and must be brought in from slower memory.

Quote:
what's a partial tail merge?
Both compilers (advanced optimization switches) and compiler theory documents are inconsistent. But two related concepts use the term "tail merge".

1) Flow of control. If the last action before a point where flow of control joins is identical on both sides, the compiler can reduce code size and maybe cache misses by joining earlier than the source code indicates. This is most common and most obvious in the join after if/then/else.
Compare:
Code:
if (a) { foo(x,y,z); } else { p=something; foo(p,y,z); }
vs.
Code:
p=x; if (!a) { p=something; } foo(p,y,z);
If those two sequences have the same net effect, the programmer probably knows it, but the compiler might not understand later uses of p well enough to know it, but anyway probably can't make that transformation.

In the first case, the last operation before joining is calling foo() so the compiler could join it early, but the benefits are tiny and an optimizer would be unlikely to guess joining early was right.

Since parameters are normally set up right to left, there is a more obvious early join in
Code:
if (a) { foo(x,y,z); } else { p=something; foo(x,y,p); }
The compiler can join after setting up z or p and use common code to set up y and x and then call. That is something the compiler could optimize but the programmer couldn't.

Now consider
Code:
p=x; if (a) {foo(p,y,z);} else { p=something; foo(p,y,z);}
Now the compiler can see the whole potential merge easily and can even see how to save a branch instruction by merging away the whole then part. But usually if you want to go that far to help the optimizer you would go the rest of the way and merge it yourself.

2) The other meaning of "tail merge" is when the last operation of a function (before returning) is to call a function with a compatible parameter stack and the same return value (turns out to be very common in advanced C++ programming). The compiler can change the code to jump to the other function instead of calling it, and thus save the extra layer of return.

Last edited by johnsfine; 08-07-2009 at 01:20 PM.
 
Old 08-08-2009, 08:14 AM   #21
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,245
Blog Entries: 15

Original Poster
Rep: Reputation: 233Reputation: 233Reputation: 233
Got it. I now know what you mean. It seems that there are parts of the code that only compilers can optimize.

By the way, regarding my second switch statement, do you think it's better if I just do something like:
Code:
} else if (dotsCount && dotsCount != 1) {
  if (dotsCount == 2) {
    ...
  } else {
    ...
  }
}
or

Code:
} else if (dotsCount) {
  if (dotsCount != 1) {
    if (dotsCount == 2) {
      ...
    } else {
      ...
    }
  }
}
?

Last edited by konsolebox; 08-08-2009 at 08:15 AM.
 
Old 08-08-2009, 08:30 AM   #22
johnsfine
Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,107

Rep: Reputation: 1114Reputation: 1114Reputation: 1114Reputation: 1114Reputation: 1114Reputation: 1114Reputation: 1114Reputation: 1114Reputation: 1114
I don't feel like studying your source code enough to decide the optimal way to branch on dotsCount. But I doubt that either of the above would be better than just using a switch.

What's wrong with
Code:
} else if (dotsCount >= 2) {
  if (dotsCount == 2) {
    ...
  } else {
    ...
  }
}
That is probably better than a switch. The optimizer usually figures out in that construct that one of the two comparisons is redundant so you have two branches based on one comparison.
 
Old 08-08-2009, 08:42 AM   #23
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,245
Blog Entries: 15

Original Poster
Rep: Reputation: 233Reputation: 233Reputation: 233
Thanks for figuring that out. I was always focusing on something like 'n |/&/^ m' so i forgot simple things like > or >=.
 
Old 08-08-2009, 08:02 PM   #24
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian
Posts: 2,494

Rep: Reputation: 850Reputation: 850Reputation: 850Reputation: 850Reputation: 850Reputation: 850Reputation: 850
If we're still in the micro-optimization game, then I would suggest optimizing for the common case: directories which are not "." or "..". This way for most of the time the code doesn't have to look out for dots, just slash and end of string. EDIT: And by putting a '/' at the end of workBuffer a lot of checks for end of string can be removed. By the way, I uncovered a bug when I tried this out:
Code:
 workBufferSize += cwdLength  + 1; // +1 is needed for the '/' between cwd and sourceArg


Code:
    // ensure that workBuffer ends in "/", to reduce checks for '\0'
    // exercise for reader: rewrite so realloc isn't needed
    endWithSlash = (sourceArg[sourceArgSize-2] == '/');
    if (!endWithSlash) {
        workBufferSize += 1;
        workBuffer = realloc(workBuffer, workBufferSize);
        workBuffer[workBufferSize-2] = '/';
        workBuffer[workBufferSize-1] = '\0';
    }

    t = 0;

    for (p = workBuffer + 1; *p; ++p) {
        if (p[0] == '.') {
            if (p[1] == '/') {
                // "." means same dir
                p += 1;
                continue;
            } else if (p[1] == '.') {
                if (p[2] == '/') {
                    // ".." means go up one dir
                    if (t > 0) --t;
                    p += 2;
                    continue;
                }
            }
        } else if (p[0] == '/') {
            continue;
        }

        // not "." or ".."
        CHECK_TOKENS(t);
        tokens[t++] = p;
        do {
            ++p;
        } while (p[0] != '/');
        p[0] = '\0';
    }

    for (i = 0; i < t; i++) {
        putchar('/');
        fputs(tokens[i], stdout);
    }
    if (endWithSlash) putchar('/');
    else if (t == 0) fputs("/.", stdout);
    putchar('\n');

Last edited by ntubski; 08-08-2009 at 09:56 PM. Reason: more hax ;)
 
Old 08-10-2009, 02:18 AM   #25
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,245
Blog Entries: 15

Original Poster
Rep: Reputation: 233Reputation: 233Reputation: 233
Thanks for the fix. That was a bug indeed. I'll take a look at the new code as well. Later.
 
Old 08-10-2009, 03:20 AM   #26
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
Quote:
Originally Posted by konsolebox View Post
Thanks for the fix. That was a bug indeed. I'll take a look at the new code as well. Later.
For example, Perl has a bunch of modules for path manipulation.

Your performance bottleneck is not "C", it's your shell script.
 
Old 08-10-2009, 03:53 AM   #27
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,245
Blog Entries: 15

Original Poster
Rep: Reputation: 233Reputation: 233Reputation: 233
Quote:
Originally Posted by Sergei Steshenko View Post
Your performance bottleneck is not "C", it's your shell script.
I already know that before I posted the code in this thread. I just want to finish the code optimized and leave it like that forever and never be bothered to play around with it again. The program is not only useful for my current script. It can also be used in my other scripts and many other things. I just want to make it fast that's all.

Some folks say that I'm wasting so much time to this code but actually I only created the code in an hour or two and posted it. I also took time making replies to this thread and analyzing the code but it was fun anyway.

I spent most of the time on my scripts and not on this program.

Now regarding my shell script, I think it is already in its most optimized form but I know that I can never be sure with it so I also plan to post it later but the script is part of a project and I still can't find a proper license for it. There's nothing much in the project but I just want to be careful.

Regardless, thanks for your reminders.

Last edited by konsolebox; 08-10-2009 at 04:02 AM.
 
Old 08-10-2009, 06:02 AM   #28
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
Quote:
Originally Posted by konsolebox View Post
I already know that before I posted the code in this thread. I just want to finish the code optimized and leave it like that forever and never be bothered to play around with it again. The program is not only useful for my current script. It can also be used in my other scripts and many other things. I just want to make it fast that's all.

Some folks say that I'm wasting so much time to this code but actually I only created the code in an hour or two and posted it. I also took time making replies to this thread and analyzing the code but it was fun anyway.

I spent most of the time on my scripts and not on this program.

Now regarding my shell script, I think it is already in its most optimized form but I know that I can never be sure with it so I also plan to post it later but the script is part of a project and I still can't find a proper license for it. There's nothing much in the project but I just want to be careful.

Regardless, thanks for your reminders.

And why is the script a shell one in the first place ?
 
Old 08-10-2009, 07:15 AM   #29
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,245
Blog Entries: 15

Original Poster
Rep: Reputation: 233Reputation: 233Reputation: 233
Quote:
Originally Posted by Sergei Steshenko
And why is the script a shell one in the first place ?
It's a helper script for shell scripts.
 
Old 08-10-2009, 07:47 AM   #30
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
Quote:
Originally Posted by konsolebox View Post
It's a helper script for shell scripts.
Is it source'd or called ?
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
java applet not loading image with relative path but loads image with absolute path amolgupta Programming 2 07-20-2009 01:58 PM
how to get absolute path in a C program, Amiya Rath Linux - Newbie 2 10-15-2008 07:46 PM
Absolute path qspares Linux - Software 6 10-02-2007 03:11 AM
Image Path reference in Linux (Absolute path) javabuddy Linux - General 7 06-05-2006 07:45 AM
relative to absolute path vishalbutte Programming 4 01-14-2006 03:17 PM


All times are GMT -5. The time now is 12:30 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration