Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game. |
| Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
 |
GNU/Linux Basic Guide
This 255-page guide will provide you with the keys to understand the philosophy of free software, teach you how to use and handle it, and give you the tools required to move easily in the world of GNU/Linux. Many users and administrators will be taking their first steps with this GNU/Linux Basic guide and it will show you how to approach and solve the problems you encounter.
Click Here to receive this Complete Guide absolutely free. |
|
 |
|
08-04-2009, 09:21 AM
|
#16
|
|
Senior Member
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,046
Original Poster
Rep: 
|
Again guys if you find a good alternative to the library calls in the code just tell me. Asm is a good option but not yet. I want to know what's best to have it done.. in C. Thanks again all.
|
|
|
|
08-04-2009, 09:40 AM
|
#17
|
|
Senior Member
Registered: Dec 2007
Distribution: Mepis, Centos
Posts: 4,674
|
Quote:
Originally Posted by SciYro
2) As a side note, using a switch statement usually produces less effiecent code for the CPU, using a switch statement in a loop while trying to make that loop faster is probably bad.
|
If you need the actual more than two way dispatch, switch might be the most speed optimal way to do it and it is usually the most readable.
I'm not sure whether you are suggesting redesign to avoid needing a three way branch or you are suggesting that two if statements would be faster for a three way branch (it usually isn't).
I haven't looked carefully enough to be certain but contrast
Code:
switch (*p) {
case '/': ...
case '.': ...
default: ... }
with
Code:
switch (dotsCount) {
case 1: ...
case 2: ...
default: ... }
The first of those looks fundamental to the underlying problem and there is nothing the programmer knows about this three way switch that the compiler doesn't know. So I don't think you could code anything better than the switch.
In the second case the programmer may know things about the likely or possible values that would allow a double if statement to be faster than a switch, and the whole three way branch might not be as fundamental to the problem. So I'm not sure, but maybe something faster than the switch could be coded.
A switch is often less efficient than if statements because the compiler is usually forced to generate a "default" path even if you don't make one explicit and even if the problem doesn't require it. But in these examples the problem does require a default path and it is explicit in the source code.
|
|
|
|
08-04-2009, 08:38 PM
|
#18
|
|
Senior Member
Registered: Nov 2005
Distribution: Debian
Posts: 2,017
|
Quote:
Originally Posted by konsolebox
I'll be using it in a shell script program that manipulates paths one at a time. The script is still slow and probably there's no other way to make it faster because of dependencies and limitations of shells but just in case that a little additional speed will help...
|
Why don't you ask for help optimizing the shell script...
Quote:
|
Originally Posted by SciYro
By 'library calls' I ment calls that went to the kernel, such as outputting text. String/array calls should be fine, and probably faster then a equivelent you can come up with.
|
The libc I/O routines do buffering in order to avoid multiple system calls. The output of this program is usually pretty short, there is probably only a single system I/O call.
Last edited by ntubski; 08-07-2009 at 05:16 PM.
Reason: only 1 only
|
|
|
|
08-07-2009, 12:43 PM
|
#19
|
|
Senior Member
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,046
Original Poster
Rep: 
|
Quote:
Originally Posted by ntubski
Why don't you ask for help optimizing the shell script...
|
Lately I've been busy finishing the whole project. Ok I'll think about posting the code and asking for help when I'm finished. It's just too bad that I still can't decide what's the best license for the code. Maybe not use a license at all and use CC0. Or just send it to the public domain. Right now I still can't find any license that gives total safety for the software and freedom for the users. Maybe there's none.
Quote:
Originally Posted by ntubski
The libc I/O routines do buffering in order to avoid multiple system calls. The output of this program is usually pretty short, there is only probably only a single system I/O call.
|
Now I guess that's the end of the line. Perhaps there's nothing more I can do to make the code faster.
By the way I made some comparisons with my original code and a version of the code where I made modifications that includes 'p[-1]' and 'token = p + 1'.
The gcc compiler (verson 4.1.2) generates the same output from both codes when optimization flag is enabled.
But when the outputs are not optimized, the modified version appeared to be faster. Both still have the same number of instructions and both also use increment statements (INCL and INC) but there was a difference in size and the targets. The compiler understood my original code as something that implies direct modification to the memory (like INCL) and not something that uses the registers first. With this form, it's probably slower right? It was probably quicker if the compiler did not recognize p + 1 as "INC AX".
You guys are right. Maybe I really shouldn't over-underestimate compilers.
P[-1] by the way also appears to yield faster code. ntubski and johnsfine was right all along. I'll change my code with those and have it finalized. Thanks for the help guys. I'll take a note of all the things you posted here.
And by the way I still can't understand some of johnsfine's statements. I know the concept of caches but what are cache misses and what's a partial tail merge?
Maybe I'll study those when I have time and after I finish the project. The first working prototype of its first and maybe final protocol (at last after many attempts to make it complete and perfect according to its requirement) is soon to be finished. 
|
|
|
|
08-07-2009, 01:19 PM
|
#20
|
|
Senior Member
Registered: Dec 2007
Distribution: Mepis, Centos
Posts: 4,674
|
Quote:
Originally Posted by konsolebox
I know the concept of caches but what are cache misses
|
Any time the thing you want doesn't happen to be in the cache and must be brought in from slower memory.
Quote:
|
what's a partial tail merge?
|
Both compilers (advanced optimization switches) and compiler theory documents are inconsistent. But two related concepts use the term "tail merge".
1) Flow of control. If the last action before a point where flow of control joins is identical on both sides, the compiler can reduce code size and maybe cache misses by joining earlier than the source code indicates. This is most common and most obvious in the join after if/then/else.
Compare:
Code:
if (a) { foo(x,y,z); } else { p=something; foo(p,y,z); }
vs.
Code:
p=x; if (!a) { p=something; } foo(p,y,z);
If those two sequences have the same net effect, the programmer probably knows it, but the compiler might not understand later uses of p well enough to know it, but anyway probably can't make that transformation.
In the first case, the last operation before joining is calling foo() so the compiler could join it early, but the benefits are tiny and an optimizer would be unlikely to guess joining early was right.
Since parameters are normally set up right to left, there is a more obvious early join in
Code:
if (a) { foo(x,y,z); } else { p=something; foo(x,y,p); }
The compiler can join after setting up z or p and use common code to set up y and x and then call. That is something the compiler could optimize but the programmer couldn't.
Now consider
Code:
p=x; if (a) {foo(p,y,z);} else { p=something; foo(p,y,z);}
Now the compiler can see the whole potential merge easily and can even see how to save a branch instruction by merging away the whole then part. But usually if you want to go that far to help the optimizer you would go the rest of the way and merge it yourself.
2) The other meaning of "tail merge" is when the last operation of a function (before returning) is to call a function with a compatible parameter stack and the same return value (turns out to be very common in advanced C++ programming). The compiler can change the code to jump to the other function instead of calling it, and thus save the extra layer of return.
Last edited by johnsfine; 08-07-2009 at 01:20 PM.
|
|
|
|
08-08-2009, 08:14 AM
|
#21
|
|
Senior Member
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,046
Original Poster
Rep: 
|
Got it. I now know what you mean. It seems that there are parts of the code that only compilers can optimize.
By the way, regarding my second switch statement, do you think it's better if I just do something like:
Code:
} else if (dotsCount && dotsCount != 1) {
if (dotsCount == 2) {
...
} else {
...
}
}
or
Code:
} else if (dotsCount) {
if (dotsCount != 1) {
if (dotsCount == 2) {
...
} else {
...
}
}
}
?
Last edited by konsolebox; 08-08-2009 at 08:15 AM.
|
|
|
|
08-08-2009, 08:30 AM
|
#22
|
|
Senior Member
Registered: Dec 2007
Distribution: Mepis, Centos
Posts: 4,674
|
I don't feel like studying your source code enough to decide the optimal way to branch on dotsCount. But I doubt that either of the above would be better than just using a switch.
What's wrong with
Code:
} else if (dotsCount >= 2) {
if (dotsCount == 2) {
...
} else {
...
}
}
That is probably better than a switch. The optimizer usually figures out in that construct that one of the two comparisons is redundant so you have two branches based on one comparison.
|
|
|
|
08-08-2009, 08:42 AM
|
#23
|
|
Senior Member
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,046
Original Poster
Rep: 
|
Thanks for figuring that out. I was always focusing on something like 'n |/&/^ m' so i forgot simple things like > or >=.
|
|
|
|
08-08-2009, 08:02 PM
|
#24
|
|
Senior Member
Registered: Nov 2005
Distribution: Debian
Posts: 2,017
|
If we're still in the micro-optimization game, then I would suggest optimizing for the common case: directories which are not "." or "..". This way for most of the time the code doesn't have to look out for dots, just slash and end of string. EDIT: And by putting a '/' at the end of workBuffer a lot of checks for end of string can be removed. By the way, I uncovered a bug when I tried this out:
Code:
workBufferSize += cwdLength + 1; // +1 is needed for the '/' between cwd and sourceArg
Code:
// ensure that workBuffer ends in "/", to reduce checks for '\0'
// exercise for reader: rewrite so realloc isn't needed
endWithSlash = (sourceArg[sourceArgSize-2] == '/');
if (!endWithSlash) {
workBufferSize += 1;
workBuffer = realloc(workBuffer, workBufferSize);
workBuffer[workBufferSize-2] = '/';
workBuffer[workBufferSize-1] = '\0';
}
t = 0;
for (p = workBuffer + 1; *p; ++p) {
if (p[0] == '.') {
if (p[1] == '/') {
// "." means same dir
p += 1;
continue;
} else if (p[1] == '.') {
if (p[2] == '/') {
// ".." means go up one dir
if (t > 0) --t;
p += 2;
continue;
}
}
} else if (p[0] == '/') {
continue;
}
// not "." or ".."
CHECK_TOKENS(t);
tokens[t++] = p;
do {
++p;
} while (p[0] != '/');
p[0] = '\0';
}
for (i = 0; i < t; i++) {
putchar('/');
fputs(tokens[i], stdout);
}
if (endWithSlash) putchar('/');
else if (t == 0) fputs("/.", stdout);
putchar('\n');
Last edited by ntubski; 08-08-2009 at 09:56 PM.
Reason: more hax ;)
|
|
|
|
08-10-2009, 02:18 AM
|
#25
|
|
Senior Member
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,046
Original Poster
Rep: 
|
Thanks for the fix. That was a bug indeed. I'll take a look at the new code as well. Later.
|
|
|
|
08-10-2009, 03:20 AM
|
#26
|
|
Senior Member
Registered: May 2005
Posts: 4,392
|
Quote:
Originally Posted by konsolebox
Thanks for the fix. That was a bug indeed. I'll take a look at the new code as well. Later.
|
For example, Perl has a bunch of modules for path manipulation.
Your performance bottleneck is not "C", it's your shell script.
|
|
|
|
08-10-2009, 03:53 AM
|
#27
|
|
Senior Member
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,046
Original Poster
Rep: 
|
Quote:
Originally Posted by Sergei Steshenko
Your performance bottleneck is not "C", it's your shell script.
|
I already know that before I posted the code in this thread. I just want to finish the code optimized and leave it like that forever and never be bothered to play around with it again. The program is not only useful for my current script. It can also be used in my other scripts and many other things. I just want to make it fast that's all.
Some folks say that I'm wasting so much time to this code but actually I only created the code in an hour or two and posted it. I also took time making replies to this thread and analyzing the code but it was fun anyway.
I spent most of the time on my scripts and not on this program.
Now regarding my shell script, I think it is already in its most optimized form but I know that I can never be sure with it so I also plan to post it later but the script is part of a project and I still can't find a proper license for it. There's nothing much in the project but I just want to be careful.
Regardless, thanks for your reminders.
Last edited by konsolebox; 08-10-2009 at 04:02 AM.
|
|
|
|
08-10-2009, 06:02 AM
|
#28
|
|
Senior Member
Registered: May 2005
Posts: 4,392
|
Quote:
Originally Posted by konsolebox
I already know that before I posted the code in this thread. I just want to finish the code optimized and leave it like that forever and never be bothered to play around with it again. The program is not only useful for my current script. It can also be used in my other scripts and many other things. I just want to make it fast that's all.
Some folks say that I'm wasting so much time to this code but actually I only created the code in an hour or two and posted it. I also took time making replies to this thread and analyzing the code but it was fun anyway.
I spent most of the time on my scripts and not on this program.
Now regarding my shell script, I think it is already in its most optimized form but I know that I can never be sure with it so I also plan to post it later but the script is part of a project and I still can't find a proper license for it. There's nothing much in the project but I just want to be careful.
Regardless, thanks for your reminders.
|
And why is the script a shell one in the first place ?
|
|
|
|
08-10-2009, 07:15 AM
|
#29
|
|
Senior Member
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,046
Original Poster
Rep: 
|
Quote:
|
Originally Posted by Sergei Steshenko
And why is the script a shell one in the first place ?
|
It's a helper script for shell scripts.
|
|
|
|
08-10-2009, 07:47 AM
|
#30
|
|
Senior Member
Registered: May 2005
Posts: 4,392
|
Quote:
Originally Posted by konsolebox
It's a helper script for shell scripts.
|
Is it source'd or called ?
|
|
|
|
| Thread Tools |
Search this Thread |
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
All times are GMT -5. The time now is 09:19 AM.
|
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|