ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
That code is slightly slower, probably because you're doing extra math as part of the loop. That's why I prefer to do the math before the loop. Technically, the compiler should be able to detect and calculate it and avoid the slight overhead.
I think your test cases are too small to make real comparisons. I assume you're not concerned with saving yourself 0.1 seconds of personal time, so why not run some tests that take several minutes or an hour?
Quote:
Originally Posted by mina86
This will fail if the file is very big, say 1T.
This really depends on pointer size and resource limits on address space. You definitely can't always count on being allowed 1TB of address space. I prefer using mmap, but the real problem here is that there's no good reason to prevent the program from reading from a descriptor that corresponds to a pipe or socket (or to a file on a filesystem without mmap support.)
As an example, piping a PRNG into this program (slight overhead from generating the random numbers):
100 MiB = 0.177s
1000 MiB = 1.708s
10000 MiB = 17.451s
That pretty much concludes my attempts to optimize this code. I think it is the best it can be, while remaining portable (no assembler code). This is completely solved now.
I'm sure the postincrement operations are optimized out, but just in case, try using preincrement. You could also directly increment a pointer to the buffer instead of using an index, e.g.
Code:
for (uint8_t *current = buffer, *end = buffer + fsize; current < end;)
{
++count[*current++];
}
Lastly, trying to load the entire file at once isn't a good idea since you could inadvertently DoS yourself, and it won't work if you want to read from a pipe, tty, or socket (i.e. what stdin is most of the time.)
I think multiple ioctl calls with printf is what's really making it slow. Perhaps using multiple sprintf to a buffer first then run one-time call to print to stdout with fwrite would make it faster.
Edit: Oh sorry perhaps it won't really be helpful with just a small number (256). If you could consider iota() which is not standard it could also help as it skips the parsing of the format string. If you could create your own itoa() function then it would also be better. You just have to be careful. Something that could return the length of the generated string would also be nice since it could help you know where to write next on the buffer. This obviously is not certainly portable and output may differ on some architectures or platforms.
Last edited by konsolebox; 08-03-2013 at 03:33 AM.
The output duration is going to be negligible with data sizes large enough to be important. Disk seek times and hardware interrupts will wash out all of that optimization.
Why are you using stdio.h for input? You're just adding unnecessary cycles to the input loop. Also, why not fstat the file to get the optimal block size (st_blksize) for input?
You should try your program with a huge file that's actually on the filesystem. If your process is running at less-than 100% CPU then there's a good chance you're needlessly optimizing.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.