ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I'm asked to modify the code for a cat program below so that it can address whitespaces and avoid segfaulting if presented with line lengths greater than the buffersize. The program as it is written below produces correct output (albeit without whitespace). However, when I modify it by changing the formatting of the fprintf call from %s to %c,
the program outputs:
�ֿ,���̷ֿz �ֿ,���̷ֿz �ֿ,���̷ֿz
This occurs on my two home Ubuntu builds. On the version of Debian that I am supposed to get this to run on, output is correct. An interesting note is that removing the fprintf(".....%s....",....argv[i]) in an if-branch that is never traversed, somehow fixes the problem on my ubuntu builds. Any thoughts?
Code:
#include <stdio.h>
#include <stdlib.h>
#define SUCCESS 0
#define E_PARAM 1
int main(int argc, char **argv)
{
int i, numread;
FILE *in;
char buf[100];
if(argc==0)
{
fprintf(stderr, "ERROR: Not enough parameters.\n");
fprintf(stderr, "Syntax: %s [file1] [file2] ... [fileN]\n", argv[0]);
exit(E_PARAM);
}
for(i=1;i<argc;i++)
{
in = fopen(argv[i], "rt");
if(in==NULL)
fprintf(stderr, "\n%s: %s: No such file or directory\n", argv[0], argv[i]);
/* Comment the line above and instead use to make it work:
fprintf(stderr, "\n%s: Invalid file or directory\n", argv[0]);
*/
else while(!feof(in))
{
numread=fscanf(in, "%s", buf); // CHANGE %s --> %c to BREAK
if(numread>0 && numread != EOF)
fprintf(stdout, buf);
}
}
exit(SUCCESS);
}
Worse still, my program above (tweaked as explained above) to run on my ubuntu builds produces gibberish when I try to run it on debian.
I have attached the second part of this project (an attempt at optimizing the program by using lower level I/O functions). This runs fine on the Ubuntu builds, but produces gibberish on the Debian build that I will be evaluated on. I also find that my processing times are incredibly variable and generally higher than that of the high-level scanf/printf variation. Is this due to the fact that my test machine is virtualized? Is there an optimal buffer size or a way to avoid looping over the buffer to zero the values or to be more efficient in general?
Code:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#define SUCCESS 0
#define ERR 1
#define BUFSIZE 32
int main(int argc, char **argv)
{
int i,c;
char buf[BUFSIZE] = "";
if(argc==1)
{
fprintf(stderr, "ERROR: Not enough parameters.\n");
fprintf(stderr, "Syntax: %s [file1] [file2] ... [fileN]\n", argv[0]);
exit(ERR);
}
for(i=1;i<argc;i++)
{
int filedes = open(argv[i],O_RDONLY,0);
if (filedes == -1) {
fprintf(stderr, "\n%s: %s: No such file or directory\n", argv[0], argv[i]);
exit(ERR);
}
struct stat statbuf;
fstat(filedes,&statbuf);
size_t offset = 0;
while (1) {
int x = pread(filedes, buf, sizeof(buf), offset);
if ( x == 0 ) break;
offset += x;
x = write(1,buf,sizeof(buf));
for (c = 0 ; c < BUFSIZE ; c++) {
buf[c] = 0;
}
}
close(filedes);
}
exit(SUCCESS);
}
Thanks in advance,
Neel
**EDIT**
The second program also results in gibberish when trying to use putchar instead of write. Why does this depend on linux build? I've tried running the binaries on both and compiling them natively. Either way, I'm in programming babel.
argc is never zero. If no arguments are given, argc is 1. For the error handling in the case, than fopen fails, see below. Futhermore the streams are not closed. Use fclose(3) for this.
This code is strage by itself. fscanf(in,"%s",buf) is very dangerous, because you cannot know how long a word will be! Changing to %c will only read one non-whitespace character. The result will be written to buf[0]. But in this case the string will not be terminated with 0, so fprintf will not know where it ends and therefore will output everything in memory, until it encounters a 0. That is the glibberish you observe.
To your second program:
Code:
x = write(1,buf,sizeof(buf));
This will write sizeof(buf) bytes to stdout. But you cannot be sure that there are so many bytes inside buffer. pread will return the number of bytes read. You should use that return value as the size argument to write. Otherwise some garbage, that is behind the official read bytes is written to stdout too, which is the glibberish you observe.
Furthermore there is no need to call pread. Just use read on the file, because you do not need to seek explicitly (read will do automatically). Also take care for the case, that read/pread will fail! (x<0)
Code:
for (c = 0 ; c < BUFSIZE ; c++) {
buf[c] = 0;
}
These lines are totally unnecessary.
Code:
if (filedes == -1) {
fprintf(stderr, "\n%s: %s: No such file or directory\n", argv[0], argv[i]);
exit(ERR);
}
If open fails, why do you assume that the file does not exist? Better use perror(3) or strerror(3) to get the real reason for the failure.
This code is strage by itself. fscanf(in,"%s",buf) is very dangerous, because you cannot know how long a word will be! Changing to %c will only read one non-whitespace character. The result will be written to buf[0]. But in this case the string will not be terminated with 0, so fprintf will not know where it ends and therefore will output everything in memory, until it encounters a 0. That is the glibberish you observe.
The first program that I am asked to improve. It runs correctly on my instructors/university computers. Moreover, it runs correctly on mine if I remove the error messages in the previous if loop. I tried printf-ing only buf[0], but it threw a segfault.
Quote:
Originally Posted by irmin
To your second program:
Code:
x = write(1,buf,sizeof(buf));
This will write sizeof(buf) bytes to stdout. But you cannot be sure that there are so many bytes inside buffer. pread will return the number of bytes read. You should use that return value as the size argument to write. Otherwise some garbage, that is behind the official read bytes is written to stdout too, which is the glibberish you observe.
Furthermore there is no need to call pread. Just use read on the file, because you do not need to seek explicitly (read will do automatically). Also take care for the case, that read/pread will fail! (x<0)
Code:
for (c = 0 ; c < BUFSIZE ; c++) {
buf[c] = 0;
}
These lines are totally unnecessary.
This code runs correctly on my computers (whereas the first program didn't if I modified %s to %c). The unnecessary for loop is used to zero out the buffer, so that if it is not filled by pread, then it does not print unnecessary garbage. As such, it outputs correctly on my computers.
However, each program outputs only gibberish when run on the other set of computers (mine versus my universities). The only consistent difference between the two is that mine are running Ubuntu and the school's are running Debian.
The most inexplicable phenomenon is that when I remove "%s",argv[i] from the "No such file or directory" error, the first program runs as expected on my personal computers. It runs regardless on the Debians.
The second program as written runs correctly on the Ubuntus. But outputs complete gibberish on the Debians.
In all cases, gibberish implies no characters other than the odd diamonds (which octal dumps claims are EOT -- end of transmission markers) are outputted.
Thank you for your quick reply and numerous corrections to my code. The one I did not mention make sense to me and I will make appropriate changes.
Sorry for not specifying. This is a lab for my computer science course. The task was to find errors in the first program (explain why it can't deal with white space and why it segfaults), then modify the program to do these things (replacing %s with %c works in the CS laboratory), finally write a lower-level implementation to be more efficient than the original. Cat does work for my purposes, but the exercise is to give us a better understanding of C.
Some more testing revealed that a buffer size of 16 or 32 characters works best. I'm sure that number will be significantly higher once I remove the for-loop that zeros out the array. But at the moment, if I don't empty the array the last buffer is doubled in the output. I will try to replace pread by read as soon as I can catch a bit of sleep.
The unnecessary for loop is used to zero out the buffer, so that if it is not filled by pread, then it does not print unnecessary garbage. As such, it outputs correctly on my computers.
So you think, that zeroing out the buffer, will not print them on the screen? But still these zeros will be written to the terminal. Under Ubuntu the terminal driver seems to ignore the zeros, but under Debian it interprets them as EOT. I think that the problem is solved by changing the number of bytes you want to write to the number of bytes actually read before.
So you think, that zeroing out the buffer, will not print them on the screen? But still these zeros will be written to the terminal. Under Ubuntu the terminal driver seems to ignore the zeros, but under Debian it interprets them as EOT. I think that the problem is solved by changing the number of bytes you want to write to the number of bytes actually read before.
That makes sense and is easy enough to test. Since zeroing it out worked on ubuntu, I assumed that write/printf/etc interpreted 0 as /0. I'll do as you suggested.
Compiling in g++ works on both systems.
The -Wall -Wextra flags did not work on the Ubuntu system for getting Program #1 to compile.
Somehow recompiling got the programs to work on the remote box (debian). I made no changes to the programs, they just work. I'm not going to question it, maybe I did something stupid while tired.
So the final question stand:
why does commenting out the printf(stderr,"%s",argv[i]) allow me to fprintf(stdout,"%c",buff) in Ubuntu, but leaving it in (even though it is in an irrelevant for-loop (that exits afterwards), causes fprintf to produce gibberish.
Again, thanks for the quick replies. Somehow with no real changes my lab is in acceptable state to be turned in. Your suggestions were very instructional. I'm just really puzzled as to why removing that argv[i] string is so destructive on both my ubuntu builds.
That makes sense and is easy enough to test. Since zeroing it out worked on ubuntu, I assumed that write/printf/etc interpreted 0 as /0. I'll do as you suggested.
If write treated 0 as void, then you could never write correct binary data to a file.
Quote:
why does commenting out the printf(stderr,"%s",argv[i]) allow me to fprintf(stdout,"%c",buff) in Ubuntu, but leaving it in (even though it is in an irrelevant for-loop (that exits afterwards), causes fprintf to produce gibberish.
If you mean fprintf(stderr,"%s",argv[i]) instead of printf(stderr,"%s",argv[i]), then it should have no side effects.
I cannot find a variable named buff in your source code. But if you mean buf, then you call fprintf(stdout,"%c",buff) the wrong way. The correct way would be:
fprintf(stdout,"%c",*buf) or fprintf(stdout,"%c",buf[0]). Otherwise the first byte of the address of buf will be written, which will be garbage.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.