LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 10-07-2019, 10:07 AM   #1
jsbjsb001
Senior Member
 
Registered: Mar 2009
Location: Earth, unfortunately...
Distribution: Currently: OpenMandriva. Previously: openSUSE, PCLinuxOS, CentOS, among others over the years.
Posts: 3,881

Rep: Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063
What is so special about 34 or more spaces when reading text files with C code?


So basically I've written a little program that uses fgets() to loop through each line in a text file with a while loop, but the while loop also has a nested for loop to scan trough the array that holds each string read by fgets(). Which will look for the hash symbol which will indicate a comment, and therefore the nested for loop will continue and set my bool flag to "false", because only if that flag is "true" will the printf() statement in my while loop be executed (which is what I intended).

I will admit that I got probably every error under the sun trying to get this program working even to just actually print whatever doesn't have the hash symbol in front of it - even got a "Bus Error" (don't know how I managed that, but that's new one on me). Anyhow, I got it working and fixed the problems I was having after hours sitting there trying to figure out how to get bloody thing working. But there's one small problem, as usual...

Even if I have the hash symbol in front of something that should be ignored as a comment (because it has the hash symbol in front of it), if there is more than 33 spaces (not tabs, spaces) between the start of the line, and the actual string; the string doesn't get ignored, and still gets displayed. But as long as there isn't anymore than 33 spaces between the start of the line and the string, it's fine, and the string gets ignored (as intended) where the hash symbol is in front of the string to be "ignored as a comment". I've tried searching for an answer, but either this is a very unusual problem, or I don't understand the solution, either way, I have no idea what the problem even just might be.

Here's my code (I've got the commented out for loop there because I was checking if it was actually reading the spaces, and the actual string, and on both counts, it was);

Code:
// a program to skip to the next line in file if the comment char (#) is encounted

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

int main(void) {
    
   // char c;
    char content[373];
    char filename[10] = "test.txt";
    bool str = true;
    int i = 0;
    
    FILE *testfile; 

    if (( testfile = fopen(filename, "r")) == NULL ) {
       fprintf(stderr, "Input file cannot be read, aborting.\nDoes it exist?\n");
       return 1;
    }         
    
    while ( fgets(content, sizeof(content), testfile) != NULL ) {
                 //  for ( int j = 0; (c = fgetc(testfile)) != EOF; j++ ) {                        
                //          printf("content = %c\n", content[j]);
               //    }
           for ( i = 0; i < content[i]; i++ ) {                        
               if ( content[i] == '#' ) { 
                  str = false;                            
                  printf("found #\n");
                  continue;                            
               }                                              
           }
           if (str) {
              str = true;                     
              printf("Uncommented lines in file: %s\n", content);           
           }                            
           str = true;     
    }        
    
    fclose(testfile);
    
    return 0;
}
This is the text file that program is reading from, where there is more than 33 spaces between the start of the "#comment 4" line, and that same string itself;

Code:
line 1
  #comment 1
  line 3 
                           # comment 2 
line 5   
# comment 3
                                 #comment 4
                  #comment 5
This is the output, which other than the "#comment 4" line, is correct and intended;

Code:
james@jamespc: practice> ./skip_line_if_comment 
Uncommented lines in file: line 1

found #
Uncommented lines in file:   line 3 

found #
Uncommented lines in file: line 5   

found #
Uncommented lines in file:                                  #comment 4

found #
But if there is no more than 33 spaces from the start of the line to the actual string in the same text file (test.txt), then there's no issues, even with spaces between the hash symbol and the string to be commented out and ignored;

Code:
james@jamespc: practice> ./skip_line_if_comment 
Uncommented lines in file: line 1

found #
Uncommented lines in file:   line 3 

found #
Uncommented lines in file: line 5   

found #
found #
found #
Any help would be good, thanks.

James
 
Old 10-07-2019, 10:34 AM   #2
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=15, FreeBSD_12{.0|.1}
Posts: 6,258
Blog Entries: 24

Rep: Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193
This does not look like it will do anything useful...

Code:
           for ( i = 0; i < content[i]; i++ ) {                        
               if ( content[i] == '#' ) { 
                  str = false;                            
                  printf("found #\n");
                  continue;                            
               }                                              
           }
I think you mean the length of the string that was read into the buffer, don't you?

This also causes it to fail at 32 spaces - do you see why? It is an important clue!

I would also recommend using a #define or variable to set the length of content, then use that same value to test the length of the buffer, like:

Code:
#define CONTENT_LEN 373
...
char content[CONTENT_LEN];
...
while ( fgets(content, CONTENT_LEN, testfile) != NULL ) {
... and as the limit of your test loop.

Last edited by astrogeek; 10-07-2019 at 10:42 AM.
 
2 members found this post helpful.
Old 10-07-2019, 10:43 AM   #3
jsbjsb001
Senior Member
 
Registered: Mar 2009
Location: Earth, unfortunately...
Distribution: Currently: OpenMandriva. Previously: openSUSE, PCLinuxOS, CentOS, among others over the years.
Posts: 3,881

Original Poster
Rep: Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063
Quote:
Originally Posted by astrogeek View Post
This does not look like it will do anything useful...

Code:
           for ( i = 0; i < content[i]; i++ ) {                        
               if ( content[i] == '#' ) { 
                  str = false;                            
                  printf("found #\n");
                  continue;                            
               }                                              
           }
I think you mean the length of the string that was read into the buffer, don't you?
...
Yeah, that's what I was trying to do - I couldn't think of how else to write the condition for the for loop though, that's why I wrote it like that. While I was sorta wondering if that might have had something to do with why it fails after 33 spaces, no, I'm not sure why though.

I'll add a define in there for the array size like you said. Thanks for your help astro!
 
Old 10-07-2019, 10:49 AM   #4
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=15, FreeBSD_12{.0|.1}
Posts: 6,258
Blog Entries: 24

Rep: Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193
Quote:
Originally Posted by jsbjsb001 View Post
While I was sorta wondering if that might have had something to do with why it fails after 33 spaces, no, I'm not sure why though.
Think of what your original code is comparing to (hint:ASCII) and what happens when there are more than 32 of them.

Using a #define to set the buffer length you will also need to modify your comparison test to check for the octothorpe or the terminating NULL added by fgets(). Think carefully about what must happen in each case - they are not the same.
 
1 members found this post helpful.
Old 10-08-2019, 04:29 AM   #5
jsbjsb001
Senior Member
 
Registered: Mar 2009
Location: Earth, unfortunately...
Distribution: Currently: OpenMandriva. Previously: openSUSE, PCLinuxOS, CentOS, among others over the years.
Posts: 3,881

Original Poster
Rep: Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063
I think I sorta know what you mean about what happens when the loop gets to 32 based on this. But I'm honestly just not sure what you mean about the second thing you said about modifying the comparison test (I assume you mean the if statement you quoted before?), or why what must happen would be different depending on whether it was the hash sign or null. The only thing I can think of is, if the it encounters the hash then it's got to goto the next line in the file and scan it. And the same for the null byte since gets() puts it on the end of the line, so I once again just don't know.
 
Old 10-08-2019, 11:27 AM   #6
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (15 current), Slack15, Ubuntu studio, MX Linux, FreeBSD 13.1, WIn10
Posts: 10,342

Rep: Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242
I do not understand why you're putting the string through another loop to search it for the char
strchr searches the string for the wanted character and returns NULL if not found.
Code:
// a program to skip to the next line in file if the comment char (#) is encounted

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc , char **argv) {

    if (argc < 2)
    {
	printf("No got file name\nI need a filename\n");
	return EXIT_FAILURE;
    }
    char *filename = strdup(argv[1]);
    FILE *testfile; 

   //Try open file
   if (( testfile = fopen(filename, "r")) == NULL ) {
         fprintf(stderr, "Input file cannot be read, aborting.\nDoes it exist?\n");
         return 1;
    }
    char content[373];
       
    //Read file
    while ( fgets(content, sizeof(content), testfile) != NULL ) {
	if (strchr(content, '#' ) != NULL ) { 
	    printf("found #\n");
	}
    }        
    
    fclose(testfile);
    free(filename);
    return 0;
}

Last edited by BW-userx; 10-08-2019 at 11:32 AM.
 
1 members found this post helpful.
Old 10-08-2019, 11:32 AM   #7
rnturn
Senior Member
 
Registered: Jan 2003
Location: Illinois (SW Chicago 'burbs)
Distribution: openSUSE, Raspbian, Slackware. Previous: MacOS, Red Hat, Coherent, Consensys SVR4.2, Tru64, Solaris
Posts: 2,795

Rep: Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550
Quote:
Originally Posted by jsbjsb001 View Post

[snip]

Even if I have the hash symbol in front of something that should be ignored as a comment (because it has the hash symbol in front of it), if there is more than 33 spaces (not tabs, spaces) between the start of the line, and the actual string; the string doesn't get ignored, and still gets displayed. But as long as there isn't anymore than 33 spaces between the start of the line and the string, it's fine, and the string gets ignored (as intended) where the hash symbol is in front of the string to be "ignored as a comment". I've tried searching for an answer, but either this is a very unusual problem, or I don't understand the solution, either way, I have no idea what the problem even just might be.

[snip]
C doesn't care about the pound sign. It doesn't know that it indicates "comment". Your code needs to check for that character and either a.) ignore the entire line if it's the first non-whitespace character in the record or b.) more generally, split the record on the first pound sign and only process the left-hand portion---everything else after the pound sign was a comment.

Similarly, if I were writing some C to process TeX/LaTeX (or PostScript) source files I'd have to deal with percent signs in a similar manner. Or exclamation points in X11 resource files. You want to allow comments in your data files you'll need to write code to recognize and deal with them.

HTH...
 
Old 10-08-2019, 01:16 PM   #8
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=15, FreeBSD_12{.0|.1}
Posts: 6,258
Blog Entries: 24

Rep: Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193
Quote:
Originally Posted by jsbjsb001 View Post
I think I sorta know what you mean about what happens when the loop gets to 32 based on this.
Try to move your understanding from the "sorta know" condition into the "Aha! Got it!" state. You must learn to do that instinctively, a few hundred times a day to be an effective programmer. Instead of saying "I sorta know", try to explain it clearly in fewest words, to yourself first, then in your reply. A single simple sentence should do.

To make the mental connection clear I would suggest drawing it out with little squares, each containing the ascii value of the byte that was read into each position for a given line. Then run the loop in your head with i being the index (number) of each square...

Code:
for ( i = 0; i < content[i]; i++ )
And write out the values of i and content[i] for each trip through the loop...

Code:
i   content[i]
______________
0 < x
1 < x
2 < x
... where x is the ascii number contained in the corresponding [I]content square.

Make that a regular exercise every time you begin to type "I sorta know..." in a reply, stop right there and develop the habit of exploring the thing further until you can type, "OK, I understnad that! It works like this...". When you are writing real programs "sorta know" will block your path and lead you down endless blind alleys! Train your brain to trigger on "sorta know" and work out your own example to change that to "OK, I understand" before asking others. Then, if you can't make that work you have a solid question to ask others! (This is not intended to discourage you from asking for help, but to help you develop the necessary programming skill of changing uncertainty into certainty on your own - it is good exercise.)

Quote:
Originally Posted by jsbjsb001 View Post
But I'm honestly just not sure what you mean about the second thing you said about modifying the comparison test (I assume you mean the if statement you quoted before?), or why what must happen would be different depending on whether it was the hash sign or null. The only thing I can think of is, if the it encounters the hash then it's got to goto the next line in the file and scan it. And the same for the null byte since gets() puts it on the end of the line, so I once again just don't know.
Well, think of what can happen while you are testing each character for the octothorpe:

* The character might be '#' - what do you want to do in that case?
* The character might be some other non-space, non-# character - what to do in this case?
* The character might be NULL - what to do in that case?

Each case probably requires a different action, so you need to craft your comparison test to detect and correctly handle each possibility. But to define those actions you really need to specify just how you actually want it to work.

For example, do you want to define a comment as beginning with the first # in a line, or only lines in which # is the first non-whitespace character? Specify first, then write code to meet the specification.

Last edited by astrogeek; 10-08-2019 at 01:45 PM.
 
1 members found this post helpful.
Old 10-10-2019, 01:19 AM   #9
jsbjsb001
Senior Member
 
Registered: Mar 2009
Location: Earth, unfortunately...
Distribution: Currently: OpenMandriva. Previously: openSUSE, PCLinuxOS, CentOS, among others over the years.
Posts: 3,881

Original Poster
Rep: Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063
Thank you BW, while your solution does work, my program shouldn't ignore a line as a comment if the hash is NOT proceeding the string. So for example;

Code:
a line#
should NOT be ignored as a comment, but;

Code:
#a line
SHOULD be ignored as a comment. But the strchr() function still ignores "a line#" as a comment, because the hash is still there.

Quote:
Originally Posted by rnturn View Post
C doesn't care about the pound sign. It doesn't know that it indicates "comment".
Yes, I know that, that's why I tried writing a program that recognizes the hash symbol, and therefore ignores what follows it on the same line.

Quote:
Your code needs to check for that character and either
Yes, that's what the for loop was setup for.

Quote:
a.) ignore the entire line if it's the first non-whitespace character in the record or b.) more generally, split the record on the first pound sign and only process the left-hand portion---everything else after the pound sign was a comment.
...
I'm not sure how to do that, nor exactly what you mean. So your comments weren't really helpful I'm sorry.

I tried what you said astrogeek, but I'm still not clear on exactly why once the loop gets past 32 it fails. The only thing I can tell is the obvious in that, it fails once it gets past 32.

I tried to modify the if statement, I tried adding more if statements to check for anything other than a hash symbol, but I've only made it worse. I just don't know how to express in code what you said about coding a different action depending on whether it sees a hash symbol, or null, or whatever other character. And the more things I try, the more confusing it gets, so I really don't know what to do at this point.

Here's my code as it stands now;

Code:
// a program to skip to the next line in file if the comment char (#) is encountered

#define  CONTENT_LEN 373
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

int main(void) {    
       
    char content[CONTENT_LEN];
    char filename[10] = "test.txt";
    bool str = false;
    int i = 0;
    
    FILE *testfile; 

    if ( ( testfile = fopen(filename, "r")) == NULL ) {
         fprintf(stderr, "Input file cannot be read, aborting.\nDoes it exist?\n");
         return 1;
    }         
    
    while ( fgets(content, CONTENT_LEN, testfile) != NULL ) {           
            printf("i = %i content[i] = content[%c]\n", i, content[i]);            
            for ( i = 0; i < content[i]; i++ ) {       
                 if ( content[i] != '#' ) {
                    ++content[i];
                    continue;
                 }
                 if ( content[i] == '#' ) {                                                    
                    str = false;
                    printf("found # content[%c] i = %i\n", content[i], i);
                    continue;                            
                 }                           
            }
            if (str) {                    
               str = true;   
               printf("Uncommented lines in file: %s\n", content);           
            }                            
            str = true;     
    }        
    
    fclose(testfile);
    
    return 0;
}
Here's what it does, which is even worse;

Code:
james@jamespc: practice> ./skip_line_if_comment
i = 0 content[i] = content[l]
i = 7 content[i] = content[e]
found # content[#] i = 2
i = 12 content[i] = content[
]
Uncommented lines in file: !!mjof!4!

i = 10 content[i] = content[ ]
found # content[#] i = 27
i = 36 content[i] = content[ ]
Uncommented lines in file: mjof!6!!!

i = 10 content[i] = content[3]
found # content[#] i = 0
i = 11 content[i] = content[ ]
Uncommented lines in file: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!       #comment 4

i = 32 content[i] = content[ ]
found # content[#] i = 18
i = 28 content[i] = content[
]
found # content[#] i = 25
 
Old 10-10-2019, 01:50 AM   #10
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=15, FreeBSD_12{.0|.1}
Posts: 6,258
Blog Entries: 24

Rep: Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193
Hi jsbjsb001!

You still have not understood what your loop is doing, so making changes inside it will only compound the problems, as you have discovered.

Consider my original post about your loop parameters...

Quote:
Originally Posted by astrogeek View Post
This does not look like it will do anything useful...

Code:
           for ( i = 0; i < content[i]; i++ ) {                        
               if ( content[i] == '#' ) { 
                  str = false;                            
                  printf("found #\n");
                  continue;                            
               }                                              
           }
I think you mean the length of the string that was read into the buffer, don't you?
...
You are comparing the index, i, to the ascii value contained in the array location content[i]. You are still doing that in your current code. That makes no sense.

The point of my next post was to have you list out the value of the index, i, and the ascii value of each array element, content[i], which should make the reason for the the behavior after 32 spaces apparent. Your printf("i = %i content[i] = content[%c]\n", i, content[i]) statement for showing those values is not correct and is not even inside the loop, so it is not showing you anything useful. But let's not troubleshoot new code until we get the original working so please do not try to fix it, just revert to your original code.

Here is what you would see if you did it as I suggested:

Code:
i   content[i]
______________
0 < 32
1 < 32
2 < 32
...
30 < 32
31 < 32
32 < 32
...at which point you should see why your loop terminated after 32 spaces whose ascii value is 32.

That answers your original question, "What is so special about 34 or more spaces...".

But it also tells you that your loop parameters are just wrong - you should not be comparing to the ascii value of each location!

In your current code you are additionally incrementing each of those values to 33 during the loop - what is the character with ascii value 33? Do you see why you are now printing all those '!'s?

But you should not be comparing i to content[i] in your loop parameters - that is still just wrong.

As I said in my first post:

Quote:
I think you mean the length of the string that was read into the buffer, don't you?
So, see if you can make that change in the original code.

The important point to get here is the you need to understand why your loop is working incorrectly first, then based on that make the necessary change to make it work correctly, and only then make changes within the loop.

In other words, do not start by making code changes - that gets you no points and adds confusion!

Start by analyzing how it is working in your original code, and why that is wrong, then make the single specific code change to fix that one thing first - for the win!

So, go back to your original code so we do not add one problem on top of another, and see if you can get the loop to not abort after 32 spaces, as I have indicated.

This is all good exercise - but only if you work your way through it by understanding! See what you come up with!

Last edited by astrogeek; 10-10-2019 at 02:07 AM.
 
1 members found this post helpful.
Old 10-10-2019, 08:20 AM   #11
jsbjsb001
Senior Member
 
Registered: Mar 2009
Location: Earth, unfortunately...
Distribution: Currently: OpenMandriva. Previously: openSUSE, PCLinuxOS, CentOS, among others over the years.
Posts: 3,881

Original Poster
Rep: Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063
Ok, I've done what you said about reverting the code back to what it was before, and I've put the printf() statement in the right place this time so it prints out for the whole for loop.

While I can see what you mean about it stopping at 32 if there's just spaces; it's still just not obvious to me why it's stopping at a blank space. I'm sorry, I've looked at the output of the program, the code, ASCII table, but it just isn't obvious to me why it's stopping at a blank space.

Here's my code as it stands now;

Code:
#define  CONTENT_LEN 373
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

int main(void) {    
       
    char content[CONTENT_LEN];
    char filename[10] = "test.txt";
    bool str = false;
    int i = 0;
    
    FILE *testfile; 

    if ( ( testfile = fopen(filename, "r")) == NULL ) {
         fprintf(stderr, "Input file cannot be read, aborting.\nDoes it exist?\n");
         return 1;
    }         
    
    while ( fgets(content, CONTENT_LEN, testfile) != NULL ) {                           
          for ( i = 0; i < content[i]; i++ ) {
              printf("i = %i content[i] = content[%c]\n", i, content[i]);
              if ( content[i] == '#' ) { 
                 str = false;                            
                 printf("found #\n");
                 continue;                           
              }                                          
          }
              if (str) {                    
                 str = true;   
                 printf("Uncommented lines in file: %s\n", content);           
              }                            
              str = true;     
    }        
    
    fclose(testfile);
    
    return 0;
}
Here's the new output;

Code:
james@jamespc: practice> ./skip_line_if_comment
i = 0 content[i] = content[l]
i = 1 content[i] = content[i]
i = 2 content[i] = content[n]
i = 3 content[i] = content[e]
i = 4 content[i] = content[ ]
i = 5 content[i] = content[1]
i = 6 content[i] = content[
]
i = 0 content[i] = content[ ]
i = 1 content[i] = content[ ]
i = 2 content[i] = content[#]
found #
i = 3 content[i] = content[c]
i = 4 content[i] = content[o]
i = 5 content[i] = content[m]
i = 6 content[i] = content[m]
i = 7 content[i] = content[e]
i = 8 content[i] = content[n]
i = 9 content[i] = content[t]
i = 10 content[i] = content[ ]
i = 11 content[i] = content[1]
i = 0 content[i] = content[ ]
i = 1 content[i] = content[ ]
i = 2 content[i] = content[l]
i = 3 content[i] = content[i]
i = 4 content[i] = content[n]
i = 5 content[i] = content[e]
i = 6 content[i] = content[ ]
i = 7 content[i] = content[3]
i = 8 content[i] = content[ ]
i = 9 content[i] = content[
]
Uncommented lines in file:   line 3 

i = 0 content[i] = content[ ]
i = 1 content[i] = content[ ]
i = 2 content[i] = content[ ]
i = 3 content[i] = content[ ]
i = 4 content[i] = content[ ]
i = 5 content[i] = content[ ]
i = 6 content[i] = content[ ]
i = 7 content[i] = content[ ]
i = 8 content[i] = content[ ]
i = 9 content[i] = content[ ]
i = 10 content[i] = content[ ]
i = 11 content[i] = content[ ]
i = 12 content[i] = content[ ]
i = 13 content[i] = content[ ]
i = 14 content[i] = content[ ]
i = 15 content[i] = content[ ]
i = 16 content[i] = content[ ]
i = 17 content[i] = content[ ]
i = 18 content[i] = content[ ]
i = 19 content[i] = content[ ]
i = 20 content[i] = content[ ]
i = 21 content[i] = content[ ]
i = 22 content[i] = content[ ]
i = 23 content[i] = content[ ]
i = 24 content[i] = content[ ]
i = 25 content[i] = content[ ]
i = 26 content[i] = content[ ]
i = 27 content[i] = content[#]
found #
i = 28 content[i] = content[ ]
i = 29 content[i] = content[c]
i = 30 content[i] = content[o]
i = 31 content[i] = content[m]
i = 32 content[i] = content[m]
i = 33 content[i] = content[e]
i = 34 content[i] = content[n]
i = 35 content[i] = content[t]
i = 0 content[i] = content[l]
i = 1 content[i] = content[i]
i = 2 content[i] = content[n]
i = 3 content[i] = content[e]
i = 4 content[i] = content[ ]
i = 5 content[i] = content[5]
i = 6 content[i] = content[ ]
i = 7 content[i] = content[ ]
i = 8 content[i] = content[ ]
i = 9 content[i] = content[
]
Uncommented lines in file: line 5   

i = 0 content[i] = content[#]
found #
i = 1 content[i] = content[ ]
i = 2 content[i] = content[c]
i = 3 content[i] = content[o]
i = 4 content[i] = content[m]
i = 5 content[i] = content[m]
i = 6 content[i] = content[e]
i = 7 content[i] = content[n]
i = 8 content[i] = content[t]
i = 9 content[i] = content[ ]
i = 10 content[i] = content[3]
i = 0 content[i] = content[ ]
i = 1 content[i] = content[ ]
i = 2 content[i] = content[ ]
i = 3 content[i] = content[ ]
i = 4 content[i] = content[ ]
i = 5 content[i] = content[ ]
i = 6 content[i] = content[ ]
i = 7 content[i] = content[ ]
i = 8 content[i] = content[ ]
i = 9 content[i] = content[ ]
i = 10 content[i] = content[ ]
i = 11 content[i] = content[ ]
i = 12 content[i] = content[ ]
i = 13 content[i] = content[ ]
i = 14 content[i] = content[ ]
i = 15 content[i] = content[ ]
i = 16 content[i] = content[ ]
i = 17 content[i] = content[ ]
i = 18 content[i] = content[ ]
i = 19 content[i] = content[ ]
i = 20 content[i] = content[ ]
i = 21 content[i] = content[ ]
i = 22 content[i] = content[ ]
i = 23 content[i] = content[ ]
i = 24 content[i] = content[ ]
i = 25 content[i] = content[ ]
i = 26 content[i] = content[ ]
i = 27 content[i] = content[ ]
i = 28 content[i] = content[ ]
i = 29 content[i] = content[ ]
i = 30 content[i] = content[ ]
i = 31 content[i] = content[ ]
Uncommented lines in file:                                        #comment 4

i = 0 content[i] = content[ ]
i = 1 content[i] = content[ ]
i = 2 content[i] = content[ ]
i = 3 content[i] = content[ ]
i = 4 content[i] = content[ ]
i = 5 content[i] = content[ ]
i = 6 content[i] = content[ ]
i = 7 content[i] = content[ ]
i = 8 content[i] = content[ ]
i = 9 content[i] = content[ ]
i = 10 content[i] = content[ ]
i = 11 content[i] = content[ ]
i = 12 content[i] = content[ ]
i = 13 content[i] = content[ ]
i = 14 content[i] = content[ ]
i = 15 content[i] = content[ ]
i = 16 content[i] = content[ ]
i = 17 content[i] = content[ ]
i = 18 content[i] = content[#]
found #
i = 19 content[i] = content[c]
i = 20 content[i] = content[o]
i = 21 content[i] = content[m]
i = 22 content[i] = content[m]
i = 23 content[i] = content[e]
i = 24 content[i] = content[n]
i = 25 content[i] = content[t]
i = 26 content[i] = content[ ]
i = 27 content[i] = content[5]
i = 0 content[i] = content[ ]
i = 1 content[i] = content[ ]
i = 2 content[i] = content[ ]
i = 3 content[i] = content[ ]
i = 4 content[i] = content[ ]
i = 5 content[i] = content[ ]
i = 6 content[i] = content[ ]
i = 7 content[i] = content[ ]
i = 8 content[i] = content[ ]
i = 9 content[i] = content[ ]
i = 10 content[i] = content[ ]
i = 11 content[i] = content[ ]
i = 12 content[i] = content[ ]
i = 13 content[i] = content[ ]
i = 14 content[i] = content[ ]
i = 15 content[i] = content[ ]
i = 16 content[i] = content[ ]
i = 17 content[i] = content[ ]
i = 18 content[i] = content[l]
i = 19 content[i] = content[i]
i = 20 content[i] = content[n]
i = 21 content[i] = content[e]
i = 22 content[i] = content[ ]
i = 23 content[i] = content[ ]
i = 24 content[i] = content[9]
i = 25 content[i] = content[#]
found #
 
Old 10-10-2019, 09:42 AM   #12
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (15 current), Slack15, Ubuntu studio, MX Linux, FreeBSD 13.1, WIn10
Posts: 10,342

Rep: Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242
Quote:
Originally Posted by jsbjsb001 View Post
.....

[CODE]// a program to skip to the next line in file if the comment char (#) is encounted


Any help would be good, thanks.

James
searching a file looking for the # pound sign, then skip to the next line, this to me would be a "this is what it looks like it is doing on the front end", but behind the scene the file still needs to be read the entire line so it can find the end of the comment line to know where and when to "skip" to next line.

I'd open file, read in contents, seek for the #, when found then look for the end line, on the new line look to see if # or not # then do with that line whatever it is you are wanting to do.

steps.
1. open file
2. read line
3. if # found.
4. find end of line
5. when found skip that line go to step 2 (repeat)
6. if no # found then print line to output, go to step 2 (repeat).

this would eliminate the 32x spaces issue.

That would be my first approach to this problem.


Code:
if ( true ) {
   ; // use of just a semi-colon skips over placement
else
    prinf("something\n");
}
the use of a semi-colon can be used for blank space when you do not want anything done within a if statement. just in case you did not know.


Code:
while (read line) {
     
//find the start of comment         
 if (strchr(content, '#') != NULL ) {
   //now find end of comment or end line
    size_t len = 0;
    len = strlen(content);
     if ((len>0) && (content[len-1] == '\n')) {
	//skip line
	printf( "comment here\n");
     }
  }
  else //print non comment lines to stdout
	printf("%s\n", content);
}
it looks like every one is stuck on that silly for loop.

Code:
 while ( fgets(content, CONTENT_LEN, testfile) != NULL ) { 
       for ( int i = 0; i < strlen(content); i++ )
	{ 
	    //find # start
	    if /* ( */ (content[i] == '#') // && (content[strlen(content)-1] == '\n'))
	     { //kick it out of the for loop and into the while loop
		   break;
	     }
	     else //print all none comment lines, one char at a time.
		printf("%c",content[i]);
	}
		// or this one too works
		
	/*
          
         //find the start of comment
           if (strchr(content, '#') != NULL ) {
			//now find end of comment or end line
			size_t len = 0;
			len = strlen(content);
			if ((len>0) && (content[len-1] == '\n')) {
				//skip line
				//printf( "comment here\n");
				;
			}
		}
		else //print non comment lines to stdout
			printf("%s\n", content);
			*/
    }
not thinking too hard about the pit falls if any. The loose logic for me is, if # that is the start of a comment on a line, kick it out go to next line check again.
not thinking or working in the what if comment looks like this.
Code:
some code here #now a comment
some code here
some code here
#comment
and not compensating for it with a new line to keep the code in line for proper readability and usage if put into another file to be used without comments.
the end results would be
Code:
some code here some code here
some code here

Last edited by BW-userx; 10-10-2019 at 11:59 AM.
 
Old 10-10-2019, 10:47 AM   #13
GazL
LQ Veteran
 
Registered: May 2008
Posts: 6,882

Rep: Reputation: 4988Reputation: 4988Reputation: 4988Reputation: 4988Reputation: 4988Reputation: 4988Reputation: 4988Reputation: 4988Reputation: 4988Reputation: 4988Reputation: 4988
Remember, characters have a numeric value inside a computer.

Does this help your understanding?
Code:
#include <stdlib.h>
#include <stdio.h>

int main()
{
    char string[] = { 'A', 'B', 'C', ' ', '\0' };

    printf("string[0] = %1$c = %1$d\n", string[0]);
    printf("string[1] = %1$c = %1$d\n", string[1]);
    printf("string[2] = %1$c = %1$d\n", string[2]);
    printf("string[3] = %1$c = %1$d\n", string[3]);
    printf("string[4] = %1$c = %1$d\n", string[4]);

    return EXIT_SUCCESS;
}
Now, if that helps, try converting that example to use a for loop instead of 5 separate printfs. Then you'll know how to change the for loop in your program.
 
2 members found this post helpful.
Old 10-10-2019, 12:35 PM   #14
rnturn
Senior Member
 
Registered: Jan 2003
Location: Illinois (SW Chicago 'burbs)
Distribution: openSUSE, Raspbian, Slackware. Previous: MacOS, Red Hat, Coherent, Consensys SVR4.2, Tru64, Solaris
Posts: 2,795

Rep: Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550
Quote:
Originally Posted by jsbjsb001 View Post
SHOULD be ignored as a comment. But the strchr() function still ignores "a line#" as a comment, because the hash is still there.
Yes, but detecting a "#" that is not the first non-space character in the record should trigger a call to strcpy/strncpy to copy the contents of the buffer up to (but not including) the "#" into a secondary buffer. Or... you could also replace the "#" with a null ('\0') and make use of the C library subroutines that know how to manipulate zero-terminated strings (what some of us call an ASCIIZ string... well, at least us old DECcies). Then you'd pass that string off to the other processing you're doing for records that did not contain a "#".

HTH...
 
Old 10-10-2019, 12:55 PM   #15
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=15, FreeBSD_12{.0|.1}
Posts: 6,258
Blog Entries: 24

Rep: Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193Reputation: 4193
Please everyone, let's refrain from offering better ways of checking for comment lines and help jsbjsb001 to understand the behavior of his particular loop code, whether we think it is not the best way to perform the task or not.

jsbjsb001, you are getting closer, but you still have not done the thing I asked so the output does not make sense to you.

Here is your original loop with your printf() which I have modified to work the way I intended, and one line added - a similar printf() after the loop exits.

Code:
          for ( i = 0; i < content[i]; i++ ) {
              printf("%i < %i (%c) ? %s\n", i, content[i], content[i], i<content[i] ? "True" : "False");
              if ( content[i] == '#' ) {
                 str = false;
                 printf("found #\n");
                 continue;
              }
          }
          //We want to see the comparison AFTER the loop has exited - in other words why it ended
          printf("%i < %i (%c) ? %s (loop exited)\n", i, content[i], content[i], i<content[i] ? "True" : "False");
The change to the printf() explicitly shows the comparison of the index i to the integer (ascii) value of the content element just as specified in your loop parameters. The printf after the loop has exited appends a "loop exited" message so you can cleaarly see the end of one loop and the beginning of the next.

I also use a simple test file:

Code:
No comment
      #comment
                                    #long comment
I wrote a simple test file so it will produce less output.

Make those specific changes to you original code and see if the output makes sense.

Do you see now why the comparison is exiting after 32 spaces?

Please compile these changes, and no others, and think about what you are seeing in the output. If you understand what is happening please try to explain it in fewest words. If you do not understand what is happening just ask for more hints.

It is very important you understand and debug this loop methodically and clearly without other distractions.
 
2 members found this post helpful.
  


Reply

Tags
fgets, strings


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: Tabs or spaces? Spaces, obviously, but how many? LXer Syndicated Linux News 0 09-13-2018 09:50 AM
block special and character special files s_shenbaga Linux - Newbie 4 06-23-2015 02:16 AM
LXer: Special mention for Special purpose LXer Syndicated Linux News 0 04-22-2011 11:11 AM
renaming files with spaces and special characters. bowens44 Linux - Newbie 8 06-29-2009 06:52 PM
Spaces and escaped spaces pslacerda Linux - Newbie 13 12-20-2008 09:03 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 01:06 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration