LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 03-07-2014, 02:47 AM   #1
jack.sully
Member
 
Registered: Jul 2012
Posts: 38

Rep: Reputation: Disabled
Removing junk characters through C program


Hi,
I am trying to remove junk characters from a file through C program. But code is not working in desired way.No output is coming int the output_file.
Here is the code.
Junk characters in junk_file i have entered by typing numeric values >255

-bash-4.1# cat number.c
#include <stdio.h>

main()
{
FILE *fp,*fp1;
fp=fopen("junk_file","r");
char ch;
fp1=fopen("output.c","w");

while((ch=getc(fp))!=EOF)
{
if(ch < '0' || ch > '255') /*unget all the characters beyond the ascii bounds */
ungetc(ch,fp);
else
putc(ch,fp1); /*get rest of the characters in the output file */
}
fclose(fp);
fclose(fp1);
return(0);
}

-bash-4.1# cat junk_file
¦+¦ÃtÃ

i#include<¦-§



Any suggestions.
 
Old 03-07-2014, 04:01 AM   #2
AnanthaP
Member
 
Registered: Jul 2004
Location: Chennai, India
Posts: 952

Rep: Reputation: 217Reputation: 217Reputation: 217
(1) char and int can be freely mixed. BUT if you mean the numeric values 0 and 255, then you should remove quotes around them.
(2) The so called junk characters are those < 32 or those > 127.
OK
 
Old 03-07-2014, 04:14 AM   #3
jack.sully
Member
 
Registered: Jul 2012
Posts: 38

Original Poster
Rep: Reputation: Disabled
I have edited the code:

-bash-4.1# cat number.c
#include <stdio.h>

main()
{
FILE *fp,*fp1;
fp=fopen("junk_file","r");
char ch;
fp1=fopen("output.c","w");

while((ch=getc(fp))!=EOF)
{
if(ch < 32 || ch > 127) /*unget all the characters beyond the ascii bounds */
ungetc(ch,fp);
else
putc(ch,fp1); /*get rest of the characters in the output file */
}
fclose(fp);
fclose(fp1);
return(0);
}

Still the code is not working, Is their any way we can remove them form appearing in the output file.
 
Old 03-08-2014, 01:08 AM   #4
AnanthaP
Member
 
Registered: Jul 2004
Location: Chennai, India
Posts: 952

Rep: Reputation: 217Reputation: 217Reputation: 217
AFAIR, unget only pushes the character back to the stream and doesn't bypass it.

So try this
Quote:
{
if(ch >= 32 && ch <= 127)
putc(ch,fp1); /*put the required characters into the output file */
}
OK
 
Old 03-08-2014, 02:31 AM   #5
s.verma
Member
 
Registered: Oct 2013
Distribution: Debian Sid, Gentoo, Arch, Debian
Posts: 186
Blog Entries: 4

Rep: Reputation: 25
Dear jack.sully,

Quote:
Originally Posted by jack.sully
#include <stdio.h>

main()
{
FILE *fp,*fp1;
fp=fopen("junk_file","r");
char ch;
fp1=fopen("output.c","w");

while((ch=getc(fp))!=EOF)
{
if(ch < '0' || ch > '255') /*unget all the characters beyond the ascii bounds */
ungetc(ch,fp);
else
putc(ch,fp1); /*get rest of the characters in the output file */
}
fclose(fp);
fclose(fp1);
return(0);
}
First in your original program why you are using ungetc(int, FILE *) function. I have run and tested it, it is creating an infinite loop. It never terminates possibly because it first get character then unget it and situation repeats itself.

Try using this code
Code:
#include <stdio.h>

main()
{
FILE *fp,*fp1;
fp=fopen("junk_file","r");
char ch;
fp1=fopen("output.c","w");

while((ch=getc(fp))!=EOF)
{
if(ch > '0' && ch < '255') /*get desired ones */
putc(ch,fp1); /* characters in the output file */
}
fclose(fp);
fclose(fp1);
return(0);
}
Also compiling your original code gives this warning.
number.c: In function ‘main’:
number.c:12:21: warning: multi-character character constant [-Wmultichar]
if(ch < '0' || ch > '255') /*unget all the characters beyond the ascii bounds */


So I am agreed with AnanthaP that you have to remove quotes. Write 255 not '255'.

Agree with AnanthaP for replacing the code.

Last edited by s.verma; 03-08-2014 at 02:36 AM. Reason: Ref. recent post of AnanthaP.
 
Old 03-09-2014, 08:42 AM   #6
jack.sully
Member
 
Registered: Jul 2012
Posts: 38

Original Poster
Rep: Reputation: Disabled
Smile

Thanks a lot guys.

Code is working now.(even though i didn't edit the code for warning)
 
Old 03-09-2014, 10:14 AM   #7
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fedora-35
Posts: 5,313

Rep: Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918
for shiggles heres my ascii scraper:
Code:
#include "stdio.h"
 
main(int argc, char *argv[])
{
int c;
FILE * fstream;
 
fstream = fopen(argv[1], "r");
c = fgetc(fstream);
 
while(c != EOF)
{
  if(c == 10)
   printf("%c", c);
  if((c >= 0 && c <= 9) || (c >= 11 && c <= 31))
   printf(" ");
  if(c >= 127)
   printf(" ");
  if((c >= 32 && c <= 126) && (c != 59))
   printf("%c", c);
  if(c == 59)
  {
   printf("\n");
  }
 
  c = fgetc(fstream);
}
fclose(fstream);
}
 
Old 03-10-2014, 12:07 AM   #8
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
Instead of using "printf" (which is slow) why not use fputc?

And instead of all the "ifs" used, why not a switch?

Code:
...
while (EOF != (c = fgetc(stream))) {
    switch (c) {
    case 10:
    case 32...58:
    case 60...126:
        fputc(c, stdout);
        break;

    case 59:
        fputc('\n',stdout);
        break;

    default:
        fputc(' ',stdout);
    }
}
And watch out for those "ch >255" tests -they are invalid.

ch is declared a "char", and that means it is an 8 bit signed value. There is no signed character that is 255... signed characters go from -127 to 128, so your test of "ch > 255" is never true.

Using "c" (from the example preceding mine) is declared an integer - which allows for testing for EOF (value is -1), and all values returned are then between 0 and 255...

Last edited by jpollard; 03-10-2014 at 12:15 AM.
 
Old 03-10-2014, 10:06 AM   #9
jack.sully
Member
 
Registered: Jul 2012
Posts: 38

Original Poster
Rep: Reputation: Disabled
-bash-4.1# cat junk_clear_new.c

#include <stdio.h>
main(int argc,char *argv[])
{
FILE *fp,*fp1;

if(argc!=2)
fprintf(stderr,"usage: %s file \n",argv[0]);

else
fp=fopen(argv[1],"r");

if(fp==NULL)
perror(argv[1]);
char ch;
fp1=fopen("output.c","w");

while((ch=getc(fp))!=EOF)
{
if(ch > 0 && ch < 255) /*get desired ones */
putc(ch,fp1); /* characters in the output file */
else if(ch=='\n')
printf("\n");
}
fclose(fp);
fclose(fp1);
return(0);
}
-bash-4.1#
-bash-4.1# cat try_junk
i#include<¦-§aa
t¦¦ y u
igg¦+qerté
-bash-4.1#
-bash-4.1# cat output.c
i#include<-aa
t y u
igg+qert
-bash-4.1#

Mission accomplished!
 
Old 03-10-2014, 12:13 PM   #10
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
Quote:
Originally Posted by jack.sully View Post
-bash-4.1# cat junk_clear_new.c

#include <stdio.h>
main(int argc,char *argv[])
{
FILE *fp,*fp1;

if(argc!=2)
fprintf(stderr,"usage: %s file \n",argv[0]);

else
fp=fopen(argv[1],"r");

if(fp==NULL)
perror(argv[1]);
char ch;
fp1=fopen("output.c","w");

while((ch=getc(fp))!=EOF)
{
if(ch > 0 && ch < 255) /*get desired ones */
putc(ch,fp1); /* characters in the output file */
else if(ch=='\n')
printf("\n");
}
fclose(fp);
fclose(fp1);
return(0);
}
-bash-4.1#
-bash-4.1# cat try_junk
i#include<¦-§aa
t¦¦ y u
igg¦+qerté
-bash-4.1#
-bash-4.1# cat output.c
i#include<-aa
t y u
igg+qert
-bash-4.1#

Mission accomplished!
not really. You still have errors in the code...
 
Old 03-11-2014, 09:52 AM   #11
jack.sully
Member
 
Registered: Jul 2012
Posts: 38

Original Poster
Rep: Reputation: Disabled
yes u r very right pollard.
Problem with my code was ,after removing the junk characters from original file i was not able to preserve their empty space in the output file.

bash-4.1# cat try_junk
i#include<¦-§aa
t¦¦ y u
igg¦+qerté
-bash-4.1#
-bash-4.1# cat output.c
i#include<-aa
t y u
igg+qert
-bash-4.1

Correct code is your's one

-bash-4.1# cat latest_junk_clear.c

#include <stdio.h>

main(int argc, char *argv[])
{
int ch;
FILE * fp;

fp= fopen(argv[1], "r");
ch = fgetc(fp);

while(ch != EOF)
{
if(ch == 10)
printf("%c", ch);
if((ch >= 0 && ch <= 9) || (ch >= 11 && ch <= 31))
printf(" ");
if(ch >= 127)
printf(" ");
if((ch >= 32 && ch <= 126) && (ch != 59))
printf("%c", ch);
if(ch == 59)
{
printf("\n");
}

ch = fgetc(fp);
}
fclose(fp);
}

-bash-4.1# cat try_junk
i#include<¦-§aa
t¦¦ y u
igg¦+qerté
-bash-4.1#

-bash-4.1# ./latest_junk try_junk
i#include< - aa
t y u
igg +qert
-bash-4.1#

Now it's fine.

Thanks a lot
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Russian characters are showing as junk in linux venky.b Linux - Newbie 5 11-02-2013 12:58 PM
HDD Fills Up -- Even after removing junk files Imprive Linux - Software 4 05-08-2013 03:02 PM
Japanese characters are junk in shell, ok in GUI grittyminder Linux - Software 0 03-19-2007 12:14 AM
Removing junk!!! phatbastard Slackware 7 12-23-2004 12:21 PM
finding and removing junk files mmail75 Linux - General 4 02-27-2004 12:59 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 07:09 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration