LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 10-29-2013, 05:40 AM   #1
techie_san778
Member
 
Registered: Jul 2011
Posts: 90

Rep: Reputation: Disabled
Exclamation How to remove the hidden characters from the file ?


Hello Frnds !!

When i run the following command :
$ man kill > man_kill.txt
and view the file using cat,I am getting the file in correct format on the screen.
But when i use vi (in Linux) or any text editor (in Windows), i get many hidden characters which make the file unreadable.
A sample o/p when i use vi to view the file:
O^HOP^HPT^HTI^HIO^HON^HNS^HS
_^Hp_^Hi_^Hd...


How to get rid of this problem ?
 
Old 10-29-2013, 08:00 AM   #2
rtmistler
Moderator
 
Registered: Mar 2011
Location: USA
Distribution: MINT Debian, Angstrom, SUSE, Ubuntu, Debian
Posts: 9,882
Blog Entries: 13

Rep: Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930
You can edit the file, copy the bad sequences and globally replace them with no text. I do that when I have a DOS file which shows up with ^M on each line visibly in the editor. I don't know if all those letters there are interfering data, but ^H is the BACKSPACE character. Either case whatever the sequence is, you're in an editor and you can copy or yank the interfering string and then globally search and replace it out. I'm not well versed in vi where I do search/replace, but if you can use gnuemacs, see the problem in that editor, I can help describing how to copy, then replace using that editor.
 
Old 10-29-2013, 08:12 AM   #3
michaelk
Moderator
 
Registered: Aug 2002
Posts: 25,721

Rep: Reputation: 5914Reputation: 5914Reputation: 5914Reputation: 5914Reputation: 5914Reputation: 5914Reputation: 5914Reputation: 5914Reputation: 5914Reputation: 5914Reputation: 5914
To convert a man page to text...
Code:
man kill | col -b > kill.txt
The col -b command removes backspaces.

Last edited by michaelk; 10-29-2013 at 08:14 AM.
 
1 members found this post helpful.
Old 10-29-2013, 08:17 AM   #4
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fedora-35
Posts: 5,313

Rep: Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918
i do this sometimes to get rid of garbage charectors:
Code:
#include "stdio.h"

main(int argc, char *argv[])
{
int c;
FILE * fstream;
 
fstream = fopen(argv[1], "r");
c = fgetc(fstream);
 
while(c != EOF)
{
  if(c == 10)
   printf("%c", c);
  if((c >= 0 && c <= 9) || (c >= 11 && c <= 31))
   printf(" ");
  if(c >= 127)
   printf(" ");
  if((c >= 32 && c <= 126))
   printf("%c", c);
 
  c = fgetc(fstream);
}
fclose(fstream);
}
 
Old 10-29-2013, 08:58 AM   #5
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,879

Rep: Reputation: 7317Reputation: 7317Reputation: 7317Reputation: 7317Reputation: 7317Reputation: 7317Reputation: 7317Reputation: 7317Reputation: 7317Reputation: 7317Reputation: 7317
you also may try isascii() or isprint():
printf("%c", isprint(c) ? c : " ");
 
1 members found this post helpful.
Old 10-29-2013, 09:34 AM   #6
vmccord
Member
 
Registered: Jun 2012
Location: Topeka, KS
Distribution: Mostly AWS
Posts: 71
Blog Entries: 31

Rep: Reputation: Disabled
http://www.linuxquestions.org/questi...ectory-461400/
 
Old 10-30-2013, 12:42 AM   #7
techie_san778
Member
 
Registered: Jul 2011
Posts: 90

Original Poster
Rep: Reputation: Disabled
Thanks for ur replies. One thing that puzzled me is that how cat is able to display the file (man_kill.txt) perfectly. But when i used cat -v to view the file, the hidden characters were displayed.
 
Old 10-30-2013, 06:12 AM   #8
michaelk
Moderator
 
Registered: Aug 2002
Posts: 25,721

Rep: Reputation: 5914Reputation: 5914Reputation: 5914Reputation: 5914Reputation: 5914Reputation: 5914Reputation: 5914Reputation: 5914Reputation: 5914Reputation: 5914Reputation: 5914
From the cat man page.
Quote:
-v, --show-nonprinting
The terminal window not the cat command processes the hidden characters that are embedded in the output i.e. line feeds, carriage returns, backspaces etc. Once the hidden characters are converted to viewable via the -v option they are displayed as regular text.
 
Old 10-30-2013, 07:05 AM   #9
rtmistler
Moderator
 
Registered: Mar 2011
Location: USA
Distribution: MINT Debian, Angstrom, SUSE, Ubuntu, Debian
Posts: 9,882
Blog Entries: 13

Rep: Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930
Many good answers, your latest question highlighted an interesting thought here, along the lines of what schneidz recommended:

The source for cat, itself. Check out the functions cook_cat(), it is exactly what you'd want as a program function to view/not view un-desirable characters.

Source for cat.c
 
1 members found this post helpful.
Old 10-30-2013, 11:46 PM   #10
techie_san778
Member
 
Registered: Jul 2011
Posts: 90

Original Poster
Rep: Reputation: Disabled
When i use col -b to filter out the backspaces, one control
character that appears at several places in the file when i use :
$ cat -v man_kill.txt
The control character is M-bM-^@M-^X--M-bM-^@M-^Y, the actual text
is --, there are 2 hidden characters surrounding the -- :
M-bM-^@M-^X and M-bM-^@M-^Y. How to remove these characters ?
 
Old 10-31-2013, 03:17 PM   #11
michaelk
Moderator
 
Registered: Aug 2002
Posts: 25,721

Rep: Reputation: 5914Reputation: 5914Reputation: 5914Reputation: 5914Reputation: 5914Reputation: 5914Reputation: 5914Reputation: 5914Reputation: 5914Reputation: 5914Reputation: 5914
I do not see any control characters when piping via col -b.
 
Old 11-04-2013, 10:48 AM   #12
normanlinux
Member
 
Registered: Apr 2013
Location: S.E. England
Distribution: Arch
Posts: 161

Rep: Reputation: Disabled
All of the above beg the question. Why the 0x46 do you want to format a man pge with man, save that to text file nd view it in an editor?

If you want a printable copy, use man -t and pipe ttht to print (or save as a ps file and print later).

On the other hand, you could always vi the (ungzipped) raw man pages and admire all of those lovely roff commands which, incidentally, is what persuaded AT&T of the viability of unix. The ability to typeset manuals - and have version control with sccs, saved them a lot of angst (and cash).
 
Old 11-08-2013, 05:19 AM   #13
techie_san778
Member
 
Registered: Jul 2011
Posts: 90

Original Poster
Rep: Reputation: Disabled
@normanlinux, i don't want to print or save the file as a ps either. I want to save it to a file that does not contain any control characters.
Is it possible to filter out the hidden characters with any roff commands (troff, nroff etc) ? If yes, plz suggest how to do that.
 
Old 11-08-2013, 07:15 AM   #14
rtmistler
Moderator
 
Registered: Mar 2011
Location: USA
Distribution: MINT Debian, Angstrom, SUSE, Ubuntu, Debian
Posts: 9,882
Blog Entries: 13

Rep: Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930
I say write a program to do that. Get the output to a file, which does have the control characters and use a self-written program to fix that file. See the example shown before, Source for cat.c look for the cook_cat() function.
 
Old 11-08-2013, 03:43 PM   #15
normanlinux
Member
 
Registered: Apr 2013
Location: S.E. England
Distribution: Arch
Posts: 161

Rep: Reputation: Disabled
techie_san778 says "i don't want to print or save the file as a ps either. I want to save it to a file that does not contain any control characters. "

This begs the question - why?

If we knew what you *really* wanted we could help you. The control characters - generally ^H which is backspace - were a way of making older terminals (and many recent ones) render text in bold.

Depending on what you *really* want you could, for example run the man -t already mentioned and pipe t0 ps2pdf so you could read it on your tablet, or view the man page in konqueror (man:/command) and save it as HTML - or copy and paste into Libreoffice for a word-processor document and then save as plain text, losing the control characters and giving a poor excuse for a man page.

What distinguishes unix - and hence linux - from every other system around is that where others have to say 'I wonder if there is a way to do ...' we have to say 'I wonder which of the ways to do ... is best for what I want at the moment'
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Remove Control Characters from a File NeonFlash Linux - Newbie 3 02-04-2013 09:24 PM
Ignoring Hidden Characters whocares357 Linux - Newbie 5 11-03-2010 02:41 PM
How to remove file with name containing only special characters abhisheknayak Linux - Newbie 5 07-04-2008 10:53 AM
How to rename file while copying - and remove special characters corporal79 Linux - General 3 10-11-2007 04:16 PM
how do I remove .xmms (hidden file)? phreakshew Mandriva 2 01-17-2006 07:37 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 04:36 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration