How to remove the hidden characters from the file ?
Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
How to remove the hidden characters from the file ?
Hello Frnds !!
When i run the following command : $ man kill > man_kill.txt
and view the file using cat,I am getting the file in correct format on the screen.
But when i use vi (in Linux) or any text editor (in Windows), i get many hidden characters which make the file unreadable.
A sample o/p when i use vi to view the file: O^HOP^HPT^HTI^HIO^HON^HNS^HS
_^Hp_^Hi_^Hd...
You can edit the file, copy the bad sequences and globally replace them with no text. I do that when I have a DOS file which shows up with ^M on each line visibly in the editor. I don't know if all those letters there are interfering data, but ^H is the BACKSPACE character. Either case whatever the sequence is, you're in an editor and you can copy or yank the interfering string and then globally search and replace it out. I'm not well versed in vi where I do search/replace, but if you can use gnuemacs, see the problem in that editor, I can help describing how to copy, then replace using that editor.
Thanks for ur replies. One thing that puzzled me is that how cat is able to display the file (man_kill.txt) perfectly. But when i used cat -v to view the file, the hidden characters were displayed.
The terminal window not the cat command processes the hidden characters that are embedded in the output i.e. line feeds, carriage returns, backspaces etc. Once the hidden characters are converted to viewable via the -v option they are displayed as regular text.
Many good answers, your latest question highlighted an interesting thought here, along the lines of what schneidz recommended:
The source for cat, itself. Check out the functions cook_cat(), it is exactly what you'd want as a program function to view/not view un-desirable characters.
When i use col -b to filter out the backspaces, one control
character that appears at several places in the file when i use : $ cat -v man_kill.txt
The control character is M-bM-^@M-^X--M-bM-^@M-^Y, the actual text
is --, there are 2 hidden characters surrounding the -- : M-bM-^@M-^X and M-bM-^@M-^Y. How to remove these characters ?
All of the above beg the question. Why the 0x46 do you want to format a man pge with man, save that to text file nd view it in an editor?
If you want a printable copy, use man -t and pipe ttht to print (or save as a ps file and print later).
On the other hand, you could always vi the (ungzipped) raw man pages and admire all of those lovely roff commands which, incidentally, is what persuaded AT&T of the viability of unix. The ability to typeset manuals - and have version control with sccs, saved them a lot of angst (and cash).
@normanlinux, i don't want to print or save the file as a ps either. I want to save it to a file that does not contain any control characters.
Is it possible to filter out the hidden characters with any roff commands (troff, nroff etc) ? If yes, plz suggest how to do that.
I say write a program to do that. Get the output to a file, which does have the control characters and use a self-written program to fix that file. See the example shown before, Source for cat.c look for the cook_cat() function.
techie_san778 says "i don't want to print or save the file as a ps either. I want to save it to a file that does not contain any control characters. "
This begs the question - why?
If we knew what you *really* wanted we could help you. The control characters - generally ^H which is backspace - were a way of making older terminals (and many recent ones) render text in bold.
Depending on what you *really* want you could, for example run the man -t already mentioned and pipe t0 ps2pdf so you could read it on your tablet, or view the man page in konqueror (man:/command) and save it as HTML - or copy and paste into Libreoffice for a word-processor document and then save as plain text, losing the control characters and giving a poor excuse for a man page.
What distinguishes unix - and hence linux - from every other system around is that where others have to say 'I wonder if there is a way to do ...' we have to say 'I wonder which of the ways to do ... is best for what I want at the moment'
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.