Strange "characters" appearing in auto "created" man pages
Hello people
If I use: Code:
man aptitude Code:
install Code:
man aptitude>aptitude.txt Code:
install |
Garbled symbols are a sure sign of a conflict in character encodings. The file is probably either being created in an encoding that can't handle those characters, or it's being created correctly and the display program is set to use the wrong encoding. Do you get the same effect no matter what text reader or editor you use? If not, then my first guess is that the file is being created using utf-8, but the text display is trying to use something else, such as Western European (iso-8859-1).
If all programs show the same problem, then the source is likely the encoding used when the file is created; in which case I couldn't off-hand tell you why it's doing that exactly or how to fix it. The same command works just fine for me. Please run the "locale" command and post the results, so we can see what encoding your shell is set to. |
Are you sure, that a "troff" document can be converted to
text just like that. I don't think so. http://heirloom.sourceforge.net/doctools/troff.1b.html http://vmlinux.org/cgi-bin/dwww?type...cation=TROFF/1 ..... |
That is what the man command does. Convert troff documents to text in your terminal. Changing the encoding of your terminal to utf8 would resolve strange characters when reading a manpage.
You may have a document that is intended to be printed instead of viewed in the terminal. But this wouldn't be the case for man pages. It may be better to do something like this: man --pager=cat --encoding=utf8 ><topic>.txt <topic> You could create a oneliner in ~/bin/ or use an alias alias man2txt='man --pager=cat --encoding=utf8' man2txt smb.conf #!/bin/bash topic="$1" man --pager=cat --encoding=utf8 $topic >${topic}.txt p.s. No, I didn't change my signature just for this post. I had it previously. |
Quote:
Code:
Sun Feb 28, 12:47 $ locale I see UTF-8 in there, Geany tells me the file is: ISO-8859-1 But the file I created "copying" the terminal output to a text tile is: UTF-8 (without BOM) and both gedit and geany read the strange characters in the first file. |
Quote:
I had to reinstall lately and was looking at the aptitude man pages when I decided that I wanted it as a text tile, and that's the result. Strange thing is about 50% of the time I get these strange characters. They are usually a single quote: ( ' ) a double quote but not the "text" ones ( " ) these look like a ( 66 ) and ( 99 ) if you get my drift, and the hyphen ( - ). A search and replace fixes it but it is a "process" I could do without. |
Quote:
Terminator is configured to use UTF-8 and I've never had the problem when "reading" in a terminal just with reading the text file: Code:
man program_name > program_name.txt |
Quote:
I'm going to play with that though. I would much rather have text files, I can read them easier and edit things (add nots etc - for personal use.) Another thanks for you. |
All times are GMT -5. The time now is 02:37 AM. |