LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 07-28-2009, 10:40 AM   #1
damianpfister
LQ Newbie
 
Registered: Jul 2009
Posts: 6

Rep: Reputation: 0
Saving output without control characters


I have a text log file which has a copy of everything typed on the console. When I cat this file it will show me the likes of:

server1:>ls
server1:>who
server1:>top


...and so on.

Now if I either cat -v or simply vi this file I get the following:

server1:>ll^Hs^M
server1:>which^H^H^H^H^Hwho^M
server1:>tap^H^Hop^M


Now it is obvious that the ^H is the control character for BACKSPACE and ^M for RETURN - both of which are actually translated by cat so that the final command executed is shown and not all the errors (plus backspaces/enters).

It is easy enough to get rid of the ^M through sed or tr, but how do you get around all those ^H characters?

If I type in:

server1:>grep top outputfile.txt

(where outputfile.txt has the above 3 commands in it), it does not show me the last line of server1:>top since the actual output was server1:>tap^H^Hop^M

This makes it difficult to manipulate the data in this text file since the file is not "true" text but a combination of text and control characters:

server1:>file outputfile.txt
outputfile.txt: ASCII text, with CRLF line terminators, with overstriking


Is there any way I can do a cat of this file and then redirect that to a file, with that file looking the same as it would to STDOUT (console/xterm)?

If I currently try to do:

server1:>cat outputfile.txt > newfile.txt

I am left with all those original control characters, which makes for viewing in vi or parsing with grep difficult. Somewhere between cat and the console those control characters are actually interpreted rather than simply displayed....something I just cannot seem to replicate!

Any suggestions?

Last edited by damianpfister; 07-28-2009 at 11:15 AM.
 
Old 07-28-2009, 11:17 AM   #2
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,415
Blog Entries: 55

Rep: Reputation: 3590Reputation: 3590Reputation: 3590Reputation: 3590Reputation: 3590Reputation: 3590Reputation: 3590Reputation: 3590Reputation: 3590Reputation: 3590Reputation: 3590
'dos2unix', 'recode' (or tr, sed or vi)?
 
Old 07-28-2009, 11:33 AM   #3
damianpfister
LQ Newbie
 
Registered: Jul 2009
Posts: 6

Original Poster
Rep: Reputation: 0
The problem is that a simple conversion is not what is needed - rather a translation. Converting the likes of ESC and carriage returns between dos and unix is straight forward enough with dos2unix, vi, tr, sed and the like (not sure about recode - never used that before).

When you actually cat the file, it displays the "interpreted" output based on all the control characters, thus "masking" the fact that there was one command typed followed by a bunch of backspaces and then another command, before RETURN was hit.

Converting ^M (Carriage return) is fine, but how do you convert ^H when it is essentially a backspace that signifies the previous character needs to be erased?

Another example
cat file.txt

Hello World!

cat -v file.txt

Hello everyone^H^H^H^H^H^H^H^Hworld!^M

I tried doing a cat of the file (without -v so it shows up as I want it to...without control characters) and then an xsel to "select" the output and put that into the mouse copy buffer (middle-click paste) and then try output that to a file. No joy - get the exact same thing.

Yet if I cat the file, select the text with a mouse and paste it (middle-click) into another file it copies Hello World! rather than Hello everyone^H^H^H^H^H^H^H^Hworld!^M which is what I want (no control characters)....but it is not practical using a mouse (especially when inside of a script).
 
Old 07-28-2009, 01:40 PM   #4
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,577
Blog Entries: 31

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
Hello damianpfister

It could be done in bash or maybe better by awk (how big are these files?). Would get complex if the user did command line editing but the bacspaces should be easy enough.

I guess the files are generated by the script command. That would make it a common problem; there are a lot of hits if you netsearch for "script command" and output.

There may also be font control characters (color, underlining etc.) and cursor positioning controls which would make it even more complex -- a very difficult problem to solve completely and generically (especially for all terminal types!) but maybe basic cleanup is "good enough".

Best

Charles
 
Old 07-28-2009, 03:03 PM   #5
tredegar
LQ 5k Club
 
Registered: May 2003
Location: London, UK
Distribution: Debian "Testing"
Posts: 6,111

Rep: Reputation: 413Reputation: 413Reputation: 413Reputation: 413Reputation: 413
Errrr..... ... .. .

You seem to be running a key-logger. That you did not write yourself (or you'd know how to parse the files it produces, or make it produce "better" log files).

This sort of software is usually most unwelcome. And you have only this single post to your LQ name.

Can you give us some reasons why we should help you?

I expect your reasons will be very understandable, but ... .. .

This is a polite request for further information, and I expect it to be answered as such.
 
Old 07-29-2009, 05:16 AM   #6
damianpfister
LQ Newbie
 
Registered: Jul 2009
Posts: 6

Original Poster
Rep: Reputation: 0
I am using the rootsh wrapper (http://sourceforge.net/projects/rootsh/), to monitor commands being executed by junior team members. The reason rootsh was chosen rather than script was for complete audit control, as script writes it's output to the users home directory (and is thus very visible and modifiable by them).

The log files are stored in a more secure location. I then run a custom script to pick out which Sysadmin logged into which server, at what date/time and the commands they executed. A weekly report is then created for management to peruse - any complaints/issues then dealt with between management and the Junior admin concerned.

So yes it is "keylogging" in the true sense of the word, but nothing sinister at all (everything within company policy).

The unfortunate thing is that rootsh logs absolutely everything - including ^H backspace keys and previous commands before they were erased by backspace.

It was suggested that I attempt to use col -b as a way around the whole backspace issue....plan on testing that out today.
 
Old 07-29-2009, 12:15 PM   #7
tredegar
LQ 5k Club
 
Registered: May 2003
Location: London, UK
Distribution: Debian "Testing"
Posts: 6,111

Rep: Reputation: 413Reputation: 413Reputation: 413Reputation: 413Reputation: 413
Thanks for the explanation.
Maybe sed can help you, but my sed skills are almost zero.
I did try for you, but, as I said I'm hopeless at sed
The closest I got was this (lifted from sed one liners explained Number 84) sed 's/.^H//g' but that's not quite right (you are welcome to try it, and improve on it though)

How are you going to cope with other control sequences (Eg Ctrl-C Ctrl-D?)
Is there any config file for rootsh that can adjust its behaviour?
 
Old 07-30-2009, 05:34 AM   #8
damianpfister
LQ Newbie
 
Registered: Jul 2009
Posts: 6

Original Poster
Rep: Reputation: 0
Smile

Normally I would use sed to get rid of characters I do not want - especially ^M carriage returns. The problem is that simply doing this will not make the command display correctly.

For example:

Hello everyone^H^H^H^H^H^H^H^Hworld!^M

If I simply deleted all the ^H (backspaces) I would end up with:

Hello everyoneworld!^M

When what I really want is:

Hello world!^M

Those ^H backspaces are what was actually typed to delete the word everyone and then world was typed in to replace it.

**Update**
I have tried the col -b option and it appears to work!

cat file.txt
Hello World!

cat -v file.txt
Hello everyone^H^H^H^H^H^H^H^Hworld!^M

col -b < file.txt > newfile.txt
Hello World!

cat -v newfile.txt
Hello World!

So essentially col -b acts as a filter of sorts, "interpreting" the backspace characters and only showing the final output not the initially deleted word nor the backspaces themselves.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
control characters in C CoderMan Programming 1 02-16-2009 09:17 PM
Handling Control Characters in Linux. bala.linux Linux - General 3 03-27-2008 02:28 PM
LXer: OOXML's (Out of) Control Characters LXer Syndicated Linux News 0 03-25-2008 05:50 PM
saving output from FIFO? slimshady Linux - Newbie 0 11-29-2007 12:18 PM
[SOLVED] control characters in vi procfs Linux - Newbie 4 08-30-2007 11:52 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 11:50 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration