LinuxQuestions.org
Go Job Hunting at the LQ Job Marketplace
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices

Reply
 
Search this Thread
Old 11-09-2011, 03:28 PM   #1
lethalfang
LQ Newbie
 
Registered: Jun 2011
Location: San Francisco, CA
Posts: 11

Rep: Reputation: Disabled
How do I get rid of some weird stuff in a text file?


I've made a .csv out of some spreadsheet, and it seems like there are a couple of trailing spaces that I need to get rid of.
So here is the weird thing. What appears to be a couple of trailing spaces, aren't exactly space characters.

Code:
cat textfile

AAK1
AATK
ABL1
ABL2
ACTR2

Code:
cat -v textfile

AAK1M-BM-  
AATKM-BM-  
ABL1M-BM-  
ABL2M-BM-  
ACTR2M-BM-
Now, the question is, what the heck is "M-BM-" and how do I get rid of it?

Thanks in advance.

Last edited by lethalfang; 11-09-2011 at 06:25 PM.
 
Old 11-09-2011, 03:36 PM   #2
kbscores
Member
 
Registered: Oct 2011
Location: USA
Distribution: Red Hat
Posts: 259
Blog Entries: 9

Rep: Reputation: 32
vi the file and do a :set list

It will turn on special characters in file. Then you could do an insert or use regular expressions to delete them.

To turn off special characters use :set nolist
 
Old 11-09-2011, 04:06 PM   #3
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950
The listing on this page for the solaris version of cat has an extended explanation of the -v option.

http://www.softpanorama.org/Tools/cat.shtml

Code:
-v  Non-printing characters (with the exception of tabs, new-lines and form-feeds) are printed visibly. ASCII control characters (octal 000 - 037) are printed as ^n, where n is the corresponding ASCII character in the range octal 100 - 137 (@, A, B, C, . . ., X, Y, Z, [, \, ], ^, and _); the DEL character (octal 0177) is printed ^?. Other non-printable characters are printed as M-x, where x is the ASCII character specified by the low-order seven bits.
So these two characters appear to be outside of the basic ascii character set, but I have no idea exactly what they are.

Assuming that the pattern is uniform, with every line having two extra characters, you could try removing them with sed:

Code:
sed 's/..$//' file
 
Old 11-09-2011, 06:25 PM   #4
lethalfang
LQ Newbie
 
Registered: Jun 2011
Location: San Francisco, CA
Posts: 11

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by David the H. View Post
The listing on this page for the solaris version of cat has an extended explanation of the -v option.

http://www.softpanorama.org/Tools/cat.shtml

Code:
-v  Non-printing characters (with the exception of tabs, new-lines and form-feeds) are printed visibly. ASCII control characters (octal 000 - 037) are printed as ^n, where n is the corresponding ASCII character in the range octal 100 - 137 (@, A, B, C, . . ., X, Y, Z, [, \, ], ^, and _); the DEL character (octal 0177) is printed ^?. Other non-printable characters are printed as M-x, where x is the ASCII character specified by the low-order seven bits.
So these two characters appear to be outside of the basic ascii character set, but I have no idea exactly what they are.

Assuming that the pattern is uniform, with every line having two extra characters, you could try removing them with sed:

Code:
sed 's/..$//' file
Thanks. I've no idea where those weirdo characters come from... but your suggestion worked, so hooray!.
 
  


Reply

Tags
cat, text


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
using sed or grep to extract stuff from a text file DEF. Programming 5 12-12-2009 11:13 AM
Weird stuff stalinheredia Slackware 3 07-07-2008 12:28 PM
Getting rid of highlighting in a vi editor text file. Dee62 Linux - Newbie 6 03-11-2004 03:55 PM
Trouble with Apt-Get/Synaptic on RH9 wanting to get rid of stuff I need rberry88 Red Hat 4 11-05-2003 04:39 PM
Weird Stuff Bigun Linux - General 6 09-14-2002 02:18 AM


All times are GMT -5. The time now is 10:21 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration