Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place. |
| Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
 |
GNU/Linux Basic Guide
This 255-page guide will provide you with the keys to understand the philosophy of free software, teach you how to use and handle it, and give you the tools required to move easily in the world of GNU/Linux. Many users and administrators will be taking their first steps with this GNU/Linux Basic guide and it will show you how to approach and solve the problems you encounter.
Click Here to receive this Complete Guide absolutely free. |
|
 |
11-09-2011, 02:28 PM
|
#1
|
|
LQ Newbie
Registered: Jun 2011
Location: San Francisco, CA
Posts: 11
Rep: 
|
How do I get rid of some weird stuff in a text file?
I've made a .csv out of some spreadsheet, and it seems like there are a couple of trailing spaces that I need to get rid of.
So here is the weird thing. What appears to be a couple of trailing spaces, aren't exactly space characters.
Code:
cat textfile
AAK1
AATK
ABL1
ABL2
ACTR2
Code:
cat -v textfile
AAK1M-BM-
AATKM-BM-
ABL1M-BM-
ABL2M-BM-
ACTR2M-BM-
Now, the question is, what the heck is "M-BM-" and how do I get rid of it?
Thanks in advance.
Last edited by lethalfang; 11-09-2011 at 05:25 PM.
|
|
|
|
11-09-2011, 02:36 PM
|
#2
|
|
Member
Registered: Oct 2011
Location: USA
Distribution: Red Hat
Posts: 240
Rep:
|
vi the file and do a :set list
It will turn on special characters in file. Then you could do an insert or use regular expressions to delete them.
To turn off special characters use :set nolist
|
|
|
|
11-09-2011, 03:06 PM
|
#3
|
|
Bash Guru
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,568
|
The listing on this page for the solaris version of cat has an extended explanation of the -v option.
http://www.softpanorama.org/Tools/cat.shtml
Code:
-v Non-printing characters (with the exception of tabs, new-lines and form-feeds) are printed visibly. ASCII control characters (octal 000 - 037) are printed as ^n, where n is the corresponding ASCII character in the range octal 100 - 137 (@, A, B, C, . . ., X, Y, Z, [, \, ], ^, and _); the DEL character (octal 0177) is printed ^?. Other non-printable characters are printed as M-x, where x is the ASCII character specified by the low-order seven bits.
So these two characters appear to be outside of the basic ascii character set, but I have no idea exactly what they are.
Assuming that the pattern is uniform, with every line having two extra characters, you could try removing them with sed:
|
|
|
|
11-09-2011, 05:25 PM
|
#4
|
|
LQ Newbie
Registered: Jun 2011
Location: San Francisco, CA
Posts: 11
Original Poster
Rep: 
|
Quote:
Originally Posted by David the H.
The listing on this page for the solaris version of cat has an extended explanation of the -v option.
http://www.softpanorama.org/Tools/cat.shtml
Code:
-v Non-printing characters (with the exception of tabs, new-lines and form-feeds) are printed visibly. ASCII control characters (octal 000 - 037) are printed as ^n, where n is the corresponding ASCII character in the range octal 100 - 137 (@, A, B, C, . . ., X, Y, Z, [, \, ], ^, and _); the DEL character (octal 0177) is printed ^?. Other non-printable characters are printed as M-x, where x is the ASCII character specified by the low-order seven bits.
So these two characters appear to be outside of the basic ascii character set, but I have no idea exactly what they are.
Assuming that the pattern is uniform, with every line having two extra characters, you could try removing them with sed:
|
Thanks. I've no idea where those weirdo characters come from... but your suggestion worked, so hooray!.
|
|
|
|
| Thread Tools |
Search this Thread |
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
All times are GMT -5. The time now is 08:52 AM.
|
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|