LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 02-04-2013, 08:30 PM   #1
NeonFlash
LQ Newbie
 
Registered: Jul 2012
Posts: 8

Rep: Reputation: Disabled
Remove Control Characters from a File


I want to delete all the control characters from my file using linux bash commands.

There are some control characters like EOF (0x1A) especially which are causing the problem when I load my file in another software. I want to delete these.

Here is what I have tried so far:

this will list all the control characters:


Code:
cat -v -e -t file.txt | head -n 10

^A+^X$
^A1^X$
^D ^_$
^E-^D$
^E-^S$
^E1^V$
^F%^_$
^F-^D$
^F.^_$
^F/^_$
^F4EZ$
^G%$
This will list all the control characters using grep:

Code:
$ cat file.txt | head -n 10 | grep '[[:cntrl:]]'
+
1

-
-
1
%
-
.
/
matches the above output of cat command.

Now, I ran the following command to show all lines not containing control characters but it is still showing the same output as above (lines with control characters)

Code:
$ cat file.txt | head -n 10 | grep '[^[:cntrl:]]'
+
1

-
-
1
%
-
.
/
here is the output in hex format:

Code:
$ cat file.txt | head -n 10 | grep '[[:cntrl:]]' | od -t x2
0000000 2b01 0a18 3101 0a18 2004 0a1f 2d05 0a04
0000020 2d05 0a13 3105 0a16 2506 0a1f 2d06 0a04
0000040 2e06 0a1f 2f06 0a1f
0000050
as you can see, the hex values, 0x01, 0x18 are control characters.

I tried using the tr command to delete the control characters but it deletes \r\n also:

Code:
$ cat file.txt | tr -d "[:cntrl:]" >> test.txt

$ cat test.txt | wc -l
0
If I delete all control characters, I will end up deleting the newline and carriage return as well which is used as the newline character on windows.

Note: I want to delete all the control characters excluding, \r\n since they are the newline characters on windows. If I delete all the control characters then everything will be on the same line.

Thanks.

Last edited by NeonFlash; 02-04-2013 at 08:34 PM.
 
Old 02-04-2013, 08:55 PM   #2
shivaa
Senior Member
 
Registered: Jul 2012
Location: Grenoble, Fr.
Distribution: Sun Solaris, RHEL, Ubuntu, Debian 6.0
Posts: 1,800
Blog Entries: 4

Rep: Reputation: 286Reputation: 286Reputation: 286
Can you make a try using awk?
Code:
awk '{gsub(/[:cntrl:]/,"",$0); print $0}' file.txt
 
Old 02-04-2013, 09:05 PM   #3
joshp
LQ Newbie
 
Registered: Aug 2006
Location: Chicago IL
Distribution: To many to list.
Posts: 27

Rep: Reputation: 1
Another option may be to use sed something along the lines of

Code:
sed 's/[:cntrl:]//g' file.txt
Not 100% sure that will work.
 
Old 02-04-2013, 09:24 PM   #4
NeonFlash
LQ Newbie
 
Registered: Jul 2012
Posts: 8

Original Poster
Rep: Reputation: Disabled
thanks for the answers, I will try them out.

However, both the above command lines will remove all the control characters. How do I exclude, \r and \n from that?
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to remove ^L control characters from text files? bashari Linux - Newbie 2 09-12-2012 11:36 AM
supression of control characters in linux file pandunr Linux - Newbie 3 11-24-2011 10:37 AM
How to remove file with name containing only special characters abhisheknayak Linux - Newbie 5 07-04-2008 10:53 AM
Viewing control characters in a text file dmorse Linux - General 2 01-06-2007 11:10 PM
Clearing control characters from a text file LinuxLala Linux - General 1 04-07-2006 06:45 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 08:21 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration