Remove Control Characters from a File
I want to delete all the control characters from my file using linux bash commands.
There are some control characters like EOF (0x1A) especially which are causing the problem when I load my file in another software. I want to delete these. Here is what I have tried so far: this will list all the control characters: Code:
cat -v -e -t file.txt | head -n 10 Code:
$ cat file.txt | head -n 10 | grep '[[:cntrl:]]' Now, I ran the following command to show all lines not containing control characters but it is still showing the same output as above (lines with control characters) Code:
$ cat file.txt | head -n 10 | grep '[^[:cntrl:]]' Code:
$ cat file.txt | head -n 10 | grep '[[:cntrl:]]' | od -t x2 I tried using the tr command to delete the control characters but it deletes \r\n also: Code:
$ cat file.txt | tr -d "[:cntrl:]" >> test.txt Note: I want to delete all the control characters excluding, \r\n since they are the newline characters on windows. If I delete all the control characters then everything will be on the same line. Thanks. |
Can you make a try using awk?
Code:
awk '{gsub(/[:cntrl:]/,"",$0); print $0}' file.txt |
Another option may be to use sed something along the lines of
Code:
sed 's/[:cntrl:]//g' file.txt |
thanks for the answers, I will try them out.
However, both the above command lines will remove all the control characters. How do I exclude, \r and \n from that? |
All times are GMT -5. The time now is 06:24 PM. |