LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   remove \r and \n from a text file (https://www.linuxquestions.org/questions/programming-9/remove-%5Cr-and-%5Cn-from-a-text-file-484629/)

powah 09-18-2006 09:14 AM

remove \r and \n from a text file
 
How to remove "\r" and " \n" from a text file with a script?
e.g.
The input is:
# cat test.txt
test
it

is

$ od -txc test.txt
0000000 74736574 74690a0d 0a0d0a0d 0a0d7369
t e s t \r \n i t \r \n \r \n i s \r \n
0000020

The output should be:
$ od -txc test.txt
0000000 74736574 74690a0d 73690a0d 00000a0d
t e s t \r \n i t \r \n i s \r \n \0 \0
0000016

jim mcnamara 09-18-2006 09:45 AM

Your input and output are identical - \r and \n are in both.
Code:

tr -d '\r' < inputfile | tr -d '\n' > outputfile

powah 09-18-2006 12:54 PM

remove extra \r and \n from a text file
 
Quote:

Originally Posted by jim mcnamara
Your input and output are identical - \r and \n are in both.
Code:

tr -d '\r' < inputfile | tr -d '\n' > outputfile

To clarify, I want to remove extra \r and \n from a text file, so "\r\n\r\n" become "\r\n" but "\r\n" remain unchanged.

spirit receiver 09-18-2006 01:22 PM

The following command will remove all lines that contain \r\n only (and make a backup of the original file):
Code:

sed -i.bak -e '/^\r$/d' filename

firstfire 09-18-2006 09:54 PM

Hi. Try this:
Code:

cat file | awk 'BEGIN{RS="\0";} {gsub(/(\r\n)+/,"\r\n");print}'| od -txc

AnanthaP 09-19-2006 01:08 AM

sed s/\\r\\n\\r\\n/\\r\\n/g in a pipe line.

So a paired \r\n will become a single one.

End

firstfire 09-20-2006 09:40 AM

Hello.
Here is an extraction of sed's info page (info sed)
Code:

3.3 Overview of Regular Expression Syntax
=========================================
 .  .  .
  `\CHAR'
    Matches CHAR, where CHAR is one of `$', `*', `.', `[', `\', or `^'.
    Note that the only C-like backslash sequences that you can
    portably assume to be interpreted are `\n' and `\\'
; in particular
    `\t' is not portable, and matches a `t' under most implementations
    of `sed', rather than a tab character.
  .  .  .


AnanthaP 09-22-2006 07:26 AM

Yes, but once ported and given that it is "visible" as \r \n (`od -c`) in a given *nix box, sed will work. That I believe is what the OP wanted.

End

sundialsvcs 09-22-2006 09:18 PM

As usual, lots of ways to do this...

The tr command example might deserve a little closer look because it's an unusual strategy (well, not for Unix)...

In the Unix world, many programs act as filters which take some input-stream, do something to it, and write the results to an output-stream. The output-stream of one filter is often piped to another filter, becoming its input-stream. Piping is indicated by the '|' character.

In the code:
Code:

tr -d '\r' < inputfile | tr -d '\n' > outputfile
we have two instances of the tr filtering-command, separated by a pipe. So we're actually going to have two instances of the tr command running at the same time, one feeding its output as input to the other.

The first part:
Code:

tr -d '\r' < inputfile
uses the '<' operator to specify that inputfile (whatever file that is) is to be used as the input. The command writes its modified output to its standard-output stream, where it's piped to become the input to the second tr command:
Code:

tr -d '\n' > outputfile
... which uses the '>' operator to specify its output-file.

So this approach solves the problem in a very Unix-like way: by running two small, generalized programs and piping them together to solve a problem.

bigearsbilly 10-02-2006 06:02 AM

unix2dos, dos2unix,


All times are GMT -5. The time now is 03:22 PM.