Line feed or carriage return-linefeed? Aka \n or \r?
GeneralThis forum is for non-technical general discussion which can include both Linux and non-Linux topics. Have fun!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Line feed or carriage return-linefeed? Aka \n or \r?
Hi: I have a text file, file1.txt with Unix style line terminators, which are simply line feeds (LF, ascii code 0x0a), aka newlines in C jargon. It's a plain ascii file. I have had to attach it to a post I posted in a forum where the users are predominantly Windows users but, ay, there's the rub, there can be Unix/Linux users too.
For the thing is that in the DOS/Windows environment, the standard is carriage return (CR, ascii code 0x0d) followed by line feed as a line terminator (I beg your pardon for my language which is a bit technical but in what subforum was I to speak about Windows?) Therefore, I seem to be in a cul de sac, for I cant know, before hand, who is going to click on the attachment link in the post, a Unix user or a DOS/windows one? I.e., do I use Unix style or DOS style in the file file1.txt?
The behaviour in a DOS/Widows editor is to ignore the LFs and the result is a very long line which is impractical to read.
This must be a trivial problem for an average linux user but it is not for me. Is there a way out?
EDIT
May be one solution is to add a CR before every LF. Perhaps Linux editors will just ignore the CR. But to put this conjecture to a test, I would need a program to do the work or a pipe consisting of several very common commands and, again, I do not know what these programs can be.
EDIT 2
This post definitely does not belong here. I'll ask a moderator to move it. Sorry for the inconvenience.
Windows Wordpad is capable of viewing Unix files correctly. I suppose any word processor should work, too. In fact, I think any Windows text editor smarter than Notepad should have no problem with Unix line endings. metapad, for example.
Many editors both for Linux and Windows will cope with either/or. I suspect for most it's a non issue as their editor does the right thing(tm). If you're posting to a forum full of non technical users, users that may very well use notepad to open then the safest would be to use \r\n line endings.
There are loads of methods of converting from one to the other:
And I found one: I ran vim on the Unix style file, issued a ':set ff=dos' command and vim,obediently, added the carriage returns. I told vim to save the file, exited, went to a Windows machine and... voila!: instead of the mess notepad, the editor invoked by the system when I click a text file in the file manager, made before, there the file was perfectly legible. No pipes, not even a single linux command, only the great Vim (although it is a comand by itself).
Personally, I would prefer a modest pipe, instead of using such a monster to do so humble a job.
May be one solution is to add a CR before every LF. Perhaps Linux editors will just ignore the CR. But to put this conjecture to a test, I would need a program to do the work or a pipe consisting of several very common commands and, again, I do not know what these programs can be.
Quote:
Originally Posted by stf92
Personally, I would prefer a modest pipe, instead of using such a monster to do so humble a job.
Here's a plain text file with Unix line endings. I shall use od to disambiguate the file's contents.
Code:
$ od -Ad -c unix-file
0000000 l i n e o n e \n l i n e t w
0000016 o \n l i n e t h r e e \n
0000029
Here I abuse cat to force the file through a Bash loop.
Code:
cat unix-file | while read -r l; do printf '%s\r\n' "$l"; done > dos-file
Here's the Windows-friendly version the pipeline made, also disambiguated with od
.
Code:
$ od -Ad -c dos-file
0000000 l i n e o n e \r \n l i n e t
0000016 w o \r \n l i n e t h r e e \r \n
0000032
$
I'll make it into a script to deal with cases like these. If you give me some little time, I'll study it and will surely make some questions about the pipe itself. Thanks for your instructive post.
cat unix-file|
while read -r 1;
do printf '%s \r \n' "$1";
done
>dos-file
OK. Here is the extent to which I have got it:
(a) Read can be true or false. When is it true? [FALSE]
(b) Every time I enter the do I write three chars!
(c) When 'read' reads a newline char, he will think there was a break in the line. So we need to take provisions to avoid this. And therefore the -r option. But I still do not understand the use of -r. I'll speak about this later.
GNU-Bash 3.1
Manual: written 2005 Dic 28
EDIT:
I am closer. Read reads a WHOLE line, but without the ending LF because of the -r switch. The line is now in var1. When writing var1, printf sufixes it with CR,LF.
Finally, builtin commands always return true, and read is a builtin comman.
cat unix-file|
while read -r 1;
do printf '%s \r \n' "$1";
done
>dos-file
Please use code tags when posting code or data. Putting your code into code tags preserves the formatting. They look like this:
[code]...some code...[/code]
I see you inserted some unneeded space characters in the printf '%s \r \n' statement. If your script contains those extra character then they will be added to the output file, which is probably not the desired outcome.
I also note that you wrote 1 (one) instead of l (ell). If that's how you write your script then the results will not likely be what you expect.
Perhaps it would be better if I enclose the pipeline into a short script.
Code:
#! /bin/bash
cat unix-file | {
while read -r
do
printf '%s\r\n' "$REPLY"
done
} > dos-file
Quote:
(a) Read can be true or false. When is it true? [FALSE]
The return code of read is 0 (zero) unless EOF is reached or something goes wrong. This loop's while is controlled by the return status of read. Since no error checking is done, the loop assumes (possibly wrongly) that any non-zero return value from read means the file has ended.
(c) When 'read' reads a newline char, he will think there was a break in the line. So we need to take provisions to avoid this. And therefore the -r option. But I still do not understand the use of -r.
Bash's read builtin normally treats \n (newline) characters as record terminators, removing them from the input line. This behavior is desirable here because we want to replace the line terminators with the \r\n sequence.
Without any options, read will remove \ (backslash) characters from the input file. The -r option disables this behavior, and is desireable whenever you want to preserve existing \ (backslash) characters from the input file.
Quote:
Finally, builtin commands always return true, and read is a builtin comman.
Most commands, builtin or external, return 0 (zero) upon success, and non-zero for failure or special circumstances. Some notable exceptions are test and false. Bash's builtin false always returns 1 (one), and non-zero return values of test probably do not indicate errors.
@Telengard: I had to copy the pipe by hand and 1(one) is almost undistiguishable from l(ell) when using code tags. About the extra space in the argument of printf I cant find it. Maybe I'm decidedly shortsighted after all. Thanks again for the useful information, above all: What is while's argument, in bash, giving while? (not a question).
@Telengard: I had to copy the pipe by hand and 1(one) is almost undistiguishable from l(ell) when using code tags. About the extra space in the argument of printf I cant find it.
Assuming you're browsing LQ in a graphical environment, you can use your mouse copy the code into a text editor. I'll see about attaching the example script to this post.
Quote:
What is while's argument, in bash, giving while? (not a question).
Not sure what you mean, but I'll hazard a guess. If you don't give read a variable name as an argument then it automatically populates the variable $REPLY from the input line.
EDIT
I had to give the file an extension to attach it to this post, so it is named dosify.txt. That's a strange name for a Bash script, but it shouldn't matter once you chmod 'u+x' it.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.