LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   linux windows end of line (https://www.linuxquestions.org/questions/linux-newbie-8/linux-windows-end-of-line-4175469365/)

v3ct0r 07-12-2013 01:59 AM

linux windows end of line
 
I'm learning regular expression, and I feel quite confused about it, every body knows that in grep I can use '^[0-9]' to match a line start with number, and '[0-9]$' to much a end.
The book I read said that Windows use ^M$ for his end of line, so I guess if I write some txt file in win then copy to linux,I must can see the ^M, but when I use vim to open it, Oops, IT IS JUST AS SAME AS A REGULAR LINE. but how?
Thanks for your advice!

acid_kewpie 07-12-2013 02:10 AM

If you run "file <myfilename.txt>" then it'll tell you what format it's written in. Vi can show windows format files "correctly" like you may be seeing, but it's not usually a default, you can tune the behaviour though:

http://vim.wikia.com/wiki/File_format

shm0 07-12-2013 02:11 AM

Vim can deal with those file formats according to http://vim.wikia.com/wiki/File_format.

v3ct0r 07-12-2013 02:27 AM

Quote:

Originally Posted by acid_kewpie (Post 4988899)
If you run "file <myfilename.txt>" then it'll tell you what format it's written in. Vi can show windows format files "correctly" like you may be seeing, but it's not usually a default, you can tune the behaviour though:

http://vim.wikia.com/wiki/File_format

It dose tell me the file is ASCII text,with CRLF line terminators
Does cat show windows format too?

Well, as I use :e ++ff=unix, I saw the ^M at the end of line, but I start wondering that if there is truly a '$' at end of line, If I just type '$' there, does it mean VIM will automatically convert it to '\$'

Ser Olmy 07-12-2013 02:34 AM

Quote:

Originally Posted by xeechou (Post 4988909)
It dose tell me the file is ASCII text,with CRLF line terminators
Does cat show windows format too?

Anyway how can I see the difference between unix and windows?

The difference is the line terminators. Windows uses two characters, Carriage Return + Line feed (CRLF), while Unix uses Line Feed (LF).

The file command reports the type of line terminators used in the file. If you want to actually see the difference, use an editor that doesn't try to adapt to the format, like joe on Linux or Notepad on Windows. joe shows the CR control character as a "^M" at the end of every line. Notepad shows LF line terminated files as a single line.

I don't think cat-ing a file with CRLF line terminators would show the difference. After all, CR is not a visible character.

v3ct0r 07-12-2013 02:55 AM

Quote:

Originally Posted by Ser Olmy (Post 4988912)
The difference is the line terminators. Windows uses two characters, Carriage Return + Line feed (CRLF), while Unix uses Line Feed (LF).

The file command reports the type of line terminators used in the file. If you want to actually see the difference, use an editor that doesn't try to adapt to the format, like joe on Linux or Notepad on Windows. joe shows the CR control character as a "^M" at the end of every line. Notepad shows LF line terminated files as a single line.

I don't think cat-ing a file with CRLF line terminators would show the difference. After all, CR is not a visible character.

Another test I just did was that I used ":e ++ff=mac" in vim, then I see ^J at the head of line, what dose ^J Means?
As ^M means \r, LF means \n.

I remember that in C programs, you just use \n to put an end, so what is that ^J?

Ser Olmy 07-12-2013 03:04 AM

The ^ prefix means it's a control character. After the circumflex you see an upper case character corresponding to the actual number of the control character, where @ is 0, A is 1 and so on.

^J would be ASCII control character 10, which is Line Feed. It seems that the mode you selected interprets CR as the line terminator, and the LF ended up as the first character of the next line.

v3ct0r 07-12-2013 03:12 AM

Quote:

Originally Posted by Ser Olmy (Post 4988924)
The ^ prefix means it's a control character. After the circumflex you see an upper case character corresponding to the actual number of the control character, where @ is 0, A is 1 and so on.

^J would be ASCII control character 10, which is Line Feed. It seems that the mode you selected interprets CR as the line terminator, and the LF ended up as the first character of the next line.

Damn it, I should realize that.


All times are GMT -5. The time now is 09:13 PM.