LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 11-03-2016, 11:21 AM   #1
cosmbrth
LQ Newbie
 
Registered: Aug 2013
Posts: 5

Rep: Reputation: Disabled
clean file from broken lines and join them together


Hello everyone,

I have an issue formatting some files. I receive everyday some files with broken lines. Every line finishes with ^M.

I have to formatt them manually: delete the false new lines and concatenate them to a single line.

Example:

Received file:

line1line1-sameending^M
line2line2-sameending^M
line3
line3--sameending^M
line4line4-sameending^M
line5line5
-sameending^M
line6line6-sameending^M


I have to formatt it to:

line1line1-sameending^M
line2line2-sameending^M
line3line3--sameending^M
line4line4-sameending^M
line5line5-sameending^M
line6line6-sameending^M

when I do cat -E myfile, the output is like:

$line1line1-sameending^M
$line2line2-sameending^M
line3$
$line3--sameending^M
$line4line4-sameending^M
line5line5$
$-sameending^M
$line6line6-sameending^M


I've tried many ideas but can't have it corrected, like

"while IFS= read -r -n1 char; do echo "$char"; done < myfile"
and then try to convert multiple lines to a single line.

But I can't seem to resolve it.

Do you have any ideas?
Thank you in advance,
 
Old 11-03-2016, 11:41 AM   #2
szboardstretcher
Senior Member
 
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 4,278

Rep: Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694
Well, first off - make a backup.

Then, I would use 'dos2unix' to remove the non-linux carriage returns (^M)

Code:
dos2unix yourfile
Then I would use sed to search out 'sameending' and join lines without.

Code:
sed ':a;/sameending$/!{N;s/\n//;ba}' yourfile
Which gives this:

Code:
line1line1-sameending
line2line2-sameending
line3line3-sameending
line4line4-sameending
line5line5-sameending
line6line6-sameending
 
Old 11-04-2016, 03:41 PM   #3
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 2,768

Rep: Reputation: 1192Reputation: 1192Reputation: 1192Reputation: 1192Reputation: 1192Reputation: 1192Reputation: 1192Reputation: 1192Reputation: 1192
Here is another sed script that works with your original file
Code:
#!/bin/bash
eol=$'\015' # a ^M character
sed '
:L
# if the eol is found, branch to the end
/'"$eol"'$/b
# append the next line; join it (remove the NL character)
$!N; s/\n//
# on success branch to the :L
tL
' yourfile
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Join two specific lines in a text file bfay General 2 03-02-2013 07:41 PM
[SOLVED] Join with unpairable lines danielbmartin Programming 4 08-05-2012 01:45 PM
join lines s_linux Programming 5 04-11-2011 10:00 AM
Join lines in text file vidyashankara Linux - General 10 12-21-2009 03:17 PM
join every three lines of a text file powah Programming 8 02-01-2007 11:40 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 12:39 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration