LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices

Reply
 
Search this Thread
Old 03-15-2011, 11:01 AM   #1
linuxScriptGirl
LQ Newbie
 
Registered: Mar 2011
Posts: 4

Rep: Reputation: 0
Sed substitution using &


I have searched high and low and hope you can help me. I need to find each line in a file which does NOT begin with a double quote (") and append that line to the previous line.

I have been successful doing this using the following command:

cat filname.csv | sed -e :a -e '$!Ns/\n[^"]//;ta -e 'P;D' > newfilename.csv

My issue is the substitution. As you would expect after the line is appended to the previous line the first character is removed. I need it to not be removed. I tried
cat filname.csv | sed -e :a -e '$!Ns/\n[^"]/&/;ta -e 'P;D' > newfilename.csv

but it just hangs.

Goal:
Input:
"line 1"
line 2
Output with existing sed command is:
line 1ine2

I need it to be line1line2.

Any help you can provide would be GREATLY appreciated!!
 
Old 03-15-2011, 11:59 AM   #2
Snark1994
Senior Member
 
Registered: Sep 2010
Location: Wales, UK
Distribution: Arch
Posts: 1,630
Blog Entries: 3

Rep: Reputation: 345Reputation: 345Reputation: 345Reputation: 345
I can't quite follow your syntax (it doesn't seem to work on my prompt) but I can see what the problem is (I think).

You're calling
Code:
s/\n[^"]//
This will delete the newline and the character after it. You need to add a group:

Code:
s/\n([^"])/\1/
This replaces \n and a non-quote with the non-quote character.

Hope this helps,
 
Old 03-15-2011, 01:07 PM   #3
linuxScriptGirl
LQ Newbie
 
Registered: Mar 2011
Posts: 4

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by Snark1994 View Post
I can't quite follow your syntax (it doesn't seem to work on my prompt) but I can see what the problem is (I think).

You're calling
Code:
s/\n[^"]//
This will delete the newline and the character after it. You need to add a group:

Code:
s/\n([^"])/\1/
This replaces \n and a non-quote with the non-quote character.

Hope this helps,


==========================================================
Thanks for your suggestion.

I tried your \1 within the quotes and i get a message:
sed: -e expression #2, char 1: invalid reference \1 on `s' command's RHS.
Do you know what that means?
 
Old 03-15-2011, 01:44 PM   #4
Ignotum Per Ignotius
Member
 
Registered: Sep 2009
Location: Wales, UK
Distribution: Slackware
Posts: 67
Blog Entries: 1

Rep: Reputation: 40
Hi linuxScriptGirl.

...Seems like you are strangely attractive to Welsh slackers...

I think the answer (well an answer) to your problem is to break it into three operations, since the newline is a bit of a pain. By translating the newline into some obscure character (i.e. a character which one could reliably assume will never appear in your input file), it becomes pretty straightforward. Here's my stab at it, anyway...

Code:
cat filename.csv | tr '\n' '' | sed 's/\([^\"]\)/ \1/g' | tr '' '\n' > filename.csv
I tried it on this file:

Code:
"line 1"
"line 2"
"line 3"
line 4
"line 5"
"line 6"
line 7
line 8
"line 9"
line 10
...and got this:

Code:
"line 1"
"line 2"
"line 3" line 4
"line 5"
"line 6" line 7 line 8
"line 9" line 10
The newline is changed into a space: you can easily eliminate this if you don't want it, by tweaking the sed script.

Nos da cariad...
 
Old 03-15-2011, 02:55 PM   #5
Snark1994
Senior Member
 
Registered: Sep 2010
Location: Wales, UK
Distribution: Arch
Posts: 1,630
Blog Entries: 3

Rep: Reputation: 345Reputation: 345Reputation: 345Reputation: 345
Quote:
Originally Posted by Ignotum Per Ignotius View Post
...Seems like you are strangely attractive to Welsh slackers...
Nah, we're strangely attractive to everyone else is what it is... isn't it?

EDIT: Darn, forgot to answer the question. You just need to escape the parentheses:
Code:
s/\n\([^"]\)/\1/

Last edited by Snark1994; 03-15-2011 at 02:59 PM.
 
Old 03-15-2011, 06:00 PM   #6
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,485

Rep: Reputation: 1890Reputation: 1890Reputation: 1890Reputation: 1890Reputation: 1890Reputation: 1890Reputation: 1890Reputation: 1890Reputation: 1890Reputation: 1890Reputation: 1890
How about:
Code:
sed -r ':a /^"/{N;s/\n([^"])/\1/};ta' filname.csv > newfilname.csv
 
Old 03-15-2011, 06:12 PM   #7
Ignotum Per Ignotius
Member
 
Registered: Sep 2009
Location: Wales, UK
Distribution: Slackware
Posts: 67
Blog Entries: 1

Rep: Reputation: 40
Snark's answer's better than my quick 'n dirty effort, since it makes no assumptions about file content.

Follow his advice & ignore mine.

Quote:
I tried
Code:
cat filname.csv | sed -e :a -e '$!Ns/\n[^"]/&/;ta -e 'P;D' > newfilename.csv
but it just hangs.
...Out of interest, whence came this script? It doesn't work properly even with Snark's correction.

If you're interested in the post-mortem, there are a few typos in there: the odd number of single quotes means that the thing will go into interactive mode (you need a single quote after the ta); missing a semicolon after the N prompts sed to grumble about "extra characters after command", and lastly the ampersand causes it to grind to a halt after a couple of lines.

If you need some good ready-made SED scripts, this is a great place (particularly if you're as idle as I am) --- the script you need is a simple modification of this one

Code:
 # if a line begins with an equal sign, append it to the previous line
 # and replace the "=" with a single space
 sed -e :a -e '$!N;s/\n=/ /;ta' -e 'P;D'
(which is listed a couple of screens down the page). It The only modifications to make to it are to swap the = for a [^"] and to mark out the latter as a sub-expression (using \( \) and \1). Which gives you this

Code:
sed -e :a -e '$!N;s/\n\([^"]\)/\1/;ta' -e 'P;D'
I should also point out that you won't be able simply to pipe in your file and redirect the output back into the same file, since sed is still reading the file --- chances are you'll end up with an empty file.

If you want to change the file itself, use the -i switch to edit the file in place:

Code:
sed -i -e :a -e '$!N;s/\n\([^"]\)/\1/;ta' -e 'P;D' filename.csv
Hope this answers your question!

Wales over & out.
 
Old 03-16-2011, 05:45 PM   #8
Ignotum Per Ignotius
Member
 
Registered: Sep 2009
Location: Wales, UK
Distribution: Slackware
Posts: 67
Blog Entries: 1

Rep: Reputation: 40
linuxScriptGirl,

How did you get on? Did our suggestions do what you wanted, or do you require further assistance? If the former, then be a dear and mark the thread [SOLVED]; if the latter, let us have your questions...
 
Old 03-17-2011, 10:18 AM   #9
linuxScriptGirl
LQ Newbie
 
Registered: Mar 2011
Posts: 4

Original Poster
Rep: Reputation: 0
Talking

Quote:
Originally Posted by Snark1994 View Post
Nah, we're strangely attractive to everyone else is what it is... isn't it?

EDIT: Darn, forgot to answer the question. You just need to escape the parentheses:
Code:
s/\n\([^"]\)/\1/
=============================
Thanks so much. It is working now!
 
Old 03-17-2011, 10:20 AM   #10
linuxScriptGirl
LQ Newbie
 
Registered: Mar 2011
Posts: 4

Original Poster
Rep: Reputation: 0
Wink

Quote:
Originally Posted by Ignotum Per Ignotius View Post
Hi linuxScriptGirl.

...Seems like you are strangely attractive to Welsh slackers...

I think the answer (well an answer) to your problem is to break it into three operations, since the newline is a bit of a pain. By translating the newline into some obscure character (i.e. a character which one could reliably assume will never appear in your input file), it becomes pretty straightforward. Here's my stab at it, anyway...

Code:
cat filename.csv | tr '\n' '' | sed 's/\([^\"]\)/ \1/g' | tr '' '\n' > filename.csv
I tried it on this file:

Code:
"line 1"
"line 2"
"line 3"
line 4
"line 5"
"line 6"
line 7
line 8
"line 9"
line 10
...and got this:

Code:
"line 1"
"line 2"
"line 3" line 4
"line 5"
"line 6" line 7 line 8
"line 9" line 10
The newline is changed into a space: you can easily eliminate this if you don't want it, by tweaking the sed script.

Nos da cariad...
========================================
Thanks for your suggestion. I was able to be succesful with the sugg from SNARK1994. I will keep this handy though for the future.
 
  


Reply

Tags
regular expressions, sed


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
SED - substitution carolflb Linux - Newbie 5 02-06-2010 12:20 AM
simple substitution with sed? ocicat Programming 9 02-22-2008 11:45 PM
sed substitution with p flag 7stud Linux - Newbie 2 03-03-2007 04:15 AM
Command substitution and sed daYz Linux - General 9 11-04-2006 01:15 AM
sed substitution conditional frostillicus Linux - Newbie 3 04-17-2005 12:36 AM


All times are GMT -5. The time now is 06:05 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration