LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 07-15-2015, 02:21 PM   #1
LilLinuxLearner
LQ Newbie
 
Registered: Jul 2015
Posts: 2

Rep: Reputation: Disabled
sed command to delete text until match is found - for each line of csv


Hello everyone

I have a csv file and I am trying to delete all characters from the beginning of the line till it finds the first occurrence of "2015". I want to do this for each line in the csv file.

My csv file structure is as follows:

Field1 , Field2 , Field3 , Field4
_________________________________
sometext1 , 2015-07-15 , sometext2, sometext3
sometext1 , 2015-07-14 , sometext2, sometext3
sometext1 , 2015-07-13 , sometext2, sometext3

I cannot use the cut command or sed for the first occurrence of a comma because the text in the Field1 sometimes has commas in them too, which is making it complicated for parsing. I figured if I search for the first occurrence of the text 2015 for each line and replace all the preceding characters with nothing, then that should work.

FYI I only want to do this for the FIRST occurrence of 2015 only. There is another text field with 2015 in it within another column and I don't any text prior to that to be affected.

For example, if my original line is sometext1,#015,2015-07-10,sometext2,2015,sometext3, I want it to return 2015-07-10,sometext2,2015,sometext3
Does anyone know the sed command to do this?

Any help will be appreciated!

Thanks

Last edited by LilLinuxLearner; 07-15-2015 at 03:35 PM. Reason: Clarification
 
Old 07-15-2015, 02:58 PM   #2
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 2,473

Rep: Reputation: Disabled
sed "s/.*2015/2015/" should do the trick.

As long as you don't use the "greedy" option, sed will match anything up to and including the first occurrence of 2015.
 
1 members found this post helpful.
Old 07-15-2015, 03:20 PM   #3
LilLinuxLearner
LQ Newbie
 
Registered: Jul 2015
Posts: 2

Original Poster
Rep: Reputation: Disabled
Hello Ser Olmy. I tried that command and it actually truncates all the columns until the last occurrence of 2015 and not the first

For example, when I run the command echo sometext1,#015,2015-07-10,sometext2,2015,sometext3 | sed "s/.*2015/2015/" it returns a value of 2015,sometext3. I actually want it to display 2015-07-10,sometext2,2015,sometext3 instead

Last edited by LilLinuxLearner; 07-15-2015 at 03:26 PM.
 
Old 07-15-2015, 04:14 PM   #4
Aia
Member
 
Registered: Jun 2006
Posts: 66

Rep: Reputation: 21
Quote:
Originally Posted by LilLinuxLearner View Post
Hello Ser Olmy. I tried that command and it actually truncates all the columns until the last occurrence of 2015 and not the first

For example, when I run the command echo sometext1,#015,2015-07-10,sometext2,2015,sometext3 | sed "s/.*2015/2015/" it returns a value of 2015,sometext3. I actually want it to display 2015-07-10,sometext2,2015,sometext3 instead
sed doesn't have a no greedy *, but Perl does.
Code:
perl -pe 's/^.*?2015/2015/'
Code:
perl -pe 's/^.*?(2015)/$1/'

Last edited by Aia; 07-15-2015 at 04:17 PM.
 
Old 07-15-2015, 05:12 PM   #5
allend
Senior Member
 
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware-current
Posts: 4,681

Rep: Reputation: 1566Reputation: 1566Reputation: 1566Reputation: 1566Reputation: 1566Reputation: 1566Reputation: 1566Reputation: 1566Reputation: 1566Reputation: 1566Reputation: 1566
From post #1
Quote:
My csv file structure is as follows:

Field1 , Field2 , Field3 , Field4
_________________________________
sometext1 , 2015-07-15 , sometext2, sometext3
sometext1 , 2015-07-14 , sometext2, sometext3
sometext1 , 2015-07-13 , sometext2, sometext3

I cannot use the cut command or sed for the first occurrence of a comma because the text in the Field1 sometimes has commas in them too, which is making it complicated for parsing.
and from post #3
Quote:
For example, when I run the command echo sometext1,#015,2015-07-10,sometext2,2015,sometext3 | sed "s/.*2015/2015/" it returns a value of 2015,sometext3. I actually want it to display 2015-07-10,sometext2,2015,sometext3 instead
This task would be better handled with awk. Awk can be used to read CSV files, as pointed out by grail in this post. http://www.linuxquestions.org/questi...5/#post5341814
Follow the link in that post for further details.
Also, despite what it says in that link, it is possible to use awk to process CSV files containing embedded new lines. http://www.linuxquestions.org/questi...8/#post5343293
 
Old 07-15-2015, 09:55 PM   #6
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 15,988

Rep: Reputation: 2217Reputation: 2217Reputation: 2217Reputation: 2217Reputation: 2217Reputation: 2217Reputation: 2217Reputation: 2217Reputation: 2217Reputation: 2217Reputation: 2217
Use the correct tool for the job as pointed out by @Aia - perl is the elegant answer (here), others such as sed and awk are inelegant and/or fragile.
For example, if you can guarantee no digit 2 occurs before its use as 2015, you could easily construct sed regex to accommodate this - but is very subject to data changing.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] SED/AWK - Delete all lines until empty line is found after pattern match vikas027 Programming 13 03-28-2012 09:33 AM
Need help with sed command: if a line contains >2 colons (:) delete it and line above kmkocot Linux - Newbie 1 12-27-2011 09:51 AM
sed command to replace line in text file gengwei89 Linux - Newbie 6 11-06-2011 09:10 AM
Perl question: delete line from text file with duplicate match at beginning of line mrealty Programming 7 04-01-2009 07:46 PM
grep/sed/awk - find match, then match on next line gctaylor1 Programming 3 07-11-2007 09:55 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 11:43 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration