LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (http://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   truncate to end of line after a specific string (http://www.linuxquestions.org/questions/linux-newbie-8/truncate-to-end-of-line-after-a-specific-string-4175428399/)

threezerous 09-21-2012 10:55 AM

truncate to end of line after a specific string
 
I thought this would be easy to find on google or figure out myself. But either I am not looking properly or missing the obvious. Apologies in advance if this has been replied elsewhere.

I want to delete all characters to end of line after a specific string in a file.

So my file contents are such (with variable length lines)

aaaabbbbbbbbbbbcccc.hrpt <www-alpha>
dddddbbbbbbbbcccc.hrpt <www-beta>
pppppppppppppppbbbbbbbbbbcccc.hrpt <www-gamma>
xxxbbbbbcccc.hrpt <www-alpha>

The common string above starts with is '.hrpt <www-'
I want to delete everything and inclusive of the common string....so my output file would look like

aaaabbbbbbbbbbbcccc
dddddbbbbbbbbcccc
pppppppppppppppbbbbbbbbbbcccc
xxxbbbbbcccc

I can easily do this using excel on a windows machine, but would like to try using shell scripting.
I played around a bit with sed command with little success.

Thanks much for any assistance.

porphyry5 09-21-2012 04:29 PM

Quote:

Originally Posted by threezerous (Post 4785934)
I thought this would be easy to find on google or figure out myself. But either I am not looking properly or missing the obvious. Apologies in advance if this has been replied elsewhere.

I want to delete all characters to end of line after a specific string in a file.

So my file contents are such (with variable length lines)

aaaabbbbbbbbbbbcccc.hrpt <www-alpha>
dddddbbbbbbbbcccc.hrpt <www-beta>
pppppppppppppppbbbbbbbbbbcccc.hrpt <www-gamma>
xxxbbbbbcccc.hrpt <www-alpha>

The common string above starts with is '.hrpt <www-'
I want to delete everything and inclusive of the common string....so my output file would look like

aaaabbbbbbbbbbbcccc
dddddbbbbbbbbcccc
pppppppppppppppbbbbbbbbbbcccc
xxxbbbbbcccc

I can easily do this using excel on a windows machine, but would like to try using shell scripting.
I played around a bit with sed command with little success.

Thanks much for any assistance.

The following assumes that there is never an actual '.' in the part you wish to keep. If that were not so you could substitute just '.*' for '[^\.]'
Code:

~ $ cat <tester
aaaabbbbbbbbbbbcccc.hrpt <www-alpha>
dddddbbbbbbbbcccc.hrpt <www-beta>
pppppppppppppppbbbbbbbbbbcccc.hrpt <www-gamma>
xxxbbbbbcccc.hrpt <www-alpha>
~ $ sed 's/\([^\.]\)\.hrpt.*/\1/' < tester
aaaabbbbbbbbbbbcccc
dddddbbbbbbbbcccc
pppppppppppppppbbbbbbbbbbcccc
xxxbbbbbcccc


sycamorex 09-21-2012 04:49 PM

Here are 3 more possible solutions:
Code:

sed 's/\..*//' file  # assuming that the dot is unique and always there
awk -F\. '{ print $1 }' file  # see above
sed '/\.hrpt.*/s///' file



All times are GMT -5. The time now is 12:14 AM.