LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 11-13-2009, 03:32 PM   #1
wesgarner
LQ Newbie
 
Registered: Nov 2009
Posts: 5

Rep: Reputation: 0
Question Using CAT / GREP to remove a set of lines from a file


I have a file that looks like this:

text
text
text
blah
blah
<package name="com.google.android.apps.maps" codePath="/system/app/Maps.apk" system="true" ts="1217592000000" version="3187" userId="10024">
blah
blah
blah
sdfd
</package>
more text


I was wanting to know how to remove any lines beginning with: <package name="com.google.android.apps.maps"
Then remove the lines until (and including): </package>

Or if it is not possible to do that, can I get it to from the line starting with <package name="com.google.android.apps.maps" and the 4 lines after?
 
Old 11-13-2009, 05:34 PM   #2
i92guboj
Gentoo support team
 
Registered: May 2008
Location: Lucena, Córdoba (Spain)
Distribution: Gentoo
Posts: 4,083

Rep: Reputation: 405Reputation: 405Reputation: 405Reputation: 405Reputation: 405
regexps are not really my strong point, but when you are doing with complex matches, you should be looking at something like sed or awk, rather than grep. I don't think this would be trivial or even doable with grep.
 
Old 11-13-2009, 05:43 PM   #3
choogendyk
Senior Member
 
Registered: Aug 2007
Location: Massachusetts, USA
Distribution: Solaris 9 & 10, Mac OS X, Ubuntu Server
Posts: 1,197

Rep: Reputation: 105Reputation: 105
cat / grep ain't gonna do it.

awk could do it.

Here's a couple of threads with similar questions and their answers: http://www.linuxquestions.org/questi...d-bbbb-601433/ or http://www.linuxquestions.org/questi...m-file-670225/
 
Old 11-13-2009, 05:46 PM   #4
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Quote:
Originally Posted by wesgarner View Post
I have a file that looks like this:

text
text
text
blah
blah
<package name="com.google.android.apps.maps" codePath="/system/app/Maps.apk" system="true" ts="1217592000000" version="3187" userId="10024">
blah
blah
blah
sdfd
</package>
more text


I was wanting to know how to remove any lines beginning with: <package name="com.google.android.apps.maps"
Then remove the lines until (and including): </package>

Or if it is not possible to do that, can I get it to from the line starting with <package name="com.google.android.apps.maps" and the 4 lines after?
see here and here for how to do it with gawk. Set RS to "</package>", then do substitution from "<package..." onwards.

experiment with it and post again if you hits problems.
 
Old 11-13-2009, 06:28 PM   #5
wesgarner
LQ Newbie
 
Registered: Nov 2009
Posts: 5

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by ghostdog74 View Post
see here and here for how to do it with gawk. Set RS to "</package>", then do substitution from "<package..." onwards.

experiment with it and post again if you hits problems.
That would work fun, but a problem is the original tag contains quote symbols - how would I get around this?
 
Old 11-13-2009, 06:30 PM   #6
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556
Post cat+sed

cat can be used, but grep, probably not; and cat alone of course won't do much all by itself.

Since this resembles homework, and no experimentation has yet been demonstrated by the OP, AND because I like fiddling with shell commands to come up with convoluted ways of doing stuff, here's a cat+sed way of doing it that is convoluted and definitely not the "best" way to do it (and I'll leave it to the OP to decipher what's happening here) but I figure as mentioned above, awk/gawk are going to be what the OP really seeks.

Code:
sasha@reactor:~/test$ echo $(cat -E test) | sed 's:<package.*package>$ ::g;s/\$ /\n/g;s/\$//g'
note: I named my file "test", and this just prints the results to the screen, so if you want the results put in a new file, you'll want to redirect it there.
 
Old 11-13-2009, 07:07 PM   #7
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Quote:
Originally Posted by wesgarner View Post
That would work fun, but a problem is the original tag contains quote symbols - how would I get around this?
where are those quote symbols you mentioned, show an exact structure of the input file again.
 
Old 11-13-2009, 07:30 PM   #8
wesgarner
LQ Newbie
 
Registered: Nov 2009
Posts: 5

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by GrapefruiTgirl View Post
cat can be used, but grep, probably not; and cat alone of course won't do much all by itself.

Since this resembles homework, and no experimentation has yet been demonstrated by the OP, AND because I like fiddling with shell commands to come up with convoluted ways of doing stuff, here's a cat+sed way of doing it that is convoluted and definitely not the "best" way to do it (and I'll leave it to the OP to decipher what's happening here) but I figure as mentioned above, awk/gawk are going to be what the OP really seeks.

Code:
sasha@reactor:~/test$ echo $(cat -E test) | sed 's:<package.*package>$ ::g;s/\$ /\n/g;s/\$//g'
note: I named my file "test", and this just prints the results to the screen, so if you want the results put in a new file, you'll want to redirect it there.
No worries, this isn't homework. I am trying to use it to remove a package from a file on Android phones (packages.xml) where the package is listed like this:
-=MORE PACKAGES=-
<package name="posimotion.Tic_Tac_Toe" codePath="/data/app/posimotion.Tic_Tac_Toe.apk" system="false" ts="1256337690000" version="2" userId="10121">
<sigs count="1">
<cert index="22" key="3082025d308201c6a003020102020449385791300d06092a864886f70d01010505003073310b3009060355040613025 5533110300e06035504081307466c6f72696461311630140603550407130d446179746f6e612042656163683113301106035 5040a130a506f73694d6f74696f6e31133011060355040b130a506f73694d6f74696f6e3110300e06035504031307556e6b6 e6f776e301e170d3038313230343232323030315a170d3336303432313232323030315a3073310b300906035504061302555 33110300e06035504081307466c6f72696461311630140603550407130d446179746f6e61204265616368311330110603550 40a130a506f73694d6f74696f6e31133011060355040b130a506f73694d6f74696f6e3110300e06035504031307556e6b6e6 f776e30819f300d06092a864886f70d010101050003818d0030818902818100ab097db9114f300310a09934a1b81577a0aee 0a6a67434a93dcacf39a73722d091fad058c33e737a7df5dd6ef65f2587b2b945cbe05ee023e88edbc9266b5e1b990c8698e f9bce4be52abf4050c37f3aa0f44d7b2318448724ac712cd3f0d6f9b66f3195b8aab4a915a28fadcd2021a6419395cbdbe80 d86d147b8aac6b1aeb30203010001300d06092a864886f70d01010505000381810075ec770965346eb2dd85d2d95c9e5553f eb265107fa5a0d1b66825366cfebd011389426eeffd1182788b1b8fd97998584e15f1abccbaa14663279670875ad1df0a070 03b708b23dbc6d620b60015537cbec5707a8d9b3c5f59a27f17436143ef00e553b52cdf7aa2466082ddbd0f4c2e9357c3d09 3acfc18602ce378047fd282" />
</sigs>
<perms>
<item name="android.permission.READ_PHONE_STATE" />
<item name="android.permission.WRITE_EXTERNAL_STORAGE" />
</perms>
</package>
-=MORE PACKAGES=-
 
Old 11-13-2009, 07:31 PM   #9
wesgarner
LQ Newbie
 
Registered: Nov 2009
Posts: 5

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by ghostdog74 View Post
where are those quote symbols you mentioned, show an exact structure of the input file again.
The exact first line of the input needed to be removed:
<package name="com.google.android.apps.maps" codePath="/data/app/com.google.android.apps.maps.apk" system="true" ts="1258057560000" version="3232" userId="10029

Ends with:
</package>

(has quotes around the name, path, etc)
 
Old 11-13-2009, 07:44 PM   #10
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
you did not try to experiment yourself did you?
Code:
$ more file
text
text
text
blah
blah
<package name="com.google.android.apps.maps" codePath="/system/app/Maps.apk" system="true" ts="1217592000000" version="3187" userId="10024">
blah
blah
blah
sdfd
</package>
more text

$ awk 'BEGIN{RS="</package>"}{gsub(/<package.*/,"")}1' file
text
text
text
blah
blah


more text
 
Old 11-13-2009, 07:59 PM   #11
wesgarner
LQ Newbie
 
Registered: Nov 2009
Posts: 5

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by ghostdog74 View Post
you did not try to experiment yourself did you?
Code:
$ more file
text
text
text
blah
blah
<package name="com.google.android.apps.maps" codePath="/system/app/Maps.apk" system="true" ts="1217592000000" version="3187" userId="10024">
blah
blah
blah
sdfd
</package>
more text

$ awk 'BEGIN{RS="</package>"}{gsub(/<package.*/,"")}1' file
text
text
text
blah
blah


more text
lol yes but that is not what I am looking for...
I am trying to remove a package from a list of packages (notated by <package> and </package>)

So far this method is what I am working on:
$LINSTART = $(cat packages.xml | grep -n "com.google.android.apps.maps" | cut -d: -f1)
$LINEEND = [What I need to implement]
$LINES = $(sed $($LINESTART),$($LINEEND)d packages.xml)
echo $LINES > packages-new.xml
 
Old 11-13-2009, 08:19 PM   #12
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
don't understand. so what should the output be if the input xml is the sample you posted in post #1 ??
 
Old 11-14-2009, 10:21 AM   #13
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fedora-35
Posts: 5,313

Rep: Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918
this is probably the hard way:
Code:
grep -n '<package' yum.lst | cut -d : -f 1 > top.lst
grep -n '</package' yum.lst | cut -d : -f 1 > bottom.lst
for pac in `paste -d , top.lst bottom.lst`
do
 sed `echo $pac`d yum.lst
done
edit: i think this mite print unwanted lines each iteration only taking out the current group of unwanted lines... happy tweaking.
edit: maybe you can take out the first group of unwanted lines then re-run and take out the next group (wash-rinse-repeat until there are no more goups left) >
edit:
untested:
Code:
cp yum.lst yum.bak
grep -n '<package' yum.lst | cut -d : -f 1 > top.lst
grep -n '</package' yum.lst | cut -d : -f 1 > bottom.lst
while [ -s top.lst ]     # man test/ google bash while
do
 pac=`paste -d , top.lst bottom.lst | head -n 1`    # man paste/ head/ google bash variable/ backticks
 sed `echo $pac`d yum.lst > yum.out; mv yum.out yum.lst     # man sed/ google io redirection
 grep -n '<package' yum.lst | cut -d : -f 1 > top.lst     # man grep/ cut/ google io redirection
 grep -n '</package' yum.lst | cut -d : -f 1 > bottom.lst     # man grep/ cut/ google io redirection
done

Last edited by schneidz; 11-14-2009 at 11:00 AM.
 
Old 11-14-2009, 10:41 AM   #14
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Quote:
Originally Posted by schneidz View Post
this is probably the hard way:
the word is not "hard", but rather, its "inefficient".
 
  


Reply

Tags
cat, grep



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
how can I "cat" or "grep" a file to ignore lines starting with "#" ??? callagga Linux - Newbie 7 08-16-2013 06:58 AM
grep - remove lines which only contain whitespace. arizonagroovejet Linux - General 4 04-25-2009 09:41 AM
can't see all 20,160 lines in a file when 'cat filename' dave247 Debian 4 10-25-2008 05:13 PM
cat [file] | grep --- trouble bob_man_uk Linux - General 12 03-10-2006 06:05 AM
output to a file - cat? grep? Godsmacker777 Linux - Newbie 6 12-08-2004 10:06 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 11:35 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration