LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (http://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Using CAT / GREP to remove a set of lines from a file (http://www.linuxquestions.org/questions/linux-newbie-8/using-cat-grep-to-remove-a-set-of-lines-from-a-file-768979/)

wesgarner 11-13-2009 03:32 PM

Using CAT / GREP to remove a set of lines from a file
 
I have a file that looks like this:

text
text
text
blah
blah
<package name="com.google.android.apps.maps" codePath="/system/app/Maps.apk" system="true" ts="1217592000000" version="3187" userId="10024">
blah
blah
blah
sdfd
</package>
more text


I was wanting to know how to remove any lines beginning with: <package name="com.google.android.apps.maps"
Then remove the lines until (and including): </package>

Or if it is not possible to do that, can I get it to from the line starting with <package name="com.google.android.apps.maps" and the 4 lines after?

i92guboj 11-13-2009 05:34 PM

regexps are not really my strong point, but when you are doing with complex matches, you should be looking at something like sed or awk, rather than grep. I don't think this would be trivial or even doable with grep.

choogendyk 11-13-2009 05:43 PM

cat / grep ain't gonna do it.

awk could do it.

Here's a couple of threads with similar questions and their answers: http://www.linuxquestions.org/questi...d-bbbb-601433/ or http://www.linuxquestions.org/questi...m-file-670225/

ghostdog74 11-13-2009 05:46 PM

Quote:

Originally Posted by wesgarner (Post 3756163)
I have a file that looks like this:

text
text
text
blah
blah
<package name="com.google.android.apps.maps" codePath="/system/app/Maps.apk" system="true" ts="1217592000000" version="3187" userId="10024">
blah
blah
blah
sdfd
</package>
more text


I was wanting to know how to remove any lines beginning with: <package name="com.google.android.apps.maps"
Then remove the lines until (and including): </package>

Or if it is not possible to do that, can I get it to from the line starting with <package name="com.google.android.apps.maps" and the 4 lines after?

see here and here for how to do it with gawk. Set RS to "</package>", then do substitution from "<package..." onwards.

experiment with it and post again if you hits problems.

wesgarner 11-13-2009 06:28 PM

Quote:

Originally Posted by ghostdog74 (Post 3756340)
see here and here for how to do it with gawk. Set RS to "</package>", then do substitution from "<package..." onwards.

experiment with it and post again if you hits problems.

That would work fun, but a problem is the original tag contains quote symbols - how would I get around this?

GrapefruiTgirl 11-13-2009 06:30 PM

cat+sed
 
cat can be used, but grep, probably not; and cat alone of course won't do much all by itself.

Since this resembles homework, and no experimentation has yet been demonstrated by the OP, AND because I like fiddling with shell commands to come up with convoluted ways of doing stuff, here's a cat+sed way of doing it that is convoluted and definitely not the "best" way to do it (and I'll leave it to the OP to decipher what's happening here) but I figure as mentioned above, awk/gawk are going to be what the OP really seeks.

Code:

sasha@reactor:~/test$ echo $(cat -E test) | sed 's:<package.*package>$ ::g;s/\$ /\n/g;s/\$//g'
note: I named my file "test", and this just prints the results to the screen, so if you want the results put in a new file, you'll want to redirect it there.

ghostdog74 11-13-2009 07:07 PM

Quote:

Originally Posted by wesgarner (Post 3756372)
That would work fun, but a problem is the original tag contains quote symbols - how would I get around this?

where are those quote symbols you mentioned, show an exact structure of the input file again.

wesgarner 11-13-2009 07:30 PM

Quote:

Originally Posted by GrapefruiTgirl (Post 3756375)
cat can be used, but grep, probably not; and cat alone of course won't do much all by itself.

Since this resembles homework, and no experimentation has yet been demonstrated by the OP, AND because I like fiddling with shell commands to come up with convoluted ways of doing stuff, here's a cat+sed way of doing it that is convoluted and definitely not the "best" way to do it (and I'll leave it to the OP to decipher what's happening here) but I figure as mentioned above, awk/gawk are going to be what the OP really seeks.

Code:

sasha@reactor:~/test$ echo $(cat -E test) | sed 's:<package.*package>$ ::g;s/\$ /\n/g;s/\$//g'
note: I named my file "test", and this just prints the results to the screen, so if you want the results put in a new file, you'll want to redirect it there.

No worries, this isn't homework. I am trying to use it to remove a package from a file on Android phones (packages.xml) where the package is listed like this:
-=MORE PACKAGES=-
<package name="posimotion.Tic_Tac_Toe" codePath="/data/app/posimotion.Tic_Tac_Toe.apk" system="false" ts="1256337690000" version="2" userId="10121">
<sigs count="1">
<cert index="22" key="3082025d308201c6a003020102020449385791300d06092a864886f70d01010505003073310b3009060355040613025 5533110300e06035504081307466c6f72696461311630140603550407130d446179746f6e612042656163683113301106035 5040a130a506f73694d6f74696f6e31133011060355040b130a506f73694d6f74696f6e3110300e06035504031307556e6b6 e6f776e301e170d3038313230343232323030315a170d3336303432313232323030315a3073310b300906035504061302555 33110300e06035504081307466c6f72696461311630140603550407130d446179746f6e61204265616368311330110603550 40a130a506f73694d6f74696f6e31133011060355040b130a506f73694d6f74696f6e3110300e06035504031307556e6b6e6 f776e30819f300d06092a864886f70d010101050003818d0030818902818100ab097db9114f300310a09934a1b81577a0aee 0a6a67434a93dcacf39a73722d091fad058c33e737a7df5dd6ef65f2587b2b945cbe05ee023e88edbc9266b5e1b990c8698e f9bce4be52abf4050c37f3aa0f44d7b2318448724ac712cd3f0d6f9b66f3195b8aab4a915a28fadcd2021a6419395cbdbe80 d86d147b8aac6b1aeb30203010001300d06092a864886f70d01010505000381810075ec770965346eb2dd85d2d95c9e5553f eb265107fa5a0d1b66825366cfebd011389426eeffd1182788b1b8fd97998584e15f1abccbaa14663279670875ad1df0a070 03b708b23dbc6d620b60015537cbec5707a8d9b3c5f59a27f17436143ef00e553b52cdf7aa2466082ddbd0f4c2e9357c3d09 3acfc18602ce378047fd282" />
</sigs>
<perms>
<item name="android.permission.READ_PHONE_STATE" />
<item name="android.permission.WRITE_EXTERNAL_STORAGE" />
</perms>
</package>
-=MORE PACKAGES=-

wesgarner 11-13-2009 07:31 PM

Quote:

Originally Posted by ghostdog74 (Post 3756400)
where are those quote symbols you mentioned, show an exact structure of the input file again.

The exact first line of the input needed to be removed:
<package name="com.google.android.apps.maps" codePath="/data/app/com.google.android.apps.maps.apk" system="true" ts="1258057560000" version="3232" userId="10029

Ends with:
</package>

(has quotes around the name, path, etc)

ghostdog74 11-13-2009 07:44 PM

you did not try to experiment yourself did you?
Code:

$ more file
text
text
text
blah
blah
<package name="com.google.android.apps.maps" codePath="/system/app/Maps.apk" system="true" ts="1217592000000" version="3187" userId="10024">
blah
blah
blah
sdfd
</package>
more text

$ awk 'BEGIN{RS="</package>"}{gsub(/<package.*/,"")}1' file
text
text
text
blah
blah


more text


wesgarner 11-13-2009 07:59 PM

Quote:

Originally Posted by ghostdog74 (Post 3756420)
you did not try to experiment yourself did you?
Code:

$ more file
text
text
text
blah
blah
<package name="com.google.android.apps.maps" codePath="/system/app/Maps.apk" system="true" ts="1217592000000" version="3187" userId="10024">
blah
blah
blah
sdfd
</package>
more text

$ awk 'BEGIN{RS="</package>"}{gsub(/<package.*/,"")}1' file
text
text
text
blah
blah


more text


lol yes but that is not what I am looking for...
I am trying to remove a package from a list of packages (notated by <package> and </package>)

So far this method is what I am working on:
$LINSTART = $(cat packages.xml | grep -n "com.google.android.apps.maps" | cut -d: -f1)
$LINEEND = [What I need to implement]
$LINES = $(sed $($LINESTART),$($LINEEND)d packages.xml)
echo $LINES > packages-new.xml

ghostdog74 11-13-2009 08:19 PM

don't understand. so what should the output be if the input xml is the sample you posted in post #1 ??

schneidz 11-14-2009 10:21 AM

this is probably the hard way:
Code:

grep -n '<package' yum.lst | cut -d : -f 1 > top.lst
grep -n '</package' yum.lst | cut -d : -f 1 > bottom.lst
for pac in `paste -d , top.lst bottom.lst`
do
 sed `echo $pac`d yum.lst
done

edit: i think this mite print unwanted lines each iteration only taking out the current group of unwanted lines... happy tweaking.
edit: maybe you can take out the first group of unwanted lines then re-run and take out the next group (wash-rinse-repeat until there are no more goups left) >
edit:
untested:
Code:

cp yum.lst yum.bak
grep -n '<package' yum.lst | cut -d : -f 1 > top.lst
grep -n '</package' yum.lst | cut -d : -f 1 > bottom.lst
while [ -s top.lst ]    # man test/ google bash while
do
 pac=`paste -d , top.lst bottom.lst | head -n 1`    # man paste/ head/ google bash variable/ backticks
 sed `echo $pac`d yum.lst > yum.out; mv yum.out yum.lst    # man sed/ google io redirection
 grep -n '<package' yum.lst | cut -d : -f 1 > top.lst    # man grep/ cut/ google io redirection
 grep -n '</package' yum.lst | cut -d : -f 1 > bottom.lst    # man grep/ cut/ google io redirection
done


ghostdog74 11-14-2009 10:41 AM

Quote:

Originally Posted by schneidz (Post 3756964)
this is probably the hard way:

the word is not "hard", but rather, its "inefficient".


All times are GMT -5. The time now is 08:18 AM.