Using CAT / GREP to remove a set of lines from a file
Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Using CAT / GREP to remove a set of lines from a file
I have a file that looks like this:
text
text
text
blah
blah
<package name="com.google.android.apps.maps" codePath="/system/app/Maps.apk" system="true" ts="1217592000000" version="3187" userId="10024">
blah
blah
blah
sdfd
</package>
more text
I was wanting to know how to remove any lines beginning with: <package name="com.google.android.apps.maps"
Then remove the lines until (and including): </package>
Or if it is not possible to do that, can I get it to from the line starting with <package name="com.google.android.apps.maps" and the 4 lines after?
regexps are not really my strong point, but when you are doing with complex matches, you should be looking at something like sed or awk, rather than grep. I don't think this would be trivial or even doable with grep.
text
text
text
blah
blah
<package name="com.google.android.apps.maps" codePath="/system/app/Maps.apk" system="true" ts="1217592000000" version="3187" userId="10024">
blah
blah
blah
sdfd
</package>
more text
I was wanting to know how to remove any lines beginning with: <package name="com.google.android.apps.maps"
Then remove the lines until (and including): </package>
Or if it is not possible to do that, can I get it to from the line starting with <package name="com.google.android.apps.maps" and the 4 lines after?
see here and here for how to do it with gawk. Set RS to "</package>", then do substitution from "<package..." onwards.
experiment with it and post again if you hits problems.
cat can be used, but grep, probably not; and cat alone of course won't do much all by itself.
Since this resembles homework, and no experimentation has yet been demonstrated by the OP, AND because I like fiddling with shell commands to come up with convoluted ways of doing stuff, here's a cat+sed way of doing it that is convoluted and definitely not the "best" way to do it (and I'll leave it to the OP to decipher what's happening here) but I figure as mentioned above, awk/gawk are going to be what the OP really seeks.
Code:
sasha@reactor:~/test$ echo $(cat -E test) | sed 's:<package.*package>$ ::g;s/\$ /\n/g;s/\$//g'
note: I named my file "test", and this just prints the results to the screen, so if you want the results put in a new file, you'll want to redirect it there.
cat can be used, but grep, probably not; and cat alone of course won't do much all by itself.
Since this resembles homework, and no experimentation has yet been demonstrated by the OP, AND because I like fiddling with shell commands to come up with convoluted ways of doing stuff, here's a cat+sed way of doing it that is convoluted and definitely not the "best" way to do it (and I'll leave it to the OP to decipher what's happening here) but I figure as mentioned above, awk/gawk are going to be what the OP really seeks.
Code:
sasha@reactor:~/test$ echo $(cat -E test) | sed 's:<package.*package>$ ::g;s/\$ /\n/g;s/\$//g'
note: I named my file "test", and this just prints the results to the screen, so if you want the results put in a new file, you'll want to redirect it there.
No worries, this isn't homework. I am trying to use it to remove a package from a file on Android phones (packages.xml) where the package is listed like this:
-=MORE PACKAGES=-
<package name="posimotion.Tic_Tac_Toe" codePath="/data/app/posimotion.Tic_Tac_Toe.apk" system="false" ts="1256337690000" version="2" userId="10121">
<sigs count="1">
<cert index="22" key="3082025d308201c6a003020102020449385791300d06092a864886f70d01010505003073310b3009060355040613025 5533110300e06035504081307466c6f72696461311630140603550407130d446179746f6e612042656163683113301106035 5040a130a506f73694d6f74696f6e31133011060355040b130a506f73694d6f74696f6e3110300e06035504031307556e6b6 e6f776e301e170d3038313230343232323030315a170d3336303432313232323030315a3073310b300906035504061302555 33110300e06035504081307466c6f72696461311630140603550407130d446179746f6e61204265616368311330110603550 40a130a506f73694d6f74696f6e31133011060355040b130a506f73694d6f74696f6e3110300e06035504031307556e6b6e6 f776e30819f300d06092a864886f70d010101050003818d0030818902818100ab097db9114f300310a09934a1b81577a0aee 0a6a67434a93dcacf39a73722d091fad058c33e737a7df5dd6ef65f2587b2b945cbe05ee023e88edbc9266b5e1b990c8698e f9bce4be52abf4050c37f3aa0f44d7b2318448724ac712cd3f0d6f9b66f3195b8aab4a915a28fadcd2021a6419395cbdbe80 d86d147b8aac6b1aeb30203010001300d06092a864886f70d01010505000381810075ec770965346eb2dd85d2d95c9e5553f eb265107fa5a0d1b66825366cfebd011389426eeffd1182788b1b8fd97998584e15f1abccbaa14663279670875ad1df0a070 03b708b23dbc6d620b60015537cbec5707a8d9b3c5f59a27f17436143ef00e553b52cdf7aa2466082ddbd0f4c2e9357c3d09 3acfc18602ce378047fd282" />
</sigs>
<perms>
<item name="android.permission.READ_PHONE_STATE" />
<item name="android.permission.WRITE_EXTERNAL_STORAGE" />
</perms>
</package>
-=MORE PACKAGES=-
where are those quote symbols you mentioned, show an exact structure of the input file again.
The exact first line of the input needed to be removed:
<package name="com.google.android.apps.maps" codePath="/data/app/com.google.android.apps.maps.apk" system="true" ts="1258057560000" version="3232" userId="10029
$ more file
text
text
text
blah
blah
<package name="com.google.android.apps.maps" codePath="/system/app/Maps.apk" system="true" ts="1217592000000" version="3187" userId="10024">
blah
blah
blah
sdfd
</package>
more text
$ awk 'BEGIN{RS="</package>"}{gsub(/<package.*/,"")}1' file
text
text
text
blah
blah
more text
$ more file
text
text
text
blah
blah
<package name="com.google.android.apps.maps" codePath="/system/app/Maps.apk" system="true" ts="1217592000000" version="3187" userId="10024">
blah
blah
blah
sdfd
</package>
more text
$ awk 'BEGIN{RS="</package>"}{gsub(/<package.*/,"")}1' file
text
text
text
blah
blah
more text
lol yes but that is not what I am looking for...
I am trying to remove a package from a list of packages (notated by <package> and </package>)
So far this method is what I am working on:
$LINSTART = $(cat packages.xml | grep -n "com.google.android.apps.maps" | cut -d: -f1)
$LINEEND = [What I need to implement]
$LINES = $(sed $($LINESTART),$($LINEEND)d packages.xml)
echo $LINES > packages-new.xml
grep -n '<package' yum.lst | cut -d : -f 1 > top.lst
grep -n '</package' yum.lst | cut -d : -f 1 > bottom.lst
for pac in `paste -d , top.lst bottom.lst`
do
sed `echo $pac`d yum.lst
done
edit: i think this mite print unwanted lines each iteration only taking out the current group of unwanted lines... happy tweaking.
edit: maybe you can take out the first group of unwanted lines then re-run and take out the next group (wash-rinse-repeat until there are no more goups left) >
edit:
untested:
Code:
cp yum.lst yum.bak
grep -n '<package' yum.lst | cut -d : -f 1 > top.lst
grep -n '</package' yum.lst | cut -d : -f 1 > bottom.lst
while [ -s top.lst ] # man test/ google bash while
do
pac=`paste -d , top.lst bottom.lst | head -n 1` # man paste/ head/ google bash variable/ backticks
sed `echo $pac`d yum.lst > yum.out; mv yum.out yum.lst # man sed/ google io redirection
grep -n '<package' yum.lst | cut -d : -f 1 > top.lst # man grep/ cut/ google io redirection
grep -n '</package' yum.lst | cut -d : -f 1 > bottom.lst # man grep/ cut/ google io redirection
done
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.