ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
"sed -n '/<text>/,/<\/text>/p' filename" should do it. Will need some work if both are on the one line.
And yes, go look at the one-liners on the sed site on sf.
"sed -n '/<text>/,/<\/text>/p' filename" should do it. Will need some work if both are on the one line.
And yes, go look at the one-liners on the sed site on sf.
If i am not wrong, OP wants to extract the text in between the tags. So i guess some more manipulations required with the sed method.
Code:
awk '/<text>/,/<\/text>/' file #equivalent to sed -n '/<text>/,/<\/text>/p
If i am not wrong, OP wants to extract the text in between the tags. So i guess some more manipulations required with the sed method.
Yeah, you might be right - I was thinking "inclusive" of tags.
Oh well.
As always, lots of ways of getting the job done. Won't take much to clean-up, depending on what the OP actually wanted. Could be done any number of ways.
I always like to find the shortest [least code required] method to do something. especially for some one like this, if they couldn't figure this out on their own they probably dont understand all of whats going on in the code you provided. not trying to knock you by any means, just providing insight on a more simple method. I know when I am new to something it drives me crazy to have people show me over complicated methods for doing something very simple, it makes it harder to understand so I can do it on my own next time. I try to help people not have the problems learning that I had, not just give them a one time fix for their problem.
If the tags and the contents are on the same line, then It can be done easily using sed:
sed -n '/<text>/,/<\/text>/s/.*<text>\(.*\)<\/text>/\1/p' file.
I've used something similar with k3b. If you save the project to a file, it actually creates a zip archive containing two file. One of them is named maindata.xml.
The xml file contains a catalog of backed up files. You could use this file to give you a list of names that are safe to delete because they are backed up.
In this case, because the source is an xml file, you need to watch for the patterns > < & and replace them with the characters >,<,& respectively. So adding three sed commands are necessary.
Code:
sed -n '/^<url>/{
s/^<url>\(.*\)<\/url>/\1/
s/>/>/g
}' maindata.xml
jschiwal@hpamd64:~> sed -n '/^<url>/{
s/^<url>\(.*\)<\/url>/\1/
s/>/>/g
s/</</g
> s/&/\&/g
> p
> }' maindata.xml
...
/home/jschiwal/Podcasts/50@10712b865b6a420bdea05b6cc5bfde98
/home/jschiwal/Podcasts/CrankyGeeks/crankygeeks.064.mp4
/home/jschiwal/Podcasts/CrankyGeeks/crankygeeks.066.mp4
/home/jschiwal/Podcasts/CrankyGeeks/<crankygeeks>&.067.mp4
/home/jschiwal/Podcasts/JM-001.ogg
/home/jschiwal/Podcasts/LQ-Podcast-050207.mp3
/home/jschiwal/Podcasts/LQ-Podcast-051207.mp3
Whatever method you use, it is best to test it out. You may have forgotten some patterns that can trip you up. The first time I did this I forgot about the reserved characters in xml, and files containing these characters weren't being deleted.
In composing this message, I added one sed rule at a time and tested it before going to the next one. Simply pressing the up arrow in the shell, and adding semicolons between sed commands, I can convert this into a true oneliner:
Code:
sed -n '/^<url>/{s/^<url>\(.*\)<\/url>/\1/;s/>/>/g;s/</</g;s/&/\&/g;p}' maindata.xml
I hope I remember to change the filename back to "crankygeeks.067.mp4" after this demonstration!
I always like to find the shortest [least code required] method to do something.
that's the problem with one liners in general IMO. They are short and specific to do a task, but not necessarily easily understandable to the one reading/maintaining it.
"quick and dirty" hacks are fine for ad hoc one-time needs.
In a corporate environment, it pays to have a better (and better documented) generic solution. Personally I prefer perl in such a circumstance, but each to their own.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.