Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I need extract rows 48 to 53 (which have "-" in all the 5 columns).
This page gets updated daily and I need names in the second column. Currently the are MSNG SIBN MSRS RASP MTLRP MTLR http://www.nkcbank.ru/viewCatalog.do?menuKey=254
I used curl command to get the html code but dont know how to extract my required data. TIA
well, i'm not going to click that link.
so if you want to post some html, explain what you want to extract, and show us what you tried so far, we'll be more than happy to assist.
thanks ondoho and pan64, I did install html-xml-utils but still a bit confused how to extract the second column from rows marked with *
Ok..so why don't you post what you have written, show us a sample of the input data, and what you're wanting as the output data, and we can try to help. But we're not going to write your code for you, or click bank-website links in Russia. Post your code and relevant details.
Okay, sorry if I conflicted the forum norms, I tried to quote the html page source but its (480122 characters) . I need information from 1st table=>name in 2nd column=> which has all blanks (-).
I used the * special character (there are total 7 * signs on the page source) and tried to check from where I can extract name in the second column. I found if * is at line number 3 then my required word is at line number 6 and so on.
I used the following command, which works correct for me as of now, but I know this logic/regex only survives till the time my word is available 3 lines ahead of the greped * symbol.
awk | grep can be combined into one single awk script.
The script you wrote does not check the 5 occurrences of - (but a *), that is not the same thing at all.
shridhar22, please believe me, in the long run you'll be happier using html-xml-utils, which contain some commands that parse html - something that you're now trying to re-implement from scratch.
xmllint is actually even better, but harder to use.
it's probably easier to parse by css classes, so instead of looking for "the 1st table", you'd be looking for "a table that has the class xxxx"
you can upload the html code of the whole page somewhere else, so interested helpers can use that to see what you're trying to achieve.
i'm not a good coder, but i once made a weather forecast script that uses above mentioned utilities, if you want you can take a peek here.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.