LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-16-2018, 04:17 AM   #31
Turbocapitalist
Senior Member
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 4,110
Blog Entries: 3

Rep: Reputation: 2000Reputation: 2000Reputation: 2000Reputation: 2000Reputation: 2000Reputation: 2000Reputation: 2000Reputation: 2000Reputation: 2000Reputation: 2000Reputation: 2000

Using the example data in #21 above, if you use a tool that can work with XPath, then the following might provide the link

Code:
//td/div[span/img[contains(@src,"theoneiwant")]]/preceding-sibling::div[last()]/a/@href
The last() is a little confusing, but preceding-sibling counts backwards from the current position.
The div[] clause could be modified to make it accept arbitrary children if span is not always there in the way.
 
2 members found this post helpful.
Old 10-16-2018, 10:44 AM   #32
individual
Member
 
Registered: Jul 2018
Posts: 234

Rep: Reputation: 176Reputation: 176
Quote:
Originally Posted by iammike2 View Post
logic is the following

Search for <a href="http://website.com/happy.php?id=

If found <a href="http://website.com/happy.php?id= search for if SAME line contains <strong>
(If so, then place id in variable)

Go on searching for theoneiwant

if found write the ID to text file.

If NOT FOUND theoneiwant stop when reaching </TR> and then start searching the next <TR></TR> section
Are you going to use the id after finding "theoneiwant?" If you're just going to write it to a file, there is no need to store it in a variable.
 
1 members found this post helpful.
Old 10-16-2018, 12:17 PM   #33
allend
LQ 5k Club
 
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware-current
Posts: 5,250

Rep: Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903
Just for fun, a bash solution
Code:
#!/bin/bash

FILE=filename
startstring='href="http://website.com/happy.php?id='
andstring="theoneiwant"
id=0

while read line; do
  if [[ "$line" =~ "$startstring"([0-9]+).*"<strong>"(.*)"</strong>" ]]; then
    id=${BASH_REMATCH[1]}
    desc=${BASH_REMATCH[2]}
  fi
  if [[ "$line" =~ "$andstring"  &&  "$id" != 0 ]] ; then
    echo "id=$id desc=$desc"
  fi
  [[ "$line" =~ "</tr>" ]] && id=0
done < $FILE

Last edited by allend; 10-16-2018 at 12:25 PM.
 
3 members found this post helpful.
Old 10-16-2018, 08:57 PM   #34
iammike2
LQ Newbie
 
Registered: Oct 2018
Distribution: Synology DSM
Posts: 23

Original Poster
Rep: Reputation: 1
Quote:
Originally Posted by allend View Post
Just for fun, a bash solution
Code:
#!/bin/bash

FILE=filename
startstring='href="http://website.com/happy.php?id='
andstring="theoneiwant"
id=0

while read line; do
  if [[ "$line" =~ "$startstring"([0-9]+).*"<strong>"(.*)"</strong>" ]]; then
    id=${BASH_REMATCH[1]}
    desc=${BASH_REMATCH[2]}
  fi
  if [[ "$line" =~ "$andstring"  &&  "$id" != 0 ]] ; then
    echo "id=$id desc=$desc"
  fi
  [[ "$line" =~ "</tr>" ]] && id=0
done < $FILE

WOW !!!!! Exactly that is the one. WORKS perfectly !!!!!!!!


I can't thank you enough !!


 
Old 10-16-2018, 09:03 PM   #35
iammike2
LQ Newbie
 
Registered: Oct 2018
Distribution: Synology DSM
Posts: 23

Original Poster
Rep: Reputation: 1
Thx again @allend.

That works perfectly and finds all the ones I want.

Now I can finally turn off my PC at Night and leave the NAS doing all the WORK

Sorry you guys that I have bothered you with this, but this was bothering me for some time already and I couldn't find a solution.

Again thanks to all and really really appreciated and I'm absolutely taking my hat off to you guys !! Very helpful !


Last edited by iammike2; 10-16-2018 at 09:08 PM.
 
1 members found this post helpful.
Old 10-16-2018, 09:19 PM   #36
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 18,069

Rep: Reputation: 2899Reputation: 2899Reputation: 2899Reputation: 2899Reputation: 2899Reputation: 2899Reputation: 2899Reputation: 2899Reputation: 2899Reputation: 2899Reputation: 2899
Thank you allend, as usual I keep forgetting about REMATCH - don't know why ....
 
2 members found this post helpful.
Old 10-16-2018, 09:32 PM   #37
allend
LQ 5k Club
 
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware-current
Posts: 5,250

Rep: Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903
Thanks for the kind words and thanks for the problem. It was a good exercise in bash syntax.
 
1 members found this post helpful.
Old 10-17-2018, 01:35 AM   #38
pan64
LQ Guru
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 12,993

Rep: Reputation: 4096Reputation: 4096Reputation: 4096Reputation: 4096Reputation: 4096Reputation: 4096Reputation: 4096Reputation: 4096Reputation: 4096Reputation: 4096Reputation: 4096
it is definitely unsafe (I mean to process an xml in bash, in general). But in this case it works. Actually I like it.
Here is a little modification:
Code:
#!/bin/bash

FILE=filename
startstring='href="http://website.com/happy.php?id='
andstring="theoneiwant"
id=0

while read -r line; do
  if [[ "$line" =~ "$startstring"([0-9]+).*"<strong>"(.*)"</strong>" ]]; then
    id=${BASH_REMATCH[1]}
    desc=${BASH_REMATCH[2]}
    continue
  fi
  [[ "$id" = 0 ]] && continue

  if [[ "$line" =~ "$andstring" ]] ; then
    echo "id=$id desc=$desc"
    id=0
  fi
done < $FILE
also using =~ assumes regexp, so I'm not really sure how ? and other tricky chars handled. see for example here: https://stackoverflow.com/questions/...string-in-bash
 
2 members found this post helpful.
Old 10-17-2018, 02:21 AM   #39
allend
LQ 5k Club
 
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware-current
Posts: 5,250

Rep: Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903
Quote:
it is definitely unsafe (I mean to process an xml in bash, in general).
Very good point. Did you hear the one about the goose that pasted the example file into nano, and had problems when the long lines were broken when saved?
Thanks for the efficiency improvements.

As for the tricky characters, I will defer to @grail who first demonstrated the use of BASH_REMATCH to me. I remember a comment that keeping the tricky characters in a variable often avoids unexpected surprises.
 
Old 10-17-2018, 02:32 AM   #40
pan64
LQ Guru
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 12,993

Rep: Reputation: 4096Reputation: 4096Reputation: 4096Reputation: 4096Reputation: 4096Reputation: 4096Reputation: 4096Reputation: 4096Reputation: 4096Reputation: 4096Reputation: 4096
Quote:
Originally Posted by allend View Post
Did you hear the one about the goose that pasted the example file into nano, and had problems when the long lines were broken when saved?
yes, of course.
Quote:
Originally Posted by allend View Post
As for the tricky characters, I will defer to @grail who first demonstrated the use of BASH_REMATCH to me. I remember a comment that keeping the tricky characters in a variable often avoids unexpected surprises.
that is the ":
Code:
[[ "$var" =~ "$pattern" ]]   # the content of the variable pattern will be used as-is
[[ "$var" =~ $pattern ]]     # the content of the pattern will be used as a regexp
in short, something like this
 
2 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
C++ text file line by line/each line to string/array guru11 Programming 5 12-29-2011 09:34 AM
C++ text file line by line/each line to string/array Dimitris Programming 15 03-11-2008 08:22 AM
How to identify a line and replace another string on that line using Shell script? Sid2007 Programming 10 10-01-2007 08:49 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 07:03 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration