I starting teaching myself python and am stuck on trying to understand why I am not getting the output that I want. Long story short,
I am using PDB for debugging and here my function in which I am having my issue:
Code:
import re
...
...
...
def find_all_flvs(url):
soup = BeautifulSoup(urllib2.urlopen(url))
flvs = []
for link in soup.findAll(onclick=re.compile("doShowCHys=1*")):
link = str(link)
vidnum = re.search("\d{5,6}.*&", link)
vidurl = "http://www.blahblah.com/home/GetPlayerXML.aspx?lpk4=%s" % vidnum
for hashval_url in BeautifulSoup(urllib2.urlopen(vidurl)).findAll("flv"):
flvs.append(hashval_url.text)
return flvs
I verified that my regex is correct(\d{5,6}.*&):
Code:
"/home/Player.aspx?lpk4=108148&playChapter=True\',960,540,94343);return false;"
produces:
which is what I want, so when running pdb using steps and I get to:
Code:
vidnum = re.search("\d{5,6}.*&", link)
and this is what I end up with as the output:
Code:
<_sre.SRE_Match object at 0xaaf8de8>
in which I should be seeing:
so it can be simply appended to:
Code:
vidurl = "http://www.blahblah.com/home/GetPlayerXML.aspx?lpk4=%s" % vidnum
producing:
I have been through several urls and cannot seem to figure out what I am doing wrong:
http://www.tutorialspoint.com/python...xpressions.htm
??