python pewee: minor changes to get the job done

sayhello_to_the_world · 08-27-2014, 08:49 AM

well fairly new to python i want to store the results of a parsing job in db.I heard of peewee which is told to be very useful and handy for such tasks.

i want to use python and peewee, I think i have to do something like the following:- after insalling peewee correctly i runned the script and now see what happened.

Code:

import urllib
import urlparse
import re
import peewee
import json

db = MySQLDatabase('cpan', user='root',passwd='rimbaud')

class User(Model):
    name = TextField()
    cname = TextField()
    email = TextField()
    url = TextField()

    class Meta:
        database = db # this model uses the cpan database


User.create_table() #ensure table is created


url = "http://search.cpan.org/author/?W"
html = urllib.urlopen(url).read()
for lk, capname, name in re.findall('<a href="(/~.*?/)"><b>(.*?)</b></a><br/><small>(.*?)</small>', html):
    alk = urlparse.urljoin(url, lk)

    data = { 'url':alk, 'name':name, 'cname':capname }

    phtml = urllib.urlopen(alk).read()
    memail = re.search('<a href="mailto:(.*?)">', phtml)
    if memail:
        data['email'] = memail.group(1)


data = json.load() #your json data file here

for entry in data: #assuming your data is an array of JSON objects
    user = User.create(name=entry["name"], cname=entry["cname"],
        email=entry["email"], url=entry["url"])
    user.save()

i got back this error.

Code:

Traceback (most recent call last):
  File "cpan5.py", line 10, in <module>
    db = MySQLDatabase('cpan', user='root',passwd='rimbaud')
NameError: name 'MySQLDatabase' is not defined
linux-70ce:/home/martin/perl #

assuming this is all right now - i have set up this...
so well - but it fails at a certain point.

Code:

import urllib
import urlparse
import re
# import peewee
import json
from peewee import *



#from peewee import MySQLDatabase ('cpan', user='root',passwd='rimbaud') 


db = MySQLDatabase('cpan', user='root',passwd='rimbaud') 

class User(Model):
    name = TextField()
    cname = TextField()
    email = TextField()
    url = TextField()

    class Meta:
        database = db # this model uses the cpan database


User.create_table() #ensure table is created


url = "http://search.cpan.org/author/?W"
html = urllib.urlopen(url).read()
for lk, capname, name in re.findall('<a href="(/~.*?/)"><b>(.*?)</b></a><br/><small>(.*?)</small>', html):
    alk = urlparse.urljoin(url, lk)

    data = { 'url':alk, 'name':name, 'cname':capname }

    phtml = urllib.urlopen(alk).read()
    memail = re.search('<a href="mailto:(.*?)">', phtml)
    if memail:
        data['email'] = memail.group(1)


data = json.load('emailyour json data file here

for entry in data: #assuming your data is an array of JSON objects
    user = User.create(name=entry["name"], cname=entry["cname"],
        email=entry["email"], url=entry["url"])
    user.save()

guess that there a data-file must exist: one that have been created by the script during the parsing... is this right?

Code:

martin@linux-70ce:~/perl> python cpan_100.py
Traceback (most recent call last):
  File "cpan_100.py", line 47, in <module>
    data = json.load('emailyour json data file here
  File "/usr/lib/python2.7/json/__init__.py", line 286, in load
    return loads(fp.read(),
AttributeError: 'str' object has no attribute 'read'
martin@linux-70ce:~/perl>

well - atm i do not know why i get so much errory.
I would be happy for any and all hints.

love to hear from you

TB0ne · 08-27-2014, 09:53 AM

Quote:

Originally Posted by sayhello_to_the_world

well fairly new to python i want to store the results of a parsing job in db.I heard of peewee which is told to be very useful and handy for such tasks. i want to use python and peewee, I think i have to do something like the following:- after insalling peewee correctly i runned the script and now see what happened.

i got back this error.

Code:

Traceback (most recent call last):
  File "cpan5.py", line 10, in <module>
    db = MySQLDatabase('cpan', user='root',passwd='rimbaud')
NameError: name 'MySQLDatabase' is not defined
linux-70ce:/home/martin/perl #

assuming this is all right now - i have set up this...so well - but it fails at a certain point.

Does it fail or not??? Why are you ASSUMING it works? Did you correct the error in the script or not?

Quote:

guess that there a data-file must exist: one that have been created by the script during the parsing... is this right?

..and why are you asking? Can you not see if a file is created? Did you LOOK for a file?

Quote:

Code:

martin@linux-70ce:~/perl> python cpan_100.py
Traceback (most recent call last):
  File "cpan_100.py", line 47, in <module>
    data = json.load('emailyour json data file here
  File "/usr/lib/python2.7/json/__init__.py", line 286, in load
    return loads(fp.read(),
AttributeError: 'str' object has no attribute 'read'
martin@linux-70ce:~/perl>

well - atm i do not know why i get so much errory.

Did you not read the error message?? It is VERY clear...you're trying to load a file of the name "emailyour json data file here", didn't close your parens, and have a TOTALLY different syntax from what you had in the first script. You've been posting this same script/error set for over two months in other forums, and have had other people point this out. Why is this not clear?

Either fix the file name/syntax, or it won't work.

dugan · 08-27-2014, 12:16 PM

Quote:

data = json.load('emailyour json data file here

Well, this line, which was clearly pointed out in the error message, is obviously a) wrong and b) the problem. What is it supposed to do, and why do you think that what you have there will work?

If it was supposed to be an "example", then you were supposed to change the quoted text in the example to a file-like object, not to leave it as a string (with no closing quote).

sayhello_to_the_world · 08-28-2014, 03:00 AM

hello TBone hello dugan

many thanks for the replies and all the tipps. I will do as adviced and will name a file eg like so

what is done here: We are passing a string to json.load. This line expects a"file like" object,
We can call open on a file and use the returned handle.

Or - what if we just use the results of the parser`?

BTW: We are already populating the data object when parsing the html, so we can say that we can omit the data = json.load('email') line and simply access the data object directly in the for loop at he end as written.
That above mentioned line is just added as an example as - but it is clear that we get the data from initially - the
parsing job!

We also might want to do data = []
before the html parsing loop and then we can do entry = { 'url':alk, 'name':name, 'cname':capname } and data.append(entry.copy()) within the loop.

what do you say!`?

TB0ne · 08-28-2014, 08:58 AM

Quote:

Originally Posted by sayhello_to_the_world

hello TBone hello dugan
many thanks for the replies and all the tipps. I will do as adviced and will name a file eg like so

These aren't 'tips'...this is reading the VERY CLEAR ERROR MESSAGE. It told you specifically what the problem was, and where.

Quote:

what is done here: We are passing a string to json.load. This line expects a"file like" object, We can call open on a file and use the returned handle. Or - what if we just use the results of the parser`?

BTW:We are already populating the data object when parsing the html, so we can say that we can omit the data = json.load('email') line and simply access the data object directly in the for loop at he end as written. That above mentioned line is just added as an example as - but it is clear that we get the data from initially - the parsing job!

We also might want to do data = [] before the html parsing loop and then we can do entry = { 'url':alk, 'name':name, 'cname':capname } and ata.append(entry.copy()) within the loop.

what do you say!`?

We say it's your program...write it however you'd like. This is much like your other threads, asking about Perl and other languages to do DB insertion/manipulation. Those programs were copied from other websites, and you just posted errors you got when trying to run them. If you're not going to actually learn how to write the code, then you really should hire someone to write it for you.

sayhello_to_the_world · 09-02-2014, 04:22 PM

hello dear TBone

tx for the hints and all your support!

we are already populating the data object when parsing the html, - well that said i think that we can omit the data = json.load('email') line and simply access the data object directly in the for loop at the end as written. we are getting the data from the parsing process.

so we can go like this:

Code:

import urllib
import urlparse
import re
# import peewee
import json
from peewee import *


#from peewee import MySQLDatabase ('cpan', user='root',passwd='rimbaud') 


db = MySQLDatabase('cpan', user='root',passwd='rimbaud') 

class User(Model):
    name = TextField()
    cname = TextField()
    email = TextField()
    url = TextField()

    class Meta:
        database = db # this model uses the cpan database

        
User.create_table() #ensure table is created


url = "http://search.cpan.org/author/?W"
html = urllib.urlopen(url).read()
for lk, capname, name in re.findall('<a href="(/~.*?/)"><b>(.*?)</b></a><br/><small>(.*?)</small>', html):
    alk = urlparse.urljoin(url, lk)

    data = { 'url':alk, 'name':name, 'cname':capname }

    phtml = urllib.urlopen(alk).read()
    memail = re.search('<a href="mailto:(.*?)">', phtml)
    if memail:
        data['email'] = memail.group(1)


# data = json.load('email') #your json data file here

for entry in data: #assuming your data is an array of JSON objects
    user = User.create(name=entry["name"], cname=entry["cname"],
        email=entry["email"], url=entry["url"])
    user.save()

note : i disabled the line in the code:

Code:

 # data = json.load('email') #your json data file here

doing so i get ahead and run the code:

Code:

martin@linux-70ce:~/perl> python cpan_100.py
Traceback (most recent call last):
  File "cpan_100.py", line 27, in <module>
    User.create_table() #ensure table is created
  File "build/bdist.linux-i686/egg/peewee.py", line 3078, in create_table                                                                                                           
  File "build/bdist.linux-i686/egg/peewee.py", line 2471, in create_table                                                                                                           
  File "build/bdist.linux-i686/egg/peewee.py", line 2414, in execute_sql                                                                                                            
  File "build/bdist.linux-i686/egg/peewee.py", line 2283, in __exit__                                                                                                               
  File "build/bdist.linux-i686/egg/peewee.py", line 2406, in execute_sql                                                                                                            
  File "/usr/lib/python2.7/site-packages/MySQLdb/cursors.py", line 174, in execute                                                                                                  
    self.errorhandler(self, exc, value)                                                                                                                                             
  File "/usr/lib/python2.7/site-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler                                                                                   
    raise errorclass, errorvalue                                                                                                                                                    
peewee.OperationalError: (1050, "Table 'user' already exists")                                                                                                                      
martin@linux-70ce:~/perl>

WELL - it seems to be clear that i have some other issues now...

TB0ne · 09-02-2014, 06:29 PM

Quote:

Originally Posted by sayhello_to_the_world

hello dear TBone
tx for the hints and all your support!

And as you've been told MANY times, SPELL OUT YOUR WORDS, and don't use text speak.

Quote:

we are already populating the data object when parsing the html, - well that said i think that we can omit the data = json.load('email') line and simply access the data object directly in the for loop at the end as written. we are getting the data from the parsing process. so we can go like this:

Code:

import urllib
import urlparse
import re
# import peewee
import json
from peewee import *
#from peewee import MySQLDatabase ('cpan', user='root',passwd='rimbaud') 
db = MySQLDatabase('cpan', user='root',passwd='rimbaud') 

class User(Model):
    name = TextField()
    cname = TextField()
    email = TextField()
    url = TextField()

    class Meta:
        database = db # this model uses the cpan database
        
User.create_table() #ensure table is created

url = "http://search.cpan.org/author/?W"
html = urllib.urlopen(url).read()
for lk, capname, name in re.findall('<a href="(/~.*?/)"><b>(.*?)</b></a><br/><small>(.*?)</small>', html):
    alk = urlparse.urljoin(url, lk)
    data = { 'url':alk, 'name':name, 'cname':capname }
    phtml = urllib.urlopen(alk).read()
    memail = re.search('<a href="mailto:(.*?)">', phtml)
    if memail:
        data['email'] = memail.group(1)

# data = json.load('email') #your json data file here
for entry in data: #assuming your data is an array of JSON objects
    user = User.create(name=entry["name"], cname=entry["cname"],
        email=entry["email"], url=entry["url"])
    user.save()

note : i disabled the line in the code:

Code:

 # data = json.load('email') #your json data file here

doing so i get ahead and run the code:

Code:

martin@linux-70ce:~/perl> python cpan_100.py
Traceback (most recent call last):
  File "cpan_100.py", line 27, in <module>
    User.create_table() #ensure table is created
  File "build/bdist.linux-i686/egg/peewee.py", line 3078, in create_table                                                                                                           
  File "build/bdist.linux-i686/egg/peewee.py", line 2471, in create_table                                                                                                           
  File "build/bdist.linux-i686/egg/peewee.py", line 2414, in execute_sql                                                                                                            
  File "build/bdist.linux-i686/egg/peewee.py", line 2283, in __exit__                                                                                                               
  File "build/bdist.linux-i686/egg/peewee.py", line 2406, in execute_sql                                                                                                            
  File "/usr/lib/python2.7/site-packages/MySQLdb/cursors.py", line 174, in execute                                                                                                  
    self.errorhandler(self, exc, value)                                                                                                                                             
  File "/usr/lib/python2.7/site-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler                                                                                   
    raise errorclass, errorvalue                                                                                                                                                    
peewee.OperationalError: (1050, "Table 'user' already exists")                                                                                                                      
martin@linux-70ce:~/perl>

WELL - it seems to be clear that i have some other issues now...

Right, because you aren't passing the right arguments to the functions, since you commented it out, rather than fixing the syntax problem(s). You've asked about how to do this in perl before, and now python. You've directly copied programs from others...haven't you tried to read/understand the code, so that YOU can write it, rather than asking us to customize a program for you?

sayhello_to_the_world · 09-06-2014, 04:34 PM

hello dear TBone,

right said - you re right: All your preliminary thoughts and ideas are not bad - sure thing.

you convinced me in doing more work to find out the issues. I for now figure out what goes wrong with the database connection.

Code:

  File "cpan_100.py", line 27, in <module>
    User.create_table() #ensure table is created
  File "build/bdist.linux-i686/egg/peewee.py", line 3078, in create_table                                                                                                           
  File "build/bdist.linux-i686/egg/peewee.py", line 2471, in create_table                                                                                                           
  File "build/bdist.linux-i686/egg/peewee.py", line 2414, in execute_sql                                                                                                            
  File "build/bdist.linux-i686/egg/peewee.py", line 2283, in __exit__                                                                                                               
  File "build/bdist.linux-i686/egg/peewee.py", line 2406, in execute_sql                                                                                                            
  File "/usr/lib/python2.7/site-packages/MySQLdb/cursors.py", line 174, in execute                                                                                                  
    self.errorhandler(self, exc, value)                                                                                                                                             
  File "/usr/lib/python2.7/site-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler

i will come back and let you know all my findings

thankx for all your help - greetings

TB0ne · 09-06-2014, 08:49 PM

Quote:

Originally Posted by sayhello_to_the_world

hello dear TBone,

right said - you re right: All your preliminary thoughts and ideas are not bad - sure thing.

you convinced me in doing more work to find out the issues. I for now figure out what goes wrong with the database connection.

i will come back and let you know all my findings

Yes...right...like you've said several times before, after you've been "convinced" to show some effort, but don't ever actually seem to DO it. Just like you've never come back and posted your work or solutions.

Based on your posting history, I just don't believe you.