I am running into a weird error with spynner, though the question is a generic one. Spynner is the stateful web-browser module for python. It works fine when it works but I almost with every run I get a failure saying this --
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/spynner-2.16.dev0-py2.7.egg/spynner/browser.py", line 1651, in createRequest
self.cookies,
AttributeError: 'Browser' object has no attribute 'cookies'
Segmentation fault (core dumped)
The problem here is its segfaulting and not letting me continue.
Looking at the code for spynner I see that the cookies variable is in fact initialized in the __init__()
function for the Browser class like this:
self.cookies = []
Now on failure its really saying that the __init__()
is not run since its not seeing the cookies variable. I do not understand how that can be possible. Without restricting to the spynner module can someone venture a guess as to how a python object could fail with an error like this?
EDIT: I definitely would have pasted my code here except its not all in one place for me to compactly show it. I should have done it earlier but here is the overall structure and how I instantiate and use spynner.
# helper class to get url data
class C:
def __init__(self):
self.browser = spynner.Browser()
def get_data(self, url):
try:
self.browser.load(url)
return self.browser.html
except:
raise
# class that does other stuff among saving url data to disk
class B:
def save_url_to_disk(self, url):
urlObj = C()
html = urlObj.get_data(url)
# do stuff with html
# class that drives everything
class A:
def do_stuff_and_save_url_data(self, url):
fileObj = B()
fileObj.save_url_to_disk(url)
driver = A()
# call this function for multiple URLs.
driver.do_stuff_and_save_url_data(url)
The way I run it is ---
# xvfb-run python myfile.py
The segfault is probably something else I am doing. May be its because of the xvfb I am using and not handling properly? I don't know yet. I need to mention that I am relatively new to python.
I noticed that when I run the code above with say ' http://www.google.com ' I get the segfault every other time.
The code block of do_stuff_and_save_url_data()
doesn't use the reference self
:
then the execution of this function doesn't depend on driver
.
The code block of save_url_to_disk()
also doesn't use the reference self
:
then the execution of this second function doesn't depend on the object fileObj
.
Only the code block of get_data()
uses the reference self
, and more precisely the reference self.browser
:
so its execution and result depends on the attribute browser
of the instance urlObj
from class C
. This attribute is in fact a browser instance named browser
of the spynner.Browser
class.
In the end, you "do stuff with html" with just the data outputed by spynner.Browser().html
. And creation of driver
and fileObj
aren't mandatory in any way.
.
Another point is that
when the instruction driver.do_stuff_and_save_url_data(url)
is executed,
the method driver.do_stuff_and_save_url_data(url)
is first created, then executed, and finally "destroyed" (or more precisely forgot somewhere in the RAM) because it hasn't been assigned to any identifier.
Then the identifier fileObj
, which is an identifier belonging to the local namespace of the function driver.do_stuff_and_save_url_data()
, is lost too, which means the instance fileObj of class B
is also lost for ulterior use since it has no more assigned identifier alive.
It's the same for save_url_to_disk()
:
after the creation and execution of the method fileObj.save_url_to_disk(url)
, the object urlObj of class C
, which contains an instance of browser ( an object created by spynner.Browser()
), is lost: the created browser and all its data is lost.
I wonder if this isn't because of this destruction of the browser instance after each execution of do_stuff_and_save_url_data()
and save_url_to_disk()
that the cookies information wouldn't be destroyed before an ulterior call.
.
So, in my opinion, your code only embeds two functions in two definitions of classes A
and B
and they are used as being considered functions , not as methods.
1/ I don't think it is a good coding pattern. When one wants only plain functions, they must be written outside of any class.
2/ The problem is that if operations are triggered by functions, a new browser is created each time these functions are activated , even if they have the mantle of methods.
You will say me that you want these functions to act with data provided by the browser defined by spynny.Browser()
.
That's why I think that they must not be functions embeded in classes as now, but real methods attached to a stable instance of a browser. That's the aim of object to keep in the same namespace the data and the tools to treat the data.
-
.
All that said, I would personnally write:
class C(spynner.Browser):
def __init__(self):
spynner.Browser.__init__(self)
def get_data(self, url):
try:
self.html = self.load(url).html
except:
raise
# method that does other stuff among saving url data to disk
def save_url_to_disk(self, url):
get_data(url)
# do stuff with self.html
# method that drives everything
def do_stuff_and_save_url_data(self, url):
self.save_url_to_disk(url)
driver = C()
driver.do_stuff_and_save_url_data(url)
But I'm not sure to have well undesrtood all your considerations, and I warn that I didn't know spynner before reading your post. All that I've written could be stupid relatively to your real problem. Keep a critic eye on my post, please.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.