简体   繁体   中英

python code not seeing a class variable iniialized in the __init__() function

I am running into a weird error with spynner, though the question is a generic one. Spynner is the stateful web-browser module for python. It works fine when it works but I almost with every run I get a failure saying this --

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/spynner-2.16.dev0-py2.7.egg/spynner/browser.py", line 1651, in createRequest
    self.cookies,
AttributeError: 'Browser' object has no attribute 'cookies'
Segmentation fault (core dumped)

The problem here is its segfaulting and not letting me continue.

Looking at the code for spynner I see that the cookies variable is in fact initialized in the __init__() function for the Browser class like this:

self.cookies = []

Now on failure its really saying that the __init__() is not run since its not seeing the cookies variable. I do not understand how that can be possible. Without restricting to the spynner module can someone venture a guess as to how a python object could fail with an error like this?

EDIT: I definitely would have pasted my code here except its not all in one place for me to compactly show it. I should have done it earlier but here is the overall structure and how I instantiate and use spynner.

# helper class to get url data
class C:
   def __init__(self):
       self.browser = spynner.Browser()

   def get_data(self, url):
       try:
           self.browser.load(url)
           return self.browser.html
       except:
           raise

# class that does other stuff among saving url data to disk
class B:
    def save_url_to_disk(self, url):
        urlObj = C()
        html = urlObj.get_data(url)
        # do stuff with html


# class that drives everything
class A:
    def do_stuff_and_save_url_data(self, url):
       fileObj = B()
       fileObj.save_url_to_disk(url)

driver = A()
# call this function for multiple URLs.
driver.do_stuff_and_save_url_data(url)

The way I run it is ---

# xvfb-run python myfile.py

The segfault is probably something else I am doing. May be its because of the xvfb I am using and not handling properly? I don't know yet. I need to mention that I am relatively new to python.

I noticed that when I run the code above with say ' http://www.google.com ' I get the segfault every other time.

The code block of do_stuff_and_save_url_data() doesn't use the reference self :
then the execution of this function doesn't depend on driver .

The code block of save_url_to_disk() also doesn't use the reference self :
then the execution of this second function doesn't depend on the object fileObj .

Only the code block of get_data() uses the reference self , and more precisely the reference self.browser :
so its execution and result depends on the attribute browser of the instance urlObj from class C . This attribute is in fact a browser instance named browser of the spynner.Browser class.

In the end, you "do stuff with html" with just the data outputed by spynner.Browser().html . And creation of driver and fileObj aren't mandatory in any way.

.

Another point is that
when the instruction driver.do_stuff_and_save_url_data(url) is executed,
the method driver.do_stuff_and_save_url_data(url) is first created, then executed, and finally "destroyed" (or more precisely forgot somewhere in the RAM) because it hasn't been assigned to any identifier.

Then the identifier fileObj , which is an identifier belonging to the local namespace of the function driver.do_stuff_and_save_url_data() , is lost too, which means the instance fileObj of class B is also lost for ulterior use since it has no more assigned identifier alive.

It's the same for save_url_to_disk() :
after the creation and execution of the method fileObj.save_url_to_disk(url) , the object urlObj of class C , which contains an instance of browser ( an object created by spynner.Browser() ), is lost: the created browser and all its data is lost.

I wonder if this isn't because of this destruction of the browser instance after each execution of do_stuff_and_save_url_data() and save_url_to_disk() that the cookies information wouldn't be destroyed before an ulterior call.

.

So, in my opinion, your code only embeds two functions in two definitions of classes A and B and they are used as being considered functions , not as methods.

1/ I don't think it is a good coding pattern. When one wants only plain functions, they must be written outside of any class.

2/ The problem is that if operations are triggered by functions, a new browser is created each time these functions are activated , even if they have the mantle of methods.

You will say me that you want these functions to act with data provided by the browser defined by spynny.Browser() .
That's why I think that they must not be functions embeded in classes as now, but real methods attached to a stable instance of a browser. That's the aim of object to keep in the same namespace the data and the tools to treat the data.

-

.

All that said, I would personnally write:

class C(spynner.Browser):
   def __init__(self):
       spynner.Browser.__init__(self)

   def get_data(self, url):
       try:
           self.html = self.load(url).html
       except:
           raise

    # method that does other stuff among saving url data to disk
    def save_url_to_disk(self, url):
        get_data(url)
        # do stuff with self.html

    # method that drives everything
    def do_stuff_and_save_url_data(self, url):
        self.save_url_to_disk(url)


driver = C()
driver.do_stuff_and_save_url_data(url)

But I'm not sure to have well undesrtood all your considerations, and I warn that I didn't know spynner before reading your post. All that I've written could be stupid relatively to your real problem. Keep a critic eye on my post, please.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM