简体   繁体   中英

Python OOP - web session

I have the following script:

import mechanize, cookielib, re ...
br = mechanize.Browser()
cj = cookielib.LWPCookieJar()
br.set_cookiejar(cj)
br.addheaders = ....
and do stuff

Because my script is growing very big, I want to split it in classes. One class to handle web-connection, one class to do stuff and so on. From what I read, I need something like:

from web_session import * # this my class handling web-connection (cookies + auth)
from do_stuff import * # i do stuff on web pages

and in my main, I have:

browser = Web_session()
stuff = Do_stuff()

the problem for me is that I lose session cookies when I pass it to Do_stuff. Can anyone help me with a basic example of classes and interaction , lets say: I log in on site, a browse a page and I want to do something like re.findall("something", one_that_page). Thanks in advance

Update: Main Script:

br = WebBrowser()
br.login(myId, myPass)

WebBrowser class:

class WebBrowser():

def __init__(self):
    self.browser = mechanize.Browser()
    cj = cookielib.LWPCookieJar()
    self.browser.set_cookiejar(cj)
    self.browser.addheaders = ....

def login(self, username, password):
    self.username = username
    self.password = password
    self.browser.open(some site)
    self.browser.submit(username, password)

def open(self, url):
    self.url = url
    self.browser.open(url)

def read(self, url):
    self.url = url
    page = self.browser.open(url).read()
    return page

Current state: This part works perfectly, I can login, but I lose the mechanize class "goodies" like open, post o read an url. For example:

management = br.read("some_url.php")

all my cookies are gone (error:must be log in)

How can I fix it?

The "mechanise.Browser" class has all the functionality it seens you want to put on your "Web_session" class (side note - naming conventions and readility would recomend "WebSession" instead).

Anyway, you will retain you cookies if you keep the same Browser object across calls - if you really go for having another wrapper class, just create a mehcanize.Broser when instantiating your Web_session class, and keep that as an object attribute (for example, as "self.browser") .

But, you most likelly don't need to do that - just create a Browser on the __init__ of your Do_stuff, keep it as an instance attribute, and reuse it for all requests -

class DoStuff(object):
   def __init__(self):
      self.browser = mechanize.Browser()
      cj = cookielib.LWPCookieJar()
      self.browser.set_cookiejar(cj)

   def login(self, credentials):
      self.browser.post(data=credentials)

   def  match_text_at_page(self, url, text):
      # this will use the same cookies as are used in the login
      req = self.browser.get(url)
      return re.findall(text, req.text)

Never use the construct from X import * as in

from web_session import *
from do_stuff import *

It's ok when you are experimenting in an interactive session, but don't use it in your code.

Imagine the following: In web_session.py you have a function called my_function , which you use in your main module. In do_stuff.py you have an import statement from some_lib_I_found_on_the_net import * . Everything is nice, but after a while, your program mysteriously fails. It turns out that you upgraded some_lib_I_found_on_the_net.py, and the new version contained a function called my_function . Your main program is suddenly calling some_lib_I_found_on_the_net.my_function instead of web_session.my_function . Python has such nice support for separating concerns, but with that lazy construct, you'll just shoot yourself in the foot, and besides, it's so nice to be able to look in your code and see where every object comes from, which you don't with the *.

If you want to avoid long things like web_session.myfunction() , do either import web_session as ws and then ws.my_function() or from web_session import my_function, ...

Even if you only import one single module in this way, it can bite you. I had colleagues who had something like...

...
import util
...
from matplotlib import *
...
(a few hundred lines of code)
...
x = util.some_function()
...

Suddenly, they got an AttributeError on the call to util.some_function which had worked as a charm for years. However they looked at the code, they couldn't understand what was wrong. It took a long time before someone realized that matplotlib had been upgraded, and now it contained a function called (you guessed it) util!

Explicit is better than implicit!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM