简体   繁体   English

在python中使用Web解析的面向对象编程

[英]object oriented programming with web parsing in python

I am teaching myself object oriented programming and web parsing in python.我在 Python 中自学面向对象编程和 Web 解析。 I want to create a class that will parse a web page.我想创建一个将解析网页的类。 I have a problem and a question about my code.我有一个问题和一个关于我的代码的问题。

I am trying to download a page using Beautifulsoup.我正在尝试使用 Beautifulsoup 下载页面。 I created a class and a function to download the page but the page doesn't seem to download.我创建了一个类和一个函数来下载页面,但页面似乎没有下载。 I'm not sure why.我不知道为什么。 If someone could help me with this, that would be great.如果有人能帮我解决这个问题,那就太好了。 Here is the code:这是代码:

from BeautifulSoup import BeautifulSoup
import urllib2

class parser():

    def __init__(self, url):
                self.url = "http://www.any_url"
                self.contents  = ''
                
        def download_page(self):
            
                page=urllib2.urlopen(self.url)
                soup = BeautifulSoup(page.read())

                page_find=soup.findAll()
                print page_find

if __name__ == '__main__':

    parser.download_page
    

Another issue I had was the indents.我遇到的另一个问题是缩进。 Right now, it appears my function download_page exists inside my constructor.现在,看起来我的函数 download_page 存在于我的构造函数中。 I tried to keep my functions separate but I kept getting errors because of my indents.我试图将我的功能分开,但由于我的缩进,我不断收到错误消息。 I basically just kept hitting 'tab' until it all compiled.我基本上只是一直点击“标签”,直到它全部编译。 Could someone explain why this is happening?有人可以解释为什么会这样吗? Is it really a problem?真的有问题吗?

I ask because whenever I looked at object oriented in python, functions are usually indented more evenly.我问是因为每当我在 python 中查看面向对象时,函数通常缩进更均匀。

I think the problem is that you aren't using classes correctly.我认为问题在于您没有正确使用类。 Try something like:尝试类似:

class Parser(object):
 
    def __init__(self, url):
        ...

    def download_page(self):
        ...

Then use:然后使用:

parser = Parser(url) # create instance of the class
parser.download_page() # call instance method

At the moment, you are trying to call download_page on the class, not an instance.目前,您正在尝试在类上调用download_page ,而不是实例。

That said, when you have a class with "two methods, one of which is __init__ " you should probably stop writing classes .也就是说,当您有一个具有“两种方法,其中一种是__init__ ”的类时,您可能应该停止编写类

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM