执行我的课程搜寻器时遇到问题

Question

I'm completely newbie to python when it comes to scrape any web data using class. 在使用class刮取任何Web数据时，我是python的新手。 So, apology in advance for any serious mistake. 因此，对于任何严重错误，请提前道歉。 I've written a script to parse the text using a tag from wikipedia web site. 我编写了一个脚本，使用Wikipedia网站上a标签来解析文本。 I tried to write the code accurately from my level best but for some reason when i execute the code it throws error. 我试图最好地从我的水平准确地编写代码，但是由于某些原因，当我执行代码时会引发错误。 The code and the error I'm having are given below for your kind consideration. 下面给出的代码和错误是出于您的考虑。

The script: 剧本：

import requests
from lxml.html import fromstring

class TextParser(object):

    def __init__(self):
        self.link = 'https://en.wikipedia.org/wiki/Main_Page'
        self.storage = None

    def fetch_url(self):
        self.storage = requests.get(self.link).text

    def get_text(self):
        root = fromstring(self.storage)
        for post in root.cssselect('a'):
            print(post.text)

item = TextParser()
item.get_text()

The error: 错误：

Traceback (most recent call last):
  File "C:\Users\mth\AppData\Local\Programs\Python\Python35-32\testmatch.py", line 38, in <module>
    item.get_text()
  File "C:\Users\mth\AppData\Local\Programs\Python\Python35-32\testmatch.py", line 33, in get_text
    root = fromstring(self.storage)
  File "C:\Users\mth\AppData\Local\Programs\Python\Python35-32\lib\site-packages\lxml\html\__init__.py", line 875, in fromstring
    is_full_html = _looks_like_full_html_unicode(html)
TypeError: expected string or bytes-like object

Answer 1

You're executing the following two lines 您正在执行以下两行

item = TextParser()
item.get_text()

When you initialize TextParser , self.storage is equal to None. 初始化TextParser ， self.storage等于None。 When you execute the function get_text() it's still equal to None. 当执行函数get_text（）时，它仍然等于None。 So that's why you get that error. 这就是为什么您会收到该错误的原因。

However, if you change it to the following. 但是，如果将其更改为以下内容。 self.storage should get populated with a string rather than being none. self.storage应该使用字符串而不是都不填充。

item = TextParser()
item.fetch_url()
item.get_text()

If you want to call the function get_text without calling fetch_url you can do it this way. 如果要调用函数get_text而不调用fetch_url，则可以通过这种方式进行。

def get_text(self):
    self.fetch_url()
    root = fromstring(self.storage)
    for post in root.cssselect('a'):
        print(post.text)

执行我的课程搜寻器时遇到问题

问题描述

1 个解决方案

解决方案1
1 已采纳 2017-10-18 20:56:36

执行我的课程搜寻器时遇到问题

问题描述

1 个解决方案

解决方案1 1 已采纳 2017-10-18 20:56:36

解决方案1
1 已采纳 2017-10-18 20:56:36