Python清单物件没有属性错误

Question

I am new to Python and I am trying to write a website scraper to get links from subreddits, which I can then pass to another class later on for automatic download of images from imagur. 我是Python的新手，我正尝试编写一个网站刮板以获取来自subreddit的链接，然后可以将其传递给另一个类，以便稍后从imagur自动下载图像。

In this code snippet, I am just trying to read the subreddit and scrape any imagur htmls from hrefs, but I get the following error: 在此代码段中，我只是尝试阅读subreddit并从hrefs中抓取任何imagur html，但出现以下错误：

AttributeError: 'list' object has no attribute 'timeout'

Any idea as to why this might be happening? 是否知道为什么会发生这种情况？ Here is the code: 这是代码：

from bs4 import BeautifulSoup
from urllib2 import urlopen
import sys
from urlparse import urljoin

def get_category_links(base_url):
    url = base_url
    html = urlopen(url)
    soup = BeautifulSoup(html)
    posts = soup('a',{'class':'title may-blank loggedin outbound'})
    #get the links with the class "title may-blank "
    #which is how reddit defines posts
    for post in posts:
        print post.contents[0]
        #print the post's title

        if post['href'][:4] =='http':
            print post['href']
        else:
            print urljoin(url,post['href'])
        #print the url.  
        #if the url is a relative url,
        #print the absolute url.   


get_category_links(sys.argv)

Answer 1

Look at how you call the function: 看一下如何调用该函数：

get_category_links(sys.argv)

sys.argv here is a list of script arguments where the first item is the script name itself. sys.argv是脚本参数的列表，其中第一项是脚本名称本身。 This means that your base_url argument value is a list which leads to failing urlopen : 这意味着您的base_url参数值是一个导致urlopen失败的列表：

>>> from urllib2 import urlopen
>>> urlopen(["I am", "a list"])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 154, in urlopen
    return opener.open(url, data, timeout)
           │           │    │     └ <object object at 0x105e2c120>
           │           │    └ None
           │           └ ['I am', 'a list']
           └ <urllib2.OpenerDirector instance at 0x105edc638>
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 422, in open
    req.timeout = timeout
    │             └ <object object at 0x105e2c120>
    └ ['I am', 'a list']
AttributeError: 'list' object has no attribute 'timeout'

You meant to get the second argument from sys.argv and pass it to get_category_links : 您打算从sys.argv获取第二个参数，并将其传递给get_category_links ：

get_category_links(sys.argv[1])

It's interesting though, how cryptic and difficult to understand the error in this case is. 但是，有趣的是，这种情况下的错误是多么的神秘和难以理解。 This is coming from the way the "url opener" works in Python 2.7 . 这来自“ URL opener”在Python 2.7中的工作方式。 If, the url value (the first argument) is not a string, it assumes it is a Request instance and tries to set a timeout value on it: 如果url值（第一个参数）不是字符串，则假定它是一个Request实例，并尝试为其设置timeout值：

def open(self, fullurl, data=None, timeout=socket._GLOBAL_DEFAULT_TIMEOUT):
    # accept a URL or a Request object
    if isinstance(fullurl, basestring):
        req = Request(fullurl, data)
    else:
        req = fullurl
        if data is not None:
            req.add_data(data)

    req.timeout = timeout  # <-- FAILS HERE

Note that the behavior have not actually changed in the latest stable 3.6 as well . 请注意，该行为在最新的稳定版3.6中也没有实际更改。

Python清单物件没有属性错误

问题描述

1 个解决方案

解决方案1
4 已采纳 2017-06-19 05:31:43

Python清单物件没有属性错误

问题描述

1 个解决方案

解决方案1 4 已采纳 2017-06-19 05:31:43

解决方案1
4 已采纳 2017-06-19 05:31:43