繁体   English   中英

生成WebScraper时出错:TypeError:'NoneType'对象没有属性'__getitem__'

[英]Error in Building WebScraper: TypeError: 'NoneType' object has no attribute '__getitem__'

我正在建立一个显示顶部网页趋势的网络爬虫。 但是,总是返回以下错误。

Traceback (most recent call last):
  File "D:\Ceryx\webSearch.py", line 21, in <module>
    topl=webScraper(m)
  File "D:\Ceryx\webSearch.py", line 12, in webScraper
    hot = data['results'][0]['url']
TypeError: 'NoneType' object has no attribute '__getitem__'

救命!!

import re
import json
import urllib, urllib2

def webScraper(trends):
    query=urllib.urlencode({'q':trends})
    url = 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&%s' % query
    response = urllib.urlopen(url)
    extract = response.read()
    results = json.loads(extract)
    data = results['responseData']
    hot = data['results'][0]['url']
    return hot

response = urllib2.urlopen('http://www.google.com/trends/hottrends/atom/hourly')
html = response.read()
matchObj = re.findall(r'<a[^>]*?>(.*?)</a>', html)

print "Urls"
for m in matchObj:
    topl=webScraper(m)
    print m,topl

错误在此行上:

hot = data['results'][0]['url']

这意味着以下之一是“ None

data
data['results']
data['results'][0]

您可以通过连续打印找出哪一个:

print 'data',data
print 'data[results]',data['results']
print 'data[results][0]',data['results'][0]

那么一百万美元的问题将是您首先如何在json中结束它-并弄清楚您需要做什么来处理它(或者,如果您能控制这些事情的话,可以阻止它)。 :)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM