简体   繁体   English

解析BeautifulSoup,错误消息TypeError:强制转换为Unicode:需要字符串或缓冲区,找到NoneType

[英]Parsing with BeautifulSoup, error message TypeError: coercing to Unicode: need string or buffer, NoneType found

so I'm trying to scrape an Amazon page for data, and I'm getting an error when I try to parse for where the seller is located. 所以我正试图抓取一个亚马逊页面的数据,当我试图解析卖家所在的位置时,我收到了一个错误。 Here's my code: 这是我的代码:

#getting the html
request = urllib2.Request('http://www.amazon.com/gp/offer-listing/0393934241/')
opener = urllib2.build_opener()
#hiding that I'm a webscraper
request.add_header('User-Agent', 'Mozilla/5 (Solaris 10) Gecko')
#opening it up, putting into soup form
html = opener.open(request).read()
soup = BeautifulSoup(html, "html5lib")

#parsing for the seller info
sellers = soup.findAll('div', {'class' : 'a-row a-spacing-medium olpOffer'})
for eachseller in sellers:
    #parsing for price
    price = eachseller.find('span', {'class' : 'a-size-large a-color-price olpOfferPrice a-text-bold'})
    #parsing for shipping costs
    shippingprice = eachseller.find('span'
    , {'class' : 'olpShippingPrice'})
    #parsing for condition
    condition = eachseller.find('span', {'class' : 'a-size-medium'})
    #parsing for seller name
    sellername = eachseller.find('b')
     #parsing for seller location
    location = eachseller.find('div', {'class' : 'olpAvailability'})

    #printing it all out
    print "price, " + price.string + ", shipping price, " + shippingprice.string + ", condition," + condition.string + ", seller name, " + sellername.string + ", location, " + location.string

I get the error message, pertaining to the 'print' command at the end: TypeError: coercing to Unicode: need string or buffer, NoneType found 我得到的错误信息与最后的'print'命令有关: TypeError: coercing to Unicode: need string or buffer, NoneType found

I know that it's coming from this line - location = eachseller.find('div', {'class' : 'olpAvailability'}) - because the code works fine without that line, and I know that I'm getting NoneType because the line isn't finding anything. 我知道它来自这一行 - location = eachseller.find('div', {'class' : 'olpAvailability'}) - 因为代码在没有该行的情况下工作正常,我知道我得到的是NoneType,因为线没找到任何东西。 Here's the html from the section I'm looking to parse: 这是我要解析的部分中的html:

<div class="olpAvailability">
    In Stock. 
        Ships from WI, United States.
    <br/><a href="/gp/aag/details/ref=olp_merch_ship_9/175-0430757-3801038?ie=UTF8&amp;asin=0393934241&amp;seller=A1W2IX7T37FAMZ&amp;sshmPath=shipping-rates#aag_shipping">Domestic shipping rates</a>
         and <a href="/gp/aag/details/ref=olp_merch_return_9/175-0430757-3801038?ie=UTF8&amp;asin=0393934241&amp;seller=A1W2IX7T37FAMZ&amp;sshmPath=returns#aag_returns">return policy</a>.
</div>

I don't see what's the problem with the 'location' line of code, or why it's not pulling the data I want. 我没有看到“位置”代码行有什么问题,或者为什么它没有提取我想要的数据。

EDIT: I figured it out, but I don't know why. 编辑:我想通了,但我不知道为什么。 If I change the print command to print location.find(text=True) it outputs the location that I want. 如果我更改print命令以打印location.find(text = True),它会输出我想要的位置。 Hope this helps somebody, someday. 希望有一天能帮助某人。

seems like you searching wrong class name 好像你在寻找错误的班级名字

<div class="a-column a-span3 olpDeliveryColumn" role="gridcell">
<p class="a-spacing-mini olpAvailability">
<ul class="a-unordered-list a-vertical olpFastTrack">
<li><span class="a-list-item">
            Ships from WI, United States.
        </span></li>
<li><span class="a-list-item">
<a href="/gp/aag/details?ie=UTF8&amp;asin=0393934241&amp;seller=A263RIO308P3G8&amp;sshmPath=shipping-rates#aag_shipping">Shipping rates</a>
                   and <a href="/gp/aag/details?ie=UTF8&amp;asin=0393934241&amp;seller=A263RIO308P3G8&amp;sshmPath=returns#aag_returns">return policy</a>.
        </span></li>
</ul>
</p>
</div>

change this line in your code: 在代码中更改此行:

location = eachseller.find('div', {'class' : 'olpDeliveryColumn'})

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 python TypeError:强制转换为Unicode:需要字符串或缓冲区,找到NoneType - python TypeError: coercing to Unicode: need string or buffer, NoneType found Python:NoneType错误 - “强制转换为Unicode:需要字符串或缓冲区,找到NoneType” - Python: NoneType error - “coercing to Unicode: need string or buffer, NoneType found” TypeError:强制转换为Unicode,需要字符串或缓冲区,找到NoneType - TypeError: coercing to Unicode, need string or buffer, NoneType found TypeError:强制转换为Unicode:需要字符串或缓冲区,找不到NoneType - TypeError: coercing to Unicode: need string or buffer, NoneType found TypeError:强制转换为Unicode:需要字符串或缓冲区,找不到NoneType - TypeError: coercing to Unicode: need string or buffer, NoneType found Web爬网程序– TypeError:强制转换为Unicode:需要字符串或缓冲区,找不到NoneType - Web Crawler–––TypeError: coercing to Unicode: need string or buffer, NoneType found Python错误“ TypeError:强制转换为Unicode:需要字符串或缓冲区,找到列表” - Python Error“TypeError: coercing to Unicode: need string or buffer, list found” 如何在Mac终端(BUSCO)上的Python中绕过“ TypeError:强制转换为Unicode:需要字符串或缓冲区,NoneType”消息 - How do I bypass “TypeError: coercing to Unicode: need string or buffer, NoneType” message on Python on the mac terminal (BUSCO) 找不到原因“TypeError: coercing to Unicode: need string or buffer, NoneType found” - Can't find the cause “TypeError: coercing to Unicode: need string or buffer, NoneType found” Python2.7 TypeError:强制转换为 Unicode:需要字符串或缓冲区,找到无类型 - Python2.7 TypeError: coercing to Unicode: need string or buffer, NoneType found
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM