![](/img/trans.png)
[英]How to get exactly 3 reviews from Yelp API request in python?
[英]how to extract reviews from iframeurl returned by amazon api in python?
我正在嘗試使用其api
amazon
給定產品的評論的text
內容。 但我無法解決。 這是我所擁有的:
result = api.item_lookup('B00062B6QY', ResponseGroup='Reviews',
TruncateReviewsAt=256, IncludeReviewsSummary=False)
iframeurl=result.xpath('//*[local-name()="IFrameURL"]/text()')[0].strip()
print iframeurl
reviews=requests.get(iframeurl)
reviews.raise_for_status()
#data = json.loads(reviews.text)
root = ET.fromstring(reviews.text)
print root
輸出為:
http://www.amazon.com/reviews/iframe?akid=helloworld&alinkCode=xm2&asin=B00062B6QY&atag=welcomehome-20&exp=2014-01-28T19%3A06%3A20Z&summary=0&truncate=256&v=2&sig=HIDDEN%3D
Traceback (most recent call last):
File "amazon_api_new.py", line 36, in <module>
root = ET.fromstring(reviews.text)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1300, in XML
parser.feed(text)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1642, in feed
self._raiseerror(v)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1506, in _raiseerror
raise err
xml.etree.ElementTree.ParseError: mismatched tag: line 867, column 2
PS:我更改了打印出的iframeurl
,只是為了清除api key
詳細信息
編輯:圖像 從
firebug
而不是使用ElementTree,嘗試將reviews.text
加載到lxml中,如下所示:
>>> from lxml import etree
>>> parser = etree.HTMLParser()
>>> tree = etree.parse(StringIO(reviews.text), parser)
>>> result = etree.tostring(tree.getroot(),
... pretty_print=True, method="html")
>>> print(result)
...
當然,您可以使用lxml xpath進行進一步解析
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.