[英]Web Scraping with Python BeautifulSoup 4
我是Webscraping的新手,并在在线观看了一些教程视频后进行了尝试。 我决定使用Tripadvisor.com
并尝试从客户评论中收集数据。
这是我想出的(代码):
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
my_url = 'https://www.tripadvisor.com.sg/Attraction_Review-g293916-d12033454-
Reviews-SHOW_DC-Bangkok.html'
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")
containers = page_soup.findAll("div",{"class":"ui_column is-9"})
for container in containers:
rating = container.div.div.div.span["class"]
comment_container = container.p
comment = comment_container[0]
print("rating" + rating)
print("comment" + comment)
这是我的代码的输出:
Traceback (most recent call last):
File "trip_advisor.py", line 18, in <module>
comment = comment_container[0]
File "/anaconda/lib/python3.6/site-packages/bs4/element.py", line 1011, in
__getitem__
return self.attrs[key]
KeyError: 0
谁能帮我解决这个问题? 谢谢。
您将无法通过使用<class bs4.element.Tag'>
的索引来访问内容, <class bs4.element.Tag'>
需要.contents
:
>>> container.p
<p class="partial_entry">I was there couple of weeks ago on the weekend. There was an event but it was not very crowded thought and I actually like it. What drawn my attention is the PUB on the roof top. A Pub in a department store sound pretty...<span class="taLnk ulBlueLinks" onclick="ta.prwidgets.call('handlers.clickExpand',event,this);">More</span></p>
>>> container.p.contents[0]
'I was there couple of weeks ago on the weekend. There was an event but it was not very crowded thought and I actually like it. What drawn my attention is the PUB on the roof top. A Pub in a department store sound pretty...'
除了这个问题,我不确定您的rating
刮刮功能是否能真正满足您的需求,但这可以解决主要错误:
for container in containers:
rating = container.div.div.div.span["class"]
comment_container = container.p.contents
comment = comment_container[0]
print("Rating: ", rating)
print("Comment: " + comment)
打印:
Rating: ['ui_bubble_rating', 'bubble_40']
Comment: I was there couple of weeks ago on the weekend. There was an event but it was not very crowded thought and I actually like it. What drawn my attention is the PUB on the roof top. A Pub in a department store sound pretty...
Rating: ['ui_bubble_rating', 'bubble_50']
Comment: Show dc is very fascinating place that you must to go. The mega complex is very special from the others mall in thailand. I think you can touching and feeling of the hapiness. I went there few days ago for find some dining and spending...
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.