[英]Get Structured Data from HTML using python and beautiful soup
I an new to python. 我是python的新手。 I want to get the result of the code as below:
我想得到如下代码的结果:
Score Postive Negative
5 good bad
7 interesting
3 horrible
But my code output nothing.Please where is the problem? 但是我的代码什么也没输出,请问问题出在哪里?
from bs4 import BeautifulSoup
text = """
... <body>
<div class="review">
<p class="pos">good</p>
<p class="neg">bad</p>
</div>
<div class="review">
<p class="pos">interesting</p>
</div>
<div class="review">
<p class="neg">horrible</p>
</div>
... </body>"""
soup = BeautifulSoup(text)
for parent in soup.find_all('div', attrs={'class': 'review'}):
if parent.findNextSiblings('p', attrs={'class': 'pos'}):
postive.append(parent.get_text())
else:
postive.append("")
if parent.findNextSiblings('p', attrs={'class': 'neg'}):
negtive.append(parent.get_text())
else:
negtive.append("")
p
tags are not siblings of the div
tag with class review
, they are children: p
标签不是带有class review
的div
标签的兄弟姐妹,它们是孩子:
positive = []
negative = []
for div in soup.find_all('div', attrs={'class': 'review'}):
pos = div.find('p', {'class': 'pos'})
positive.append(pos.get_text() if pos else '')
neg = div.find('p', {'class': 'neg'})
negative.append(neg.get_text() if neg else '')
print positive
print negative
Prints: 打印:
[u'good', u'interesting', '']
[u'bad', '', u'horrible']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.