简体   繁体   English

基本的Python /美丽的汤解析

[英]Basic Python/Beautiful Soup Parsing

Say I have used 说我用过

date = r.find('abbr')

to get 要得到

<abbr class="dtstart" title="2012-11-16T00:00:00-05:00">November 16, 2012</abbr>

I just want to print November 16, 2012 , but if I try 我只想打印November 16, 2012 ,但如果我试试

print date.string

I get 我明白了

AttributeError: 'NoneType' object has no attribute 'string'

What am I doing wrong? 我究竟做错了什么?

ANSWER: Here's my final working code for learning purposes: 答案:这是我用于学习目的的最终工作代码:

soup = BeautifulSoup(page)
calendar = soup.find('table',{"class" : "vcalendar ical"})

dates = calendar.findAll('abbr', {"class" : "dtstart"})
events = calendar.findAll('strong')

for i in range(1,len(dates)-1):
    print dates[i].string + ': ' + events[i].string

soup.find('abbr').string should work fine. soup.find('abbr').string应该可以正常工作。 There must be something wrong with date . date肯定有问题。

from BeautifulSoup import BeautifulSoup

doc = '<abbr class="dtstart" title="2012-11-16T00:00:00-05:00">November 16, 2012</abbr>'

soup = BeautifulSoup(doc)

for abbr in soup.findAll('abbr'):
    print abbr.string

Result: 结果:

November 16, 2012

Update based on code added to question: 根据添加到问题的代码进行更新:

You can't use the text parameter like that. 你不能像这样使用text参数。

http://www.crummy.com/software/BeautifulSoup/documentation.html#arg-text http://www.crummy.com/software/BeautifulSoup/documentation.html#arg-text

text is an argument that lets you search for NavigableString objects instead of Tags text是一个参数,可让您搜索NavigableString对象而不是Tags

Either you're looking for text nodes, or you're looking for tags. 要么你正在寻找文本节点,要么你正在寻找标签。 A text node can't have a tag name. 文本节点不能具有标记名称。

Maybe you want ''.join([el.string for el in r.findAll('strong')]) ? 也许你想要''.join([el.string for el in r.findAll('strong')])

The error message is saying that date is None . 错误消息是dateNone You haven't shown enough code to say why that is so. 您没有显示足够的代码来说明原因。 Indeed, using the code you posted in the most straight-forward way should work: 实际上,使用您以最直接的方式发布的代码应该有效:

import BeautifulSoup

content='<abbr class="dtstart" title="2012-11-16T00:00:00-05:00">November 16, 2012</abbr>'
r=BeautifulSoup.BeautifulSoup(content)
date=r.find('abbr')
print(date.string)
# November 16, 2012

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM