简体   繁体   English

美丽的汤 - 处理错误

[英]Beautiful Soup - handling errors

  1. I'd like to know how to handle a situation when href doesn't exist after the <strong>Text:</strong> 我想知道如何处理<strong>Text:</strong>href不存在的情况

  2. Is there a better way to search for the content that exists after <strong>Contact:</strong> 有没有更好的方法来搜索<strong>Contact:</strong>之后存在的内容

http://pastebin.com/FYMxTJkf http://pastebin.com/FYMxTJkf

How about findNext ? findNext怎么样?

import re
from BeautifulSoup import BeautifulSoup

html = '''<strong>Text:</strong>   

        <a href='http://domain.com'>url</a>'''

soup = BeautifulSoup(html)
label = soup.find("strong" , text='Text:')
contact = label.findNext('a')

if contact.get('href') != None:
    print contact
else:
    print "No href"

If you're looking specifically for an a tag with an href , use: 如果您正在寻找具有href a标签,请使用:

contact = label.findNext('a', attrs={'href' : True})

With this you won't need to condense whitespace. 有了这个,你不需要压缩空格。 I imagine you did this because next was returning the whitespace after the label. 我想你是这样做的,因为next是在标签后面返回空格。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM