简体   繁体   English

解析html漂亮汤

[英]parse html beautiful soup

I have a html page 我有一个html页面

<a email="corporate@max.ru" href="http://www.max.ru/agent?message&to=corporate@max.ru" title="Click herе" class="mf_spIco spr-mrim-9"></a><a class="mf_t11" type="booster" href="http://max.ru/mail/corporate/">

I neeed a parse email string 我需要解析电子邮件字符串

    soup = BeautifulSoup(data
    string = soup.find("a",{"email": ""})
    print string

But it not working. 但它不起作用。 Where mistake? 哪里有错?

Your mistake was in using the attrs dict to look for elements with an email attribute that is empty. 您的错误在于使用attrs字典查找电子邮件属性为空的元素。 Try this instead. 试试这个吧。

#!/usr/bin/env python

from BeautifulSoup import BeautifulSoup
import urllib2

req = urllib2.urlopen('http://worldnuclearwar.ru')

soup = BeautifulSoup(req)
print soup.find("a", email=True)["email"]

To print the email attribute of the first a element which has an email attribute. 要打印email一个属性a它有一个元素email属性。 If you want all emails, try 如果您需要所有电子邮件,请尝试

for link in soup.findAll("a", email=True):
    print link["email"]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM