![](/img/trans.png)
[英]Can't grab coordinates from ArcGIS iframe in a webpage using requests
[英]Can't grab an email address from a webpage using requests
我使用请求从网页中获取电子邮件地址编写了python脚本。 当我运行脚本时,它会捕获一些难以理解的内容,而不是电子邮件。
我尝试过:
import requests
from bs4 import BeautifulSoup
link = 'https://www.marinetraffic.com/lv/maritime-companies/profile/74391/company_name:africa%20stable%20logistics%20co.ltd'
def get_email(link):
res = requests.get(link,headers={"User-Agent":"Mozilla/5.0"})
soup = BeautifulSoup(res.text, "lxml")
email = soup.select_one("td:contains('Email:') ~ td > script").text
return email
if __name__ == '__main__':
print(get_email(link))
我得到的输出:
/*<![CDATA[*/eval("var a=\"tfUyBZIrTc@WN-gjhbunO78.R2XSDk1lGKwY0zFPpvmasVC_o436iJMqe9xAHE5+LQd\";var b=a.split(\"\").sort().join(\"\");var c=\"4MCq-vAiaUaqUHd\";var d=\"\";for(var e=0;e<c.length;e++)d+=b.charAt(a.indexOf(c.charAt(e)));document.getElementById(\"e372346296\").innerHTML=\"<a rel=\\\"nofollow\\\" class=\\\"text-primary text-light\\\" href=\\\"mailto:\"+d+\"\\\">\"+d+\"</a>\"")/*]]>*/
所需的输出:
info@aslc.co.tz
如何使用请求解析该电子邮件地址?
如果您查看页面,那么这就是您选择的脚本标签的实际内容。 实际上,你要选择的内部文本a
标签,所以尝试更换email = soup.select_one("td:contains('Email:') ~ td > script").text
与email = soup.select_one("td:contains('Email:') ~ td > a").text
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.