简体   繁体   English

通过Beautifulsoup获取span的content属性

[英]Get content attribute of span by Beautifulsoup

I have parsed html page: using beautifulsoup 我已经解析了html页面:使用beautifulsoup

authors = soup.find_all("span", itemprop = 'author')
for author in authors:
    print(author)

and I got the authors: 我得到了作者:

<span content="Oliver" itemprop="author"></span>
<span content="Jack" itemprop="author"></span>

How can I get the content? 我如何获得内容?

I tried: 我试过了:

for auther in authors:
    print(author.content)

But I get None 但是我什么都没有

To get the content you should do the following: 要获取内容,您应该执行以下操作:

for auther in authors:
    print(author["content"])

Alternatively you can use the following code to store all authors in the all_authors variable (as a list) : 另外,您可以使用以下代码将所有作者存储在all_authors变量中(作为列表)

all_authors = [x["content"] for x in authors]

hope this helps! 希望这可以帮助!

You are close: 您接近:

for author in authors:
    print(author["content"])

如果您不确定是否始终具有itemprop = author的元素的content属性,则可以在选择器中使用AND语法指定在尝试访问之前必须同时具有两个属性:

authors = [i['content'] for i in soup.select('[itemprop=author][content]')]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM