简体   繁体   English

通过 Python BeautifulSoup 中的特定文本查找标签

[英]Find a tag by specific text in Python BeautifulSoup

Do you know how to search for specific text inside pythons BeautifulSoup, to find tags - better full path to tags, containing some string?你知道如何在 pythons BeautifulSoup 中搜索特定文本,以查找标签 - 更好的标签完整路径,包含一些字符串? The common way of BS usage is for example: BS使用的常见方式例如是:

import requests
from bs4 import BeautifulSoup

url = "https://elementy.ru/novosti_nauki"


website = requests.get(url)
results = BeautifulSoup(website.content, 'html.parser')

and then you can quueuery for all tags with some properties, like header, class, etc. However i want to go different way, and find the location of the specific text inside this structure?然后您可以查询具有某些属性的所有标签,例如 header、class 等。但是我想 go 不同的方式,并在结构中找到这个特定文本的位置? If you do it with plain HTML text it is really unconvenient.如果您使用普通的 HTML 文本进行操作,那真的很不方便。 Any ideas how to do this with BS?任何想法如何用BS做到这一点? Thanks谢谢

You could use a css selector :您可以使用css selector

soup.select(':-soup-contains("that")')

or as alternative re.compile() :或作为替代re.compile()

import re
soup('p', text=re.compile('that')))

Example例子

from bs4 import BeautifulSoup

html = '''
<p>some content</p>
<p>pattern that we like</p>
<p>some content</p>
'''
soup = BeautifulSoup(html, 'html.parser')

soup.select(':-soup-contains("that")')

Output Output

[<p>pattern that we like</p>]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM