简体   繁体   English

在 BeautifulSoup Find() function 中正确使用变量

[英]Correct Variable Usage within BeautifulSoup Find() function

How to use variables correctly in BeautifulSoup's.find() function?如何在 BeautifulSoup's.find() function 中正确使用变量?

Probably a simple solution but the official documentation doesn't cover it and the solution eludes me.可能是一个简单的解决方案,但官方文档没有涵盖它,并且解决方案让我无法理解。 Latest Python, stable version and latest BeautifulSoup as of yesterday.截至昨天的最新 Python、稳定版和最新 BeautifulSoup。

Using the find() function seems to fail when you specify string variables as a parameter.当您将字符串变量指定为参数时,使用 find() function 似乎会失败。

HTML <h3 id="ourId">Something</h3> HTML <h3 id="ourId">Something</h3>

For example:例如:

prodTitle = page.find("h3", {"id": "ourId"}).get_text(strip=True)

That code works.该代码有效。

tag="h3"
attrib="id"
element="ourId"
prodTitle = page.find(tag, {attrib: element}).get_text(strip=True)

The above fails, any usage of a variable within Find() and the call fails with error.以上失败,在 Find() 中使用任何变量并且调用失败并出现错误。

AttributeError: 'NoneType' object has no attribute 'get_text' AttributeError: 'NoneType' object 没有属性 'get_text'

I can try the attrib=element, hard-coding the quotations and it also fails.我可以尝试 attrib=element,对引号进行硬编码,但它也失败了。

What am I missing here?我在这里想念什么?

UPDATE:更新:

prodTitle = page.find(str(tag), {str(attrib): "ourId"}).get_text(strip=True) This code works prodTitle = page.find(str(tag), {str(attrib): "ourId"}).get_text(strip=True)此代码有效

prodTitle = page.find(str(tag), {str(attrib): str(element)}).get_text(strip=True) This code does not work, with or without str() or quotes passed. prodTitle = page.find(str(tag), {str(attrib): str(element)}).get_text(strip=True)无论是否传递了 str() 或引号,此代码都不起作用

This is one way you can use variables in bs4 (python version >= 3.6):这是您可以在 bs4(python 版本 >= 3.6)中使用变量的一种方式:

from bs4 import BeautifulSoup

html = '<h3 id="ourId">Something</h3>'

soup = BeautifulSoup(html, 'html.parser')

prodTitle = soup.find("h3", {"id": "ourId"}).get_text(strip=True)

tag="h3"
attrib="id"
element="ourId"

var_prodTitle = soup.find(f"{tag}", {f"{attrib}" : f"{element}"}).text.strip()
print('prodTitle', prodTitle)
print('_______________')
print('var_prodTitle', var_prodTitle)

Result:结果:

prodTitle Something
_______________
var_prodTitle Something

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM