![](/img/trans.png)
[英]login to page with Selenium works - parsing with BS4 works - but not the combination of both
[英]Parsing text with bs4 works with selenium but does not work with requests in Python
这段代码有效并返回我想要的个位数,但它太慢了,需要 10 秒才能完成。我将运行这 4 次供我使用,这样每次运行都会浪费 40 秒。 ` 从 selenium 导入 webdriver 从 bs4 导入 BeautifulSoup
options = webdriver.FirefoxOptions()
options.add_argument('--headless')
driver = webdriver.Firefox(options=options)
driver.get('https://warframe.market/items/ivara_prime_blueprint')
html = driver.page_source
soup = BeautifulSoup(html, 'html.parser')
price_element = soup.find('div', {'class': 'row order-row--Alcph'})
price2=price_element.find('div',{'class':'order-row__price--hn3HU'})
price = price2.text
print(int(price))
driver.close()`
另一方面,此代码不起作用。 它返回无。 ` 导入请求来自 bs4 导入 BeautifulSoup
url='https://warframe.market/items/ivara_prime_blueprint'
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, 'html.parser')
price_element=soup.find('div', {'class': 'row order-row--Alcph'})
price2=price_element.find('div',{'class':'order-row__price--hn3HU'})
price = price2.text
print(int(price))`
首先想到的是添加用户代理,但仍然没有用。 当我打印(汤)时,它给了我 html 代码,但是当我进一步解析它时,它停止并开始给我 None 甚至它与 selenium 示例中的命令相同。
数据在<script>
标签内动态加载,因此 Beautifulsoup 看不到它(它不呈现 Javascript)。
例如,要获取数据,您可以使用:
import json
import requests
from bs4 import BeautifulSoup
url = "https://warframe.market/items/ivara_prime_blueprint"
headers = {
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36"
}
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, "html.parser")
script_tag = soup.select_one("#application-state")
json_data = json.loads(script_tag.string)
# Uncomment the line below to see all the data
# from pprint import pprint
# pprint(json_data)
for data in json_data["payload"]["orders"]:
print(data["user"]["ingame_name"])
印刷:
Rogue_Monarch
Rappei
KentKoes
Tenno61189
spinifer14
Andyfr0nt
hollowberzinho
您可以将数据作为dict
访问并访问keys
/ values
。
我推荐一个在线工具来查看所有 JSON,因为它非常大。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.