[英]Unable to scrape two fields from a webpage using requests
I'm trying to scrape two fields from this webpage using requests.我正在尝试使用请求从该网页中抓取两个字段。 I've used accurate selectors to locate the content but I can't fetch them as they are generated dynamically and not available in page source.我使用了准确的选择器来定位内容,但我无法获取它们,因为它们是动态生成的并且在页面源中不可用。 However, I used the selectors as placeholders.但是,我使用选择器作为占位符。 I know how to grab the two fields using selenium but I wish to know how I can grab them using requests.我知道如何使用 selenium 来获取这两个字段,但我想知道如何使用请求来获取它们。
Fields that I'm after:我追求的领域:
I've tried with:我试过:
import requests
from bs4 import BeautifulSoup
url = "https://www.namebase.io/domains/unite"
with requests.Session() as s:
s.headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1; ) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36'
r = s.get(url)
soup = BeautifulSoup(r.text,"lxml")
total_bids = soup.select_one("[class='domain-highlights__container'] [class*='text_type_h4']").text
highest_lockup = soup.select_one("[class='desktop-bid-card__right'] > [class*='text_type_h3']").text
print(total_bids,highest_lockup)
How can I grab the two fields using requests?如何使用请求获取这两个字段?
The data is loaded via JavaScript, but you can use requests
module to obtain the Json data.数据通过 JavaScript 加载,但您可以使用requests
模块获取 Json 数据。
For example:例如:
import requests
url = 'https://www.namebase.io/api/domains/get/unite'
data = requests.get(url).json()
# uncomment this to print all data:
# import json
# print(json.dumps(data, indent=4))
no_bids = len(data['bids'])
highest = float(data['highestStakeAmount'] / 1_000_000)
print('No. bids', no_bids)
print('Highest lockup', highest)
Prints:印刷:
No. bids 6
Highest lockup 5.0
EDIT (Screenshot from Firefox Developer tools, where I found the API URL):编辑(Firefox 开发工具的屏幕截图,我在其中找到了 API URL):
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.