[英]Scraping data from a website using Python
Is it possible to extract the data from the graphs of this website using a Python code?是否可以使用 Python 代码从本网站的图表中提取数据? https://xsi.xeneta.com/
https://xsi.xeneta.com/
Yes, assuming that the data exists on the page you could use requests to get the page, then extract the data you want.是的,假设页面上存在数据,您可以使用请求获取页面,然后提取所需的数据。 It would look something like
它看起来像
import requests
page = requests.get(url="https://xsi.xeneta.com/")
data = page.content
print(data)
This would give you a starting point at least to do whatever processing you want.这将为您提供一个起点,至少可以进行您想要的任何处理。
For some functions that might be helpful here- https://www.w3schools.com/python/ref_requests_response.asp对于此处可能有用的一些功能-https://www.w3schools.com/python/ref_requests_response.asp
If you inspect the graph you'll see it's nested inside iframe.如果您检查图表,您会看到它嵌套在 iframe 内。 I grabbed the 1st graph and navigate directly to that site, and not on xsi.xeneta.com.
我抓住了第一个图表并直接导航到该站点,而不是 xsi.xeneta.com。 You can also see that there's a lot of data in data-json attribute, so this code prints that data using selenium.
您还可以看到 data-json 属性中有很多数据,因此此代码使用 selenium 打印该数据。
Imports:进口:
pip install selenium
pip install webdriver-manager
Code:代码:
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.implicitly_wait(5)
driver.get("https://xsi-short.xeneta.com/xsic/chart/asia-europe/")
canvas = driver.find_element_by_xpath('//*[@id="chart-visualization-b9948b5ccd27f73bf764abe4a935c502"]')
print(canvas.get_attribute("data-json"))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.