使用 Python 从网站抓取数据

Question

Is it possible to extract the data from the graphs of this website using a Python code?是否可以使用 Python 代码从本网站的图表中提取数据？ https://xsi.xeneta.com/ https://xsi.xeneta.com/

Answer 1

Yes, assuming that the data exists on the page you could use requests to get the page, then extract the data you want.是的，假设页面上存在数据，您可以使用请求获取页面，然后提取所需的数据。 It would look something like它看起来像

import requests
page = requests.get(url="https://xsi.xeneta.com/")
data = page.content
print(data)

This would give you a starting point at least to do whatever processing you want.这将为您提供一个起点，至少可以进行您想要的任何处理。

For some functions that might be helpful here- https://www.w3schools.com/python/ref_requests_response.asp对于此处可能有用的一些功能-https://www.w3schools.com/python/ref_requests_response.asp

Answer 2

If you inspect the graph you'll see it's nested inside iframe.如果您检查图表，您会看到它嵌套在 iframe 内。 I grabbed the 1st graph and navigate directly to that site, and not on xsi.xeneta.com.我抓住了第一个图表并直接导航到该站点，而不是 xsi.xeneta.com。 You can also see that there's a lot of data in data-json attribute, so this code prints that data using selenium.您还可以看到 data-json 属性中有很多数据，因此此代码使用 selenium 打印该数据。

Imports:进口：

pip install selenium
pip install webdriver-manager

Code:代码：

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager

driver = webdriver.Chrome(ChromeDriverManager().install())
driver.implicitly_wait(5)
driver.get("https://xsi-short.xeneta.com/xsic/chart/asia-europe/")
canvas = driver.find_element_by_xpath('//*[@id="chart-visualization-b9948b5ccd27f73bf764abe4a935c502"]')
print(canvas.get_attribute("data-json"))

使用 Python 从网站抓取数据

问题描述

2 个解决方案

解决方案1
0 2021-12-21 18:34:24

解决方案2
0 2021-12-21 18:39:07

使用 Python 从网站抓取数据

问题描述

2 个解决方案

解决方案1 0 2021-12-21 18:34:24

解决方案2 0 2021-12-21 18:39:07

解决方案1
0 2021-12-21 18:34:24

解决方案2
0 2021-12-21 18:39:07