简体   繁体   中英

Scraping data from a website using Python

Is it possible to extract the data from the graphs of this website using a Python code? https://xsi.xeneta.com/

Yes, assuming that the data exists on the page you could use requests to get the page, then extract the data you want. It would look something like

import requests
page = requests.get(url="https://xsi.xeneta.com/")
data = page.content
print(data)

This would give you a starting point at least to do whatever processing you want.

For some functions that might be helpful here- https://www.w3schools.com/python/ref_requests_response.asp

If you inspect the graph you'll see it's nested inside iframe. I grabbed the 1st graph and navigate directly to that site, and not on xsi.xeneta.com. You can also see that there's a lot of data in data-json attribute, so this code prints that data using selenium.

Imports:

pip install selenium
pip install webdriver-manager

Code:

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager

driver = webdriver.Chrome(ChromeDriverManager().install())
driver.implicitly_wait(5)
driver.get("https://xsi-short.xeneta.com/xsic/chart/asia-europe/")
canvas = driver.find_element_by_xpath('//*[@id="chart-visualization-b9948b5ccd27f73bf764abe4a935c502"]')
print(canvas.get_attribute("data-json"))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM