如何从具有多个标准选项的动态图表中通过 XPath 进行 web 抓取？

Question

I am very new to scraping and programing in general.一般来说，我对抓取和编程非常陌生。 That's why I am asking for help with the next issue.这就是为什么我在下一个问题上寻求帮助。 There is a web site under the url. url 下有一个 web 站点。 I need to get data from dynamic charts.我需要从动态图表中获取数据。 The code has to be written with an option of looping through all the required days data represented for and an option of looping though all elements containing the data.编写代码时必须选择循环遍历所有表示的所需日期数据，以及循环遍历包含数据的所有元素的选项。

First issue is that I need somehow to get the data following the XPath.第一个问题是我需要以某种方式获取 XPath 之后的数据。 And the second one is that I have to write the loop to get all the required inflammation第二个是我必须编写循环来获得所有需要的炎症

url = "https://www.oree.com.ua/index.php/control/results_mo/DAM"


from selenium import webdriver
import requests
import pandas as pd
import time



browser = webdriver.PhantomJS(executable_path = "C:/ProgramData/Anaconda3/Lib/site-packages/phantomjs-2.1.1-windows/bin/phantomjs")
browser.get(url)
time.sleep(2)


elements = browser.find_elements_by_xpath("html/body/div[5]/div[1]/div[3]/div[3]/div/div/table/tbody/tr[1]/td[3]/text()")
for element in elements:
    print(element)

browser.quit()

Answer 1

Not sure Selenium is required here.不确定此处是否需要 Selenium。 You can get directly data from this mixed html/json object (change the date accordingly to your needs):您可以直接从此混合的 html/json object 获取数据（根据您的需要更改日期）：

https://www.oree.com.ua/index.php/PXS/get_pxs_hdata/04.04.2020/DAM/1

Then request with:然后请求：

//tbody//tr/td[i]

Where i is the column of interest.其中 i 是感兴趣的列。 Range of i is 3-7. i的范围是3-7。 Column 3 is "Sales volume, MW.h", 4 is "Purchase volume, MW.h", etc...第 3 列是“销售量，MW.h”，第 4 列是“购买量，MW.h”，等等...

Output for sales volume (04/04/2020): Output 销量 (04/04/2020):

如何从具有多个标准选项的动态图表中通过 XPath 进行 web 抓取？

问题描述

1 个解决方案

解决方案1
0 2020-04-05 01:36:53

如何从具有多个标准选项的动态图表中通过 XPath 进行 web 抓取？

问题描述

1 个解决方案

解决方案1 0 2020-04-05 01:36:53

解决方案1
0 2020-04-05 01:36:53