简体   繁体   English

Python/Selenium 通过 xpath 查找具有特定标签、类、firstchild 的元素

[英]Python/Selenium find elements with specific tags, classes, firstchild by xpath

I need to extract elements of an HTML page, linearly (keeping true to the order of appearance) in a list.我需要在列表中线性地(保持真实的出现顺序)提取 HTML 页面的元素。 Picking elements individually - in separate lists - looks like so:单独挑选元素 - 在单独的列表中 - 看起来像这样:

date_select = driver.find_elements(By.XPATH, "//tr[@class='dayHeader']//h5")
time_select = driver.find_elements(By.XPATH, "//span[@class='col0Item']")
class_select = driver.find_elements(By.XPATH, "//span[@class='col1Item']")
duration_select = driver.find_elements(By.XPATH, "//span[@class='col4Item']")

I've managed to get to the following:我设法做到了以下几点:

output = driver.find_elements_by_xpath('//*[contains(@class, 'dayHeader') or contains(@class, 'col0Item') or contains(@class, 'col1Item') or contains(@class, 'col4Item')]')

But the problem is that this line has a wildcard for the tags and also doesn't take into account the firstchild for date_select.但问题是这条线有一个标签通配符,也没有考虑到 date_select 的第一个孩子。 So "output" saves a lot of unwanted elements.所以“输出”节省了很多不需要的元素。

How can I get as precise to output all date/time/class/duration linearly, into one line?如何将 output 的所有日期/时间/课程/持续时间线性精确到一行?

I've actually managed to figure one way out:我实际上设法找到了一种出路:

output = driver.find_elements_by_xpath("//tr[contains(@class,'dayHeader')]//h5|//span[contains(@class,'col0Item') or contains(@class,'col1Item') or contains(@class,'col4Item')]")

The solution was explained on Specifying multiple conditions in xpath . 在 xpath 中指定多个条件中解释了该解决方案。 Explanation of |解释| ("pipe" operator) on Pipe character in Python (“管道”运算符)在Python 中的 Pipe 字符上

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM