[英]scraping SVG data by CSS Selector and Id (Selenium)
I'm looking to scrape a label from an SVG that only arrives with a mouse hover. I'm working with this link for the data contained with the [+] expand button to the right in each of the table rows.我想从 SVG 中抓取一个 label,它只能通过鼠标 hover 到达。我正在使用此链接获取包含在每个表格行右侧的 [+] 展开按钮的数据。 When you press [+] expand, an SVG table pops up that shows elements that contain elements.
按 [+] 展开时,会弹出一个 SVG 表,显示包含元素的元素。 When you hover on each of the elements, a element appears called "Capacity Impact" with a value for each of the bars.
当您在每个元素上输入 hover 时,会出现一个名为“容量影响”的元素,其中每个条都有一个值。 These values are the values I want to scrape.
这些值是我想要抓取的值。
See a screenshot below.请参阅下面的屏幕截图。
So far, my code is successful in opening each of the [+] expand buttons, and identifying the polygons but I can't get to the labels using either XPATH or CSS Selectors.到目前为止,我的代码已成功打开每个 [+] 展开按钮并识别多边形,但我无法使用 XPATH 或 CSS 选择器访问标签。 See code below.
请参阅下面的代码。
driver.get(url)
table_button_xpath = "//table[@class='data-view-table redispatching dataTable']//tr//td[@class = 'button-column']//a[@class='openIcon pre-table-button operation-detail-expand small-button ui-button-light ui-button ui-widget ui-corner-all ui-button-text-only']"
driver.find_element(By.ID, "close-button").click()
driver.find_element(By.ID, "cookieconsent-button").click()
# open up all the "+" buttons
table_buttons = driver.find_element(By.XPATH, table_button_xpath)
for i in list(range(1, 10)):
driver.find_element(By.XPATH, table_button_xpath).click()
# find all the polygons
polygons = driver.find_elements(By.TAG_NAME, 'path')
label_xpath = "//*[name()='svg']//*[name()='g' and @id = 'ballons')]//*[name()='g']//*[name()='tspan']"
for polygon in polygons :
action.move_to_element(polygon)
labels_by_xpath = driver.find_elements(By.XPATH, label_xpath)
labels_by_css_selector = driver.find_elements(By.CSS_SELECTOR, "svg>#ballons>g>text>tspan")
Both labels_by_xpath and labels_by_css_selector return a list of 0 elements. labels_by_xpath 和 labels_by_css_selector 都返回一个包含 0 个元素的列表。 I've tried many versions of both the xpath and css selector approach, along with using WebDriverWait, but I can't get it to return the capacity impact values.
我已经尝试了 xpath 和 css 选择器方法的许多版本,以及使用 WebDriverWait,但我无法让它返回容量影响值。
HTML screenshot is also copied below (to be clear, the number I need to scrape is the "50" text in the tag. HTML截图也抄在下面(说清楚,我要抓取的数字是标签中的“50”文字。
Any help is appreciated, Thank you, Sophie感谢您的帮助,谢谢,索菲
The solution to your problem is with the locator.您的问题的解决方案是使用定位器。 Here is the updated locator to select the desired element.
这是 select 所需元素的更新定位器。
CSS Selector: CSS 选择器:
svg>[id^='balloons']>g:nth-child(2)>text:nth-child(2)>tspan
try this to get the element Capacity 50试试这个来获得元素 Capacity 50
x = driver.find_elements(By.CSS_SELECTOR, "svg>[id^='balloons']>g>text>tspan") x = driver.find_elements(By.CSS_SELECTOR, "svg>[id^='balloons']>g>text>tspan")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.