[英]How to extract multiple texts from span elements using python selenium?
I am trying to extract all the texts in span into list, using the following HTML code from Selenium webdriver method:我正在尝试使用 Selenium webdriver 方法中的以下 HTML 代码将 span 中的所有文本提取到列表中:
['1a', '1b', '1c', '2a', ' ', ' ', '3a', '3b', '3c', '4a', ' ', ' ']
Anyone expert know how to do it?有哪位高手知道怎么做吗?
HTML: HTML:
<tr style="background-color:#999">
<td><b style="white-space: nowrap;">table_num</b><enter code here/td>
<td style="text-align:center;">
<span style="flex: 1;display: flex;flex-direction: column;">
<span>1a</span>
<span>1b</span>
<span>1c</span>
</span>
</td>
<td style="text-align:center;">
<span style="flex: 1;display: flex;flex-direction: column;">
<span>2a</span>
<span> </span>
<span> </span>
</span>
</td>
<td style="text-align:center;">
<span style="flex: 1;display: flex;flex-direction: column;">
<span>3a</span>
<span>3b</span>
<span>3c</span>
</span>
</td>
<td style="text-align:center;">
<span style="flex: 1;display: flex;flex-direction: column;">
<span>4a</span>
<span> </span>
<span> </span>
</span>
</td>
</tr>
Here is the way, use the below xpath
which will give you all the required spans
.这是方法,使用下面的
xpath
它将为您提供所有必需的spans
。
//span[contains(@style,"column")]/span
Once you have all the span, you have to extract text from it.获得所有跨度后,您必须从中提取文本。
If there is empty text, then ignore or else add it in the list.如果有空文本,则忽略或将其添加到列表中。
As per the HTML, to extract all the texts from the <span>
elements into a list you have to induce WebDriverWait for visibility_of_all_elements_located() and using List Comprehension you can use either of the following locator strategies :根据 HTML,要将
<span>
元素中的所有文本提取到列表中,您必须诱导WebDriverWait for visibility_of_all_elements_located()并使用列表理解,您可以使用以下任一定位器策略:
Using CSS_SELECTOR and text attribute:使用CSS_SELECTOR和text属性:
driver.get("application url") print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "tr[style^='background'] > td td > span span")))])
Using XPATH and get_attribute("innerHTML")
:使用XPATH和
get_attribute("innerHTML")
:
driver.get("application url") print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//tr[starts-with(@style, 'background')]/td//td/span//span")))])
Just remove the predicate [1]
from XPath, so it becomes:只需从 XPath 中删除谓词
[1]
,它就变成:
//td[contains(.,'table_num')]/following-sibling::td
En to be more precise you could use:更准确地说,您可以使用:
//td[contains(.,'table_num')]/following-sibling::td/span/span
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.