![](/img/trans.png)
[英]How to extract the texts from the span tag as per the html using selenium and Python
[英]How to extract multiple texts from span elements using python selenium?
我正在嘗試使用 Selenium webdriver 方法中的以下 HTML 代碼將 span 中的所有文本提取到列表中:
['1a', '1b', '1c', '2a', ' ', ' ', '3a', '3b', '3c', '4a', ' ', ' ']
有哪位高手知道怎么做嗎?
HTML:
<tr style="background-color:#999">
<td><b style="white-space: nowrap;">table_num</b><enter code here/td>
<td style="text-align:center;">
<span style="flex: 1;display: flex;flex-direction: column;">
<span>1a</span>
<span>1b</span>
<span>1c</span>
</span>
</td>
<td style="text-align:center;">
<span style="flex: 1;display: flex;flex-direction: column;">
<span>2a</span>
<span> </span>
<span> </span>
</span>
</td>
<td style="text-align:center;">
<span style="flex: 1;display: flex;flex-direction: column;">
<span>3a</span>
<span>3b</span>
<span>3c</span>
</span>
</td>
<td style="text-align:center;">
<span style="flex: 1;display: flex;flex-direction: column;">
<span>4a</span>
<span> </span>
<span> </span>
</span>
</td>
</tr>
這是方法,使用下面的xpath
它將為您提供所有必需的spans
。
//span[contains(@style,"column")]/span
獲得所有跨度后,您必須從中提取文本。
如果有空文本,則忽略或將其添加到列表中。
根據 HTML,要將<span>
元素中的所有文本提取到列表中,您必須誘導WebDriverWait for visibility_of_all_elements_located()並使用列表理解,您可以使用以下任一定位器策略:
使用CSS_SELECTOR和text屬性:
driver.get("application url") print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "tr[style^='background'] > td td > span span")))])
使用XPATH和get_attribute("innerHTML")
:
driver.get("application url") print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//tr[starts-with(@style, 'background')]/td//td/span//span")))])
只需從 XPath 中刪除謂詞[1]
,它就變成:
//td[contains(.,'table_num')]/following-sibling::td
更准確地說,您可以使用:
//td[contains(.,'table_num')]/following-sibling::td/span/span
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.