[英]How to extract the values of the alt atributte within img with selenium python
我有以下問題,我一直無法解決。 我必須提取出現在圖像的alt
屬性中的文本。 id
總是改變,所包含的alt
也是如此。
我注意到id
只在這部分發生變化
//*[@id="j_id1102997597_32d04ef7:0:j_id1102997597_32d04d36:**2**:j_id1102997597_32d04d7a:**2**:j_id1102997597_32d04d8e:j_id415163359_4cfbfc60"]
//*[@id="j_id1102997597_32d04ef7:0:j_id1102997597_32d04d36:**1**:j_id1102997597_32d04d7a:**1**:j_id1102997597_32d04d8e:j_id415163359_4cfbfc60"]
//*[@id="j_id1102997597_32d04ef7:0:j_id1102997597_32d04d36:**0**:j_id1102997597_32d04d7a:**0**:j_id1102997597_32d04d8e:j_id415163359_4cfbfc60"]
無論如何,我仍然無法登錄。
您可以為此使用 BeautifulSoup(使用pip install bs4
):
from bs4 import BeautifulSoup
soup = BeautifulSoup(browser.page_source, 'html.parser')
images = soup.select_one('div.text-center').select('img')
for image in images:
print(image.get('alt'))
要提取和打印alt屬性的值,您必須為visibility_of_all_elements_located()引入WebDriverWait ,您可以使用以下任一定位器策略:
使用CSS_SELECTOR :
print([my_elem.get_attribute("alt") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.matriz table > tbody tr.filaMatriz > td > img[alt]")))])
使用XPATH :
print([my_elem.get_attribute("alt") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='matriz']//table/tbody//tr[@class='filaMatriz']/td/img[@alt]")))])
注意:您必須添加以下導入:
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.