Web 使用 Selenium 刮取 python - 不檢索所有元素

Question

我正在嘗試使用 Selenium 來 web 刮 coinmarketcap.com，但我只能檢索列表中的前 10 個山寨幣。 我讀到 //div[contains(concat(' ', normalize-space(@class), ' '), 'class name')] 應該可以解決問題，但它不起作用。 有人能幫我嗎？ 我也知道 coinmarketcap 是 api，但我只是想嘗試另一種方式。


driver = webdriver.Chrome(r'C:\Users\Ejer\PycharmProjects\pythonProject\chromedriver')
driver.get('https://coinmarketcap.com/')

Crypto = driver.find_elements_by_xpath("//div[contains(concat(' ', normalize-space(@class), ' '), 'sc-16r8icm-0 sc-1teo54s-1 lgwUsc')]")
#price = driver.find_elements_by_xpath('//td[@class="cmc-link"]')
#coincap = driver.find_elements_by_xpath('//td[@class="DAY"]')

CMC_list = []
for c in range(len(Crypto)):
    CMC_list.append(Crypto[c].text)
print(CMC_list)

driver.close()

Answer 1

要檢索列表中的前 10 個山寨幣，您需要為visibility_of_all_elements_located()引入WebDriverWait ，您可以使用以下任一定位器策略：

使用CSS_SELECTOR和get_attribute("innerHTML") ：

 driver.get('https://coinmarketcap.com/') print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "table.cmc-table tbody tr td > ap[color='text']")))[:10]])

使用XPATH和文本屬性：

 driver.get('https://coinmarketcap.com/') print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[contains(@class, 'cmc-table')]//tbody//tr//td/a//p[@color='text']")))[:10]])

控制台 Output：

 ['Bitcoin', 'Ethereum', 'XRP', 'Tether', 'Litecoin', 'Bitcoin Cash', 'Chainlink', 'Cardano', 'Polkadot', 'Binance Coin']

注意：您必須添加以下導入：

 from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC

Web 使用 Selenium 刮取 python - 不檢索所有元素

問題描述

1 個解決方案

解決方案1
0 已采納 2020-12-07 22:25:36

Web 使用 Selenium 刮取 python - 不檢索所有元素

問題描述

1 個解決方案

解決方案1 0 已采納 2020-12-07 22:25:36

解決方案1
0 已采納 2020-12-07 22:25:36