繁体   English   中英

从带有向下滚动的表中抓取 Selenium 数据

[英]Scraping with Selenium data from a table with scrolldown

我想抓取网站gate.io,我想在左表上列出所有硬币/代币,以便点击每个并为每个获得右侧显示的多头/空头比率这一页。 问题是我无法从表格中获取硬币/代币列表。 这是我所做的:

!pip install selenium
!apt-get update
!apt install chromium-chromedriver
!cp /usr/lib/chromium-browser/chromedriver /usr/bin
import sys
import logging
from selenium.webdriver.remote.remote_connection import LOGGER
LOGGER.setLevel(logging.WARNING)
sys.path.insert(0,'/usr/lib/chromium-browser/chromedriver')
from selenium import webdriver
from tqdm import tqdm_notebook as tqdm
import pandas
import json
import pprint

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
chrome_options.add_argument("user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36")

wd = webdriver.Chrome('chromedriver',chrome_options=chrome_options)
wd.get("https://www.gate.io/en/trade/BTC_USDT")

table = wd.find_element_by_xpath("//*[@id='marketlist_usdt']")
rows = table.find_elements_by_xpath("tbody/tr[@class=' border-box']")
for row in rows: print( row.get_attribute("id") ) 

您的元素路径似乎不正确:元素路径 1元素路径 2

尝试使用find_element_by_css_selector ,这对我find_element_by_css_selector

from selenium import webdriver

chrome = webdriver.Chrome(executable_path='MY_PATH')
chrome.get("https://www.gate.io/en/trade/BTC_USDT")


class Ask:
    def __init__(self, driver):
        self.driver = driver

    def __repr__(self):
        return f'{self.price} | {self.volume} | {self.total}'

    @property
    def price(self):
        return self.driver.find_element_by_css_selector('span.price ').text

    @property
    def volume(self):
        return self.driver.find_element_by_css_selector('span.volume').text

    @property
    def total(self):
        return self.driver.find_element_by_css_selector('span.total').text


ask_list = [Ask(a) for a in chrome.find_elements_by_css_selector('ul#ul-ask-list li')]

for a in ask_list:
    print(a)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM