简体   繁体   English

如何绕过 selenium python web 抓取中的错误

[英]how to bypass error in selenium python web scraping

I have written the following code for web scraping.我已经为网页抓取编写了以下代码。 the code without the if else loop part works fine, that i intend to do.没有 if else 循环部分的代码工作正常,我打算这样做。 I have a list of urls which i want to scrape and if in any url the element that doesnt exist, then i have to bypass that url and move on to the next.我有一个我想抓取的网址列表,如果在任何网址中元素不存在,那么我必须绕过该网址并继续下一个。 I achieved bypassing the url that has no element, but my normal scraping then doesnt work in the else loop as it should.我实现了绕过没有元素的 url,但是我的正常抓取然后在 else 循环中无法正常工作。 any help guys?任何帮助家伙?

from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
import time
from selenium.common.exceptions import TimeoutException
from selenium.common.exceptions import NoSuchElementException  

urls= [
'http://www.marketsmojo.com/Stocks?StockId=1002687&Exchange=0'
]
f= open("lolly.txt","a+")
browser=webdriver.Chrome()
browser.maximize_window()
browser.get('http://www.marketsmojo.com/Stocks?StockId=565016&Exchange=0')
browser.find_element_by_xpath("//*[@id='step-0']/a/i").click()
for url in urls:
    browser.get(url)
    browser.execute_script("window.scrollTo(10,9500);")
    browser.implicitly_wait(2000)
    if browser.find_element_by_xpath("//div[contains(.,' No Shareholding data available ')]"):
        continue
    else:
        add=browser.find_element_by_css_selector('#btnShareholdingDashboardFullDetails')
        SearchButton = browser.find_element_by_css_selector('#btnShareholdingDashboardFullDetails')
        Hover = ActionChains(browser).move_to_element(add).move_to_element(SearchButton)
        Hover.click().perform()
        browser.find_elements_by_css_selector('#allquarters > div > table')
        add1 = browser.find_element_by_css_selector('#AllQuarters')
        SearchButton1 = browser.find_element_by_css_selector('#AllQuarters')
        Hover1 = ActionChains(browser).move_to_element(add).move_to_element(SearchButton1)
        Hover1.click().perform()
        data = []
        for tr in browser.find_elements_by_css_selector('#allquarters > div > table'):
            ths = tr.find_elements_by_tag_name('th')
            tds = tr.find_elements_by_tag_name('td')
            if ths: 
                data.append([th.text for th in ths])
            if tds: 
                data.append([td.text for td in tds])
            f.write(str(data))

    
browser.quit()

Sorry for the trouble guys, it was just the timeout function that caused the hiccup.!对不起,麻烦各位,只是超时功能导致了打嗝。! code is perfectly fine.代码完全没问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM