BeautifulSoup Python Selenium - 在抓取網站之前等待推文加載

Question

我正在嘗試抓取一個網站以提取推文鏈接（在這種情況下特別是 DW），但我無法獲取任何數據，因為推文沒有立即加載，因此請求在有時間加載之前執行。 我曾嘗試使用請求超時以及 time.sleep() 但沒有運氣。 使用這兩個選項后，我嘗試使用 Selenium 在本地加載網頁並給它時間加載，但我似乎無法讓它工作。 我相信這可以用 Selenium 來完成。 這是我到目前為止所嘗試的：

        links = 'https://www.dw.com/en/vaccines-appear-effective-against-india-covid-variant/a-57344037'
        driver.get(links)
        delay = 30 #seconds
        try:
            WebDriverWait(driver, delay).until(EC.visibility_of_all_elements_located((By.ID, "twitter-widget-0")))
        except:
            pass
        tweetSource = driver.page_source
        tweetSoup = BeautifulSoup(tweetSource, features='html.parser')
        linkTweets = tweetSoup.find_all('a')
        for linkTweet in linkTweets:
            try:
                tweetURL = linkTweet.attrs['href']
            except:  # pass on KeyError or any other error
                pass
            if "twitter.com" in tweetURL and "status" in tweetURL:
                # Run getTweetID function
                tweetID = getTweetID(tweetURL)
                newdata = [tweetID, date_tag, "DW", links, title_tag, "News", ""]
                # Write to dataframe
                df.loc[len(df)] = newdata
                print("working on tweetID: " + str(tweetID))

如果有人能得到 Selenium 來找到這條推文，那就太好了！

Answer 1

這是一個iframe首先你需要切換到那個 iframe

iframe = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.ID, "twitter-widget-0"))
    )
driver.switch_to.frame(iframe)

BeautifulSoup Python Selenium - 在抓取網站之前等待推文加載

問題描述

1 個解決方案

解決方案1
0 已采納 2021-04-29 10:23:42

BeautifulSoup Python Selenium - 在抓取網站之前等待推文加載

問題描述

1 個解決方案

解決方案1 0 已采納 2021-04-29 10:23:42

解決方案1
0 已采納 2021-04-29 10:23:42