Python Selenium WebDriver即使存在也無法一致選擇元素

Question

我正在開發一個Web抓取工具，以從html文件中的源標簽收集src鏈接並將其添加到列表中。

該站點的視頻嵌套在div的負載下，但最終所有頁面都位於：

<video type="video/mp4" poster="someimagelink" preload="metadata" crossorigin="anonymous">
    <source type="video/mp4" src="somemp4link">
</video>

我當前的方法是登錄該網站，轉到包含視頻頁面鏈接的頁面，一個一個地轉到每個視頻頁面，然后嘗試查找源標簽並將其添加到列表中。

import time
import requests
from bs4 import BeautifulSoup
from selenium import webdriver

browser = webdriver.Firefox()

# A bunch of log in and get list of video page links, which works fine

soup = BeautifulSoup(browser.page_source)
for i in range(3):
    browser.get(soup('a', {'class', 'subject__item'})[i]['href'])
    vsoup = BeautifulSoup(browser.page_source)
    print(vsoup('source'))
    browser.get('pageWithVideoPages')

    # This doen't add to a list, it just goes to the video page,
    # tries to find the source tag and print it out.
    # Then go back to original page and start loop again.

但是發生了什么，我得到了：

[<source src="themp4link" type="video/mp4"></source>]
[]
[]
[]

因此，第一個可以正常工作，然后所有其余的都返回黑名單……好像沒有源標簽，但是手動檢查檢查器會發現那里有一個源標簽。

重復一次，我現在得到：

[<source src="http://themp4link" type="video/mp4"></source>]
[]
[<source src="http://themp4link" type="video/mp4"></source>]

該網站需要啟用了javascript才能加載內容（這就是為什么我正在使用webdriver來執行此操作）...可能與之相關嗎？

任何幫助深表感謝！

Answer 1

您可能需要等待要查找的Web元素。 您應該使用WebDriverWait探索。

Python Selenium WebDriver即使存在也無法一致選擇元素

問題描述

1 個解決方案

解決方案1
1 已采納 2016-07-12 13:50:36

Python Selenium WebDriver即使存在也無法一致選擇元素

問題描述

1 個解決方案

解決方案1 1 已采納 2016-07-12 13:50:36

解決方案1
1 已采納 2016-07-12 13:50:36