簡體   English   中英

從頁面抓取圖像,URL,描述

[英]Scraping image, url, description from page

我正在嘗試從https://www.google.com/trends/home/all/IN獲取圖像和視頻URL

這是代碼:

driver = webdriver.PhantomJS('/usr/local/bin/phantomjs')
driver.set_window_size(1124, 850)
driver.get("https://www.google.com/trends/home/all/IN")
trend = {}
def getGooglerends():
    try:
    #Does this line makes any sense
        #element = WebDriverWait(driver, 20).until(lambda driver: driver.find_elements_by_class_name('md-list-block ng-scope'))
        for s in driver.find_elements_by_class_name('md-list-block ng-scope'):
            print s.find_element_by_tag_name('img').get_attribute('src')
            print s.find_element_by_tag_name('img').get_attribute('alt')
            print s.find_elements_by_class_name('image-wrapper ng-scope').get_attribute('href')
    except:
        getNDTVTrends()
getGooglerends()

這使

WebDriverException: Message: {"errorMessage":"Compound class names not permitted","request":{"headers":{"Accept":"application/json","Accept-Encoding":"identity","Connection":"close","Content-Length":"111","Content-Type":"application/json;charset=UTF-8","Host":"127.0.0.1:57213","User-Agent":"Python-urllib/2.7"},"httpVersion":"1.1","method":"POST","post":"{\"using\": \"class name\", \"sessionId\": \"648251c0-1cc7-11e5-bf1c-4ff79ddbdce4\", \"value\": \"md-list-block ng-scope\"}","url":"/elements","urlParsed":{"anchor":"","query":"","file":"elements","directory":"/","path":"/elements","relative":"/elements","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/elements","queryKey":{},"chunks":["elements"]},"urlOriginal":"/session/648251c0-1cc7-11e5-bf1c-4ff79ddbdce4/elements"}}
Screenshot: available via screen

對這個錯誤有什么建議嗎?

不允許使用復合類名

基本上,這意味着您的類名稱中不能包含空格。 您需要切換到另一個選擇器,例如css,xpath或類似的東西。

不太確定要嘗試選擇的內容,例如,在xpath之后選擇包含該類的項目列表:

//div[@class="homepage-trending-stories generic-container ng-scope"]/md-list[@class="md-list-block ng-scope"]

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM