简体   繁体   English

Python WebDriver:find_element() 找不到元素

[英]Python WebDriver: find_element() not finding the element

I'm learning the very basics of Web Scraping by following Chapter 12 of Automate the boring stuff with Python, but I'm having an issue with the find_element() method.我正在按照使用 Python 自动化无聊的东西的第 12 章学习 Web Scraping 的基础知识,但是我遇到了 find_element() 方法的问题。 When I use the method to look for an element with the class name 'card-img-top cover-thumb', the method doesn't return any matches.当我使用该方法查找类名为“card-img-top cover-thumb”的元素时,该方法不返回任何匹配项。 However, the code does work for URL's other than the example in the book.但是,该代码确实适用于本书中示例以外的 URL。

I have had to make quite a few changes to the code as-written in order to get the code to do anything.为了让代码做任何事情,我不得不对编写的代码进行相当多的更改。 I've posted the full code on GitHub HERE , but to summarise:我已经在 GitHub HERE上发布了完整的代码,但总结一下:

  • The book says to use 'find_element_by_*' methods, but these were producing depreciation messages that directed me to use find_element() instead.这本书说要使用“find_element_by_*”方法,但这些方法会产生折旧消息,指示我改用 find_element()。

  • To use this other method, I import 'By'.要使用这种其他方法,我导入“By”。

  • I also import 'Service' from 'Selenium.Webdriver.Chrome.Service' because Chromedriver doesn't work otherwise.我还从“Selenium.Webdriver.Chrome.Service”导入“服务”,因为 Chromedriver 无法正常工作。

  • I also define options with Webdriver.ChromeOptions() that hide certain error messages about a faulty device which apparently you're just supposed to ignore?我还使用 Webdriver.ChromeOptions() 定义了选项,这些选项隐藏了有关故障设备的某些错误消息,显然你应该忽略这些错误消息?

  • I put the code from the book into a function with 'url' and 'classname' arguments so I can test different url's without having to edit the code repeatedly.我将书中的代码放入带有“url”和“classname”参数的函数中,这样我就可以测试不同的 url,而无需重复编辑代码。

Here is the 'business-part' of the code:这是代码的“业务部分”:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service  
from selenium.webdriver.common.by import By

s=Service(r'C:\Users\antse\AppData\Local\Chrome_WebDriver\chromedriver.exe')

op = webdriver.ChromeOptions()
op.add_experimental_option('excludeSwitches', ['enable-logging'])

def FNC_GET_CLASS_ELEMENT_FROM_PAGE(URL, CLASSNAME):       
    browser = webdriver.Chrome(service = s, options = op)
    browser.get(URL)
    try:  
        elem = browser.find_element(By.CLASS_NAME, CLASSNAME)
        print('Found <%s> element with that class name!' % (elem.tag_name))
    except:
        print('Was not able to find an element with that name.')

FNC_GET_CLASS_ELEMENT_FROM_PAGE('https://inventwithpython.com', 'card-img-top cover-thumb')

Expected output: Found <img> element with that class name!预期输出:找到具有该类名的 <img> 元素!

Since the code does work when I look at a site like Wikipedia, I wonder if there have been changes to the html of the page that prevents the scrape from working properly?由于当我查看 Wikipedia 之类的网站时代码确实有效,我想知道是否对页面的 html 进行了更改以防止抓取正常工作?

Link to the book chapter HERE .本书章节的链接在这里

I appreciate any advice you can give me!我很感激你能给我的任何建议!

You can't pass multiple classes to find_element .您不能将多个类传递给find_element Only one can be present.只有一个可以在场。 So replace this:所以替换这个:

FNC_GET_CLASS_ELEMENT_FROM_PAGE('https://inventwithpython.com', 'card-img-top cover-thumb')

with this:有了这个:

FNC_GET_CLASS_ELEMENT_FROM_PAGE('https://inventwithpython.com', 'card-img-top')

If you really want to use both classes, then take a look at this answer which explains things in detail.如果你真的想同时使用这两个类,那么看看这个答案,它详细解释了事情。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM