[英]Python Selenium find_elements_by_class_name Error
I am scraping a google page that has returned links to Linkedin profiles.我正在抓取一个已返回 Linkedin 个人资料链接的谷歌页面。
I want to collect the links on a page and put them in a python list.我想收集页面上的链接并将它们放入 python 列表中。
Problem is I can't seem to properly extract them from the page and I don't know why.问题是我似乎无法从页面中正确提取它们,我不知道为什么。
Google source code looks like this:谷歌源代码如下所示:
The page displays 10 of the following:该页面显示以下 10 项:
Mary Smith - Director of Talent Acquisition ...
https://www.linkedin.com › marysmith
Anytown, Arizona 500+ connections ... Experienced Talent Acquisition Director, with a
demonstrated history of working in the marketing and advertising ...
The source code looks like this:源代码如下所示:
<div data-hveid="CAIQAA" data-ved="2ahUKEwjLv6HMr4HmAhWluVkKHfjfA1EQFSgAMAF6BAgCEAA">
<div class="rc">
<div class="r">
<a href="https://www.linkedin.com/in/marysmith" ping="/url?sa=t&source=web&rct=j&url=https://www.linkedin.com/in/marysmith&ved=2ahUKEwjLv6HMr4HmAhWluVkKHfjfA1EQFjABegQIAhAB">
<h3 class="LC20lb"><span class="S3Uucc">Mary Smith - Director of Talent Acquisition, Culture Curator ...</span></h3><br>
<div class="TbwUpd">
<cite class="iUh30 bc">https://www.linkedin.com › marysmith</cite>
</div>
</a>
...
In my script I'm using Selenium and find_element_by_class_name()
to collect all the instances of the links to Linkedin.在我的脚本中,我使用 Selenium 和
find_element_by_class_name()
来收集链接到 Linkedin 的所有实例。 The one in the above example is https://www.linkedin.com › marysmith
.上面例子中的一个是
https://www.linkedin.com › marysmith
。 It is one line of code where I use driver.find_element_by_class_name()
with the particular class name:这是我使用
driver.find_element_by_class_name()
和特定类名的一行代码:
linkedin_urls = driver.find_element_by_class_name("iUh30 bc")
However I get the following error:但是我收到以下错误:
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"[name="iUh30 bc"]"}
I've tried various permutations and other classes but it won't work.我尝试了各种排列和其他类,但它不起作用。 If I use the X_Path for one of those links the script WILL return that single link.
如果我将 X_Path 用于这些链接之一,脚本将返回该单个链接。
What am I doing wrong?我究竟做错了什么?
Websites like Google and Facebook use an AI to construct the pages sources and assign random classes that's why you are getting no such element because every time you load that page the class's value varies To solve this issue try to use constant tags or attributes.像 Google 和 Facebook 这样的网站使用 AI 来构建页面源并分配随机类,这就是为什么您没有获得此类元素的原因,因为每次加载该页面时,类的值都会发生变化。要解决此问题,请尝试使用常量标签或属性。
Try something like:尝试类似:
#<cite class="iUh30 bc">https://www.linkedin.com › mary-smith-mckenzie-8b660799</cite>
driver.find_elements_by_xpath("//cite[contains(text(),'›') and contains(text(),'linkedin.com')]")
That method is known to be buggy.众所周知,这种方法有问题。 Try:
尝试:
driver.find_element_by_css_selector(".iUh30.bc")
Thanks for the replies. 感谢您的答复。
I evidently corrupted something on my computer because when I transferred to another an ran it it worked fine as I had coded it. 我显然损坏了我计算机上的某些内容,因为当我转移到另一个计算机上运行时,它可以像我编写的代码一样正常工作。
I appreciate the replies. 我感谢您的答复。 I am not fluent in CCS and your responses will indeed help me going forward
我对CCS不太熟练,您的回答确实会帮助我前进
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.