[英]Find href attributes with find_elements_by_class_name in Selenium with Python
Using Selenium with Python. 在Python中使用Selenium。
I've got pages full of links with a class called item-title. 我的页面充满了名为item-title的类的链接。 I'm trying to iterate through the pages and compile a list of all the link texts and the href attributes that go with them.
我试图遍历页面并编译所有链接文本和它们所伴随的href属性的列表。 I want to output the titles and links to a csv file.
我想输出标题和指向csv文件的链接。 Here is my code:
这是我的代码:
myLinks=driver.find_elements_by_class_name("item-title")
for link in myLinks:
out.write(link.text)
out.write (",")
out.write(link.get_attribute("href"))
out.write("\n")
The line that outputs the href value gives the following error: 输出href值的行出现以下错误:
TypeError: expected a character buffer object TypeError:预期的字符缓冲区对象
Tried the following: 尝试了以下内容:
myLinks=driver.find_elements_by_class_name("item-title")
for link in myLinks:
out.write(link.text)
out.write (",")
out.write(str(link.get_attribute("href")))
out.write("\n")
Error went away, link text is coming through okay, but now the href is coming through as 'None' 错误消失了,链接文本通过了,但是现在href通过“ None”通过了
Edit to add the HTML 编辑以添加HTML
<div class="item-title">
<span class="icons-pinned"></span>
<span class="icons-solved"></span>
<span class="icons-locked"></span>
<span class="icons-moved"></span>
<span class="icons-type"></span>
<span class="icons-reply"></span>
<a href="/mylink">My title</a>
</div>
I think I see the issue now. 我想我现在看到了这个问题。 The is a child element of the div, I need to target that, don't I?
是div的子元素,我需要定位该对象,不是吗?
As per the HTML you have shared, the link texts
and the href attributes
are not within the node identified as find_elements_by_class_name("item-title")
. 根据您共享的HTML,
link texts
和href attributes
不在标识为find_elements_by_class_name("item-title")
的节点内。 Rather they are within the decendent <a>
tag. 而是它们位于下降的
<a>
标记内。 Hence instead of using find_elements_by_class_name("item-title")
we have to use either find_elements_by_xpath
or find_elements_by_css_selector
as follows : 因此,必须使用
find_elements_by_xpath
或find_elements_by_css_selector
来代替使用find_elements_by_class_name("item-title")
:
Using find_elements_by_css_selector
: 使用
find_elements_by_css_selector
:
myLinks=driver.find_elements_by_css_selector("div.item-title > a") for link in myLinks: out.write(link.get_attribute("innerHTML")) out.write (",") out.write(link.get_attribute("href")) out.write("\\n")
Using find_elements_by_xpath
: 使用
find_elements_by_xpath
:
myLinks=driver.find_elements_by_xpath("//div[@class='item-title']/a") for link in myLinks: out.write(link.get_attribute("innerHTML")) out.write (",") out.write(link.get_attribute("href")) out.write("\\n")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.