繁体 English 中英

提取没有href属性的链接：Selenium-Python

[英]Fetch links having no href attribute : Selenium-Python

原文 2017-09-28 07:07:02 1 1 python/ selenium/ web-crawler

我目前正在尝试使用selenium-python通过指定爬网深度的整个网站进行爬网。 我从Google入手，然后想到通过爬网而向前发展，同时开发代码。

它的工作方式是：如果页面为“ www.google.com”，并且其中包含15个链接，则在提取所有链接后，会将其存储在以“ www.google.com”为键和15个链接的列表作为值。 然后从相应的字典中获取15个链接中的每个链接，然后以递归的方式继续爬网。

问题在于，它相对于页面上找到的每个链接的href属性都向前移动。 但是，并非每个链接都具有href属性。

例如：在爬网并到达“ 我的帐户”页面时，它的页脚中有“ 帮助和反馈 <span role="button" tabindex="0" class="fK1S1c" jsname="ngKiOe">Help and Feedback</span>为<span role="button" tabindex="0" class="fK1S1c" jsname="ngKiOe">Help and Feedback</span> 。

因此，我不确定的是-在这样的上下文中可以做什么，因为在这种情况下javascript / ajax高度支持链接很重要-因为它没有链接，但是会打开模式窗口/对话框或排序。

1 个解决方案

您可能需要找到链接的设计模式。 例如：您可能有一个带有锚标记的链接，并且在您的情况下为跨度。

这取决于网页的设计。 开发人员如何通过属性/标识符来设计html元素。

例如：如果开发人员决定为所有不具有锚标记名称的链接使用一个公共类值，则识别所有这些元素将很容易。

您也可以尝试在此处编写脚本以获取具有预期标签名称（例如：span）的所有元素，然后尝试单击元素。 您可以获取后端响应/日志详细信息的详细信息。 因此，对于那些单击，您将获得其他响应/日志，这意味着在其后写有其他代码，这使我们知道它不是静态元素。

如何使用 selenium-python 检查元素是否具有属性

[英]How to check if an element is having an attribute using selenium-python

如何使用 Python/Selenium 在没有显式 href 属性的情况下从元素获取 href 属性的值

[英]How to get the value of the href attibute from element without having an explicit href attribute using Python/Selenium

如何通过 selenium-python 获取 ab 标签中的数字？

[英]How can I fetch the number in a b tag through selenium-python?

如何处理来自 selenium-python 的 javascript 中的 webelement 的 id - 不是属性 id(?)？

[英]How to handle webelement's id - not attribute id(!) in javascript from selenium-python?

Selenium-python中css选择器中括号[？]的问题

[英]Problem with brackets[?] in css selector in Selenium-python

Selenium-Python-如何减少NoSuchElementException的时间

[英]Selenium-Python - How to reduce time for NoSuchElementException

WebDriverException Selenium-python在CircleCi上的测试

[英]WebDriverException selenium-python tests on CircleCi

如何使用 Selenium-Python 抓取 dropdwon 菜单？

[英]How to scrape a dropdwon menu with Selenium-Python?

在 Selenium-Python 中，控制不会切换到 iframe

[英]Control does not switch to iframe in Selenium-Python

selenium-python 无法定位元素

[英]selenium-python cannot locate element

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用 selenium-python 检查元素是否具有属性如何使用 Python/Selenium 在没有显式 href 属性的情况下从元素获取 href 属性的值如何通过 selenium-python 获取 ab 标签中的数字？如何处理来自 selenium-python 的 javascript 中的 webelement 的 id - 不是属性 id(?)？ Selenium-python中css选择器中括号[？]的问题 Selenium-Python-如何减少NoSuchElementException的时间 WebDriverException Selenium-python在CircleCi上的测试如何使用 Selenium-Python 抓取 dropdwon 菜单？在 Selenium-Python 中，控制不会切换到 iframe selenium-python 无法定位元素

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM