简体   繁体   English

Python Selenium 仅当锚标签包含特定属性值时如何获取锚标签href值

[英]Python Selenium How to get anchor tag href value only if anchor tag contains certain attribute value

i want to get GitHub repository links from GitHub search results .我想从GitHub 搜索结果中获取 GitHub 存储库链接。 right now, my code gets links of both username and repository.现在,我的代码获得了用户名和存储库的链接。 how do i get only the repository links by targeting anchor tag attribute values.如何通过定位锚标记属性值仅获取存储库链接。

my code:我的代码:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time

path = "C:\programs\chromedriver.exe"
driver = webdriver.Chrome(path)
url = 'https://github.com/topics/flutter-apps'

driver.get(url)

links_list = []

headings = driver.find_elements_by_class_name('f3')


for heading in headings:
    links = heading.find_elements_by_tag_name('a')
    for l in links:
        links_list.append(l.get_attribute('href'),)


print(links_list)

this is the code i want to get links from.这是我想从中获取链接的代码。

    <h1 class="f3 text-gray text-normal lh-condensed">
      <a data-hydro-click="{&quot;event_type&quot;:&quot;explore.click&quot;,&quot;payload&quot;:{&quot;click_context&quot;:&quot;REPOSITORY_CARD&quot;,&quot;click_target&quot;:&quot;OWNER&quot;,&quot;click_visual_representation&quot;:&quot;REPOSITORY_OWNER_HEADING&quot;,&quot;actor_id&quot;:49521558,&quot;record_id&quot;:484656,&quot;originating_url&quot;:&quot;https://github.com/topics/ios&quot;,&quot;user_id&quot;:49521558}}"
        data-hydro-click-hmac="7b69680b468dda1b4e10ddab19c8034fd4c530bc57957662d8be320d79cc38f1"
        data-ga-click="Explore, go to repository owner, location:explore feed" href="/vsouza">
        vsouza
      </a> /
      <a data-hydro-click="{&quot;event_type&quot;:&quot;explore.click&quot;,&quot;payload&quot;:{&quot;click_context&quot;:&quot;REPOSITORY_CARD&quot;,&quot;click_target&quot;:&quot;REPOSITORY&quot;,&quot;click_visual_representation&quot;:&quot;REPOSITORY_NAME_HEADING&quot;,&quot;actor_id&quot;:49521558,&quot;record_id&quot;:21700699,&quot;originating_url&quot;:&quot;https://github.com/topics/ios&quot;,&quot;user_id&quot;:49521558}}"
        data-hydro-click-hmac="c38ef14c5a72214b8e946bde857c36653301cb96a15a6b1108242526485221b8"
        data-ga-click="Explore, go to repository, location:explore feed" href="/vsouza/awesome-ios" class="text-bold">
        awesome-ios
      </a>
    </h1>

between the two anchor elements i want to get href value of anchor tag which has this attribute and value data-ga-click="Explore, go to repository, location:explore feed"在两个锚元素之间,我想获取具有此属性和值的锚标签的 href 值data-ga-click="Explore, go to repository, location:explore feed"

To get such specific link you have pass this data-ga-click attribute in your xpath to get unique result.要获得这样的特定链接,您必须在xpath中传递此data-ga-click属性以获得独特的结果。

for heading in headings:
   links = heading.find_elements_by_xpath('.//a[@data-ga-click="Explore, go to repository, location:explore feed"]')
   for l in links:
        links_list.append(l.get_attribute('href'))

Or Css Selector.或 Css 选择器。

for heading in headings:
   links = heading.find_elements_by_css_selector('a[data-ga-click="Explore, go to repository, location:explore feed"]')
   for l in links:
        links_list.append(l.get_attribute('href'))

Do you want only the a tags with that value inside heading.您是否只想要标题内具有该值的 a 标签。 You need to use the.你需要使用. for child elements and use the data attribute value.对于子元素并使用数据属性值。

heading.find_elements_by_xpath('.//a[@data-ga-click="Explore, go to repository owner, location:explore feed"]')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用 scrapy、xpath、python 在锚标记中获取文本和 href 值 - How to get text and href value in anchor tag with scrapy, xpath, python 使用 selenium python 隐藏时如何获取锚标记 href 属性 - how to get anchor tag href attribute when it's hidden with selenium python 从xpath节点获取锚标签的2属性值 - Get value to 2 attribute from a xpath node for anchor tag 如何使用 Python Django 在 Html 锚标签中传递动态值 - How to pass Dynamic Value in Html anchor tag using Python Django Python / Selenium-无法获得标签的HREF值 - Python/Selenium - Can't get HREF value of a tag 如何从锚标签中抓取数据,该标签位于 selenium python 中的另一个锚标签内 - How to scrape data from an anchor tag, which is inside another anchor tag in selenium python beautifulsoup4从具有特定属性值的anchor元素获取href - beautifulsoup4 get href from anchor element with specific attribute value 从锚点XPath获取href(Selenium python) - Get href from anchor XPath(selenium python) 单击具有href =&#39;#&#39;的锚标记 - click on anchor tag having href = '#' 如何获取祖先<a>标签相对于已故孩子</a>的 href 属性<h3>标签使用 Selenium 和 Python</h3> - How to get the href attribute of the ancestor <a> tag with respect to the decedent child <h3> tag using Selenium and Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM