简体   繁体   English

Python selenium - 获取文本和 href

[英]Python selenium - get text and href

Let's say I have multiple divs looking like this:假设我有多个divs ,如下所示:

<div class="class1">
    <div class="class2">
        <div class="class3">Text1</div>
        <div class="class4">
            <a href="https://somelink"><h2>Text2</h2></a>
            <p class="class5">Text3 <span class="class6"> Text4 </span></p>
        </div>
    </div>
</div>

For each div I can get Text1, Text2, Text3, and Text4:对于每个div ,我可以获得 Text1、Text2、Text3 和 Text4:

elements = driver.find_elements_by_xpath("//div[@class='class1']/*")
for e in elements:
    print(e.text)
    print('------------------------------------------')

But how do I additionaly get value of href ?但是我如何额外获得href的价值?

I would like to have: https://somelink, Text1, Text2, Text3, Text4我想要:https://somelink, Text1, Text2, Text3, Text4

why not do this?为什么不这样做呢?

elements = driver.find_elements_by_xpath("//div[@class='class1']/*")
res = []
for e in elements:
    res.append(e.text)
    href = e.get_attribute('href')
    if href is not None:
        res.insert(0, href)
print(", ".join(res))

Try like this:试试这样:

elements = driver.find_elements_by_xpath("//div[@class='class1']/*") # this will recognize "class2" 
for e in elements:
    print(e.text)
    link = e.find_element_by_xpath(".//a").get_attribute("href") # Finds the "a" tag inside the class2. A "." at the beginning because we are finding element within elements. "//a" because "class2" is nested. 
    print('------------------------------------------')

I think you'll find your answer here: Python Selenium - get href value我想您会在这里找到答案: Python Selenium - 获取 href 值

Basically, it will look like基本上,它看起来像

driver.find_elements_by_css_selector('div.class4 > a').get_attribute('href')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM