简体   繁体   English

如何在 selenium 中使用 xpath 获取元素的父级(python)

[英]How to get parents of an element with xpath in selenium (python)

I have a script that takes all the images I want on a webpage, then I have to take the link that enclose the image .我有一个脚本可以在网页上获取我想要的所有图像,然后我必须获取包含image的链接。

I actually click on every image, take the current page link and then I go back and continue with the work.我实际上点击了每张图片,获取当前页面链接,然后我返回 go 并继续工作。 This is slow but I have an a tag that "hug" my image , I don't know how to retrieve that tag.这很慢,但我有一个“拥抱”我的image a标签,我不知道如何检索该标签。 With the tag it could be easier and faster.有了标签,它可能会更容易和更快。 I attach the html code and my python code!我附上 html 代码和我的 python 代码!

HTML code HTML代码

<div class="col-xl col-lg col-md-4 col-sm-6 col-6">
<a href="URL I WANT TO GET ">
<article>
<span class="year">2017</span>
<span class="quality">4K</span>
<span class="imdb">6.7</span>
<img width="190" height="279" src="THE IMAGE URL" class="img-full wp-post-image" alt="" loading="lazy"> <h2>TITLE</h2>
</article>
</a></div>
<div class="col-xl col-lg col-md-4 col-sm-6 col-6">
<a href="URL I WANT TO GET 2">
<article>
<span class="year">2019</span>
<span class="quality">4K</span>
<span class="imdb">8.0</span>
<img width="190" height="279" src="THE IMAGE URL 2" class="img-full wp-post-image" alt="" loading="lazy"> <h2>TITLE</h2>
</article>
</a></div>

Python code Python代码

self.driver.get(category_url)
WebDriverWait(self.driver, 10).until(EC.visibility_of_element_located((By.CSS_SELECTOR, 'div.archivePaging')))  # a div to see if page is loaded
movies_buttons = self.driver.find_elements_by_css_selector('img.img-full.wp-post-image')

print("Getting all the links!")
for movie in movies_buttons:
    self.driver.execute_script("arguments[0].scrollIntoView();", movie)
    movie.click()
    WebDriverWait(self.driver, 10).until(EC.visibility_of_element_located((By.CLASS_NAME, 'infoFilmSingle')))
    print(self.driver.current_url)
    self.driver.back()
    WebDriverWait(self.driver, 10).until(EC.visibility_of_element_located((By.CSS_SELECTOR, 'div.archivePaging')))
            

Note that this code don't work now because i'm calling a movie object of an old page but that's not a problem because if i would just see the link i don't need to change page and so the session don't change.请注意,此代码现在不起作用,因为我正在调用旧页面的电影 object 但这不是问题,因为如果我只看到链接我不需要更改页面,因此 session 不会更改.

An example based on what I understand you want to do - You wanna get the parent a tags href一个基于我理解你想要做的例子 - 你想给父母a标签href

Example例子

from selenium import webdriver
driver = webdriver.Chrome(executable_path=r'C:\Program Files\ChromeDriver\chromedriver.exe')

html_content = """
<div class="col-xl col-lg col-md-4 col-sm-6 col-6">
    <a href="https://www.link1.de">
        <article>
            <span class="year">2017</span>
            <span class="quality">4K</span>
            <span class="imdb">6.7</span>
            <img width="190" height="279" src="THE IMAGE URL" class="img-full wp-post-image" alt="" loading="lazy"> <h2>TITLE</h2>
        </article>
    </a>
</div>
<div class="col-xl col-lg col-md-4 col-sm-6 col-6">
    <a href="https://www.link2.de">
        <article>
            <span class="year">2019</span>
            <span class="quality">4K</span>
            <span class="imdb">8.0</span>
            <img width="190" height="279" src="THE IMAGE URL 2" class="img-full wp-post-image" alt="" loading="lazy"> <h2>TITLE</h2>
        </article>
    </a>
</div>
"""
driver.get("data:text/html;charset=utf-8,{html_content}".format(html_content=html_content))

Locate the image elements with its class and and walk up the element structur with .. in this case /../..用它的class定位image elements ,然后用..向上走元素结构,在这种情况下是/../..

driver.get("data:text/html;charset=utf-8,{html_content}".format(html_content=html_content))
aTags = driver.find_elements_by_xpath("//img[contains(@class,'img-full wp-post-image')]/../..")
for ele in aTags:
  x=ele.get_attribute('href')
  print(x)
driver.close()

Output Output

https://www.link1.de/
https://www.link2.de/

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM