[英]Select an element in Selenium, Python, not by using XPath
I was trying to scrape a website, and I need to select only the ul element inside the div with a class "Slider__SliderWrapper-sc-143uniy-0 jrPmnS", however, since there are many div tags with the same class, the only way I have to select just the ul I need is by looking at the href of the a tag, the one inside the h2. I was trying to scrape a website, and I need to select only the ul element inside the div with a class "Slider__SliderWrapper-sc-143uniy-0 jrPmnS", however, since there are many div tags with the same class, the only way我必须 select 只是我需要的 ul 是通过查看 h2 内的 a 标签的 href。 I can't use xpath, because div tags always change position.
我不能使用 xpath,因为 div 标签总是改变 position。
<div>
<h2><a class="slider-components__SectionLink-sc-1r2bduf-3 jchpWs" href="rightOne">Right!</a></h2>
<div class="Slider__SliderWrapper-sc-143uniy-0 jrPmnS">
<ul class="Slider__List-sc-143uniy-1 MTYOL">
the right ul
</ul>
</div>
</div>
<div>
<h2><a class="slider-components__SectionLink-sc-1r2bduf-3 jchpWs" href="wrongOne">Something else</a></h2>
<div class="Slider__SliderWrapper-sc-143uniy-0 jrPmnS">
<ul class="Slider__List-sc-143uniy-1 MTYOL">
the wrong ul
</ul>
</div>
</div>
I thought about using css
selector but I don't know how to, any help?我想过使用
css
选择器,但我不知道怎么做,有什么帮助吗?
You definitely CAN use xpath to access the href attribute AND it's contents:您绝对可以使用 xpath 访问 href 属性及其内容:
//a[contains(@href,'rightOne')]
and for the ul:对于 ul:
//h2/a[contains(@href,'rightOne')]/../following-sibling::div/ul
try xpath
试试
xpath
//a[@href='rightOne']/../following-sibling::div/ul
Explanation:解释:
You cannot use css_selector
or any other locator
since you are depending on a
tag and you have to traverse upwards in DOM first, we are using /..
for that, alternatively you can use /parent::h2
and the next following-sibling
using /following-sibling::div
and then finally ul
child您不能使用
css_selector
或任何其他locator
,因为您依赖a
标签并且您必须首先在 DOM 中向上遍历,我们使用/..
为此,或者您可以使用/parent::h2
和下一个following-sibling
使用/following-sibling::div
然后最后是ul
child
You cannot get a parent element with css selector, as it's not possible.您无法使用 css 选择器获取父元素,因为这是不可能的。 Check here Is there a CSS parent selector?
检查这里是否有 CSS 父选择器?
In your case you would need to get the parent of a[href=rightOne]
and get the ul
of the following sibling.在您的情况下,您需要获取
a[href=rightOne]
的父级并获取以下兄弟的ul
。
With css you could use one of these locators:使用 css 您可以使用以下定位器之一:
div:nth-child(1) .Slider__SliderWrapper-sc-143uniy-0.jrPmnS>.Slider__List-sc-143uniy-1.MTYOL
Or或者
div:nth-child(1) .Slider__SliderWrapper-sc-143uniy-0.jrPmnS>ul
I would select any of XPaths proposed in other two answers if there are not restrictions on selectors.如果对选择器没有限制,我会 select 在其他两个答案中提出的任何 XPaths。
But, if you are using such libraries as BeautfulSoup , you will have to use css selectors, as it does not support XPath.但是,如果您使用BeautfulSoup等库,则必须使用 css 选择器,因为它不支持 XPath。 So, use the ones I proposed.
所以,使用我建议的那些。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.