简体   繁体   English

如何最好地选择正确的元素(Python 3,Selenium)

[英]How to best select the correct element (Python 3, Selenium)

I'm trying to select a particular element from a page using Python 3 and Selenium. 我正在尝试使用Python 3和Selenium从页面中选择特定元素。

The page consists of a long list (hundreds of items) that are all formatted just like this: 该页面包含一长串列表(数百个项目),这些列表的格式均如下所示:

在此处输入图片说明

The html for this table looks like this: 该表的html如下所示:

在此处输入图片说明

And when I expand the element for the particular item I'm trying to click on, it looks like this (link obscured for privacy): 当我为要单击的特定项目扩展元素时,它看起来像这样(为隐私起见模糊了链接):

在此处输入图片说明

What I have been doing so far is to search for the element I need using 到目前为止,我一直在寻找需要使用的元素

titleField = 'Zombie Apocalypse'   
searchBuilder = "//*[contains(text(), '" + titleField + "')]"
searchForBook = browser.find_elements_by_xpath(searchBuilder)
searchForBook[0].click()

which works some of the time. 有时会起作用。 I'm running into problems when there are two items with that same name, or if there is an apostrophe in the title, and sometimes I can't figure out why it didn't work at all. 当有两个具有相同名称的项目,或者标题中带有撇号时,我会遇到问题,有时我无法弄清为什么它根本不起作用。

Is there a better way to select an individual element out of that table than the way I'm using? 是否有比我正在使用的方法更好的方法从表中选择单个元素? I will have the title of the item ahead of time, but not the ID number. 我将提前获得商品的标题,但没有ID号。 The ID number is the information I'm trying to scrape. ID号是我要抓取的信息。

I'm also okay with it if the search returns the URL of the item, because the ID number is contained in that URL, so I can just pull it from there. 如果搜索返回该项目的URL,我也可以,因为ID号包含在该URL中,所以我可以从那里提取它。 But the title isn't in the URL, so I didn't know how to search for it. 但是标题不在URL中,所以我不知道如何搜索。

You should quote the text before inserting it in the XPath expression. 在将文本插入XPath表达式之前,应先对其加引号。 This will encode appropriately yoyr string for xpath expressions. 这将为xpath表达式正确编码yoyr字符串。 Note that "quoteattr"ed stings include surrounding ' or " . 请注意,“ quoteattr”字符串会包含'"

from xml.sax.saxutils import quoteattr
titleField = quoteattr('Zombie Apocalypse')  # But may contain XML markup chars
searchBuilder = "//*[contains(text(), " + titleField + ")]"
searchForBook = browser.find_elements_by_xpath(searchBuilder)
searchForBook[0].click()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM