[英]XPath: Select specific item from property string
Trying to drill down to a specific Xpath of a url in a longer string. 尝试向下钻取较长字符串中的URL的特定Xpath。 I've gotten down to each of the listed blocks, but can't seem to get any further than the long string of properties.
我已经深入到列出的每个块,但是似乎除了长长的属性字符串之外别无他法。
example code: 示例代码:
<div class="abc class">
<a class="123" title="abc" keys="xyz" href="url string">
Right now I have... 现在我有...
.//*[@id='content']/div/div[1]/a
That only retrieves the whole string of data, from class through href. 那只会从类到href检索整个数据字符串。 What would I need to just retrieve the "url string" from that part?
我仅需要从该部分检索“ URL字符串”,该怎么办? Would this need to be accomplished with a subsequent 'for' argument in the python input?
是否需要在python输入中使用后续的“ for”参数来实现?
A pure XPath solution would involve just adding the @href
to the expression: 一个纯XPath解决方案将只需要在表达式中添加
@href
:
.//*[@id='content']/div/div[1]/a/@href
In Python, assuming you are using lxml.html
, you can get the attribute using the .attrib
: 在Python中,假设你使用
lxml.html
,您可以用得到的属性.attrib
:
for link in root.xpath(".//*[@id='content']/div/div[1]/a"):
print(link.attrib['href'])
Try to avoid this array 尽量避免这个数组
if your class name is unique you can do it like:- 如果您的班级名称是唯一的,则可以这样做:
//*[@id='content']/div/div[@class='abc class']/a[@keys='xyz']/@href
Hope it will help you :) 希望它能对您有所帮助:)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.