简体   繁体   English

XPath:在节点集上使用字符串函数?

[英]XPath: Using string functions on a node-set?

I am wondering if it is possible to use string functions on a node-set - specifically the 'substring()' function in XPath 1.0. 我想知道是否可以在节点集上使用字符串函数 - 特别是XPath 1.0中的'substring()'函数。

The page I am scraping details from has a node-set of 5 URLs that are returned with the following XPath location path: 我正在抓取详细信息的页面有一个包含5个URL的节点集,这些URL通过以下XPath位置路径返回:

//div[@class='titles cf']/a[not(contains(div,'Sold'))]/@href

Unfortunately, the URLs are in '//www.example.com' format - I need these in 'www.example.com' format (without the leading slashes). 遗憾的是,网址采用“//www.example.com”格式 - 我需要采用“www.example.com”格式(不含前导斜杠)。 I have tried: 我努力了:

substring(//div[@class='example example-1']/a[not(contains(div,'Sold'))]/@href, 3)

However, this only returns 1 result. 但是,这只返回1个结果。 I need all 5 returned without the leading slashes. 我需要所有5个返回没有前导斜杠。 My guess is that you cannot use this kind of string function on a node-set, but hoping someone can shed light on this and help me achieve my desired result, please? 我的猜测是你不能在节点集上使用这种字符串函数,但希望有人可以阐明这一点,并帮助我实现我想要的结果,拜托?

If there are alternative methods of achieving the same result, then I'm all ears too. 如果有其他方法可以达到相同的结果,那么我也是耳朵。

Thanks 谢谢

There is no way to process each item of an arbitrary list or node-set with a function in pure XPath 1 so you will need to drop down to the host language you use XPath with to process the different nodes separately and call the substring function on each item from the host language (eg XSLT <xsl:for-each select="/div[@class='titles cf']/a[not(contains(div,'Sold'))]/@href"><xsl:value-of select="substring(., 3)"/></xsl:for-each> ). 无法使用纯XPath 1中的函数处理任意列表或节点集的每个项目,因此您需要下拉到使用XPath的主机语言来分别处理不同的节点并调用子字符串函数来自主语的每个项目(例如XSLT <xsl:for-each select="/div[@class='titles cf']/a[not(contains(div,'Sold'))]/@href"><xsl:value-of select="substring(., 3)"/></xsl:for-each> )。

In XPath 2 and later you can use either use the function call in or as the last step eg /div[@class='titles cf']/a[not(contains(div,'Sold'))]/@href/substring(., 3) or use a for .. return expression eg for $href in /div[@class='titles cf']/a[not(contains(div,'Sold'))]/@href return substring($href, 3) but there is no such option in pure XPath 1. 在XPath 2及更高版本中你可以使用函数调用或最后一步例如/div[@class='titles cf']/a[not(contains(div,'Sold'))]/@href/substring(., 3)或者使用for .. return表达式,例如for $href in /div[@class='titles cf']/a[not(contains(div,'Sold'))]/@href return substring($href, 3)但纯XPath 1中没有这样的选项。

Of course if you know you have five items, then depending on the way you use XPath (the host language or the tool) you might be able to use five path expressions eg substring((/div[@class='titles cf']/a[not(contains(div,'Sold'))]/@href)[1], 3) , substring((/div[@class='titles cf']/a[not(contains(div,'Sold'))]/@href)[2], 3) etc. 当然,如果您知道有五个项目,那么根据您使用XPath(主机语言或工具)的方式,您可以使用五个路径表达式,例如substring((/div[@class='titles cf']/a[not(contains(div,'Sold'))]/@href)[1], 3)substring((/div[@class='titles cf']/a[not(contains(div,'Sold'))]/@href)[2], 3)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM