[英]Extract date from text inside html tags using XPATH
Extract date inside html tag using xpath substring 使用xpath子字符串在html标记中提取日期
I have tried using substring in xpath 我尝试在xpath中使用子字符串
<span id="latestReplyLine"><a href="#comment-965609" class="lastScroll js-latest-reply">Latest reply</a> on May 22, 2019 by John Stoltzfus</span>
I am using below xpath query to extract text 我正在下面的xpath查询中提取文本
/span[@id="latestReplyLine"]/text()[substring-after(substring-before(.,' by '), ' on ')]
Expected result - 预期结果 -
"May 22, 2019"
But I am getting, 但我明白了
"on May 22, 2019 by John Stoltzfus"
any idea? 任何想法?
You were missing the right string by one space ( on
instead of on
). 您缺少正确的字符串一个空格(
on
而不是on
)。
An improved XPath expression is the following: 改进的XPath表达式如下:
normalize-space(substring-after(substring-before(string(/span[@id='latestReplyLine']),'by'), 'on'))
This will give you the right result. 这将给您正确的结果。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.