简体   繁体   English

Xquery解析带有<a>标签的</a>文本

[英]Xquery parsing text with <a> tags

I am using XQuery to extract content from html pages. 我正在使用XQuery从html页面提取内容。 The html body structure is of this kind: html主体结构是这种类型的:

 <td>
      <a href ="hw1">xyz </a>
          Hello world 1 
        <a href="hw2">Helloworld 2</a>
          Helloworld 3         
 </td>

My XQuery expression for extracting the text is as follows: 用于提取文本的XQuery表达式如下:

  //a[starts-with(@href,'hw1')]/following-sibling::text()

This expression gives me : 这个表达式给我:

Helloworld 1 Helloworld 2 Helloworld 3 Helloworld 1 Helloworld 2 Helloworld 3

I would like to have it in this fashion: Helloworld 1 Helloworld 2 Helloworld 3 or Helloworld 1 Helloworld 3 我希望以这种方式使用它:Helloworld 1 Helloworld 2 Helloworld 3或Helloworld 1 Helloworld 3

How do I specify to parse the text enclosed by tags 如何指定解析标记所包围的文本

I'm not really clear what you're looking for, but 我不清楚您要寻找什么,但是

let $content := 
 <td>
      <a href ="hw1">xyz </a>
          Hello world 1 
        <a href="hw2">Helloworld 2</a>
          Helloworld 3         
 </td>

return $content/text()

gives you the text nodes directly under the <td>. 给您直接在<td>下的文本节点。 I don't see a difference between what you're getting and what you want... perhaps your post lost some formatting? 我没有看到您所得到的与您想要的有什么区别……也许您的帖子丢失了某些格式?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM