简体   繁体   English

XPath - node() 和 text() 之间的区别

[英]XPath - Difference between node() and text()

I'm having trouble understanding the difference between text() and node() .我无法理解text()node()之间的区别。 From what I understand, text() would be whatever is in between the tags <item>apple</item> which is apple in this case.据我所知, text()将是标签<item>apple</item>之间的任何内容,在这种情况下是苹果 Node would be whatever that node actually is, which would be item节点将是该节点实际是什么,这将是项目

But then I've been assigned some work where it asks me to "Select the text of all items under produce" and a separate question asks "Select all the manager nodes in all departments"但是后来我被分配了一些工作,它要求我“选择生产下所有项目的文本”,一个单独的问题询问“选择所有部门中的所有管理节点”

How is the output suppose to look text() as opposed to node()输出如何假设看起来text()而不是node()

Snippet of XML: XML 片段:

<produce>
 <item>apple</item>
 <item>banana</item>
 <item>pepper</item>
</produce>

<department>
 <phone>123-456-7891</phone>
 <manager>John</manager>
</department>

Of course, there are more departments and more managers, but this was just a snippet of code.当然,还有更多的部门和更多的经理,但这只是一小段代码。

Any help would be much appreciated!任何帮助将非常感激!

text() and node() are node tests , in XPath terminology ( compare ). text()node()节点测试,在 XPath 术语中( compare )。

Node tests operate on a set (on an axis , to be exact) of nodes and return the ones that are of a certain type.节点测试在一组(准确地说是在轴上)节点上运行,并返回特定类型的节点。 When no axis is mentioned, the child axis is assumed by default.当没有提到轴时,默认情况下采用child轴。

There are all kinds of node tests :有各种节点测试

  • node() matches any node (the least specific node test of them all) node()匹配任何节点(所有节点中最不具体的节点测试)
  • text() matches text nodes only text()仅匹配文本节点
  • comment() matches comment nodes comment()匹配评论节点
  • * matches any element node *匹配任何元素节点
  • foo matches any element node named "foo" foo匹配任何名为"foo"元素节点
  • processing-instruction() matches PI nodes (they look like <?name value?> ). processing-instruction()匹配 PI 节点(它们看起来像<?name value?> )。
  • Side note: The * also matches attribute nodes, but only along the attribute axis.旁注: *也匹配属性节点,但仅沿attribute轴。 @* is a shorthand for attribute::* . @*attribute::*的简写。 Attributes are not part of the child axis, that's why a normal * does not select them.属性不是child轴的一部分,这就是普通*不选择它们的原因。

This XML document:此 XML 文档:

<produce>
    <item>apple</item>
    <item>banana</item>
    <item>pepper</item>
</produce>

represents the following DOM (simplified):代表以下 DOM(简化):

root node
   element node (name="produce")
      text node (value="\n    ")
      element node (name="item")
         text node (value="apple")
      text node (value="\n    ")
      element node (name="item")
         text node (value="banana")
      text node (value="\n    ")
      element node (name="item")
         text node (value="pepper")
      text node (value="\n")

So with XPath:所以使用 XPath:

  • / selects the root node /选择根节点
  • /produce selects a child element of the root node if it has the name "produce" (This is called the document element ; it represents the document itself. Document element and root node are often confused, but they are not the same thing.) /produce如果根节点的子元素名称为"produce" ,则选择根节点的子元素(这称为文档元素;它代表文档本身。文档元素和根节点经常混淆,但它们不是一回事。)
  • /produce/node() selects any type of child node beneath /produce/ (ie all 7 children) /produce/node()选择/produce/下的任何类型的子节点(即所有 7个子节点
  • /produce/text() selects the 4 (!) whitespace-only text nodes /produce/text()选择 4 个 (!) 纯空白文本节点
  • /produce/item[1] selects the first child element named "item" /produce/item[1]选择第一个名为"item"子元素
  • /produce/item[1]/text() selects all child text nodes (there's only one - "apple" - in this case) /produce/item[1]/text()选择所有子文本节点(只有一个 - “apple” - 在这种情况下)

And so on.等等。

So, your questions所以,你的问题

  • "Select the text of all items under produce" /produce/item/text() (3 nodes selected) “选择生产下所有项目的文本” /produce/item/text() (选择了3个节点)
  • "Select all the manager nodes in all departments" //department/manager (1 node selected) "选择所有部门的所有管理节点" //department/manager (选择1个节点)

Notes笔记

  • The default axis in XPath is the child axis. XPath 中的默认child轴。 You can change the axis by prefixing a different axis name.您可以通过为不同的轴名称添加前缀来更改轴。 For example: //item/ancestor::produce例如: //item/ancestor::produce
  • Element nodes have text values.元素节点具有文​​本值。 When you evaluate an element node, its textual contents will be returned.当您评估元素节点时,将返回其文本内容。 In case of this example, /produce/item[1]/text() and string(/produce/item[1]) will be the same.在本例中, /produce/item[1]/text()string(/produce/item[1])将相同。
  • Also see this answer where I outline the individual parts of an XPath expression graphically.另请参阅此答案,其中我以图形方式概述了 XPath 表达式的各个部分。

I'm having trouble understanding the difference between text() and node() .我在理解text()node()之间的区别时遇到了麻烦。 From what I understand, text() would be whatever is in between the tags <item>apple</item> which is apple in this case.据我了解, text()可以是标签<item>apple</item>之间的任何内容,在本例中为apple Node would be whatever that node actually is, which would be item节点将是该节点实际存在的任何东西,这将是

But then I've been assigned some work where it asks me to "Select the text of all items under produce" and a separate question asks "Select all the manager nodes in all departments"但是然后我被分配了一些工作,要求我“选择要生产的所有项目的文本”,并提出一个单独的问题,要求“选择所有部门中的所有经理节点”。

How is the output suppose to look text() as opposed to node()输出如何看起来像是text()而不是node()

Snippet of XML: XML片段:

<produce>
 <item>apple</item>
 <item>banana</item>
 <item>pepper</item>
</produce>

<department>
 <phone>123-456-7891</phone>
 <manager>John</manager>
</department>

Of course, there are more departments and more managers, but this was just a snippet of code.当然,有更多的部门和更多的经理,但这只是一小段代码。

Any help would be much appreciated!任何帮助将非常感激!

For me it was a big difference when I faced this scenario (here my story:)对我来说,当我面对这种情况时,这是一个很大的不同(这里是我的故事:)

<?xml version="1.0" encoding="UTF-8"?>
<sentence id="S1.6">When U937 cells were infected with HIV-1, 
        
    <xcope id="X1.6.3">
        <cue ref="X1.6.3" type="negation">no</cue> 
                        
                        induction of NF-KB factor was detected
        
    </xcope>
                    
, whereas high level of progeny virions was produced, 
        
    <xcope id="X1.6.2">
        <cue ref="X1.6.2" type="speculation">suggesting</cue> that this factor was 
        <xcope id="X1.6.1">
            <cue ref="X1.6.1" type="negation">not</cue> required for viral replication
        </xcope>
    </xcope>.

</sentence>

I needed to extract text between tags and aggregate (by concat) the text including in innner tags.我需要提取标签之间的文本并聚合(通过连接)包含在内部标签中的文本。

/node() did the job, while /text() made half job /node()完成了工作,而/text()完成了一半工作

/text() only returned text not included in inner tags, because inner tags are not "text nodes". /text()只返回不包含在内部标签中的文本,因为内部标签不是“文本节点”。 You may think, "just extract text included in the inner tags in an additional xpath", however, it becomes challenging to sort the text in this original order because you dont know where to place the aggregated text from the inner tags!because you dont know where to place the aggregated text from the inner nodes.您可能会想,“只需在附加的 xpath 中提取包含在内部标签中的文本”,但是,按照原始顺序对文本进行排序变得具有挑战性,因为您不知道将来自内部标签的聚合文本放在哪里!因为您不知道知道在哪里放置来自内部节点的聚合文本。

  1. When U937 cells were infected with HIV-1,当 U937 细胞被 HIV-1 感染时,
  2. no induction of NF-KB factor was detected未检测到 NF-κB 因子的诱导
  3. , whereas high level of progeny virions was produced, , 而产生了高水平的子代病毒粒子,
  4. suggesting that this factor was not required for viral replication表明该因子不是病毒复制所必需的
  5. . .

Finally, /node() did exactly what I wanted, because it gets the text from inner tags too.最后, /node()正是我想要的,因为它也从内部标签中获取文本。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM