简体   繁体   English

Xpath如何从后代节点获取文本

[英]Xpath how to get text from descendant nodes

I have something like this: 我有这样的事情:

<div id="m0">
...
 <tr>
  <td></td>
  <td></td>
  <td>Radio</td>
 </tr>
</div>

<div id="m1">
...
<tr>
  <td></td>
  <td></td>
  <td> 
    <a>TV channel</a>
    <font color="#555555">...</font>
  </td>
</tr>
<tr>
  <td></td>
  <td></td>
  <td>
     <i> </i>
  </td>
</tr>
<tr>
  <td></td>
  <td></td>
  <td> 
     <i> Other channel </i>
  </td>
</tr>

I want to get this as result: ['Radio','TV Channel',' ','Other channel] 我想得到这个结果: ['Radio','TV Channel',' ','Other channel]

I have tried to do: ch_nodes=tree.xpath('//div[@id="%s"]/table[@class= "fl"]/tr/td[3]/descendant-or-self::*'%div) 我尝试做: ch_nodes=tree.xpath('//div[@id="%s"]/table[@class= "fl"]/tr/td[3]/descendant-or-self::*'%div)

After that for each node i get the text but it gives me nodes that i do not want like <font> content. 之后,对于每个节点,我都得到了文本,但它为我提供了我不想像<font>内容那样的节点。

I have tried this too : ch_nodes=tree.xpath('//div[@id="%s"]/table[@class= "fl"]/tr/td[3]/descendant-or-self::*[2]'%div) but does not give me the self content if does not have child nodes. 我也尝试过这个: ch_nodes=tree.xpath('//div[@id="%s"]/table[@class= "fl"]/tr/td[3]/descendant-or-self::*[2]'%div)但如果没有子节点,则不会提供自我内容。

How can i get ['Radio','TV Channel',' ','Other channel] ? 我如何获得['Radio','TV Channel',' ','Other channel]

Get each first text-node from tr : tr获取每个第一个文本节点:

$x("//table//tr//*[1]/text()")

If you want to get each first non-empty text-node from tr : 如果要从tr获取每个第一个非空文本节点:

$x("//table//tr//*[boolean(string-length(normalize-space(text())))][1]/text()")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM