简体   繁体   English

使用XPATH从html提取文本

[英]Extracting text from html using XPATH

Here's a bit of an awkward one (I think). 这有点尴尬(我认为)。 I have the following html and am trying to extract the words London and Paris using XPATH. 我有以下html,并尝试使用XPATH提取单词London和Paris。

<h2 class="results-title">
    <span class="light">Flights from </span> London
    <span class="light">to</span> Paris    
</h2>

The nearest I can get is with the following: 我能得到的最接近的是以下内容:

//h2[@class='results-title']//span

This gives the following results: 得到以下结果:

Flights from to 从飞往的航班

Any suggestions greatly appreciated. 任何建议,不胜感激。

I am not sure if this will always work in a larger context, but it works for the given toy problem: 我不确定这是否将始终在更大的范围内起作用,但是它可以解决给定的玩具问题:

//h2/text()

returns London and Paris . 返回LondonParis

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM