[英]Selenium for Python: Get text() of node that is shared with another element, via XPath
[英]Python/xpath get instances of text in arbitrary element
給定以下內容:
<table>
<tr>
<td>
<div>Text 1</div>
</td>
<td>
Text 2
</td>
<td>
<div>
<a href="#">Text 3</a>
</div>
</td>
</tr>
<tr>
...
</tr>
</table>
給定上表,我將如何提取所有文本? 請注意,嵌套元素的數量是任意的,因此我不能只尋找第一個兄弟,第零個兄弟和第二個兄弟。
我正在尋找提取文本的一般方法。
In [1]: d="""<table>
...: <tr>
...: <td>
...: <div>Text 1</div>
...: </td>
...: <td>
...: Text 2
...: </td>
...: <td>
...: <div>
...: <a href="#">Text 3</a>
...: </div>
...: </td>
...: </tr>
...: <tr>
...: ...
...: </tr>
...: </table>"""
In [3]: from lxml import etree
In [4]: f = etree.HTML(d)
In [5]: f.xpath('normalize-space(string(/table))')
Out[5]: ''
In [6]: f.xpath('normalize-space(string(//table))')
Out[6]: 'Text 1 Text 2 Text 3 ...
我會用:
normalize-space(string(/table))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.