簡體 English 中英

在lxml.html中，如何獲取節點的文本，子級和子級內容？

[英]in lxml.html how do i grab the text, children and content of children of a node?

原文 2011-08-26 18:49:26 5 2 python/ lxml

我正在使用python的lxml.html。 我有一個xpath表達式，它可以獲取節點的文本，但我需要的是所有文本，包括子標簽和其內容的標簽。 我該如何實現？

2 個解決方案

Element的text_content方法返回元素的文本，包括沒有標記的子元素的文本內容。

我不確定您使用的是什么標簽； 因此，我彌補了。

您可以嘗試：

result = lxml.html.parse(url).xpath("//tr/td/a/text()")

// tr表示無論在何處，都從當前節點中選擇與選擇匹配的節點。

您可以使用此（'//'）表達式來掌握children標簽的標簽。

我該如何保存？ <br> 作為lxml.html text_content（）或等效的換行符？

[英]How can I preserve <br> as newlines with lxml.html text_content() or equivalent?

使用lxml.html提取文本

[英]Extract text with lxml.html

pythons lxml.html，一次抓取所有

[英]pythons lxml.html, grab all at once

如何使用 lxml.html 從 HTML 元素獲取文本

[英]How to get text from HTML element by using lxml.html

BeautifulSoup / LXML.html：如果孩子看起來像x，則刪除標簽及其子項

[英]BeautifulSoup/LXML.html: delete tag and its children if child looks like x

我如何使用lxml和python遍歷 <body> html文檔及其子元素

[英]How do I use lxml and python to traverse the <body> of a html document along with its children

如何在lxml.html中刪除無關緊要的空格？

[英]How to remove insignificant whitespace in lxml.html?

Python：使用“lxml.html”將 HTML 內容注入標簽

[英]Python: Injecting HTML content into a tag using `lxml.html`

在lxml.html元素的文本中搜索特殊的HTML字符

[英]Search for special HTML characters in text of lxml.html elements

python lxml.html：提取html docstring中的先前文本

[英]python lxml.html: pull preceding text in html docstring

暫無

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 我該如何保存？ <br> 作為lxml.html text_content（）或等效的換行符？使用lxml.html提取文本 pythons lxml.html，一次抓取所有如何使用 lxml.html 從 HTML 元素獲取文本 BeautifulSoup / LXML.html：如果孩子看起來像x，則刪除標簽及其子項我如何使用lxml和python遍歷 <body> html文檔及其子元素如何在lxml.html中刪除無關緊要的空格？ Python：使用“lxml.html”將 HTML 內容注入標簽在lxml.html元素的文本中搜索特殊的HTML字符 python lxml.html：提取html docstring中的先前文本

相關標簽

粵ICP備18138465號 © 2020-2024 STACKOOM.COM