[英]How can I extract text content only from root element - java, com.gargoylesoftware.htmlunit.html
I can't find any way to extract text content only from the root element using com.gargoylesoftware.htmlunit.html .我找不到任何使用com.gargoylesoftware.htmlunit.html 仅从根元素提取文本内容的方法。 Here is some example:
下面是一些例子:
<td>
W 03:10 PM-04:25 PM
<strong>
<br>
Hybrid (50%+ in-person)
</strong>
</td>
I want to extract the text content from the root element("td" in this case), but it also extract the text content from the child element, which is the part that I don't want:我想从根元素中提取文本内容(在这种情况下为“td”),但它也从子元素中提取文本内容,这是我不想要的部分:
private void extractTextContent(HtmlElement htmlElement) {
String content = htmlElement.getTextContent();
System.out.println(content);
}
output:输出:
W 03:10 PM-04:25 PMHybrid (50%+ in-person)
desired output:所需的输出:
W 03:10 PM-04:25 PM
I've tried to use other method call "asText()", however that doesn't give me desired output.我尝试使用其他方法调用“asText()”,但这并没有给我想要的输出。 I couldn't find any people who has same question using com.gargoylesoftware.htmlunit.html .
我找不到任何使用com.gargoylesoftware.htmlunit.html有相同问题的人。 Is there any way/method that would extract text content only from the root element?
有什么方法/方法可以仅从根元素中提取文本内容吗?
EDIT: Thank you for the answer.编辑:谢谢你的回答。 I used same idea of deleting child node to get my desired output.
我使用相同的删除子节点的想法来获得我想要的输出。 Here is the syntax for java:
这是java的语法:
private void extractTextContent(HtmlElement htmlElement) {
DomNode child = htmlElement.getLastElementChild();
String tagname = "";
if(child != null) {
tagname = child.getTextContent();
htmlElement.removeChild(tagname, 0);
}
String content = htmlElement.getTextContent();
}
You can try removing child nodes before fetching textContent.您可以在获取 textContent 之前尝试删除子节点。
private void extractTextContent(HtmlElement htmlElement) {
DomNode child = htmlElement.getLastElementChild();
String tagname = "";
if(child != null) {
tagname = child.getTextContent();
htmlElement.removeChild(tagname, 0);
}
String content = htmlElement.getTextContent();
}
I have edited my answer with Java Syntax provided by @XYZ我用@XYZ 提供的 Java 语法编辑了我的答案
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.