[英]JTidy Node.findBody() — How to use?
I'm trying to do XHTML DOM parsing with JTidy, and it seems to be rather counterintuitive task. 我正在尝试用JTidy进行XHTML DOM解析,这似乎是违反直觉的任务。 In particular, there's a method to parse HTML:
特别是,有一种解析HTML的方法:
Node Tidy.parse(Reader, Writer)
And to get the <body /> of that Node, I assume, I should use 为了获得该节点的<body />,我认为,我应该使用
Node Node.findBody(TagTable)
Where should I get an instance of that TagTable? 我应该在哪里获得该TagTable的实例? (Constructor is protected, and I haven't found a factory to produce it.)
(构造函数受到保护,我还没有找到工厂来生产它。)
I use JTidy 8.0-SNAPSHOT. 我使用JTidy 8.0-SNAPSHOT。
I found there's much simpler method to extract the body: 我发现有更简单的方法来提取身体:
tidy = new Tidy(); tidy.setXHTML(true);tidy.setPrintBodyOnly(true);
And then use tidy on the Reader-Writer pair. 然后在Reader-Writer对上使用整洁。
Simple as it should be. 应该是简单的。
You could use the parseDOM
method instead, which would give you a org.w3c.dom.Document
back: 您可以使用
parseDOM
方法,这将为您提供org.w3c.dom.Document
:
Document document = Tidy.parseDOM(reader, writer);
Node body = document.getElementsByTagName("body").item(0);
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.