简体   繁体   English

使用MarkLogic的Java API访问HTML Tidy

[英]Accessing HTML Tidy with MarkLogic's Java API

I'm refactoring a Java EE project to use MarkLogic, and would like to take advantage of MarkLogic's built-in HTML Tidy functionality. 我正在重构Java EE项目以使用MarkLogic,并且想利用MarkLogic的内置HTML Tidy功能。 Is it possible to make use of HTML Tidy from the MarkLogic Java API? 是否可以使用MarkLogic Java API中的HTML Tidy? Or am I going to need to use a third-party API that lets me run XQuery commands directly? 还是我需要使用允许我直接运行XQuery命令的第三方API?

Or is this a fool's errand, and I should just use HTML Tidy in my code? 还是这是一个愚蠢的事情,我应该在代码中使用HTML Tidy?

请参阅随Java API一起分发的com.marklogic.client.example.cookbook.DocumentWriteTransform示例,该示例在文档写入时使用server-side XQuery转换来调用xdmp:tidy()

The example Erik cited installs an XQuery transform. Erik引用的示例安装了XQuery转换。 It's name is html2xthml.xqy -- it's packaged somewhere in the distribution. 它的名称是html2xthml.xqy -包装在发行版中的某个位置。 The example has one method for installing the transform and then an example of how to invoke it. 该示例提供了一种安装转换的方法,然后提供了如何调用它的示例。 The invocation part is at line 126. 调用部分在第126行。

writeMgr.write(docId, writeHandle, transform); writeMgr.write(docId,writeHandle,transform);

Just above that you'll see how the transform is created and configured. 在其上方,您将看到如何创建和配置转换。

The idea is that you can use REST (via the Java API) to install the transform at /v1/transforms/html2xhtml and then invoke it during a document PUT (using this java write method) with the transform name as parameter. 这个想法是,您可以使用REST(通过Java API)在/ v1 / transforms / html2xhtml上安装转换,然后在使用转换名称作为参数的文档PUT(使用此java write方法)期间调用转换。

So the reference you're looking for is in the XQuery transform, not the Java source file. 因此,您要查找的参考位于XQuery转换中,而不是Java源文件中。

I would be inclined to leave tidy in the Java layer, as long as you are planning to keep Java in the picture anyway. 只要您打算将Java保留在图片中,我将倾向于在Java层中保持整洁。 Running tidy in the JVM gives you more control: you can install whatever version of jtidy you like, and even patch it yourself. 在JVM中运行整洁可提供更多控制权:您可以安装所需的任何版本的jtidy,甚至可以自行对其进行修补。 Also tidy can be fairly CPU-intensive, so running it in the JVM layer would keep it from competing with database queries. 整洁也可能会占用大量CPU,因此在JVM层中运行它会使它避免与数据库查询竞争。

Of course you might have other strong incentives to run tidy in MarkLogic. 当然,您可能还有其他强烈的动机去使MarkLogic保持整洁。 For example you might be planning to allow direct REST integration with your MarkLogic code. 例如,您可能打算允许与MarkLogic代码直接进行REST集成。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM