简体   繁体   English

哪些库将使用PHP解析DTD

[英]What libraries will parse a DTD using PHP

I need to parse DTDs using PHP and am hoping there's a simple library to help out. 我需要使用PHP解析DTD,并希望有一个简单的库可以提供帮助。 Each DTD has numerous <!ENTITY... and <!-- Comment... elements, which I need to act upon. 每个DTD都有许多<!ENTITY...<!-- Comment...元素,我需要对其进行操作。

Note that I do not need to validate anything against these DTDs, simply parse them as data files themselves. 请注意,我并不需要验证对这些DTD什么,只是它们解析为数据文件本身。

A few options I've looked at: 我看过几个选项:

James Clarke's SD , which is an option of last resort, but I'd like to avoid the complexity of building/installing/configuring code external to PHP. James Clarke的SD ,这是最后的选择,但我想避免构建/安装/配置PHP外部代码的复杂性。 I'm not sure it's even possible in my situation. 在我的情况下,我不确定它是否可能。

PEAR has an XML_DTD_Parser , which requires installing/configuring PEAR and a number of pear modules, which I'm also not sure is possible, and would rather avoid. PEAR有一个XML_DTD_Parser ,它需要安装/配置PEAR和一些梨模块,我也不确定是否可行,宁可避免。 Has anyone used it with success? 有没有人成功使用它? EDIT: I've since learned that XML_DTD_Parser discards comments, so is not a valid option for my needs. 编辑:我已经知道XML_DTD_Parser丢弃注释,因此不是我需要的有效选项。

PHP XML Classes has the class_path_parser, which another site suggested, but it fails to read ENTITY elements. PHP XML类具有class_path_parser,这是另一个站点建议的,但它无法读取ENTITY元素。 It appears to be using PHP's built in XML parsing capabilities, which use EXPAT. 它似乎使用PHP的内置XML解析功能,它使用EXPAT。

PHP's DOMDocument will validate against a DTD , so must be able to read them, though I don't see how to get at the DTD parser directly at first glance. PHP的DOMDocument将针对DTD进行验证 ,因此必须能够读取它们,尽管我没有看到如何直接进入DTD解析器。

None of the standard XML parsers for PHP give access to general entities*, and few give access to comments. PHP的标准XML解析器都不允许访问常规实体*,很少有人可以访问注释。 PHP's built in XML Parser uses Expat , but does not expose the full expat API; PHP的内置XML Parser使用Expat ,但不公开完整的expat API; in particular, a handler for entities cannot be set. 特别是,无法设置实体的处理程序。 There is a PHP bug filed to add this. 提交了一个PHP错误来添加它。

AFAICT, the only way to handle comments and general entities in a DTD parser is to write your own parser; AFAICT,在DTD解析器中处理注释和一般实体的唯一方法是编写自己的解析器; either by hand, or using one of the lexers and parser generators available for php (eg PHP_LexerGenerator and PHP_ParserGenerator among others). 手动,或使用可用于php的词法分析器和解析器生成器之一(例如PHP_LexerGeneratorPHP_ParserGenerator等)。

* PHP's expat wrapper (XML Parser) does give access to notation declarations, which are similar to, but not the same as general entities. * PHP的expat包装器(XML Parser)确实提供了对符号声明的访问,这些声明声明与通用实体类似但不相同。

I don't know useful this will be... 我不知道这会有用......

If I understand what you're looking for, you're looking for a means to extract the and "nodes" from a DTD in order to act on them. 如果我理解你在寻找什么,那么你正在寻找一种方法从DTD中提取和“节点”以便对它们采取行动。 Very interesting. 很有意思。 Here's where my brain went: 这是我的大脑去的地方:

  • Use DOMDocument class directly. 直接使用DOMDocument类 Looks as if there's no distinct way of getting at the DTD data if you treat the DTD as the source. 如果将DTD视为源,则看起来没有明确的获取DTD数据的方法。
  • Use the SimpleXML in the same way. 以相同的方式使用SimpleXML Ditto. 同上。
  • Use the XML parser in, again, the same way but use some of the entity declaration handler functions to get information out. 再次以相同的方式使用XML解析器 ,但使用一些实体声明处理函数来获取信息。 I think this proves more foresight and is probably not what you need. 我认为这证明了更多的远见,可能不是你需要的。 (Although I could be wrong.) (虽然我错了。)
  • Use preg_match_all , or the like, to grab your values based on the patterns. 使用preg_match_all等可以根据模式获取值。 Not to dissimilar to other thoughts in the world . 与世界上的其他想法不相同。
  • Use XSLT to nix everything but what you need. 除了你需要的东西之外,使用XSLT来解决所有问题。 The .xsl to remove all non-comments would be pretty easy to manage. 删除所有非注释的.xsl非常容易管理。 It's quite possible you could just output them in a format that's easier to parse (say, in a better XML structure). 很有可能你可以用一种更容易解析的格式输出它们 (比如,在一个更好的XML结构中)。 Entities may require handling via PHP's XSL processor . 实体可能需要通过PHP的XSL处理器进行处理 I'm a little rusty on entities. 我对实体有点生疏。

Regardless, I hope some of this helps. 无论如何,我希望其中一些有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM