简体   繁体   English

如何只解析Java中的DTD文件,一次一次一行,并且没有任何验证

[英]How do I just parse a DTD file in Java, one line at the a time and without any validation

I have been given an invalid DTD file that has duplicate elements and the elements are not identical: 给我一个无效的DTD文件,其中包含重复的元素,并且元素不相同:

<!ELEMENT Data (Name, address?)>
<!ELEMENT Data (Name, age)>

And I need to write a utility which reads the DTD and merges the elements like the following: 我需要编写一个读取DTD并合并如下元素的实用程序:

<!ELEMENT Data (Name, address?, age)>

I can't seem to be able to find a java library which allows me to do just parsing one element at a time (like SAX). 我似乎找不到能够允许我一次只解析一个元素的Java库(例如SAX)。

What I am really after is to read the <!ELEMENT Data (Name, address?)> into a data structure like a Map of arrays or something similar. 我真正要做的是将<!ELEMENT Data (Name, address?)>读入数据结构,如数组映射或类似的东西。

Any pointers will be much appreciated. 任何指针将不胜感激。

Seems to me you have to read all the DTD ELEMENTs at once, or you can't pair them up as you have shown in your example. 在我看来,您必须立即阅读所有DTD元素,否则就无法像示例中所示将它们配对。

Because DTD descriptions can have arbitrary nesting of (...) regular expressions can't help you in theory. 由于DTD描述可以具有(...)正则表达式的任意嵌套,因此从理论上讲对您没有帮助。 As a practical matter, most DTDs ELEMENTS have only one or two layers of (...) and so they might work. 实际上,大多数DTD ELEMENTS只有一层或两层(...),因此它们可能起作用。 If your problem largely looks like you have shown, you can do this with just string hacking and hand fix the rest. 如果您的问题在很大程度上看起来像您所显示的那样,则可以仅通过字符串破解来完成此任务,然后手动修复其余问题。 (Reading single lines won't cut it; ELEMENT descriptions can cross multiple lines and end with "...>" and you'll have to find that). (阅读单行代码不会减少它; ELEMENT描述可以跨越多行并以“ ...>”结尾,您必须找到它)。

If you want a reliable automated approach, you need what amounts to aa program transformation system . 如果您需要可靠的自动化方法,则需要一个相当于程序转换系统的软件 DTD's are a particular type of formal system; DTD是一种特殊类型的形式系统。 you need a tool that can read instances of the formal description, give you access to read and update the data structures that represent the instance (typically calls Abstract Syntax Trees), and rewrite the results back out as valid source text. 您需要一个可以读取形式描述实例的工具,可以读取和更新代表实例的数据结构(通常称为抽象语法树),并将结果重写为有效的源文本。

Not in Java, but our DMS Software Reengineering Toolkit is such a program transformation engine. 不是用Java,而是我们的DMS软件再造工具包就是这样的程序转换引擎。 It has an XML front end that is capable of parsing DTDs, and in fact we've build code generators using those DTDs. 它具有一个能够解析DTD的XML前端,实际上,我们已经使用这些DTD构建了代码生成器。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在Java中解析和分析DTD文件? - How can I parse and analyze a DTD file in Java? 如何让 java 程序从文件中只读取一行? - How do I make a java program read just one line from the file? 如何使用 Java 将内部 dtd 插入 xml 文件 - How do I insert an internal dtd into an xml file using Java 如何一次从XML文件通过Java代码解析一个节点? - how do I parse one node at a time from XML file through Java code? 没有XML文件中包含的DTD声明的XML文件的Java DTD验证? - Java DTD validation of XML files without the DTD declaration included in the XML File? 如何将本地DTD文件的验证应用于java中的xml文件? - How to apply validation of local DTD file to xml file in java? 如何在不使用Java进行任何缓冲的情况下读取文件? - How do I read a file without any buffering in Java? 如何使用正则表达式仅使用一行的一部分? (Java) - How do I use a regular expression to use just one part of a line? (Java) 如何使Java CLI解析库Jopt Simple解析参数而没有任何选择? - How do I make my Java CLI parsing library Jopt Simple parse an argument without any option? 使用Maven,如何使用一个不同的.java文件构建两个版本的Java项目? - With Maven how do I build two versions of a Java project with just one different .java file?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM