简体繁体 English

从文本文件中提取XML标签

[英]Extract XML tag from text file

原文 2015-06-06 10:04:31 0 1 java/ xml/ text/ extract

my intention is to extract single or nested XML tags from a text file. 我的意图是从文本文件中提取单个或嵌套的XML标签。 My input file structure is both plain text and XML (in my case HTML) format. 我的输入文件结构既是纯文本格式，也是XML（在我的情况下为HTML）格式。 What i want to do is to scan input discarding everything until an XML tag is reached; 我想要做的是扫描输入，丢弃所有内容，直到到达XML标签为止。 then extract it all (with everything nested in) and continue this way until the whole file is processed. 然后提取所有内容（嵌套所有内容）并继续这种方式，直到处理完整个文件。 Before attempting doing it on my own, i'd like to see if there is some java library i don't know which could help me. 在尝试自己做之前，我想看看是否有一些我不知道的Java库可以帮助我。

Thank you all. 谢谢你们。

1 个解决方案

you need to parse the XML file and create a it's relative DOM tree. 您需要解析XML文件并创建它的相对DOM树。 Check out here Java XML-DOM parser tutorial 在这里查看Java XML-DOM解析器教程

从Selenium中的内部标记中提取文本 - Extract text from inner tag in Selenium

HtmlUnit：从<span><a>标记</a>后的</span>文本中提取文本 - HtmlUnit: Extract text from a <span> after an <a> tag

需要从img标签中提取文本 - Need to extract text from img tag

正则表达式从xml文本中提取字符串 - Regex to extract string from xml text

从JSON文件中提取文本 - Extract text from JSON file

从pdf文件中提取文本 - extract text from a pdf file

如何从Java中的XML文件提取所有PCDATA（文本）？ - How can I extract all PCDATA (text) from an XML file in Java?

如何从xml文件中提取字符串列表？ - How to extract a list of string from xml file?

无法从XML文件提取时间数据 - cannot extract time data from XML file

从 xml 文件中提取信息作为 RDF 三元组 - extract information from xml file as RDF triples

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从Selenium中的内部标记中提取文本 - Extract text from inner tag in Selenium HtmlUnit：从<span><a>标记</a>后的</span>文本中提取文本 - HtmlUnit: Extract text from a <span> after an <a> tag 需要从img标签中提取文本 - Need to extract text from img tag 正则表达式从xml文本中提取字符串 - Regex to extract string from xml text 从JSON文件中提取文本 - Extract text from JSON file 从pdf文件中提取文本 - extract text from a pdf file 如何从Java中的XML文件提取所有PCDATA（文本）？ - How can I extract all PCDATA (text) from an XML file in Java? 如何从xml文件中提取字符串列表？ - How to extract a list of string from xml file? 无法从XML文件提取时间数据 - cannot extract time data from XML file 从 xml 文件中提取信息作为 RDF 三元组 - extract information from xml file as RDF triples

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM