简体   繁体   中英

How to parse an xml that has non-xml data in it

I am working with some xml in C# and am having some issues parsing an xml file due to the format it is in. It has non xml data in the file and I have no control over the format of this file. The file is "test.xml"(see below). I am only concerned with the xml portion of the data, but am unsure the best way to go about accessing it. Any thoughts or recommendations would be greatly appreciated.

Test data -1
Smith, 2234

@@*j

Random--

@<?xml version="1.0" encoding="utf-16"?>
<ConfigMessage xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://www.Test.com/schemas/Test.test.Config">
  <Config>
    <Version>10</Version>
    <Build>00520</Build>
    <EnableV>false</EnableV>
    <BuildL>22</BuildL>
    <BuildP>\\testpath\test</BuildP>
  </Config>
</ConfigMessage>
@

我可以建议您这样的解决方案:打开伪XML像简单的文本文件一样,阅读全文,然后,使用regex,您应该获取xml文档(原始文档的一部分,该文档可以转换为XML [| startTag | | 任何符号 | / endTag |]),将其放入XDocument(在内存中),然后像XML文件一样对其进行解析。

Put the whole file into a string that contains anything within the first '<' and the last '>' characters detected on the file. Then you can treat it as normal XML from there. If there's random non-XML elements throughout it though you will need to add additional logic to detect starting/stopping XML "blocks".

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM