简体   繁体   English

将两个xml文件合并为一个的最快方法是什么

[英]What is the fastest way to combine two xml files into one

If I have two string of xml1 and xml2 which both represent xml in the same format. 如果我有两个字符串xml1和xml2,它们都以相同的格式表示xml。 What is the fastest way to combine these together? 将这些组合在一起的最快方法是什么? The format is not important, but I just want to know how can I get rid off or ? 格式并不重要,但我只想知道如何摆脱或删除?

xml1 : xml1:

<?xml version="1.0" encoding="utf-8"?>
<AllNodes>
   <NodeA>
      <NodeB>test1</NodeB>
      <NodeB>test2</NodeB>
   </NodeA>
</AllNodes>

xm2 : xm2:

<?xml version="1.0" encoding="utf-8"?>
<AllNodes>
   <NodeA>
      <NodeB>test6</NodeB>
      <NodeB>test7</NodeB>
   </NodeA>
   <NodeA>
      <NodeB>test99</NodeB>
      <NodeB>test23</NodeB>
   </NodeA>
</AllNodes>

and have something like this : 并有这样的事情:

<?xml version="1.0" encoding="utf-8"?>
    <AllNodes>
          <NodeA>
              <NodeB>test1</NodeB>
              <NodeB>test2</NodeB>
          </NodeA>
         <NodeA>
              <NodeB>test6</NodeB>
              <NodeB>test7</NodeB>
           </NodeA>
           <NodeA>
              <NodeB>test99</NodeB>
              <NodeB>test23</NodeB>
           </NodeA>
    </AllNodes>

The easiest way to do this is using LINQ to XML. 最简单的方法是使用LINQ to XML。 You can use either Union or Concat depending on your needs. 您可以根据需要使用UnionConcat

var xml1 = XDocument.Load("file1.xml");
var xml2 = XDocument.Load("file2.xml");

//Combine and remove duplicates
var combinedUnique = xml1.Descendants("AllNodes")
                          .Union(xml2.Descendants("AllNodes"));

//Combine and keep duplicates
var combinedWithDups = xml1.Descendants("AllNodes")
                           .Concat(xml2.Descendants("AllNodes"));

An XSLT transformation could do it: XSLT转换可以做到这一点:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:param name="pXml1" select="''" />
  <xsl:param name="pXml2" select="''" />
  <xsl:param name="pRoot" select="'root'" />

  <xsl:template match="/">
    <xsl:variable name="vXml1" select="document($pXml1)" />
    <xsl:variable name="vXml2" select="document($pXml2)" />

    <xsl:element name="{$pRoot}">
      <xsl:copy-of select="$vXml1/*/*" />
      <xsl:copy-of select="$vXml2/*/*" />
    </xsl:element>
  </xsl:template>

</xsl:stylesheet>

Pass in the names of the files as parameters, as well as the name of the new root element. 传递文件名作为参数,以及新的根元素的名称。

Apply to any XML document, eg an empty one. 适用于任何XML文档,例如空文档。

This is the fastest and cleanest way to merge xml files. 这是合并xml文件的最快,最干净的方法。

XElement xFileRoot = XElement.Load(file1.xml);
XElement xFileChild = XElement.Load(file2.xml);
xFileRoot.Add(xFileChild);
xFileRoot.Save(file1.xml);

If you can guarantee this format you can combine them by doing string manipulation: 如果可以保证这种格式,则可以通过字符串操作将它们组合起来:

  • Read the first file, keep everything before "</AllNodes>" 读取第一个文件,将所有内容保留在“ </ AllNodes>”之前
  • Read the second file, remove the part up to "<AllNodes>" 阅读第二个文件,删除直到“ <AllNodes>”的部分
  • Combine those strings. 合并那些字符串。

This should be the fastest way since no parsing is needed. 由于不需要解析,因此这应该是最快的方法。

const string RelevantTag = "AllNodes";

string xml1 = File.ReadAllText(xmlFile1);
xml1 = xml1.Substring(0, xml.LastIndexOf("</" + RelevantTag + ">"));

string xml2 = File.ReadAllText(xmlFile2);
xml2 = xml2.Substring(xml.IndexOf("<" + RelevantTag + ">") + "<" + RelevantTag + ">".Length, xml1.Length);

File.WriteAllText(xmlFileCombined, xm1 + xml2);

That said I would always prefer the safe way to the fast way. 话虽如此,我总是会选择安全的方法而不是快速的方法。

If you want to use the XmlDocument, try this 如果要使用XmlDocument,请尝试以下操作

 var lNode = lDoc1.ImportNode(lDoc2.DocumentElement.FirstChild, true);
 lDoc1.DocumentElement.AppendChild(lNode);
var doc= XDocument.Load("file1.xml");
var doc1= XDocument.Load("file2.xml");
doc.Root.Add(doc2.Root.Elements());

Best solution to me, based on Jose Basilio answer, slightly modified, 对我最好的解决方案,基于Jose Basilio的回答,稍作修改,

var combinedUnique = xml1.Descendants()
    .Union(xml2.Descendants());
combinedUnique.First().Save(#fullName)

You have two basic options: 您有两个基本选择:

  1. Parse the xml, combine the data structures, serialize back to xml. 解析xml,合并数据结构,序列化回xml。

  2. If you know the structure, use some basic string manipulation to hack it. 如果您知道该结构,请使用一些基本的字符串操作对其进行修改。 For example, in the example above you could take the inside of allnodes in the two xml blocks and put them in a single allnodes block and be done. 例如,在上面的示例中,您可以将两个xml块中的allnode放入内部,并将它们放在单个allnodes块中并完成。

In my case the main solution did not work well , the difference was that I had a List for a thousands of files when I take one element and try to merge with the first element I get OutOfMemory exception, I added an empty template with and empty row (NodeA in this case) to solve the weird problem of the memory and run smoothly. 在我的情况下,主要解决方案不能很好地工作 ,区别在于当我接受一个元素并尝试与第一个元素合并时,我有一个包含数千个文件的列表,我得到了OutOfMemory异常,我添加了一个空模板行(在这种情况下NodeA)来解决内存的怪异问题并使其运行平稳。

I save the document in other process 我将文件保存在其他过程中

XDocument xmlDocTemplate = GetXMLTemplate(); -- create an empty document with the same root and empty row element (NodeA), everything will be merge here.
List<XElement> lstxElements = GetMyBunchOfXML();

foreach (var xmlElement lstxElements)
{
    xmlDocTemplate
        .Root
        .Descendants("NodeA")
        .LastOrDefault()
        .AddAfterSelf(xmlElement.Descendants("NodeA"));
}

If I were doing this (using C#), I would create a class that I can deserialize this XML to (you can use xsd.exe to do this), and then loop through all the nodes in the object representing the first piece of XML and "Add" them to the AllNodes property of the object representing the second XML. 如果执行此操作(使用C#),则将创建一个可以反序列化此XML的类(可以使用xsd.exe进行此操作),然后遍历对象中代表第一段XML的所有节点并将它们“添加”到代表第二个XML的对象的AllNodes属性中。

Then serialize the second class back out the XML, and it should look like your 3rd example. 然后将第二个类序列化回XML,它看起来应该像您的第三个示例。

Since you asked for the fastest : 由于您要求最快的速度

If (and only if) the xml structure is always consistent: (this is pseudo code) 如果(且仅当)xml结构始终一致:(这是伪代码)

string xml1 = //get xml1 somehow
string xml2 = //get xml2 somehow
xml1 = replace(xml1, "<?xml version=\"1.0\" encoding=\"utf-8\"?>", "");
xml1 = replace(xml1, "<allnodes>", "");
xml1 = replace(xml1, "</allnodes>", "");
xml2 = replace(xml2, "<allnodes>", "<allnodes>\n" + xml1);

It's a giant hack but it's fast. 这是一个巨大的hack,但速度很快。 Expect to see it on TheDailyWTF when your colleagues find it. 希望当您的同事发现它时,可以在TheDailyWTF上看到它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM