简体   繁体   中英

What is the most efficient way in C# to merge more than 2 xml files with the same schema together?

I have several fairly large XML files that represent data exported from a system that is to be used by a 3rd party vendor. I was chopping the results at 2,500 records for each XML file because the files become huge and unmanagable otherwise. However, the 3rd party vendor has asked me to combine all of these XML files into a single file. There are 78 of these XML files and they total over 700MB in size! Crazy, I know... so how would you go about combining these files to accomodate the vendor using C#? Hopefully there is a real efficient way to do this without reading in all of the files at once using LINQ :-)

I'm going to go out on a limb here and assume that your xml looks something like:

<records>
  <record>
    <dataPoint1/>
    <dataPoint2/>
  </record>
</records>

If that's the case, I would open a file stream and write the <records> part, then sequentially open each XML file and write all lines (except the first and last) to disk. That way you don't have huge strings in memory and it should all be very, very quick to code and run.

public void ConsolidateFiles(List<String> files, string outputFile)
{
  var output = new StreamWriter(File.Open(outputFile, FileMode.Create));
  output.WriteLine("<records>");
  foreach (var file in files)
  {
    var input = new StreamReader(File.Open(file, FileMode.Open));
    string line;
    while (!input.EndOfStream)
    {
      line = input.ReadLine();
      if (!line.Contains("<records>") &&
          !line.Contains("</records>"))
      {
        output.Write(line);
      }
    }
  }
  output.WriteLine("</records>");
}

Use DataSet.ReadXml() , DataSet.Merge() , and DataSet.WriteXml() . Let the framework do the work for you.
Something like this:

  public void Merge(List<string> xmlFiles, string outputFileName)
  {
     DataSet complete = new DataSet();

     foreach (string xmlFile in xmlFiles)
     {
        XmlTextReader reader = new XmlTextReader(xmlFile);
        DataSet current = new DataSet();
        current.ReadXml(reader);
        complete.Merge(current);
     }

     complete.WriteXml(outputFileName);
  }

For further description and examples, take a look at this article from Microsoft.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM