简体   繁体   English

如何使用C#从xml提取信息

[英]How to extract information from xml using C#

I have a xml document and i am using T-SQL to extract information and saving in tables. 我有一个xml文档,我正在使用T-SQL提取信息并保存在表中。 But the problem is i extract multiple xml and have multiple dataset to save in database. 但是问题是我提取了多个xml,并有多个数据集要保存在数据库中。 I was wondering if i could extract data from these multiple xml files on my C# code using Linq and create a list and use Bulk insert rather then having to sent the xml to stored procedure each time. 我想知道是否可以使用Linq从C#代码上的这些多个xml文件中提取数据并创建列表并使用大容量插入,而不必每次都将xml发送到存储过程。 My T-Sql code to extract the information: 我的T-Sql代码提取信息:

select  x.i.value('ReportCell[1]/Value[1]', 'varchar(250)') as AccountName, x.i.value('ReportCell[1]/Attributes[1]/ReportCellAttribute[1]/Value[1]', 'varchar(250)') as AccountId, x.i.value('ReportCell[2]/Value[1]', 'varchar(250)') as Amnount
from @xml.nodes('//Cells') as x(i) 
where   x.i.value('../RowType[1]', 'varchar(250)') = 'Row' and x.i.value('ReportCell[1]/Attributes[1]', 'varchar(250)') is not null

The Xml file is : Xml文件是:

   <?xml version="1.0" encoding="utf-16"?>
<Report xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <ReportID>BalanceSheet</ReportID>
  <ReportName>Balance Sheet</ReportName>
  <ReportType>BalanceSheet</ReportType>
  <ReportTitles>
    <string>Balance Sheet</string>
    <string>Ulysses It 6</string>
    <string>As at 31 October 2016</string>
  </ReportTitles>
  <ReportDate>18 January 2017</ReportDate>
  <UpdatedDateUTC>2017-01-18T01:07:41.654Z</UpdatedDateUTC>
  <Fields />
  <Rows>
    <ReportRow>
      <RowType>Header</RowType>
      <Cells>
        <ReportCell>
          <Value />
        </ReportCell>
        <ReportCell>
          <Value>31 Oct 2016</Value>
        </ReportCell>
        <ReportCell>
          <Value>31 Oct 2015</Value>
        </ReportCell>
      </Cells>
    </ReportRow>
    <ReportRow>
      <RowType>Section</RowType>
      <Title>Assets</Title>
      <Rows />
    </ReportRow>
    <ReportRow>
      <RowType>Section</RowType>
      <Title>Bank</Title>
      <Rows>
        <ReportRow>
          <RowType>Row</RowType>
          <Cells>
            <ReportCell>
              <Value>Ulysses Six</Value>
              <Attributes>
                <ReportCellAttribute>
                  <Value>9e44f52a-90f4-4e9f-88f5-2dd9a33fc1c6</Value>
                  <Id>account</Id>
                </ReportCellAttribute>
              </Attributes>
            </ReportCell>
            <ReportCell>
              <Value>486000.00</Value>
              <Attributes>
                <ReportCellAttribute>
                  <Value>9e44f52a-90f4-4e9f-88f5-2dd9a33fc1c6</Value>
                  <Id>account</Id>
                </ReportCellAttribute>
              </Attributes>
            </ReportCell>
            <ReportCell>
              <Value>0.00</Value>
              <Attributes>
                <ReportCellAttribute>
                  <Value>9e44f52a-90f4-4e9f-88f5-2dd9a33fc1c6</Value>
                  <Id>account</Id>
                </ReportCellAttribute>
              </Attributes>
            </ReportCell>
          </Cells>
        </ReportRow>
        <ReportRow>
          <RowType>SummaryRow</RowType>
          <Cells>
            <ReportCell>
              <Value>Total Bank</Value>
            </ReportCell>
            <ReportCell>
              <Value>486000.00</Value>
            </ReportCell>
            <ReportCell>
              <Value>0.00</Value>
            </ReportCell>
          </Cells>
        </ReportRow>
      </Rows>
    </ReportRow>
    <ReportRow>
      <RowType>Section</RowType>
      <Title>Current Assets</Title>
      <Rows>
        <ReportRow>
          <RowType>Row</RowType>
          <Cells>
            <ReportCell>
              <Value>Accounts Receivable</Value>
              <Attributes>
                <ReportCellAttribute>
                  <Value>25e89097-8895-445c-8315-7efe50dc3be7</Value>
                  <Id>account</Id>
                </ReportCellAttribute>
              </Attributes>
            </ReportCell>
            <ReportCell>
              <Value>375000.00</Value>
              <Attributes>
                <ReportCellAttribute>
                  <Value>25e89097-8895-445c-8315-7efe50dc3be7</Value>
                  <Id>account</Id>
                </ReportCellAttribute>
              </Attributes>
            </ReportCell>
            <ReportCell>
              <Value>0.00</Value>
              <Attributes>
                <ReportCellAttribute>
                  <Value>25e89097-8895-445c-8315-7efe50dc3be7</Value>
                  <Id>account</Id>
                </ReportCellAttribute>
              </Attributes>
            </ReportCell>
          </Cells>
        </ReportRow>
        <ReportRow>
          <RowType>SummaryRow</RowType>
          <Cells>
            <ReportCell>
              <Value>Total Current Assets</Value>
            </ReportCell>
            <ReportCell>
              <Value>375000.00</Value>
            </ReportCell>
            <ReportCell>
              <Value>0.00</Value>
            </ReportCell>
          </Cells>
        </ReportRow>
      </Rows>
    </ReportRow>
    <ReportRow>
      <RowType>Section</RowType>
      <Title />
      <Rows>
        <ReportRow>
          <RowType>SummaryRow</RowType>
          <Cells>
            <ReportCell>
              <Value>Total Assets</Value>
            </ReportCell>
            <ReportCell>
              <Value>861000.00</Value>
            </ReportCell>
            <ReportCell>
              <Value>0.00</Value>
            </ReportCell>
          </Cells>
        </ReportRow>
      </Rows>
    </ReportRow>
    <ReportRow>
      <RowType>Section</RowType>
      <Title>Liabilities</Title>
      <Rows />
    </ReportRow>
    <ReportRow>
      <RowType>Section</RowType>
      <Title>Current Liabilities</Title>
      <Rows>
        <ReportRow>
          <RowType>Row</RowType>
          <Cells>
            <ReportCell>
              <Value>GST</Value>
              <Attributes>
                <ReportCellAttribute>
                  <Value>81ee7772-593d-48d3-851d-0bd68149d527</Value>
                  <Id>account</Id>
                </ReportCellAttribute>
              </Attributes>
            </ReportCell>
            <ReportCell>
              <Value>78273.51</Value>
              <Attributes>
                <ReportCellAttribute>
                  <Value>81ee7772-593d-48d3-851d-0bd68149d527</Value>
                  <Id>account</Id>
                </ReportCellAttribute>
              </Attributes>
            </ReportCell>
            <ReportCell>
              <Value>0.00</Value>
              <Attributes>
                <ReportCellAttribute>
                  <Value>81ee7772-593d-48d3-851d-0bd68149d527</Value>
                  <Id>account</Id>
                </ReportCellAttribute>
              </Attributes>
            </ReportCell>
          </Cells>
        </ReportRow>
        <ReportRow>
          <RowType>SummaryRow</RowType>
          <Cells>
            <ReportCell>
              <Value>Total Current Liabilities</Value>
            </ReportCell>
            <ReportCell>
              <Value>78273.51</Value>
            </ReportCell>
            <ReportCell>
              <Value>0.00</Value>
            </ReportCell>
          </Cells>
        </ReportRow>
      </Rows>
    </ReportRow>
    <ReportRow>
      <RowType>Section</RowType>
      <Title />
      <Rows>
        <ReportRow>
          <RowType>SummaryRow</RowType>
          <Cells>
            <ReportCell>
              <Value>Total Liabilities</Value>
            </ReportCell>
            <ReportCell>
              <Value>78273.51</Value>
            </ReportCell>
            <ReportCell>
              <Value>0.00</Value>
            </ReportCell>
          </Cells>
        </ReportRow>
      </Rows>
    </ReportRow>
    <ReportRow>
      <RowType>Section</RowType>
      <Title />
      <Rows>
        <ReportRow>
          <RowType>Row</RowType>
          <Cells>
            <ReportCell>
              <Value>Net Assets</Value>
            </ReportCell>
            <ReportCell>
              <Value>782726.49</Value>
            </ReportCell>
            <ReportCell>
              <Value>0.00</Value>
            </ReportCell>
          </Cells>
        </ReportRow>
      </Rows>
    </ReportRow>
    <ReportRow>
      <RowType>Section</RowType>
      <Title>Equity</Title>
      <Rows>
        <ReportRow>
          <RowType>Row</RowType>
          <Cells>
            <ReportCell>
              <Value>Current Year Earnings</Value>
              <Attributes>
                <ReportCellAttribute>
                  <Value>abababab-abab-abab-abab-abababababab</Value>
                  <Id>account</Id>
                </ReportCellAttribute>
              </Attributes>
            </ReportCell>
            <ReportCell>
              <Value>324545.13</Value>
              <Attributes>
                <ReportCellAttribute>
                  <Value>abababab-abab-abab-abab-abababababab</Value>
                  <Id>account</Id>
                </ReportCellAttribute>
                <ReportCellAttribute>
                  <Value>7/1/2016</Value>
                  <Id>fromDate</Id>
                </ReportCellAttribute>
                <ReportCellAttribute>
                  <Value>10/31/2016</Value>
                  <Id>toDate</Id>
                </ReportCellAttribute>
              </Attributes>
            </ReportCell>
            <ReportCell>
              <Value>0.00</Value>
              <Attributes>
                <ReportCellAttribute>
                  <Value>abababab-abab-abab-abab-abababababab</Value>
                  <Id>account</Id>
                </ReportCellAttribute>
                <ReportCellAttribute>
                  <Value>7/1/2015</Value>
                  <Id>fromDate</Id>
                </ReportCellAttribute>
                <ReportCellAttribute>
                  <Value>10/31/2015</Value>
                  <Id>toDate</Id>
                </ReportCellAttribute>
              </Attributes>
            </ReportCell>
          </Cells>
        </ReportRow>
        <ReportRow>
          <RowType>Row</RowType>
          <Cells>
            <ReportCell>
              <Value>Retained Earnings</Value>
              <Attributes>
                <ReportCellAttribute>
                  <Value>86d10e93-151d-4b89-bb65-6df9bbacd2e3</Value>
                  <Id>account</Id>
                </ReportCellAttribute>
              </Attributes>
            </ReportCell>
            <ReportCell>
              <Value>458181.36</Value>
              <Attributes>
                <ReportCellAttribute>
                  <Value>86d10e93-151d-4b89-bb65-6df9bbacd2e3</Value>
                  <Id>account</Id>
                </ReportCellAttribute>
                <ReportCellAttribute>
                  <Value />
                  <Id>fromDate</Id>
                </ReportCellAttribute>
                <ReportCellAttribute>
                  <Value>10/31/2016</Value>
                  <Id>toDate</Id>
                </ReportCellAttribute>
              </Attributes>
            </ReportCell>
            <ReportCell>
              <Value>0.00</Value>
              <Attributes>
                <ReportCellAttribute>
                  <Value>86d10e93-151d-4b89-bb65-6df9bbacd2e3</Value>
                  <Id>account</Id>
                </ReportCellAttribute>
                <ReportCellAttribute>
                  <Value />
                  <Id>fromDate</Id>
                </ReportCellAttribute>
                <ReportCellAttribute>
                  <Value>10/31/2015</Value>
                  <Id>toDate</Id>
                </ReportCellAttribute>
              </Attributes>
            </ReportCell>
          </Cells>
        </ReportRow>
        <ReportRow>
          <RowType>SummaryRow</RowType>
          <Cells>
            <ReportCell>
              <Value>Total Equity</Value>
            </ReportCell>
            <ReportCell>
              <Value>782726.49</Value>
            </ReportCell>
            <ReportCell>
              <Value>0.00</Value>
            </ReportCell>
          </Cells>
        </ReportRow>
      </Rows>
    </ReportRow>
  </Rows>
</Report>

First, you should define POCO classes for your purpose, for instance: 首先,您应该根据自己的目的定义POCO类,例如:

public class Report {
    // your poco definition here
    public long ReportID {get;set;}
    ... 
}

You can do so automatically with Visual Studio, all the required classes will be generate for you without writing a single line of code: 您可以使用Visual Studio自动执行此操作,无需编写任何代码即可为您生成所有必需的类:

在此处输入图片说明

XmlSerializer x = new XmlSerializer(typeof(Report));
using (FileStream fs = new FileStream(FilePath, FileMode.Open))
{
    XmlReader reader = new XmlTextReader(fs);
    Report r = (Report)x.Deserialize(reader);   
}   

// you can access whatever info you want, e.g.
// r.ReportID 

Try following : 尝试以下操作:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
using System.IO;
using System.Data;

namespace ConsoleApplication42
{
    class Program
    {
        const string FILENAME = @"c:\temp\test.xml";
        static void Main(string[] args)
        {
            StreamReader reader = new StreamReader(FILENAME, Encoding.UTF8);
            reader.ReadLine();  //ignore xml definition
            XmlReaderSettings settings = new XmlReaderSettings();
            settings.ConformanceLevel = ConformanceLevel.Fragment;
            XmlReader xReader = XmlReader.Create(reader, settings);
            XElement doc = XElement.Load(xReader);

            DataSet ds = new DataSet();

            string[] columnHeaders = null;
            foreach (XElement table in doc.Element("Rows").Elements("ReportRow"))
            {
                string type = (string)table.Descendants("RowType").FirstOrDefault();

                switch (type)
                {
                    case "Header" :
                        columnHeaders = table.Descendants("Value").Select(x => (string)x).ToArray();
                        break;
                    case "Section":
                        string name = (string)table.Element("Title");
                        DataTable dt = new DataTable(name);
                        for (int index = 0; index < columnHeaders.Length; index++)
                        {
                            if (index == 0)
                            {
                                dt.Columns.Add(columnHeaders[index], typeof(string));
                            }
                            else
                            {
                                dt.Columns.Add(columnHeaders[index], typeof(decimal));
                            }
                        }
                        ds.Tables.Add(dt);
                        foreach (XElement datarow in table.Descendants("ReportRow"))
                        {
                            DataRow newRow = dt.Rows.Add();
                            string[] values = datarow.Descendants("ReportCell").Elements("Value").Select(x => (string)x).ToArray();
                            for (int index = 0; index < values.Length; index++)
                            {
                                switch (index)
                                {
                                    case 0 :
                                        newRow[index] = values[index];
                                        break;

                                    default :
                                        newRow[index] = decimal.Parse(values[index]);
                                        break;
                                }
                            }
                        }
                        break;
                }
            }


        }
    }


}

Finally I did manage to extract what i need and it is shorter then above one. 最后,我确实设法提取了我需要的东西,它比上面的要短。 It requires a few steps. 它需要一些步骤。

var balancesheet = (from a in bs.Rows
                            where a.RowType == "Section"
                            &&
                            a.Rows.Count > 0 
                            &&
                            (a.Rows.Where(x => x.RowType == "Row")) != null 
                            select new
                            {                                    
                                ReportCells =   a.Rows
                                                .SelectMany( x=>
                                                {
                                                    return x.Cells.Where(s => s.Attributes != null);
                                                }).ToList()

                            }
                            ).ToList();

List<ReportCell> allCells = balancesheet.Where(z => z.ReportCells.Count > 0).SelectMany(x =>
      {
          return x.ReportCells;
      }).ToList();

List<BalanceSheet> allBalanceSheetInfo = new List<BalanceSheet>();

        for(int i = 0; i < allCells.Count; i += 3)
        {
            allBalanceSheetInfo.Add(new BalanceSheet()
            {

                AccountId = allCells.ElementAt(i).Attributes.FirstOrDefault() != null ? new Guid(allCells.ElementAt(i).Attributes.FirstOrDefault().Value) : Guid.Empty,
                AccountName = allCells.ElementAt(i).Value,
                Amount = Decimal.Parse(allCells.ElementAt(i + 1).Value)

            });
        }

Now I can insert the list to the database. 现在,我可以将列表插入数据库。 But am starting to question to do it in C# or just pass the XML to my T-SQL code. 但是我开始质疑用C#来完成它,或者只是将XML传递给我的T-SQL代码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM