简体   繁体   English

使用XMLTextReader进行大型XML读取非常慢

[英]Large XML reading using XMLTextReader is very slow

I like to think that my problem is not very unique, given size of my XML file is just 3MB. 我想认为我的问题不是很独特,因为我的XML文件只有3MB。 There are close to 60 thousand records in XML. XML中有近6万条记录。 I am having hard time reducing the processing the processing time. 我很难减少处理时间。 Currently it is taking close to 7-8 minutes to read and insert into Datatables. 目前,读取和插入Datatables大约需要7-8分钟。 (Please note I am NOT inserting in database yet, so database transactions are not the issues here) (请注意,我尚未插入数据库,因此数据库事务不是这里的问题)

Here is the code I wrote. 这是我写的代码。 Any suggestion to reduce the processing time will be greatly appreciated. 任何减少处理时间的建议将不胜感激。

 XmlTextReader reader = new XmlTextReader(destFile);            

        while (reader.Read())
        {
            switch (reader.NodeType)
            {
                case XmlNodeType.Element:
                    if (reader.Name == "Report")
                    {                           
                        FileDataTable = UpdateReportTable(FileDataTable, reader);

                    else if (reader.Name == "Name")
                    {
                        NameTable = UpdateNameTable(NameTable, reader);
                    }

                    else if (reader.Name == "Entries")
                    {
                        EntriesTable = UpdateEntriesTable(EntriesTable , reader);
                    }                     

                    reader.MoveToElement();
                    break;
                case XmlNodeType.Text:
                    break;
                case XmlNodeType.EndElement:
                    break;
            }
        }

Then I have following function to get value into DataTable. 然后,我有以下函数来将值获取到DataTable中。 Element "Entries" are taking 90% of the time so I am posting that code, other functions are similar. 元素“条目”占用了90%的时间,因此我要发布该代码,其他功能相似。

private static DataTable UpdateEntriesTable(DataTable entries, XmlTextReader reader)
    {
        DataRow row = entries.NewRow();

        for (int attInd = 0; attInd < reader.AttributeCount; attInd++)
        {
            reader.MoveToAttribute(attInd);
            if (reader.Name == "refDataId") { row["DataId"] = Convert.ToInt32(reader.Value); }

        }

        reader.MoveToElement();
        reader.Read();
        row["DataCount"] = Convert.ToInt32(reader.Value);
        row["LastModifiedOn"] = DateTime.Now;
        try
        {
            entries.Rows.Add(row);
            entries.AcceptChanges();
        }
        catch (Exception ex)
        {
            log.Error(ex.Message);
            return entries;
        }
        return entries;
    }

It looks like you're saving each entity to the database as you go. 好像您正在将每个实体都保存到数据库中。 That can be pretty slow going, especially if you have to open a connection, save data, and then close the connection again. 这可能会很慢,尤其是当您必须打开连接,保存数据然后再次关闭连接时。

I'd suggest attempting to wrap all the entity changes up in a bulk update, so you're only having to open a connection to the database and write to it once. 我建议尝试在一次批量更新中包装所有实体更改,因此您只需要打开与数据库的连接并写入一次即可。 You can add all of your entities to the DataTable and after you're done processing then execute AcceptChanges(); 您可以将所有实体添加到DataTable ,然后在完成处理后执行AcceptChanges(); . That would likely save you a ton of time. 这将为您节省大量时间。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM