[英]Large XML reading using XMLTextReader is very slow
I like to think that my problem is not very unique, given size of my XML file is just 3MB. 我想认为我的问题不是很独特,因为我的XML文件只有3MB。 There are close to 60 thousand records in XML. XML中有近6万条记录。 I am having hard time reducing the processing the processing time. 我很难减少处理时间。 Currently it is taking close to 7-8 minutes to read and insert into Datatables. 目前,读取和插入Datatables大约需要7-8分钟。 (Please note I am NOT inserting in database yet, so database transactions are not the issues here) (请注意,我尚未插入数据库,因此数据库事务不是这里的问题)
Here is the code I wrote. 这是我写的代码。 Any suggestion to reduce the processing time will be greatly appreciated. 任何减少处理时间的建议将不胜感激。
XmlTextReader reader = new XmlTextReader(destFile);
while (reader.Read())
{
switch (reader.NodeType)
{
case XmlNodeType.Element:
if (reader.Name == "Report")
{
FileDataTable = UpdateReportTable(FileDataTable, reader);
else if (reader.Name == "Name")
{
NameTable = UpdateNameTable(NameTable, reader);
}
else if (reader.Name == "Entries")
{
EntriesTable = UpdateEntriesTable(EntriesTable , reader);
}
reader.MoveToElement();
break;
case XmlNodeType.Text:
break;
case XmlNodeType.EndElement:
break;
}
}
Then I have following function to get value into DataTable. 然后,我有以下函数来将值获取到DataTable中。 Element "Entries" are taking 90% of the time so I am posting that code, other functions are similar. 元素“条目”占用了90%的时间,因此我要发布该代码,其他功能相似。
private static DataTable UpdateEntriesTable(DataTable entries, XmlTextReader reader)
{
DataRow row = entries.NewRow();
for (int attInd = 0; attInd < reader.AttributeCount; attInd++)
{
reader.MoveToAttribute(attInd);
if (reader.Name == "refDataId") { row["DataId"] = Convert.ToInt32(reader.Value); }
}
reader.MoveToElement();
reader.Read();
row["DataCount"] = Convert.ToInt32(reader.Value);
row["LastModifiedOn"] = DateTime.Now;
try
{
entries.Rows.Add(row);
entries.AcceptChanges();
}
catch (Exception ex)
{
log.Error(ex.Message);
return entries;
}
return entries;
}
It looks like you're saving each entity to the database as you go. 好像您正在将每个实体都保存到数据库中。 That can be pretty slow going, especially if you have to open a connection, save data, and then close the connection again. 这可能会很慢,尤其是当您必须打开连接,保存数据然后再次关闭连接时。
I'd suggest attempting to wrap all the entity changes up in a bulk update, so you're only having to open a connection to the database and write to it once. 我建议尝试在一次批量更新中包装所有实体更改,因此您只需要打开与数据库的连接并写入一次即可。 You can add all of your entities to the DataTable
and after you're done processing then execute AcceptChanges();
您可以将所有实体添加到DataTable
,然后在完成处理后执行AcceptChanges();
. 。 That would likely save you a ton of time. 这将为您节省大量时间。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.