简体   繁体   中英

Best approach to reading large files

I'm currently working on a program that reads writes a XML file. While this is a simple task, i'm concerned about future issues.

My code reads the streamed data from the XML, and checks every element <x> until an element that matches a criteria is founds, this works quite fast, since the file currently has about 100 <x> elements, but when more elements are added this task will be much slower, specially if the element that matches the criteria is the last one in avery large file.

What approach should I take to minimize the impact of this? I was thinking about spliting files in smaller ones (containing up to 1000 elements each) and reading from various of those files at the same time. Is this a proper approach to this?

I'm coding in C#, in case it's relevant for a language-specific approach.

You should use one of the available XML APIs of .Net. Which one depends on the size of the XML files. In this question there is a discussion between XDocument (Linq-to-Xml) and XmlReader . To summarize: If your file fits in memory, then use XDocument . If not then use XmlReader .

This sounds like a batch process in your case. Maybe this link: https://www.codeproject.com/Articles/1155341/Batch-Processing-Patterns-with-Taskling will help you. I never did this in C#, but in Java, and it's a good way to resolve this kind of tasks. Hope it will help you.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM