简体   繁体   中英

Is using TStringList to load huge text file the best way in Delphi?

What is the best way to load huge text file data in delphi? Is there any component that can load text file superfast?

Let's say I have a text file contains database and stored in fix length format. It contains 150 field with each at least 50 characters. 1. I need to load it into memory 2. I need to parse it and probably store it in a memdataset for processing

My questions: 1. Is it enough if I use TStringList.loadFromFile method? 2. Is there any other better component to manipulate the text file? 3. Should I use low level reading from textfile?

Thank you in advance.

TStringList is never the optimal way of working with lots of text, but it's the simplest. If you've got small files on your hands you can use TStringList without issues. Even if you have large files (not huge files) you might implement a version of you algorithm using TStringList for testing purposes, because it's simple and easy to understand.

If your files are large, as they probably are since you call them "databases", you need to look into alternative technologies that will enable you to read only as much as you need from the database. Look into:

  • TFileStream
  • Memory mapped files.

Don't look at the old "file" based API's still available in Delphi, they're plain old.

I'm not going to go into details on how to access text using those methods because we've recently had two similar questions on SO:

How Can I Efficiently Read The FIrst Few Lines of Many Files in Delphi

and

Fast Search to see if a String Exists in Large Files with Delphi

Since you have a fixed length that you're working with, you can build an access class based on TList with a TWriter and TReader that will take your records into account. You'll have none of the overhead of a TStringList (not that it's a bad thing, but if you don't need it, why have it) and you can build in your own access to records into the class. Ultimately it depends on what you are trying to accomplish with the data once you have it loaded into memory. While TStringlist is easy to use, it isn't as efficient as "rolling your own".

However, efficiency in data manipulation may not be that much of an issue, as you are using text files to hold a database. If you just need to read in and make decisions based on data in the file, the more flexible TList may be overkill.

I recommend to adhere to TStringList if you find it convenient for your problem. Optimization is another thing that should be done later.

As for TStringList the optimization is to declare a descendant class that overrides TStrings.LoadFromStream method - you can make it practically as fast as possible, taking into account the structure of your files.

It is not entirely clear from your question why you need to load the entire file into memory, prior to then going on to create an in-memory data set.... are you conflating the two issues? (ie because you need to create an in-memory data set you think you first need to load the source data entirely into memory? Or is there some initial pre-processing of the source file which is only possible with the entire file loaded in memory (this is unlikely and even if this is the case, it isn't necessary with a navigable stream object such as a TFileStream).

But I think the answer you are looking for is right there in the question....

If you are loading this file in order to parse it and populate/initialise a further data structure (the data set) for further processing, then using an existing high level data structure is an unnecessary and potentially costly (in terms of time) step.

Use the lowest level means of access that provides the capabilities you need.

In this case a TFileStream will likely provide the best balance of convenience and ease of use.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM