[英]What is the equivalent 'streams' code of TStringList.SaveToFile and which is better for large amounts of data?
The following console application utilises TStringList.SaveToFile to write multiples lines to a text file: 以下控制台应用程序利用TStringList.SaveToFile将多个行写入文本文件:
program Project1;
{$APPTYPE CONSOLE}
{$R *.res}
uses
System.SysUtils,
System.Classes;
var
i: Integer;
a,b,c: Single;
myString : String;
myStringList : TStringList;
begin
try
Randomize;
myStringList := TStringList.Create;
for i := 0 to 1000000 do
begin
a := Random;
b := Random;
c := Random;
myString := FloatToStr(a) + Char(9) + FloatToStr(b) + Char(9) + FloatToStr(c);
myStringList.Add(myString);
end;
myStringList.SaveToFile('Output.txt');
myStringList.Free;
WriteLn('Done');
Sleep(10000);
except
on E: Exception do
Writeln(E.ClassName, ': ', E.Message);
end;
end.
It takes around 3 seconds to write a >50MB file with 1000001 lines and seems to work fine. 写入大于50MB且具有1000001行的文件大约需要3秒钟,并且似乎工作正常。 However, many people advocate using streams for such processes.
但是,许多人主张将流用于此类过程。 What would the stream equivalent be and what are the advantages/disadvantages of using it compared to TStringList.SaveToFile?
与TStringList.SaveToFile相比,等效的流是什么?使用它的优缺点是什么?
It may be faster to write directly to a stream. 直接写入流可能更快。 Or it may not.
否则可能不会。 I suggest you try it out and time both options.
我建议您尝试一下并为这两个选项计时。 Writing to a stream looks like this:
写入流看起来像这样:
for i := 0 to 1000000 do
begin
a := Random;
b := Random;
c := Random;
myString := FloatToStr(a) + Char(9) + FloatToStr(b) + Char(9) +
FloatToStr(c) + sLineBreak;
Stream.WriteBuffer(myString[1], Length(myString)*SizeOf(myString[1]));
end;
To have any hope of this version being fast, you need to use a buffered stream. 为了使这个版本很快,您需要使用缓冲流。 Try this one: Buffered files (for faster disk access) .
尝试以下一项: 缓冲文件(以加快磁盘访问速度) 。
The code above will output UTF-16 text on modern Delphi. 上面的代码将在现代Delphi上输出UTF-16文本。 If you want to output ANSI text simply declare
myString
as AnsiString
. 如果要输出ANSI文本,只需将
myString
声明为AnsiString
。
I'll let you do the timing, but my guess is that this variant performs similarly to the string list. 我让您安排时间,但是我猜想这个变体的性能类似于字符串列表。 I suspect that the time is spent calling
Random
and FloatToStr
. 我怀疑时间花费在调用
Random
和FloatToStr
。 I expect that the file saving with the string list is already very fast. 我希望使用字符串列表保存文件已经非常快了。
Putting speed to one side, there is another benefit of this approach. 一方面讲,这种方法还有另一个好处。 In the string list approach, as per the code in the question, the entire content of the text file is stored in memory.
在字符串列表方法中,按照问题中的代码,文本文件的全部内容都存储在内存中。 And when you save the file, another copy is made as part of the save procedure.
并且,当您保存文件时,会在保存过程中制作另一个副本。 So you will have two copies of the entire file in memory.
因此,您将在内存中拥有整个文件的两个副本。
In contrast, when saving directly to a stream, the only memory requirement is whatever buffer your stream class uses. 相反,直接保存到流时,唯一的内存需求是流类使用的任何缓冲区。 For a 50MB file as per the question there's likely no real problem with either approach.
根据问题,对于50MB的文件,两种方法都可能没有真正的问题。 For a much larger file then you will run into out of memory errors if you try to hold the entire file in memory.
对于更大的文件,如果尝试将整个文件保存在内存中,则会遇到内存不足错误。
Personally though, I'd consider making use of the TStreamWriter
class. 不过,就我个人而言,我将考虑使用
TStreamWriter
类。 This useful class separates the concerns of writing data (text, values etc.) from the concern of pushing to a stream. 这个有用的类将写数据(文本,值等)的关注与推送到流的关注分开了。 Your code would become:
您的代码将变为:
Writer := TStreamWriter.Create(Stream);//use whatever stream you like
try
for i := 0 to 1000000 do
begin
a := Random;
b := Random;
c := Random;
Writer.WriteLine(FloatToStr(a) + Char(9) + FloatToStr(b) + Char(9) +
FloatToStr(c));
end;
finally
Writer.Free;
end;
The TStreamWriter
implements buffering with a 1KB buffer so you can use TFileStream
and expect to get reasonable performance. TStreamWriter
使用1KB缓冲区实现缓冲,因此您可以使用TFileStream
并期望获得合理的性能。
I would recommend that you choose the technique that leads to the most readable code. 我建议您选择导致代码可读性最高的技术。 If performance becomes an issue you can optimise that later.
如果性能成为问题,您可以稍后对其进行优化。 My personal preference would be for
TStreamWriter
. 我个人更喜欢
TStreamWriter
。 This gives very clean and readable code, yet also excellent separation of content generation from streaming. 这给出了非常清晰易读的代码,同时还很好地将内容生成与流分离。 The performance is perfectly reasonable also.
性能也完全合理。
A TFileStream
based solution would look as follows, but there are some important points: 基于
TFileStream
的解决方案如下所示,但是有一些重要的要点:
TFileStream
code is slower. TFileStream
代码较慢。 There's no buffering in TFileStream
and writing 20 bytes at a time to file is not effective. TFileStream
没有缓冲,一次向文件写入20个字节是无效的。 The TStringList
bufferes everything in RAM and saves it all at once. TStringList
将所有内容缓冲在RAM中,并立即将其全部保存。 That's optimum, but it uses a lot of RAM. TStringList
- based variant 50% of time is spent in Random
, as expected actually. TStringList
的变体中, Random
实际花费了50%的时间。 TFileStream
solution to become more effective you'd need to roll a buffering scheme so you'd write a reasonable amount to disk each time (example: 4Kb) TFileStream
解决方案更有效,您需要滚动缓冲方案,以便每次都在磁盘上写入合理的数量(例如:4Kb) Code: 码:
program Project9;
{$APPTYPE CONSOLE}
{$R *.res}
uses
SysUtils,
Classes,
DateUtils;
var
i: Integer;
a,b,c: Single;
myString : AnsiString;
StartTime: TDateTime;
F: TFileStream;
begin
try
Randomize;
StartTime := Now;
F := TFileStream.Create('Output.txt', fmCreate);
try
for i := 0 to 1000000 do
begin
a := Random;
b := Random;
c := Random;
myString := FloatToStr(a) + Char(9) + FloatToStr(b) + Char(9) + FloatToStr(c);
myString := AnsiString(Format('%f'#9'%f'#9'%f'#13#10, [a, b, c]));
F.WriteBuffer(myString[1], Length(myString));
end;
finally F.Free;
end;
WriteLn('Done. ', SecondOf(Now-StartTime), ':', MilliSecondOf(Now-StartTime));
ReadLn;
except
on E: Exception do
Writeln(E.ClassName, ': ', E.Message);
end;
end.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.