I have text file like this:
"01","AAA","AAAAA"
"02","BBB","BBBBB","BBBBBBBB"
"03","CCC"
"04","DDD","DDDDD"
I want to load this text file data into temp table in sybase db. So, I need to build a program to read line by line this text file until eof. If the text file size is small, the process to read line by line is fast. But if text file size is too big (can be more than 500M), the process read line by line is too slow. I think the read line by line method not suitable for huge text file. So, need to find other solution to load text file data into db instead of read text file line by line method. Any suggestion? Example code:
var
myFile : TextFile;
text : string;
begin
// Try to open the Test.txt file for writing to
AssignFile(myFile, 'Test.txt');
// Display the file contents
while not Eof(myFile) do
begin
ReadLn(myFile, text);
TempTable.append;
TempTable.FieldByName('Field1').asstring=Copy(text,2,2);
TempTable.FieldByName('Field2').asstring=Copy(text,7,3);
TempTable.FieldByName('Field3').asstring=Copy(text,13,5);
TempTable.FieldByName('Field4').asstring=Copy(text,21,8);
TempTable.post;
end;
// Close the file for the last time
CloseFile(myFile);
end;
Text files normally have a very small buffer. Look into using the SetTextBuf function to increase your performance.
var
myFile : TextFile;
text : string;
myFileBuffer: Array[1..32768] of byte;
begin
// Try to open the Test.txt file for writing to
AssignFile(myFile, 'Test.txt');
SetTextBuf(MyFile, myFileBuffer);
Reset(MyFile);
// Display the file contents
while not Eof(myFile) do
begin
ReadLn(myFile, text);
end;
// Close the file for the last time
CloseFile(myFile);
end;
Some general tips:
TempTable
is in memory, or use a fast database engine - take a look at SQlite3 or other means (like FireBird embedded, NexusDB or ElevateDB) as possible database alternatives; TTable
, but a true database, ensure you nest the insert within a Transaction ; FieldByName('...')
method is known to be very slow: use locals TField
variables instead; So your code may be:
var
myFile : TextFile;
myFileBuffer: array[word] of byte;
text : string;
Field1, Field2, Field3, Field4: TField;
begin
// Set Field* local variables for speed within the main loop
Field1 := TempTable.FieldByName('Field1');
Field2 := TempTable.FieldByName('Field2');
Field3 := TempTable.FieldByName('Field3');
Field4 := TempTable.FieldByName('Field4');
// Try to open the Test.txt file for writing to
AssignFile(myFile, 'Test.txt');
SetTextBuf(myFile, myFileBuffer); // use 64 KB read buffer
// Display the file contents
while not Eof(myFile) do
begin
ReadLn(myFile, text);
TempTable.append;
Field1.asInteger := StrToInt(Copy(text,2,2));
Field2.asString := Copy(text,7,3);
Field3.asString := Copy(text,13,5);
Field4.asString := Copy(text,21,8);
TempTable.post;
end;
// Close the file for the last time
CloseFile(myFile);
end;
You can achieve very high speed with embedded engines, with almost no size limit, but your storage. See for instance how fast we can add content to a SQLite3 database in our ORM : about 130,000 / 150,000 rows per second in a database file, including all ORM marshalling. I also found out that SQLite3 generates much smaller database files than alternatives. If you want fast retrieval of any field, do not forget to define INDEXes in your database, if possible after the insertion of row data (for better speed). For SQLite3 , there is already an ID/RowID
integer primary key available, which maps your first data field, I suppose. This ID/RowID
integer primary key is already indexed by SQLite3 . By the way, our ORM now supports FireDAC / AnyDAC and its advanced Array DML feature .
In addition to what has already been said, I would also avoid using any TTable component. You would be better off using a TQuery type component (depending on the access layer you're using). Something like this :-
qryImport.SQL := 'Insert Into MyTable Values (:Field1, :Field2, :Field3, :Field4);';
Procedure ImportRecord(Const pField1, pField2, pField3, pField4 : String);
Begin
qryImport.Close;
qryImport.Params[0].AsString := pField1;
qryImport.Params[1].AsString := pField2;`
qryImport.Params[2].AsString := pField3;
qryImport.Params[3].AsString := pField4;
qryImport.ExecSQL;
End;
Hope this helps.
Another approach would be to use memory-mapped files (you can google or go torry.net to find implementations). it would not work well for files >1gb (in win32,, in win64 you can map virtually any file). It would turn all your file into PAnsiChar
that you would be able to scan like a one large buffer, searching for #10 and #13 (alone or in pairs) and thus manually splitting strings.
If you use (or don't mind starting to use) the JEDI Jvcl , they have a TJvCSVDataSet
which allows you to simply use your CSV file like any other dataset in Delphi, including being able to define persistent fields and use "standard" Delphi database functionality:
JvCSVDataSet1.FileName := 'MyFile.csv';
JvCSVDataSet1.Open;
while not JvCSVDataSet1.Eof do
begin
TempTable.Append; // Posts last appended row automatically;
// no need to call Post here.
// Assumes TempTable has same # of fields in the
// same order
for i := 0 to JvCSVDataSet1.FieldCount - 1 do
TempTable.Fields[i].AsString := JvCSVDataSet1.Fields[i].AsString;
JvCSVDataSet1.Next;
end;
// Post the last row appended when the loop above exited
if TempTable.State in dsEditModes then
TempTable.Post;
In Delphi 7 you can use Turbo Power SysTools TStAnsiTextStream() to read and write in a line oriented way, but using the thread safe TStream implementation and not the unsafe old pascal file interface. In later Delphi versions you will find something alike in the standard RTL (although they are a little different in their implementation), but Delphi 7 didn't offer much for text file manipulation.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.