简体   繁体   中英

Fast CSV parser in C++

I am trying to read a .csv file with 20k+ lines, and each line has ~300 fields.

I am using my own code to read it line by line, then I separate the lines to fields, and convert the fields to corresponding data type (such as integer, double, etc). Then these data are transfered to class objects via their constructor.

However, I found it is not very efficient. It took about 1 min to read these 20k+ lines and create 20k+ objects.

I've googled about fast csv parser, and found there are many options. I've tried some of them, but not very satisfied with the time performance.

Does anyone have a better method to read large .csv files? Many thanks in advance.

An efficient method for parsing or for that matter processing of files is to read as much of the file into memory before you start parsing.

File I/O has been, since the dawn of computers, one of the slower parts of a computer system. For example, parsing your data may take 1 microsecond. Reading the data from a hard drive may take 1 millisecond == 1000 microseconds.

I've made programs faster by allocating a large array for the data then reading the data into the array. Next I process the data in the array and repeat until the entire file is processed.

Another technique is called memory mapping, where the OS handles reading the file into memory as needed.

Please edit your post to show the code where the bottleneck is.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM