简体   繁体   中英

Efficiently reading a csv file with windows newline on linux in Python

The following is working under windows for reading csv files line by line.

f = open(filename, 'r')

for line in f:

Though when copying the csv file to a linux server, it fails.

It should be mentioned that performance is an issue as the csv files are huge. I am therefore concerned about the string copying when using things like strip.

Python has builtin support for Windows, Linux and Mac line endings:

f = open(filename, 'rtU')

for line in f:
    ...

If you really want don't want slow string operations, you should strip the files before processing them. You can either use dos2unix (can be found in the Debian package "tofrodos") or (easier) use FTP text mode which should do that automatically.

如果性能很重要,为什么不使用csv.reader

嗯....你有csv文件,你使用的是Python,为什么不使用Python csv模块读取文件?

The dos2unix utility will do this very efficiently. If the files are that large I would run that command as part of the copy.

Actually, the most efficient way to read any file is in one big I/O. There isn't always enough RAM to do that, but the less I/Os the better.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM