简体   繁体   中英

readlines() python 2.7 vs 3.10

I wrote a script in python 2.7 but want to switch to python 3.10 The only problem is that for some reason the readlines() command isn't producing the same results and is causing problems with my list comp. Below are the two different versions and their results:


Python 2.7

file_to_open = open('file.csv', 'r') 
f = file_to_open.readlines()
print(len(f))

The result is 2001


Python 3.10

file_to_open = open('file.csv', 'r') 
f = file_to_open.readlines()
print(len(f))

The result is 10401


The csv file does have 2001 rows so that is the correct number. There has to be some characters that are creating new lines or something that is screwing with the python 3 version. Has anyone encountered this before?

It has to do with universal new lines and how python 2 and 3 read them. In the CSV file there were extra '\r' characters within the fields. So I had to use the 'b' option when opening the file to ignore universal new lines. But then it was reading each line as bytes so I had to type cast each line back to a str and then do an re.sub to replace the '\r' characters. Below is the list that I created that ended up working perfectly.

import re

f = [re.sub(' \r ', '', str(line)) for line in open('file.csv', 'rb')]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM