简体   繁体   中英

Pandas.read_csv(), how to read every character as a new element

I have a huge text file(MyTextFile.txt) containing characters like this : ("\\n" refers to the line breaker)

ABCDE\n
FGHIJ\n
KLMNO\n

using pandas.read_csv('MyTextFile.txt') returns a 3x1 array, each element contains 5 characters. But I need a 15x1 array ([A,B,C,D,E,F,G,H,I,J,K,L,M,N,O] , line breaker should be ignored), is there a simple way to achieve this ?

there are about 250 million characters in a file, and I have 25 files to read, so the efficiency of doing this could be quite critical to me

Thanks.

You could use:

# Open the file
file = open('example.txt', 'r') 
# Create your results
res = []  

# Edited from https://www.geeksforgeeks.org/python-program-to-read-character-by-character-from-a-file/
while 1: 
    # read by character 
    char = file.read(1)           
    # If youre out of characters
    if not char:  
        break
    # If not, add the character to the list, but don't include breaking spaces
    elif char != '\n':
        res.append(char)

# Close your file object
file.close()

# Print out the results
print(res)

Yields: ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM