Pandas.read_csv()，如何将每个字符读取为新元素

Question

I have a huge text file(MyTextFile.txt) containing characters like this : ("\\n" refers to the line breaker)我有一个巨大的文本文件（MyTextFile.txt）包含这样的字符：（“\\n”指的是换行符）

ABCDE\n
FGHIJ\n
KLMNO\n

using pandas.read_csv('MyTextFile.txt') returns a 3x1 array, each element contains 5 characters.使用pandas.read_csv('MyTextFile.txt')返回一个 3x1 数组，每个元素包含 5 个字符。 But I need a 15x1 array ([A,B,C,D,E,F,G,H,I,J,K,L,M,N,O] , line breaker should be ignored), is there a simple way to achieve this ?但是我需要一个 15x1 的数组（[A,B,C,D,E,F,G,H,I,J,K,L,M,N,O] ，应该忽略换行符），是否有一个简单的实现这一目标的方法？

there are about 250 million characters in a file, and I have 25 files to read, so the efficiency of doing this could be quite critical to me一个文件中大约有 2.5 亿个字符，而我有 25 个文件要读取，因此这样做的效率对我来说可能非常关键

Thanks.谢谢。

Answer 1

You could use:你可以使用：

# Open the file
file = open('example.txt', 'r') 
# Create your results
res = []  

# Edited from https://www.geeksforgeeks.org/python-program-to-read-character-by-character-from-a-file/
while 1: 
    # read by character 
    char = file.read(1)           
    # If youre out of characters
    if not char:  
        break
    # If not, add the character to the list, but don't include breaking spaces
    elif char != '\n':
        res.append(char)

# Close your file object
file.close()

# Print out the results
print(res)

Yields: ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O']产量： ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O']

Pandas.read_csv()，如何将每个字符读取为新元素

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-10-21 03:49:40

Pandas.read_csv()，如何将每个字符读取为新元素

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-10-21 03:49:40

解决方案1
0 已采纳 2020-10-21 03:49:40