简体   繁体   English

Python:从.txt文件导入期间将数组还原到一行

[英]Python: Restore array to a single line during import from .txt file

A program writes lists to a .txt file in the following manner: 程序以下列方式将列表写入.txt文件:

[ 3.  6.  3.  1.  1.  1.  0.  1.  2.  2.  9.  2.  5.  2.  2.  1.  0.  0.
  4.  6.  1.  1.  1.  0.  5.  2.  0.  0.  0.  0.  0.  0.  0.  0.]
[  4.   9.   8.   7.   2.   4.   1.   7.   5.   3.   7.   2.   6.   0.   9.
  5.   6.  10.   6.   2.   1.   5.   0.]
[  3.   5.   9.   1.   1.   1.   0.   1.   1.   4.   8.   5.   5.   3.   3.
   7.   6.  12.   9.   2.   1.   0.   0.   4.   6.   1.   1.   1.   0.   5.
   0.   0.   0.   0.   0.   0.   0.   0.   0.]

Ie, the lists are not on one line. 即,列表不在一行上。 I want to create a histogram for each of these lists, how do I import these are integer values in a list after making sure the whole list (and not one single line) is imported? 我想为每个列表创建一个直方图,在确保导入整个列表(而不是一行)之后,如何在列表中导入这些整数值? I have tried: 我努力了:

data = [line.strip() for line in open('n.txt', 'r')]

But when calling data[0] it just yields the top line. 但是,当调用data [0]时,它只产生第一行。 Any suggestions? 有什么建议么?

If you're in control of the writing to the file, there are easier formats to write this data. 如果您可以控制文件的写入,则可以使用更简单的格式来写入此数据。 But if you're stuck with this, here's one way to load it: 但是,如果您坚持使用此方法,则可以采用以下一种加载方法:

import ast

with open('test.txt', 'r') as f:
    data = []
    curList = []
    for line in f:
        line = line.replace('[', ' [ ').replace(']', ' ] ')
        items = line.split()
        for item in items:
            if item == "[":
                curList = []
            elif item == "]":
                data.append(curList)
            else:
                curList.append(ast.literal_eval(item))

print data

OUTPUT: 输出:

[[3.0, 6.0, 3.0, 1.0, 1.0, 1.0, 0.0, 1.0, 2.0, 2.0, 9.0, 2.0, 5.0, 2.0, 2.0, 1.0, 0.0, 0.0, 4.0, 6.0, 1.0, 1.0, 1.0, 0.0, 5.0, 2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], 
 [4.0, 9.0, 8.0, 7.0, 2.0, 4.0, 1.0, 7.0, 5.0, 3.0, 7.0, 2.0, 6.0, 0.0, 9.0, 5.0, 6.0, 10.0, 6.0, 2.0, 1.0, 5.0, 0.0], 
 [3.0, 5.0, 9.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 4.0, 8.0, 5.0, 5.0, 3.0, 3.0, 7.0, 6.0, 12.0, 9.0, 2.0, 1.0, 0.0, 0.0, 4.0, 6.0, 1.0, 1.0, 1.0, 0.0, 5.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]]

Crude, but should join the lines without loading all the data into memory at the same time. 粗略,但应加入行,而不能同时将所有数据加载到内存中。

a_lines = list()
str_line = ''

for line in [l.rstrip() for l in open('data.txt')]:
    str_line += line
    if str_line[-1] == ']':
        a_lines.append(str_line)
        str_line = ''

my_data = '\n'.join(a_lines)

Here with regex : 这里用regex

import re

p = re.compile(r'([^\]])\n', re.MULTILINE)
my_data = ''

with open('data.txt') as my_file:
    my_data = p.sub(r'\1', my_file.read())

Both code samples leave the data in one string element, my_data 两个代码样本都将数据保留在一个字符串元素my_data

A different approach based on the fact that a new list is indicated by [ : 基于[表示新列表的事实的另一种方法

data = []
with open("n.txt") as fh:
    for line in fh:
        line = line.replace('.', '').replace(']', '')
        line = line.split()
        if line[0] == '[':
            data.append(map(int, line[1:]))
        else:
            data[-1].extend(map(int, line))

Dots are removed so the int works later on. 点被删除,因此int以后可以使用。 It relies on there being at least one space after each [ (which is true in your short example), but if that's not true you can easily adapt, for instance using the replace in Brionius' answer. 它依赖于每个[后面至少有一个空格((在您的简短示例中是正确的),但是如果不正确,您可以轻松地进行调整,例如使用Brionius答案中的replace。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM