简体   繁体   English

文件中的.readlines()列表未索引值

[英].readlines() list in a file not indexing values

I have a txt file with content in the form of lists like this: 我有一个txt文件,其内容形式如下:

[1,2,3,4]
[5,6,7,8]

I've put these lists into a list using the following code: 我已使用以下代码将这些列表放入列表中:

t = open('filename.txt', 'r+w')
contents = t.readlines()

alist = []

for i in contents:
    alist.append(i)

When I run 当我跑步

alist[0]

I get 我懂了

[1,2,3,4]

but when I run 但是当我跑步时

for a in alist:
    print a[0]

I get 我懂了

[

instead of the fist value in the list. 而不是列表中的拳头值。

.readlines() reads lines as strings. .readlines()将行读取为字符串。 The first character of that string is a [ . 该字符串的第一个字符是[

If you want to read the text file and "deserialize" it into data structures, the easiest way is to use Python's built-in eval() function. 如果要读取文本文件并将其“反序列化”为数据结构,最简单的方法是使用Python的内置eval()函数。 A safer way is to use ast.literal_eval() . 一种更安全的方法是使用ast.literal_eval()

http://docs.python.org/2/library/ast.html?highlight=literal#ast.literal_eval http://docs.python.org/2/library/ast.html?highlight=literal#ast.literal_eval

Suggested code: 建议的代码:

import ast

with open("filename.txt") as f:
    alist = [ast.literal_eval(line) for line in f]

print(type(alist[0]))  # prints: <type 'list'>
print(alist[0]) # prints: [1,2,3,4]

We almost never want to call .readlines() ; 我们几乎永远都不想调用.readlines() ; it slurps in all the lines from the file, so if the file is very large, it will cause your program's memory usage to go way up. 它会吞噬文件中的所有行,因此,如果文件很大,则会导致程序的内存使用率上升。 An open file handle object (in my example, f ) can be used as an iterator, and it will yield up one line from the file each time it is iterated. 一个打开的文件句柄对象(在我的示例中为f )可以用作迭代器,并且每次对其进行迭代都会从文件中产生一行。 So a for loop or a list comprehension will pull one line at a time from the file. 因此, for循环或列表理解将一次从文件中拉一行。 Thus, this example program does not keep the whole file in memory; 因此,此示例程序不会将整个文件保留在内存中; it keeps just one line at a time, while building the list. 在构建列表时,它一次只保留一行。 If this program called .readlines() it would keep all the lines and also the list, so the peak memory usage would be higher. 如果此程序调用.readlines() ,它将保留所有行以及列表,因此峰值内存使用率将更高。 (It doesn't matter for such a small input file as this example, of course. But it's easy to do things the memory efficient way, so why not?) (当然,对于本例中的这么小的输入文件来说,这没关系。但是,以内存高效的方式来做事情很容易,所以为什么不呢?)

It is always good practice to use with to open a file. with一起使用总是打开文件的好习惯。 Then you know the file will be properly closed when you are done with it. 然后,您知道完成该文件后将正确关闭该文件。

We use a list comprehension to build a list of the results of ast.literal_eval() , which for the given input file returns a list per line, so alist will be a list of lists. 我们使用列表ast.literal_eval()来构建ast.literal_eval()结果的列表,该列表对于给定的输入文件每行返回一个列表,因此alist将是列表的列表。

If you just inherited or downloaded these files and can't do anything about the format, and you know they're supposed to be treated as lines of Python list s, ast.literal_eval is the best answer, as steveha explained: 如果您只是继承或下载了这些文件,并且对格式无能为力,并且知道将它们视为Python list s行,则ast.literal_eval是最好的答案,如steveha所述:

t = open('filename.txt', 'r')
alist = []    
for i in contents:
    alist.append(ast.literal_eval(i))

If you inherited or downloaded these files, and are just guessing at the format, it's possible that they're actually intended to be read as lines of JSON, because they definitely are valid JSON just as they are valid Python literals. 如果您继承或下载了这些文件,并且只是猜测其格式,则实际上它们有可能被当作JSON行读取,因为它们和Python文字一样,肯定是有效的JSON。 In that case: 在这种情况下:

t = open('filename.txt', 'r')
alist = []    
for i in contents:
    alist.append(json.loads(i))

But if you're the one who created these files in the first place, you should instead create them in a way that's designed for serialization. 但是,如果您是首先创建这些文件的人,则应该改用专为序列化设计的方式来创建它们。

For example, instead of this: 例如,代替此:

t = open('filename.txt', 'w')
for i in alist:
    print >>t, i

Do something like this: 做这样的事情:

t = open('filename.txt', 'w')
json.dump(alist, t)

Then you can write your reading code like this: 然后,您可以像这样编写阅读代码:

t = open('filename.txt', 'r')
alist = json.load(t)

The whole point of serialization formats like JSON, YAML, or Pickle is that they're specifically designed so that you can write a value and later read back that same value. 诸如JSON,YAML或Pickle之类的序列化格式的全部要点是,它们是经过专门设计的,以便您可以写入一个值,然后再读回该相同的值。

Functions like print , str , etc. are not designed for that; 诸如printstr等功能并非为此设计; they're designed so you can display a value in the nicest human-readable form, even if that's difficult or impossible to read back later. 它们经过设计,因此即使以后很难或不可能读回,您也可以以人类最易理解的形式显示值。

The function repr is somewhere in between. 函数repr在两者之间。 It's designed to be readable to humans playing with the interactive prompt, so if possible it gives you a string that you could type into the prompt to get the same value back. 它被设计为对使用交互式提示的人类可读,因此,如果可能,它会为您提供一个字符串,您可以在该字符串中键入以获取相同的值。 This means that, in some cases, ast.literal_eval is the inverse of repr , just as json.load is the inverse of json.dump . 这意味着,在某些情况下, ast.literal_eval是逆repr ,就像json.load是逆json.dump But you shouldn't rely on this, even when dealing with types where it works. 但是即使在处理它可以工作的类型时,也不应依赖于此。


A few side notes about your code: 有关您的代码的一些注意事项:

t = open('filename.txt', 'r+w')

If you're only going to read the file, don't try to open it for writing. 如果您只打算读取文件,请不要尝试打开它进行写入。 Also, if you do want to open for both reading and writing, the right mode string is r+ , not r+w . 另外,如果您确实想同时进行读写操作,则正确的模式字符串为r+ ,而不是r+w (The way you've done it is technically an error, but most versions of Python will ignore the w , so you get away with it.) (从技术上来说,您执行此操作的方式是一个错误,但是大多数版本的Python都会忽略w ,因此您可以摆脱它。)

And if the mode is r , you don't need to specify it at all, because that's the default. 如果模式是r ,则根本不需要指定它,因为这是默认值。

Meanwhile, you never close the file. 同时,您永远不会close文件。 The easiest way to do this is to use a with statement. 最简单的方法是使用with语句。

contents = t.readlines()

There is almost never a good reason to call readlines() . 几乎从来没有一个很好的理由来调用readlines() This gives you a sequence of lines—but the file itself is already a sequence of lines. 这为您提供了一系列的行,但是文件本身已经是一系列的行。 All you're doing is making an extra copy of it. 您正在做的只是对其进行额外复制。

alist = []

for i in contents:
    alist.append(i)

This pattern—creating an empty list and then appending to it in a loop—is so common that Python has a shortcut to it, called a list comprehension. 这种模式(创建一个空列表,然后将其附加到循环中)非常普遍,以至于Python都有一个捷径,称为列表理解。 Comprehensions are less verbose, more readable, harder to get wrong, and faster than explicit loops, so it's worth using them most of the time. 与显式循环相比,理解不那么冗长,更易读,更容易出错,并且速度更快,因此值得在大多数时间使用它们。

Finally, it's better to give meaningful names to your variables. 最后,最好给变量赋予有意义的名称。 Especially if you want someone else (or yourself, six months later) to be able to debug your code. 特别是如果您希望其他人(或您自己,六个月后)能够调试您的代码。 If it's working perfectly, we can tell what the variables mean by what they do—but if it's not, we can't fix it unless we can guess what they're supposed to mean, and names are the best way to signal that. 如果它的正常使用,我们可以告诉变量由他们究竟意味着什么做的,但如果没有的话,我们不能修复它,除非我们可以猜测他们应该意味着什么,和名字对信号的最佳途径。

So, putting it all together, your original code could be written as: 因此,将所有内容放在一起,原始代码可以写为:

with open('filename.txt') as textfile:
    alist = [line for line in textfile]

And the various fixed versions are: 各种固定版本是:

with open('filename.txt') as textfile:
    alist = [ast.literal_eval(line) for line in textfile]

with open('filename.txt') as textfile:
    alist = [json.loads(line) for line in textfile]

with open('filename.txt') as textfile:
    alist = json.load(textfile)

What you have is a list of character strings. 您所拥有的是字符串列表。 A character string with brackets and commas in it is not magically a list, it is merely a string with brackets and commas in it. 带有方括号和逗号的字符串不是神奇的列表,它只是带有方括号和逗号的字符串。

alist is the list. alist是列表。 In your loop, a is an item from that list: first, it is alist[0] , then alist[1] and so on. 在循环中, a是该列表中的一项:首先是alist[0] ,然后是alist[1] ,依此类推。 Thus, a[0] is asking for alist[0][0] , alist[1][0] , and so on: the first character from each line. 因此, a[0]要求alist[0][0]alist[1][0]等:每行的第一个字符。 And so that's what you get. 这就是您得到的。

If you want to convert it to an actual Python list, use ast.literal_eval() . 如果要将其转换为实际的Python列表,请使用ast.literal_eval()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM