简体   繁体   English

python不通过readlines()进行迭代

[英]python not iterating by line with readlines()

I have a text file with just strings on each row. 我有一个文本文件,每行只有一个字符串。 I want to get python to look at a row and then check if that string is in a list and if it is not add it, else skip to next line. 我想让python查看一行,然后检查该字符串是否在列表中,以及是否未添加它,否则请跳至下一行。 Later I will use collections to count total occurrences of each list item. 稍后,我将使用集合来计数每个列表项的总出现次数。

testset = ['2']
# '2' is just a "sanity check" value that lets me know I am extending list

file = open('icecream.txt')

filelines = file.readlines()

for i in filelines:
    if i not in testset:
    testset.extend(i)
else:
    print(i, "is already in set")

print(testset)

I was expecting to get: 我期待得到:

testset = ['2', 'chocolate', 'vanilla', 'AmericaConeDream', 'cherrygarcia', ...]

instead I got: 相反,我得到了:

testset = ['2', 'c', 'h', 'o', 'c', 'o' ....]        

Not sure what is happening here. 不知道这里发生了什么。 I have tried to run this using: for i in file: 我尝试使用以下命令运行此命令:对于文件中的i:

As I believe I read on another post that the open() was a iterator in and of itself. 正如我相信的那样,我在另一篇文章中读到open()本身就是一个迭代器。 Can someone enlighten me as to how I get this iteration to work? 有人可以启发我如何使此迭代生效吗?

extend() iterates over the elements (in this case, the characters) of its argument, and adds each of the them individually to the list. extend()遍历其参数的元素(在本例中为字符),并将每个元素分别添加到列表中。 Use append() instead: 使用append()代替:

    testset.append(i)

If you don't care about the order in which the lines appear in testset , you could use a set instead of a list. 如果您不关心各行在testset出现的testset ,则可以使用集合而不是列表。 The following one-liner will create a set containing every unique line in the file: 以下单行代码将创建一个包含文件中每个唯一行的集合:

testset = set(open('icecream.txt'))

You can think of extend as append for an an iterable of values rather than just one. 您可以将extend视为append的附加值,而不只是一个值。 Because you plan to use a counter to counter the files anyway, I would do the following to key the unique values: 因为无论如何您打算使用计数器来对文件进行计数,所以我将执行以下操作以键入唯一值:

with open('text.txt') as text:
    data = Counter(i for i in text) # try data.keys()

EDIT: look at NPE's answer : it's basically the same, but more elegant and pythonic. 编辑:看看NPE的答案 :基本上是相同的,但是更加优雅和Pythonic。

Try reading and splitting and reducing in one go: 尝试一次阅读,拆分和减少:

textset = set(file.read().split('\n'))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM