简体   繁体   中英

python not iterating by line with readlines()

I have a text file with just strings on each row. I want to get python to look at a row and then check if that string is in a list and if it is not add it, else skip to next line. Later I will use collections to count total occurrences of each list item.

testset = ['2']
# '2' is just a "sanity check" value that lets me know I am extending list

file = open('icecream.txt')

filelines = file.readlines()

for i in filelines:
    if i not in testset:
    testset.extend(i)
else:
    print(i, "is already in set")

print(testset)

I was expecting to get:

testset = ['2', 'chocolate', 'vanilla', 'AmericaConeDream', 'cherrygarcia', ...]

instead I got:

testset = ['2', 'c', 'h', 'o', 'c', 'o' ....]        

Not sure what is happening here. I have tried to run this using: for i in file:

As I believe I read on another post that the open() was a iterator in and of itself. Can someone enlighten me as to how I get this iteration to work?

extend() iterates over the elements (in this case, the characters) of its argument, and adds each of the them individually to the list. Use append() instead:

    testset.append(i)

If you don't care about the order in which the lines appear in testset , you could use a set instead of a list. The following one-liner will create a set containing every unique line in the file:

testset = set(open('icecream.txt'))

You can think of extend as append for an an iterable of values rather than just one. Because you plan to use a counter to counter the files anyway, I would do the following to key the unique values:

with open('text.txt') as text:
    data = Counter(i for i in text) # try data.keys()

EDIT: look at NPE's answer : it's basically the same, but more elegant and pythonic.

Try reading and splitting and reducing in one go:

textset = set(file.read().split('\n'))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM