简体   繁体   中英

Putting column values from text file into a list in python

I have a text file like this:

a    w
b    x
c,d  y
e,f  z

And I want to get the values of the first column into a list without duplicates. For now I get the values from the first column, which I am doing like this:

f=open("file.txt","r")
lines=f.readlines()
firstCol=[]
for x in lines:
    firstCol.append(x.split('   ')[0])
f.close()

In the next step I want to separate the values by a comma delimiter the same way I did before, but then I get an output like this:

[['a'], ['b'], ['c', 'd'], ['e', 'f']]

How can I convert this into a one dimensional thing to be able to remove duplicates afterwards? I am a beginner in python.

you can use itertools.chain to flatten your list of lists and then you can use the built-in class set to remove the duplicates:

from itertools import chain

l = [['a'], ['b'], ['c', 'd'], ['e', 'f']]
set(chain.from_iterable(l))
# {'a', 'b', 'c', 'd', 'e', 'f'}

to flatten your list you can also use a list comprehension:

my_l = [e for i in l for e in i]
# ['a', 'b', 'c', 'd', 'e', 'f']

same with 2 simple for loops:

my_l = []

for i in l:
    for e in i:
        my_l.append(e)

You can split it immediately after the first split and must use extend instead of append.

f=open("file.txt","r")
lines=f.readlines()
firstCol=[]
for x in lines:
    firstCol.extend(x.split(' ')[0].split(','))
f.close()

print(firstCol)

Result

['a', 'b', 'c', 'd', 'e', 'f']

Or if you want to keep the firstCol

f=open("file.txt","r")
lines=f.readlines()
firstCol=[]
for x in lines:
    firstCol.append(x.split(' ')[0])
f.close()

one_dimension = []
for col in firstCol:
    one_dimension.extend(col.split(','))

print(firstCol)
print(one_dimension)

Result

['a', 'b', 'c,d', 'e,f']
['a', 'b', 'c', 'd', 'e', 'f']

Possible solution 1

If your are fine with your code, you can keep like that and remove duplicates from a list of lists executing the following:

import itertools

firstCol.sort()
firstCol = list(x for x,_ in itertools.groupby(firstCol))

Possible solution 2

If you want to convert the list of lists into one list of items:

firstCol = [x for y in firstCol for x in y]

If you want to also remove duplicates:

firstCol = list(set([x for y in firstCol for x in y]))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM