I have a text file like this:
a w b x c,d y e,f z
And I want to get the values of the first column into a list without duplicates. For now I get the values from the first column, which I am doing like this:
f=open("file.txt","r")
lines=f.readlines()
firstCol=[]
for x in lines:
firstCol.append(x.split(' ')[0])
f.close()
In the next step I want to separate the values by a comma delimiter the same way I did before, but then I get an output like this:
[['a'], ['b'], ['c', 'd'], ['e', 'f']]
How can I convert this into a one dimensional thing to be able to remove duplicates afterwards? I am a beginner in python.
you can use itertools.chain
to flatten your list of lists and then you can use the built-in class set
to remove the duplicates:
from itertools import chain
l = [['a'], ['b'], ['c', 'd'], ['e', 'f']]
set(chain.from_iterable(l))
# {'a', 'b', 'c', 'd', 'e', 'f'}
to flatten your list you can also use a list comprehension:
my_l = [e for i in l for e in i]
# ['a', 'b', 'c', 'd', 'e', 'f']
same with 2 simple for
loops:
my_l = []
for i in l:
for e in i:
my_l.append(e)
You can split it immediately after the first split and must use extend instead of append.
f=open("file.txt","r")
lines=f.readlines()
firstCol=[]
for x in lines:
firstCol.extend(x.split(' ')[0].split(','))
f.close()
print(firstCol)
Result
['a', 'b', 'c', 'd', 'e', 'f']
Or if you want to keep the firstCol
f=open("file.txt","r")
lines=f.readlines()
firstCol=[]
for x in lines:
firstCol.append(x.split(' ')[0])
f.close()
one_dimension = []
for col in firstCol:
one_dimension.extend(col.split(','))
print(firstCol)
print(one_dimension)
Result
['a', 'b', 'c,d', 'e,f']
['a', 'b', 'c', 'd', 'e', 'f']
If your are fine with your code, you can keep like that and remove duplicates from a list of lists executing the following:
import itertools
firstCol.sort()
firstCol = list(x for x,_ in itertools.groupby(firstCol))
If you want to convert the list of lists into one list of items:
firstCol = [x for y in firstCol for x in y]
If you want to also remove duplicates:
firstCol = list(set([x for y in firstCol for x in y]))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.