[英]Putting column values from text file into a list in python
I have a text file like this:我有一个这样的文本文件:
a w b x c,d y e,f z
And I want to get the values of the first column into a list without duplicates.我想将第一列的值放入一个没有重复的列表中。 For now I get the values from the first column, which I am doing like this:
现在我从第一列获取值,我这样做是这样的:
f=open("file.txt","r")
lines=f.readlines()
firstCol=[]
for x in lines:
firstCol.append(x.split(' ')[0])
f.close()
In the next step I want to separate the values by a comma delimiter the same way I did before, but then I get an output like this:在下一步中,我想像以前一样用逗号分隔符分隔值,但随后我得到一个 output ,如下所示:
[['a'], ['b'], ['c', 'd'], ['e', 'f']]
How can I convert this into a one dimensional thing to be able to remove duplicates afterwards?如何将其转换为一维事物以便之后能够删除重复项? I am a beginner in python.
我是 python 的初学者。
you can use itertools.chain
to flatten your list of lists and then you can use the built-in class set
to remove the duplicates:您可以使用
itertools.chain
来展平您的列表列表,然后您可以使用内置的 class set
来删除重复项:
from itertools import chain
l = [['a'], ['b'], ['c', 'd'], ['e', 'f']]
set(chain.from_iterable(l))
# {'a', 'b', 'c', 'd', 'e', 'f'}
to flatten your list you can also use a list comprehension:要展平您的列表,您还可以使用列表理解:
my_l = [e for i in l for e in i]
# ['a', 'b', 'c', 'd', 'e', 'f']
same with 2 simple for
loops:与 2 个简单
for
循环相同:
my_l = []
for i in l:
for e in i:
my_l.append(e)
You can split it immediately after the first split and must use extend instead of append.您可以在第一次拆分后立即拆分它,并且必须使用 extend 而不是 append。
f=open("file.txt","r")
lines=f.readlines()
firstCol=[]
for x in lines:
firstCol.extend(x.split(' ')[0].split(','))
f.close()
print(firstCol)
Result结果
['a', 'b', 'c', 'd', 'e', 'f']
Or if you want to keep the firstCol或者如果你想保留 firstCol
f=open("file.txt","r")
lines=f.readlines()
firstCol=[]
for x in lines:
firstCol.append(x.split(' ')[0])
f.close()
one_dimension = []
for col in firstCol:
one_dimension.extend(col.split(','))
print(firstCol)
print(one_dimension)
Result结果
['a', 'b', 'c,d', 'e,f']
['a', 'b', 'c', 'd', 'e', 'f']
If your are fine with your code, you can keep like that and remove duplicates from a list of lists executing the following:如果您的代码很好,您可以保持这样并从执行以下操作的列表列表中删除重复项:
import itertools
firstCol.sort()
firstCol = list(x for x,_ in itertools.groupby(firstCol))
If you want to convert the list of lists into one list of items:如果要将列表列表转换为一个项目列表:
firstCol = [x for y in firstCol for x in y]
If you want to also remove duplicates:如果您还想删除重复项:
firstCol = list(set([x for y in firstCol for x in y]))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.