简体   繁体   English

将文本文件中的列值放入 python 的列表中

[英]Putting column values from text file into a list in python

I have a text file like this:我有一个这样的文本文件:

a    w
b    x
c,d  y
e,f  z

And I want to get the values of the first column into a list without duplicates.我想将第一列的值放入一个没有重复的列表中。 For now I get the values from the first column, which I am doing like this:现在我从第一列获取值,我这样做是这样的:

f=open("file.txt","r")
lines=f.readlines()
firstCol=[]
for x in lines:
    firstCol.append(x.split('   ')[0])
f.close()

In the next step I want to separate the values by a comma delimiter the same way I did before, but then I get an output like this:在下一步中,我想像以前一样用逗号分隔符分隔值,但随后我得到一个 output ,如下所示:

[['a'], ['b'], ['c', 'd'], ['e', 'f']]

How can I convert this into a one dimensional thing to be able to remove duplicates afterwards?如何将其转换为一维事物以便之后能够删除重复项? I am a beginner in python.我是 python 的初学者。

you can use itertools.chain to flatten your list of lists and then you can use the built-in class set to remove the duplicates:您可以使用itertools.chain来展平您的列表列表,然后您可以使用内置的 class set来删除重复项:

from itertools import chain

l = [['a'], ['b'], ['c', 'd'], ['e', 'f']]
set(chain.from_iterable(l))
# {'a', 'b', 'c', 'd', 'e', 'f'}

to flatten your list you can also use a list comprehension:要展平您的列表,您还可以使用列表理解:

my_l = [e for i in l for e in i]
# ['a', 'b', 'c', 'd', 'e', 'f']

same with 2 simple for loops:与 2 个简单for循环相同:

my_l = []

for i in l:
    for e in i:
        my_l.append(e)

You can split it immediately after the first split and must use extend instead of append.您可以在第一次拆分后立即拆分它,并且必须使用 extend 而不是 append。

f=open("file.txt","r")
lines=f.readlines()
firstCol=[]
for x in lines:
    firstCol.extend(x.split(' ')[0].split(','))
f.close()

print(firstCol)

Result结果

['a', 'b', 'c', 'd', 'e', 'f']

Or if you want to keep the firstCol或者如果你想保留 firstCol

f=open("file.txt","r")
lines=f.readlines()
firstCol=[]
for x in lines:
    firstCol.append(x.split(' ')[0])
f.close()

one_dimension = []
for col in firstCol:
    one_dimension.extend(col.split(','))

print(firstCol)
print(one_dimension)

Result结果

['a', 'b', 'c,d', 'e,f']
['a', 'b', 'c', 'd', 'e', 'f']

Possible solution 1可能的解决方案 1

If your are fine with your code, you can keep like that and remove duplicates from a list of lists executing the following:如果您的代码很好,您可以保持这样并从执行以下操作的列表列表中删除重复项:

import itertools

firstCol.sort()
firstCol = list(x for x,_ in itertools.groupby(firstCol))

Possible solution 2可能的解决方案 2

If you want to convert the list of lists into one list of items:如果要将列表列表转换为一个项目列表:

firstCol = [x for y in firstCol for x in y]

If you want to also remove duplicates:如果您还想删除重复项:

firstCol = list(set([x for y in firstCol for x in y]))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM