繁体   English   中英

python脚本按行连接值并删除相同的值

[英]python script to concatenate values by row and delete identical

我正在使用python 2.7,并且我有一个文本文件,如下所示:

id     value
---    ----
1      x
2      a
1      z
1      y
2      b

我试图得到一个看起来像这样的输出:

id     value
---    ----
1      x,z,y
2      a,b

非常感激!

最简单的解决方案是使用collections.defaultdictcollections.OrderedDict 如果您不关心订单,也可以使用set代替OrderedDict

from collections import defaultdict, OrderedDict

# Keeps all unique values for each id
dd = defaultdict(OrderedDict)
# Keeps the unique ids in order of appearance
ids = OrderedDict()

with open(yourfilename) as f:
    f = iter(f)
    # skip first two lines
    next(f), next(f)  
    for line in f:
        id_, value = list(filter(bool, line.split()))  # split at whitespace and remove empty ones
        dd[id_][value] = None  # dicts need a value, but here it doesn't matter which one...
        ids[id_] = None

print('id     value')
print('---    ----')
for id_ in ids:
    print('{}      {}'.format(id_, ','.join(dd[id_])))

结果:

id     value
---    ----
1      x,z,y
2      a,b

如果您想将其写入另一个文件,只需将我打印的内容与\\n连接起来,然后writewrite文件。

我认为这也可能有效,尽管其他答案似乎更为复杂:

input =['1,x',
'2,a',
'1,z',
'1,y',
'2,b',
'2,a', #added extra values to show duplicates won't be added
'1,z',
'1,y']

output = {}

for row in input:
    parts = row.split(",")
    id_ = parts[0]
    value = parts[1]
    if id_ not in output:
        output[id_] = value
    else:
        a_List = list(output[id_])
        if value not in a_List:
            output[id_] += "," + value
        else:
            pass

您最终会得到与您所要求的字典相似的字典。

#read
fp=open('','r') 
d=fp.read().split("\n")
fp.close()
x=len(d)
for i in range(len(d)):
    n= d[i].split()
    d.append(n)
d=d[x:]

m={}
for i in d:
    if i[0] not in m:
        m[i[0]]=[i[1]]
    else:
        if i[1] not in m[i[0]]:
            m[i[0]].append(i[1])
for i in m:
    print i,",".join(m[i])

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM