简体   繁体   English

如何在dict python 3.4的值列表中找到第一个唯一项

[英]How to find the first unique item in a list of values in a dict python 3.4

Hello all I have a dict 大家好,我有字典

dat = {
       '2018-01':['jack', 'jhon','mary','mary','jack'],
       '2018-02':['Oliver', 'Connor','mary','Liam','jack','Oliver'],
       '2018-03':['Jacob', 'jhon','Reece','mary','jack'],
       '2018-04':['George', 'jhon','mary','Alexander','Richard'],
}

I want the output like this: 我想要这样的输出:

    Output = {
              '2018-01':['jack','jhon','mary'],
              '2018-02':['Oliver', 'Connor','Liam'],
              '2018-03':['Jacob','Reece'],
              '2018-04':['George','Alexander','Richard'] 
}

I have my code which is a nested for loop inserting it to a list 我有我的代码,该代码是用于将其插入列表的循环嵌套

lis = []
for key,value in dat.iteritems():   
    for va in value:
        if va not in lis:
            val = key,va
            lis.append(val)

But my dict "dat" has so many items in the values in that list. 但是我的字典“ dat”在该列表的值中有很多项。 How can I do this with out nested for loop its consuming a lot of time. 我如何做到这一点而没有嵌套的for循环会消耗很多时间。

Thanks in advance 提前致谢

What you are trying to do is this: 您想做的是这样的:

dat = {
       '2018-01':['jack', 'jhon','mary','mary','jack'],
       '2018-02':['Oliver', 'Connor','mary','Liam','jack','Oliver'],
       '2018-03':['Jacob', 'jhon','Reece','mary','jack'],
       '2018-04':['George', 'jhon','mary','Alexander','Richard'],
}

unique = set()
res = {}
for key, values in dat.items():
    res[key] = []
    for value in values:
        if value not in unique:
            res[key].append(value)
            unique.add(value)

which produces: 产生:

{'2018-01': ['jack', 'jhon', 'mary'], 
 '2018-02': ['Oliver', 'Connor', 'Liam'], 
 '2018-03': ['Jacob', 'Reece'], 
 '2018-04': ['George', 'Alexander', 'Richard']}

BUT

the order in dictionaries prior to Python version 3.7 could not be guaranteed and this makes the above code dangerous . Python 3.7版之前的字典中的顺序无法得到保证,这使上面的代码很危险 The reason that is, is that with the same input you might end up having multiple different outputs. 原因是,对于相同的输入,您可能最终会拥有多个不同的输出。

To understand why take a look at this: 要了解为什么要看这个:

list1 = ['foo', 'bar', 'foobar']
list2 = ['bar']
  1. If I use list1 to eliminate all duplicates I would end up with: 如果我使用list1消除所有重复项,最终结果是:

     list1 = ['foo', 'bar', 'foobar'] list2 = [] 
  2. If I use list2 to eliminate all duplicates I would end up with: 如果我使用list2消除所有重复项,最终结果是:

     list1 = ['foo', 'foobar'] list2 = ['bar'] 

So depending on what I start with I end up having different results. 因此,根据我从什么开始,我最终会有不同的结果。 With the dict from your example, what list you start with is any man's guess . 随着dict从你的榜样,什么list你开始是任何人的猜测


There is still hope however 但是仍然有希望

because you can start with an OrderedDict (from collections ): 因为您可以 OrderedDict (来自collections )开始:

dat = OrderedDict([('2018-01', ['jack', 'jhon', 'mary', 'mary', 'jack']), 
                   ('2018-02', ['Oliver', 'Connor', 'mary', 'Liam', 'jack', 'Oliver']), 
                   ('2018-03', ['Jacob', 'jhon', 'Reece', 'mary', 'jack']), 
                   ('2018-04', ['George', 'jhon', 'mary', 'Alexander', 'Richard'])])

and then continue with the rest of the code as before. 然后像以前一样继续其余的代码。

Another take on @Ev. @Ev的另一种观点。 Kounis's approach using sets and OrderedDict (and pprint for sake of pretty printing): Kounis使用set和OrderedDict的方法(为了美观打印而使用pprint ):

import pprint
from collections import OrderedDict

dat = OrderedDict({
    '2018-01': ['jack', 'jhon', 'mary', 'mary', 'jack'],
    '2018-02': ['Oliver', 'Connor', 'mary', 'Liam', 'jack', 'Oliver'],
    '2018-03': ['Jacob', 'jhon', 'Reece', 'mary', 'jack'],
    '2018-04': ['George', 'jhon', 'mary', 'Alexander', 'Richard'],
})

exist = set()
output = OrderedDict()

for k, v in dat.items():
    output[k] = set(v) - exist
    exist.update(v)

pprint.pprint(output)

# OrderedDict([('2018-01', {'mary', 'jack', 'jhon'}),
#             ('2018-02', {'Connor', 'Oliver', 'Liam'}),
#             ('2018-03', {'Jacob', 'Reece'}),
#             ('2018-04', {'George', 'Alexander', 'Richard'})])

You can do something like this: 您可以执行以下操作:

l=[]
for k,v in dat.items():
    dat[k] = list(set([i for i in v if i not in l]))
    l = l + v

now dat will be: 现在dat将是:

{
    '2018-01': ['jhon', 'mary', 'jack'],
    '2018-02': ['Oliver', 'Liam', 'Connor'],
    '2018-03': ['Jacob', 'Reece'],
    '2018-04': ['George', 'Alexander', 'Richard']
}

If you don't mind about the order in the list of values this can be a solution. 如果您不介意值列表中的顺序,那么这可以解决。 Note the outpout of this solution may be different according to the version of Python. 请注意,根据Python版本的不同,此解决方案的输出可能会有所不同。 Indeed dict are guaranteed to be insertion ordered only from Python3.6. 实际上,保证dict只能从Python3.6插入命令。

dat = {
'2018-01': ['jack', 'jhon', 'mary', 'mary', 'jack'],
'2018-02': ['Oliver', 'Connor', 'mary', 'Liam', 'jack', 'Oliver'],
'2018-03': ['Jacob', 'jhon', 'Reece', 'mary', 'jack'],
'2018-04': ['George', 'jhon', 'mary', 'Alexander', 'Richard'],
}

s = set()
d = {}
for k,v in dat.items():
    d[k] = list(set(v) - s)
    s.update(d[k])

#{'2018-01': ['jack', 'jhon', 'mary'], '2018-02': ['Connor', 'Oliver', 'Liam'], '2018-03': ['Reece', 'Jacob'], '2018-04': ['Richard', 'Alexander', 'George']}

I think that what you need, I just edit your code 我认为您需要什么,我只是编辑您的代码

dat = {
       '2018-01':['jack', 'jhon','mary','mary','jack'],
       '2018-02':['Oliver', 'Connor','mary','Liam','jack','Oliver'],
       '2018-03':['Jacob', 'jhon','Reece','mary','jack'],
       '2018-04':['George', 'jhon','mary','Alexander','Richard'],
}

lis= dat.values()
lis = list(set([item for sublist in lis for item in sublist]))
out_val = []
for key,value in dat.iteritems():   
    res = []
    for i in value :
        if i in lis :
            res.append(i)
            lis.remove(i)
    out_val.append(res)

your_output=dict(zip( dat.keys(), out_val))

Output : 输出:

{'2018-01': ['jack', 'jhon', 'mary'], 
'2018-03': ['Jacob', 'Reece'], 
'2018-02': ['Oliver', 'Connor', 'Liam'], 
'2018-04': ['George', 'Alexander', 'Richard']}

Assuming the order is by the keys ['2018-01', '2018-02', '2018-03', '2018-04'] you could loop over the keys in that order, like this: 假设通过键['2018-01', '2018-02', '2018-03', '2018-04']进行排序,则可以按此顺序遍历键,如下所示:

d = {'2018-01': ['jack', 'jhon', 'mary', 'mary', 'jack'],
     '2018-02': ['Oliver', 'Connor', 'mary', 'Liam', 'jack', 'Oliver'],
     '2018-03': ['Jacob', 'jhon', 'Reece', 'mary', 'jack'],
     '2018-04': ['George', 'jhon', 'mary', 'Alexander', 'Richard']}

result = {}
found = set()
for i in sorted(d):
    result[i] = list(set(d[i]).difference(found))
    found.update(d[i])

for i in sorted(result):
     print(i, result[i])

Output 产量

2018-01 ['mary', 'jhon', 'jack']
2018-02 ['Oliver', 'Liam', 'Connor']
2018-03 ['Reece', 'Jacob']
2018-04 ['Alexander', 'Richard', 'George']

Try this. 尝试这个。

tmp_list1 = []

for key,value in dat.iteritems():

    tmp_list2 = []

    dat[key] = list(set(value))

    for val in dat[key]:

        if val not in tmp_list1:

            tmp_list2.append(val)

    dat[key] = tmp_list2

    tmp_list1 = tmp_list1 + tmp_list2

print dat
import itertools
for i in d:
     d[i].sort()
     d[i] = list(i for i, _ in itertools.groupby(d[i]))

# Print the dict containing unique lists for keys.
for i in d:
     print(i, "->", d[i])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM