繁体   English   中英

在python中找到两个列表列表之间的最常用元素的最快方法

[英]Fastest way of finding common elements between two list of lists in python

我有一个如下列表。

mylist = 
[  
   [  
      [  
         "chocolate_pudding",
         920.8000000000001
      ],
      [  
         "caramel_pudding",
         345.59999999999997
      ],
      [  
         "pudding",
         248.0
      ],
      [  
         "banana_pudding",
         27.599999999999998
      ]
   ],
   [  
      [  
         "biscuits",
         190.8
      ],
      [  
         "chocolates",
         33.599999999999994
      ],
      [  
         "chocolate_pudding",
         920.8000000000001
      ]
   ],
   [  
      [  
         "tiramusu",
         145.8
      ]
   ],
   [  
      [  
         "cakes",
         139.29999999999998
      ]
   ],
   [  
      [  
         "butter_cakes",
         133.0
      ]
   ],
   [  
      [  
         "chocolate_pudding",
         920.8000000000001
      ]
   ]
]

我想找到在列表中出现多次的元素(例如, ["chocolate_pudding", 920.8000000000001] )并想要删除重复的["chocolate_pudding", 920.8000000000001] ,同时保留第一个条目。

所以,我的输出应该如下所示。

mylist = 
[  
   [  
      [  
         "chocolate_pudding",
         920.8000000000001
      ],
      [  
         "caramel_pudding",
         345.59999999999997
      ],
      [  
         "pudding",
         248.0
      ],
      [  
         "banana_pudding",
         27.599999999999998
      ]
   ],
   [  
      [  
         "biscuits",
         190.8
      ],
      [  
         "chocolates",
         33.599999999999994
      ]
   ],
   [  
      [  
         "tiramusu",
         145.8
      ]
   ],
   [  
      [  
         "cakes",
         139.29999999999998
      ]
   ],
   [  
      [  
         "butter_cakes",
         133.0
      ]
   ]
]

我一直在尝试的代码如下。

mylist_copy = mylist

for item in mylist:
    myindex = mylist.index(item)
    #print(item)

    for single_item in item:
        #print(single_item)
        for item_copy in mylist_copy:
            if mylist_copy.index(item_copy) != myindex:
                if single_item in item_copy:
                    print(single_item)

因为,它有许多for循环,我想要一种有效的方法。 注意:我也试过;

mylist_copy = mylist

for item in mylist:
    myindex = mylist.index(item)
    for item_copy in mylist_copy:
          if mylist_copy.index(item_copy) != myindex:
                print(set(item).intersection(item_copy))

但是,交叉点不支持列表。

在python中有一种简单快捷的方法吗?

使用set()对象并保留子列表的顺序:

mylist = [[["chocolate_pudding", 920.8000000000001], ["caramel_pudding", 345.59999999999997], 
          ["pudding", 248.0], ["banana_pudding", 27.599999999999998]], [["biscuits", 190.8], 
          ["chocolates", 33.599999999999994], ["chocolate_pudding", 920.8000000000001]], 
          [["tiramusu", 145.8]], [["cakes", 139.29999999999998]], [["butter_cakes", 133.0]], 
          [["chocolate_pudding", 920.8000000000001]]]

result, foods = [], set()
for sub_l in mylist:
    new_sublist = []
    for i in sub_l:
        if i[0] not in foods:     # on the 1st occurrence of `foodstuff` name
            new_sublist.append(i)
            foods.add(i[0])       # add `foodstuff` into set of unique foods
    if new_sublist: result.append(new_sublist)

print(result)

输出:

[[['chocolate_pudding', 920.8000000000001], ['caramel_pudding', 345.59999999999997], ['pudding', 248.0], ['banana_pudding', 27.599999999999998]], [['biscuits', 190.8], ['chocolates', 33.599999999999994]], [['tiramusu', 145.8]], [['cakes', 139.29999999999998]], [['butter_cakes', 133.0]]]

您可以展开内部列表并将它们全部放在一个集合中。 集合可能不包含重复项,因此您甚至不必检查它,集合会在很短的时间内为您完成。 唯一需要注意的是,一个集合不能包含列表,因此需要先将它们转换为元组。 如果您对这两种类型转换没有问题,可以在简单的集合理解中完成,并且应该相当快:

no_duplicates = {tuple(inner) for outer in mylist for inner in outer}

或者您之后更改类型:

no_dupe_lists = list(map(list, no_duplicates))

你不要求这样做,但是如果要复制一个列表,你必须使用一种正确的复制技术: mylist_copy = list(mylist)mylist_copy = mylist[:]mylist_copy = [element for element in mylist] ,第一个是推荐的。

由于您的列表包含嵌套列表,因此需要复制这些列表:

mylist_copy = [[list(inner) for inner in outer] for outer in mylist]

一旦一个伟大的人说,只采取你想要的,为什么删除? 现在有两个人说:

mylist = [[["chocolate_pudding", 920.8000000000001], ["caramel_pudding", 345.59999999999997],
          ["pudding", 248.0], ["banana_pudding", 27.599999999999998]], [["biscuits", 190.8],
          ["chocolates", 33.599999999999994], ["chocolate_pudding", 920.8000000000001]],
          [["tiramusu", 145.8]], [["cakes", 139.29999999999998]], [["butter_cakes", 133.0]],
          [["chocolate_pudding", 920.8000000000001]]]


result=[]
track=[]
for i in mylist:
    sublist=[]
    for k in i:
        if k not in track:
            track.append(k)
            sublist.append(k)

    if sublist:

        result.append(sublist)


print(result)

输出:

[[['chocolate_pudding', 920.8000000000001], ['caramel_pudding', 345.59999999999997], ['pudding', 248.0], ['banana_pudding', 27.599999999999998]], [['biscuits', 190.8], ['chocolates', 33.599999999999994]], [['tiramusu', 145.8]], [['cakes', 139.29999999999998]], [['butter_cakes', 133.0]]]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM