简体   繁体   English

从列表中删除字典

[英]Remove dictionary from list

If I have a list of dictionaries, say:如果我有字典列表,请说:

[{'id': 1, 'name': 'paul'},
 {'id': 2, 'name': 'john'}]

and I would like to remove the dictionary with id of 2 (or name 'john' ), what is the most efficient way to go about this programmatically (that is to say, I don't know the index of the entry in the list so it can't simply be popped).我想删除id为 2 (或名称'john' )的字典,以编程方式解决这个问题的最有效方法是 go (也就是说,我不知道列表中条目的索引所以它不能简单地弹出)。

thelist[:] = [d for d in thelist if d.get('id') != 2]

Edit : as some doubts have been expressed in a comment about the performance of this code (some based on misunderstanding Python's performance characteristics, some on assuming beyond the given specs that there is exactly one dict in the list with a value of 2 for key 'id'), I wish to offer reassurance on this point.编辑:因为在关于此代码性能的评论中表达了一些疑问(一些基于误解 Python 的性能特征,一些基于假设超出给定规范列表中恰好有一个值为 2 的 dict 为 key ' id'),我想在这一点上保证。

On an old Linux box, measuring this code:在旧的 Linux 机器上,测量此代码:

$ python -mtimeit -s"lod=[{'id':i, 'name':'nam%s'%i} for i in range(99)]; import random" "thelist=list(lod); random.shuffle(thelist); thelist[:] = [d for d in thelist if d.get('id') != 2]"
10000 loops, best of 3: 82.3 usec per loop

of which about 57 microseconds for the random.shuffle (needed to ensure that the element to remove is not ALWAYS at the same spot;-) and 0.65 microseconds for the initial copy (whoever worries about performance impact of shallow copies of Python lists is most obviously out to lunch;-), needed to avoid altering the original list in the loop (so each leg of the loop does have something to delete;-).其中 random.shuffle 大约需要 57 微秒(需要确保要删除的元素不总是在同一个位置;-)初始副本需要 0.65 微秒(担心 Python 列表浅拷贝对性能影响的人最显然是出去吃午饭;-),需要避免改变循环中的原始列表(因此循环的每一段确实都有要删除的内容;-)。

When it is known that there is exactly one item to remove, it's possible to locate and remove it even more expeditiously:当知道只有一个项目要移除时,可以更快地定位和移除它:

$ python -mtimeit -s"lod=[{'id':i, 'name':'nam%s'%i} for i in range(99)]; import random" "thelist=list(lod); random.shuffle(thelist); where=(i for i,d in enumerate(thelist) if d.get('id')==2).next(); del thelist[where]"
10000 loops, best of 3: 72.8 usec per loop

(use the next builtin rather than the .next method if you're on Python 2.6 or better, of course) -- but this code breaks down if the number of dicts that satisfy the removal condition is not exactly one. (当然,如果您使用的是 Python 2.6 或更高版本,则使用next内置函数而不是.next方法)——但是如果满足删除条件的字典数量不完全是一个,则此代码会崩溃。 Generalizing this, we have:概括这一点,我们有:

$ python -mtimeit -s"lod=[{'id':i, 'name':'nam%s'%i} for i in range(33)]*3; import random" "thelist=list(lod); where=[i for i,d in enumerate(thelist) if d.get('id')==2]; where.reverse()" "for i in where: del thelist[i]"
10000 loops, best of 3: 23.7 usec per loop

where the shuffling can be removed because there are already three equispaced dicts to remove, as we know.可以删除改组的地方,因为我们知道,已经有三个等距的 dict 需要删除。 And the listcomp, unchanged, fares well:而 listcomp 没有变化,表现良好:

$ python -mtimeit -s"lod=[{'id':i, 'name':'nam%s'%i} for i in range(33)]*3; import random" "thelist=list(lod); thelist[:] = [d for d in thelist if d.get('id') != 2]"
10000 loops, best of 3: 23.8 usec per loop

totally neck and neck, with even just 3 elements of 99 to be removed.完全脖子和脖子,甚至只有 99 的 3 个元素要删除。 With longer lists and more repetitions, this holds even more of course:对于更长的列表和更多的重复,这当然更适用:

$ python -mtimeit -s"lod=[{'id':i, 'name':'nam%s'%i} for i in range(33)]*133; import random" "thelist=list(lod); where=[i for i,d in enumerate(thelist) if d.get('id')==2]; where.reverse()" "for i in where: del thelist[i]"
1000 loops, best of 3: 1.11 msec per loop
$ python -mtimeit -s"lod=[{'id':i, 'name':'nam%s'%i} for i in range(33)]*133; import random" "thelist=list(lod); thelist[:] = [d for d in thelist if d.get('id') != 2]"
1000 loops, best of 3: 998 usec per loop

All in all, it's obviously not worth deploying the subtlety of making and reversing the list of indices to remove, vs the perfectly simple and obvious list comprehension, to possibly gain 100 nanoseconds in one small case -- and lose 113 microseconds in a larger one;-).总而言之,与完全简单明了的列表推导相比,显然不值得部署制作和反转要删除的索引列表的微妙之处,以在一个小案例中可能获得 100 纳秒——而在更大的案例中失去 113 微秒;-) Avoiding or criticizing simple, straightforward, and perfectly performance-adequate solutions (like list comprehensions for this general class of "remove some items from a list" problems) is a particularly nasty example of Knuth's and Hoare's well-known thesis that "premature optimization is the root of all evil in programming"!-)避免或批评简单、直接和完美的性能足够的解决方案(例如此类“从列表中删除某些项目”问题的列表推导式)是 Knuth 和 Hoare 著名论文的一个特别讨厌的例子,即“过早的优化是编程中万恶之源”!-)

Here's a way to do it with a list comprehension (assuming you name your list 'foo'):这是一种通过列表理解来实现的方法(假设您将列表命名为“foo”):

[x for x in foo if not (2 == x.get('id'))]

Substitute 'john' == x.get('name') or whatever as appropriate.替换'john' == x.get('name')或任何适当的。

filter also works: filter也有效:

foo.filter(lambda x: x.get('id')!=2, foo)

And if you want a generator you can use itertools:如果你想要一个生成器,你可以使用 itertools:

itertools.ifilter(lambda x: x.get('id')!=2, foo)

However, as of Python 3, filter will return an iterator anyway, so the list comprehension is really the best choice, as Alex suggested.但是,从 Python 3 开始, filter无论如何都会返回一个迭代器,因此正如 Alex 所建议的那样,列表推导式确实是最佳选择。

这不是一个正确的 anwser(因为我认为你已经有一些很好的),但是......你有没有考虑过使用<id>:<name>的字典而不是字典列表?

# assume ls contains your list
for i in range(len(ls)):
    if ls[i]['id'] == 2:
        del ls[i]
        break

Will probably be faster than the list comprehension methods on average because it doesn't traverse the whole list if it finds the item in question early on.平均而言可能比列表理解方法更快,因为如果它在早期找到有问题的项目,它不会遍历整个列表。

You can try the following:您可以尝试以下操作:

a = [{'id': 1, 'name': 'paul'},
     {'id': 2, 'name': 'john'}]

for e in range(len(a) - 1, -1, -1):
    if a[e]['id'] == 2:
        a.pop(e)

If You can't pop from the beginning - pop from the end, it won't ruin the for loop.如果你不能从头弹出 - 从末尾弹出,它不会破坏 for 循环。

Supposed your python version is 3.6 or greater, and that you don't need the deleted item this would be less expensive...假设您的 python 版本是 3.6 或更高版本,并且您不需要删除的项目,这会更便宜...

If the dictionaries in the list are unique :如果列表中的字典是唯一的:

for i in range(len(dicts)):
    if dicts[i].get('id') == 2:
        del dicts[i]
        break

If you want to remove all matched items :如果要删除所有匹配的项目:

for i in range(len(dicts)):
    if dicts[i].get('id') == 2:
        del dicts[i]

You can also to this to be sure getting id key won't raise keyerror regardless the python version您也可以这样做以确保无论 python 版本如何,获取 id 密钥都不会引发 keyerror

if dicts[i].get('id', None) == 2如果 dicts[i].get('id', None) == 2

You could try something along the following lines:您可以尝试以下方法:

def destructively_remove_if(predicate, list):
      for k in xrange(len(list)):
          if predicate(list[k]):
              del list[k]
              break
      return list

  list = [
      { 'id': 1, 'name': 'John' },
      { 'id': 2, 'name': 'Karl' },
      { 'id': 3, 'name': 'Desdemona' } 
  ]

  print "Before:", list
  destructively_remove_if(lambda p: p["id"] == 2, list)
  print "After:", list

Unless you build something akin to an index over your data, I don't think that you can do better than doing a brute-force "table scan" over the entire list.除非您在数据上构建类似于索引的东西,否则我认为您不会比对整个列表进行蛮力“表扫描”做得更好。 If your data is sorted by the key you are using, you might be able to employ the bisect module to find the object you are looking for somewhat faster.如果您的数据按您使用的键排序,您也许可以使用bisect模块更快地找到您要查找的对象。

From the update on pep448 on unpacking generalisations (python 3.5 and onwards) while iterating a list of dicts with a temporary variable, let's say row, You can take the dict of the current iteration in, using **row, merge new keys in or use a boolean operation to filter out dict(s) from your list of dicts.pep448上关于解包概括(python 3.5 及更高版本)的更新,同时使用临时变量迭代字典列表,比方说行,您可以使用 **row 获取当前迭代的字典,合并新键或使用 boolean 操作从您的字典列表中过滤掉字典。

Keep in mind **row will output a new dictionary.请记住 **行将 output 是一本新词典。

For example your starting list of dicts:例如,您的字典起始列表:

data = [{'id': 1, 'name': 'paul'},{'id': 2, 'name': 'john'}]

if we want to filter out id 2:如果我们想过滤掉 id 2:

data = [{**row} for row in data if row['id']!=2]

if you want to filter out John:如果你想过滤掉约翰:

data = [{**row} for row in data if row['name']!='John']

not directly related to the question but if you want to a add new key:与问题没有直接关系,但如果您想添加新密钥:

data = [{**row, 'id_name':str(row['id'])+'_'+row['name']} for row in data]

It's also a tiny bit faster than the accepted solution.它也比公认的解决方案快一点。

Try this: example remove 'joh' for the list试试这个:例如删除列表中的“joh”

for id,elements in enumerate(dictionary):
    if elements['name']=='john':
        del dictionary[id]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM