简体   繁体   English

Python:从迭代器内的列表中删除元素?

[英]Python: Removing elements from a list inside an iterator?

I am trying to remove elements from a list in Python. 我试图从Python中的列表中删除元素。 Most answers seem to suggest that using a list iterator is best, but I don't think it's possible (or at least elegant) for my problem. 大多数答案似乎表明使用列表迭代器是最好的,但我不认为这对我的问题是可能的(或者至少是优雅的)。

I want to iterate over the test_data list and delete any items that meet the following two conditions: (1) have an attribute total:sum (2) have an attribute ( pagePath ) that starts with, but is not equal to, any element in the list mystrings . 我想迭代test_data列表并删除满足以下两个条件的任何项:(1)具有属性total:sum (2)具有以(但不等于)任何元素开头的属性( pagePath )列表mystrings

Here is my list of strings, and my test data: 这是我的字符串列表,以及我的测试数据:

    mystrings = [u'/calculate-state-pension', u'/check-uk-visa']
    test_data = [
        {
            "pagePath": "/check-uk-visa",
            "total:sum": 2.0
        },
        {
            "pagePath": "/check-uk-visa/y",
            "total:sum": 3.0
        },
        {
            "pagePath": "/check-uk-visa/n",
            "total:sum": 4.0
        },
        {
            "pagePath": "/bank-holidays",
            "total:sum": 2.0
        },
        {
            "pagePath": "/check-uk-visa",
            "searchUniques:sum": 2.0
        }
    ]

So I would like to end up with this list: 所以我想最终得到这个列表:

    results = [
        {
            "pagePath": "/check-uk-visa",
            "total:sum": 2.0
        },
        {
            "pagePath": "/bank-holidays",
            "total:sum": 2.0
        },
        {
            "pagePath": "/check-uk-visa",
            "searchUniques:sum": 2.0
        }
    ]

This is my code: 这是我的代码:

    results = test_data[:]
    for r in results_copy:
        for s in mystrings:
            if 'total:sum' in r and r['pagePath'].startswith(s) \
                 and r['pagePath'] != s:
                results.remove(r)
    return results

But this doesn't seem to work. 但这似乎不起作用。 It removes the element with /check-uk-visa/y but not the one with /check-uk-visa/n . 它使用/check-uk-visa/y删除元素,但不使用/check-uk-visa/n删除元素。

What am I doing wrong? 我究竟做错了什么? I think it's something to do with the deleting and the iterator - it looks like it's skipping elements. 我认为它与删除和迭代器有关 - 看起来它正在跳过元素。

You want any combination where "pagePath" value startswith a string in your string list but is not equal to the string. 您想要"pagePath"值以字符串列表中的字符串"pagePath"但不等于字符串的any组合。

for dic in test_data[:]:
    s = dic.get("pagePath","")
    if "total:sum" in dic and any(s.startswith(y) and s != y  for y in mystrings):
        test_data.remove(dic)

[{'total:sum': 2.0, 'pagePath': '/check-uk-visa'}, {'total:sum': 2.0, 'pagePath': '/bank-holidays'}, {'searchUniques:sum': 2.0, 'pagePath': '/check-uk-visa'}]

One caveat is if you have similar strings in your mystrings list where one might start with the same letters and not be equal but may be equal to another so in that case we can use a set for 0(1) lookups and use in. 一个警告是,如果你的mystrings列表中有类似的字符串,其中一个可能以相同的字母开头但不相等但可能等于另一个,所以在这种情况下我们可以使用一个集合进行0(1)查找并使用。

mystrings = {u'/calculate-state-pension', u'/check-uk-visa'}

for dic in test_data[:]:
    s = dic.get("pagePath","")
    if "total:sum" in dic and any(s.startswith(y) for y in mystrings)and s not in mystrings:
        test_data.remove(dic)
print(test_data)

The easiest way to filter something like this is usually to use the filter function. 过滤此类内容的最简单方法通常是使用filter功能。

results_copy = filter(lambda r: ('total:sum' in r
                                 and any([r['pagePath'].startswith(s) 
                                                             for s in mystrings])
                                 and r['pagePath'] not in mystrings), 
                      results)

Alternatively, you could use a list comprehension. 或者,您可以使用列表推导。 Sometimes it is easier to read when you want to do some processing in addition to filtering: 除了过滤之外,当您想要进行一些处理时,有时会更容易阅读:

results_copy = [r for r in results if ('total:sum' in r
                                       and any([r['pagePath'].startswith(s) 
                                                             for s in mystrings])
                                       and r['pagePath'] not in mystrings)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM