简体   繁体   English

根据其他列表中的值更新列表

[英]updating a list based on the values on other lists

I have a list of lists, each list contains four elements, and the elements represent id , age , val1 , val2 . 我有一个列表列表,每个列表包含四个元素,并且这些元素代表idageval1val2 I am manipulating each list in such a way that the val1 and val2 values of that list always depend on the most recent values seen in the previous lists. 我以这样的方式处理每个列表,即该列表的val1val2值始终取决于先前列表中看到的最新值。 The previous lists for a list are those lists for which the age difference is not less than timeDelta . 列表的先前列表是年龄差异不小于timeDelta那些列表。 The list of lists are in sorted order by age. 列表列表按年龄排序。

My code is working perfect but it is slow. 我的代码运行完美,但是速度很慢。 I feel that the line marked * * is generating too many lists of lists and can be avoided, by keep on deleting the lists from the begining one I know that the age difference of a list with the next list is more than timeDelta . 我觉得标记为* *的行正在生成太多列表列表,可以避免,方法是从开始的列表中继续删除列表,我知道列表与下一个列表的年龄差异大于timeDelta

myList = [
          [1,   20, '',     'x'],
          [1,   25, 's',    ''],
          [1,   26, '',     'e'],
          [1,   30, 'd',    's'],
          [1,   50, 'd',    'd'],
          [1,   52, 'f',    'g']
          ]


age_Idx =1
timeDelta = 10

for i in range(len(myList))[1:]:
    newList = myList[:i+1] #Subset of lists.  #********
    respList = newList.pop(-1) 
    currage = float(respList[age_Idx])
    retval = collapseListTogether(newList, age_Idx, currage, timeDelta)
    if(len(retval) == 0):
        continue
    retval[0:2] = respList[0:2]
    print(retval)

def collapseListTogether(li, age_Idx, currage, timeDelta):
    finalList = []
    for xl in reversed(li) :
        #print(xl)
        oldage = float(xl[age_Idx])
        if ((currage-timeDelta) <= oldage < currage):
            finalList.append(xl)
        else:
            break
    return([reduce(lambda a, b: b or a, tup) for tup in zip(*finalList[::-1])])

Example

[1, 20, '',     'x'] ==> Not dependent on anything. Skip this list
[1, 25, 's',    '']    == > [1, 25, '', 'x'] 
[1, 26, '',     'e']   ==>  [1, 26, 's', 'x']
[1, 30, 'd',    's']   ==>  [1, 30, 's', 'e']
[1, 50, 'd',    'd']   ==>  Age difference (50-30 = 20) which is more than 10 
[1, 52, 'f',    'g']   ==>  [1, 52, 'd', 'd']

I'm just rewriting your data structure and your code: 我只是重写您的数据结构和您的代码:

from collections import namedtuple
Record = namedtuple('Record', ['id', 'age', 'val1', 'val2'])
myList = [
      Record._make([1,   20, '',     'x']),
      Record._make([1,   25, 's',    '']),
      Record._make([1,   26, '',     'e']),
      Record._make([1,   30, 'd',    's']),
      Record._make([1,   50, 'd',    'd']),
      Record._make([1,   52, 'f',    'g'])
]

timeDelta = 10

for i in range(1, len(myList)):
    subList = list(myList[:i+1])
    rec = supList.pop(-1) 
    age = float(rec.age)
    retval = collapseListTogether(subList, age, timeDelta)
    if len(retval) == 0:
        continue
    retval.id, retval.age = rec.id, rec.age
    print(retval)

def collapseListTogether(lst, age, tdelta):
    finalLst = []
    [finalLst.append(ele) if age - float(ele.age) <= tdelta and age > float(ele.age)
     else None for ele in lst]
    return([reduce(lambda a, b: b or a, tup) for tup in zip(*finalLst[::-1])])

Your code is not readable to me. 您的代码对我不可读。 I did not change the logic, but just modify places for performance. 我没有更改逻辑,而只是修改性能位置。

One of the way out is to replace your 4-element list with tuple, even better with namedtuple, which is a famous high-performance container in Python. 解决方法之一是用元组替换4元素列表,甚至用namedtuple更好,后者是Python中著名的高性能容器。 Also, for-loop should be avoided in interpreted languages. 此外,应避免在解释语言中使用for循环。 In python, one would use comprehensions instead of for-loop if possible to enhance performance. 在python中,如果可能的话,可以使用comprehensions代替for循环来提高性能。 Your list is not too large, so time earned in efficient line interpreting should be more than that in breaking. 您的清单不是太大,因此有效的行翻译所花费的时间应该多于中断所花费的时间。

To me, your code should not work, but I am not sure. 对我来说,您的代码不起作用,但我不确定。

Assuming your example is correct, I see no reason you can't do this in a single pass, since they're sorted by age. 假设您的示例是正确的,我认为没有理由您不能一次完成此操作,因为它们是按年龄排序的。 If the last sublist you inspected has too great a difference, you know nothing earlier will count, so you should just leave the current sublist unmodified. 如果您检查的最后一个子列表相差太大,则您会更早地了解到什么都不重要,因此您只需保留当前子列表不变即可。

previous_age = None
previous_val1 = ''
previous_val2 = ''

for sublist in myList:
    age = sublist[1]
    latest_val1 = sublist[2]
    latest_val2 = sublist[3]
    if previous_age is not None and ((age - previous_age) <= timeDelta):
        # there is at least one previous list            
        sublist[2] = previous_val1
        sublist[3] = previous_val2
    previous_age = age
    previous_val1 = latest_val1 or previous_val1
    previous_val2 = latest_val2 or previous_val2

When testing, that code produces this modified value for your initial myList: 测试时,该代码会为您的初始myList生成此修改后的值:

[[1, 20, '', 'x'],
 [1, 25, '', 'x'],
 [1, 26, 's', 'x'],
 [1, 30, 's', 'e'],
 [1, 50, 'd', 'd'],
 [1, 52, 'd', 'd']]

It's a straightforward modification to build a new list rather than edit one in place, or to entirely omit the skipped lines rather than just leave them unchanged. 这是一种直接的修改,可以建立一个新列表,而不是就地编辑列表,或者完全忽略跳过的行,而不仅仅是保持它们不变。

reduce and list comprehensions are powerful tools, but they're not right for all problems. 减少和列出理解力是强大的工具,但并非适用于所有问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM