简体   繁体   English

将循环写入pythonic方式

[英]Write loop into pythonic way

I have a dict array that stores audits for tickets. 我有一个dict数组,用于存储票证的审核。 Each audit has an information of user_id , date that happens changes and a list of events and each event has a few attributes like type , field name , among others. 每次审核都具有user_id信息,发生更改的datelist of events并且每个事件具有一些属性,如typefield name等等。

Based on those informations, I need to extract events information based on date and convert to another dict. 基于这些信息,我需要根据date提取事件信息,然后转换为另一个字典。 Note: I need to keep only the last event for each field_name . 注意:我只需要为每个field_name保留最后一个事件。

I've wrote a "super" loop that does what I need but this code looks pretty weird and not optmized: 我写了一个“超级”循环来满足我的需要,但是这段代码看起来很怪异,并且没有被优化:

dict sample: 字典样本:

data = {
    "audits": [
        "id": 1234,
            "ticket_id": 1111,
            "created_at": "2019-04-07T01:09:40Z",
            "author_id": 9876543,           
            "events": [{
                    "id": 1234,
                    "type": "Random"
                },
                {
                    "id": 765456,
                    "type": "Create",
                    "value": "Lovely form",
                    "field_name": "subject"
                },              
                {
                    "id": 356765,
                    "type": "Create",
                    "value": None,
                    "field_name": "priority"
                },              
                {
                    "id": 2345432,
                    "type": "Change",                   
                    "value": "normal",
                    "field_name": "priority",
                    "previous_value": None
                }
            ]
        }
    ]
}

code: 码:

field_history = []

for audit in data['audits']:
    user_id = audit['author_id']
    updated = audit['created_at']

    base_info = {
        'user_id': user_id,
        'updated': updated
    }

    # Iterate to get distinct value (last found on dict)
    fields = [d for d in audit['events'] if (d['type'] == 'Create' or d['type'] == 'Change') and d['field_name'] != 'tags']        
    updated_fields = [] # this list is being used to keep history by updated
    for field in fields:
        distincts = [d for d in audit['events'] if d.get('field_name', '') == field['field_name']]        
        distinct = distincts[-1]
        # remove older values and keep only the last one found on list
        updated_fields[:] = [d for d in updated_fields if d['updated'] == updated and d.get('field_name') != distinct['field_name']]
        updated_fields.append({**base_info, **distinct}) # add always the last element on list

    field_history = field_history + updated_fields

What is the proper way to write this loop making it optimized to handle large datasets? 编写此循环以使其优化以处理大型数据集的正确方法是什么?

I like to start by making some simple functions to handle the transformations and filtering to allow the top level to remain clean: 我喜欢从制作一些简单的函数开始以处理转换和过滤,以使顶层保持干净:

def event_valid(event):
    return (
        event['type'] in ('Create', 'Change')
        and event['field_name'] not in ('tags',)
    )

events = [event for event in audit['events'] if event_valid(event)]

# Assuming the list is ordered... If not then sort it before next statement
# This trick filters to only the latest event for each distinct field_name
events = {
    event['field_name']: event for event in events
}.values()

return {
    'user_id': audit['author_id'],
    'updated': audit['created_at'],
    'events': events,
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM