简体   繁体   English

有没有办法根据一个字典中的值小于另一个字典中的相同键来过滤字典列表?

[英]Is there a way to filter a list of dictionaries based on a value in one dictionary being less than the same key in another?

I apologize for the convoluted title.我为令人费解的标题道歉。 I need to filter a list of dictionaries by a fairly specific criteria.我需要按相当具体的标准过滤字典列表。

Normally, I would do a list comprehension, but I'm not positive on the logic.通常,我会做一个列表理解,但我对逻辑并不积极。

Here's an example list:这是一个示例列表:

list_dict = [{'item_id': '000354', 'ts_created': '11/12/2013', 'item_desc': 'a product'},
             {'item_id': '000354', 'ts_created': '11/13/2013', 'item_desc': 'a product'},
             {'item_id': '000355', 'ts_created': '11/12/2013', 'item_desc': 'a different product'}]

You'll notice that the first two dictionary items are identical besides 'ts_created'.您会注意到除了“ts_created”之外,前两个字典项是相同的。

I want to create a new dictionary keeping all items with the earliest timestamp, and discarding the rest.我想创建一个新字典,保留所有具有最早时间戳的项目,并丢弃 rest。

Edit: Removed 'elegantly' from title as it seemed to offend some.编辑:从标题中删除“优雅”,因为它似乎冒犯了一些人。

Edit 2: Tried to improve title.编辑2:试图改进标题。

Edit 3 (focus?): I'm really not sure how to focus this question anymore than it already is, but I'll try.编辑3(焦点?):我真的不确定如何关注这个问题,但我会尝试。 In reference to the example code above (the actual list is much greater), There are duplicate dictionaries within the list.参考上面的示例代码(实际列表要大得多),列表中有重复的字典。 The only difference in them is the 'ts_created' values.它们的唯一区别是“ts_created”值。 I want to only keep the unique 'item_id' dictionaries, and further the earliest 'ts_created'.我只想保留唯一的“item_id”字典,以及最早的“ts_created”。 The resulting list would look like this.结果列表将如下所示。

list_dict = [{'item_id': '000354', 'ts_created': '11/12/2013', 'item_desc': 'a product'},
             {'item_id': '000355', 'ts_created': '11/12/2013', 'item_desc': 'a different product'}]

You can filter the dictionaries using a dictionary of dictionaries keyed on the item_id.您可以使用以 item_id 为键的字典来过滤字典。 As you populate that indexes dictionary, only keep the items that have thegreatest timestamp.当您填充该索引字典时,仅保留具有最大时间戳的项目。 Since your time stamps are strings not formatted in the international standard you will need to convert them to actual dates to compare them.由于您的时间戳是未按国际标准格式化的字符串,因此您需要将它们转换为实际日期以进行比较。 A second dictionary (indexed on the item_id as well) can be used to keep track of the converted timestamps.第二个字典(在 item_id 上也有索引)可用于跟踪转换后的时间戳。

list_dict = [{'item_id': '000354', 'ts_created': '11/12/2013', 'item_desc': 'a product'},
             {'item_id': '000354', 'ts_created': '11/13/2013', 'item_desc': 'a product'},
             {'item_id': '000355', 'ts_created': '11/12/2013', 'item_desc': 'a different product'}]

from datetime import datetime
maxDates = dict()  # association between item and timestamp
result   = dict()  # indexed single instance result (dictionary of dictionaries)
for d in list_dict:
    key       = d['item_id']
    timestamp = datetime.strptime(d['ts_created'], '%m/%d/%Y') # usable timestamp
    if itemId not in result or timestamp>maxDates[key]:        # keep only latest
        result[key]   = d
        maxDates[key] = timestamp
result = list(result.values())    # convert back to a list of dictionaries

print(result)
        
[{'item_id': '000354', 'ts_created': '11/13/2013', 'item_desc': 'a product'},
 {'item_id': '000355', 'ts_created': '11/12/2013', 'item_desc': 'a different product'}]

If uniqueness is determined by multiple fields (as opposed to just the item_id), you will need to combine all the values into a single key.如果唯一性由多个字段确定(而不仅仅是 item_id),则需要将所有值组合到一个键中。

For example (for all fields except the time stamp):例如(对于除时间戳之外的所有字段):

key = tuple(d[k] for k in sorted(d) if k != 'ts_created')

You can use pandas.DataFrame , order by date and then drop all the duplicates.您可以使用pandas.DataFrame ,按日期排序,然后删除所有重复项。

import pandas

df = pandas.DataFrame(list_dict)
# To datetime
df['ts_created'] = pandas.to_datetime(df['ts_created'])
# Sort by item_id, then by date
df.sort_values(by=['item_id', 'ts_created'], inplace=True)
# Drop duplicates, leaving only the first item_id
df.drop_duplicates(subset=['item_id'], keep='first', inplace=True)
# Convert the dates back to the original format
df['ts_created'] = df.ts_created.dt.strftime('%m/%d/%Y')
# Create the list again
df.to_dict(orient='records')

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据该字典中某个键的特定值过滤字典列表 - filter list of dictionaries based on a particular value of a key in that dictionary 如何根据另一个字典替换列表字典中键的值? - How to replace a value for a key in dictionaries of a list based on another dictionary? 根据另一个字典列表中的键值删除字典 - Delete a dictionary based on the value of a key in another list of dictionaries 当键值对的值小于字典中另一对的值时,删除键值对 - Remove key and value pairs when the value of those pairs are less than value of another pairs in a dictionary of dictionaries 有没有办法在字典列表中的下一个字典中获取相同键的下一个值? - Is there a way to get next value for the same key in the next dictionary in the list of dictionaries? 根据同一个字典中的另一个,提取熊猫中一个字典键的值 - Extracting value for one dictionary key in Pandas based on another in the same dictionary 根据键的值过滤字典列表 - Filter list of dictionaries based on the value of a key python 根据键值过滤字典列表 - python filter list of dictionaries based on key value 根据基于该字典中另一个键的值的条件,更新python词典列表中的值 - Update a value in a list of dictionaries in python based on a condition based on the value of another key in that dictionary 如何根据另一个字典中指定给相同键的值来更改字典键? - How to Change a Dictionaries Key Based on the Value Designated to the Same Key in Another Dictionary?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM