如何从字典列表中返回所有具有匹配值的字典

Question

我环顾四周寻找类似的问题，因为这看起来很基本，但找不到任何东西。 如果那里已经有东西了，很抱歉提出一个新问题！

我正在努力想办法解决我的问题：

我有一个字典列表：

[{'name':'Josh', 'age':'39','Date of Birth':'1983-02-22','Time of Birth':'11:25:03'},
{'name':'Tyrell', 'age':'24', 'Date of Birth':'1998-01-27','Time of Birth':'01:23:54'},
{'name':'Jannell', 'age':'39', 'Date of Birth':'1983-02-27','Time of Birth':'11:21:34'},
{'name':'David', 'age':'24', 'Date of Birth':'1998-01-20','Time of Birth':'01:27:24'},
{'name':'Matthew', 'age':'24','Date of Birth':'1998-03-31','Time of Birth':'01:26:41'},
{'name':'Tylan', 'age':'24','Date of Birth':'1998-01-22','Time of Birth':'01:23:16'}
]

从该列表中，我想提取具有完全相同年龄、出生日期相差 10 天内以及出生时间相差 10 分钟以内的字典的所有名称键值。 所以从上面：

39 岁： [Josh,Jannell]或年龄：24 [Tyrell,David,Tylan]或[]任何其他年龄。

如果向我展示如何成功提取这些案例中的任何一个，我绝对认为我可以自己弄清楚。

我尝试解决

我目前的尝试是这样的：

#dicts = above dict from question
ages = [d['age'] for d in dicts]
ages = list(set(ages))

groupedlist = []
for age in ages:
    sameagelist = []
    for dict_ in [x for x in dicts if x['age'] == ages]:

        sameagelist.append(dict_)
    groupedlist.append(sameagelist)    

return groupedlist

尽管事实证明这很麻烦，但因为现在我只有一个带有字典的列表列表，当我需要过滤出生时间/出生日期时，下一步似乎更困难/更复杂。

我很难过，但我觉得答案会很简单。 感谢任何提供推动力将我推向边缘的人！

Answer 1

如果根据你设置的条件“年龄完全相同，出生日期相差 10 天内，出生时间相差 10 分钟以内”和你提供的数据，如果我没记错的话，'Tyrell'，'David ' 和 'Tylan' 应该在同一组中。

但可能存在 Tyrell 比 David 早 9 天出生而 Tylan 晚 9 天的情况，这意味着 Tylan 和 David 这对夫妇不符合要求。

一个想法可能是为每个人建立一个小组。 以下代码输出：

[['Josh', 'Jannell'], ['Tyrell', 'David', 'Tylan'], ['David', 'Tylan']]

其中每个子列表的名字是该组的“焦点/主要”人。 这意味着，当查看组['Tyrell', 'David', 'Tylan']时，David 和 Tylan 在 Tyrell 的边界内。 要知道 David 和 Tylan 是否在彼此的边界内，任何一个都需要成为焦点，因此是第二组。

为了使计算更容易，我使用了：

pandas（处理表状结构数据的库： https://pandas.pydata.org/docs/ ）
日期时间（促进日期/时间操作的模块： https://docs.python.org/3/library/datetime.html ）

import pandas as pd 
import datetime

dicts = [{'name':'Josh', 'age':'39','Date of Birth':'1983-02-22','Time of Birth':'11:25:03'},
{'name':'Tyrell', 'age':'24', 'Date of Birth':'1998-01-27','Time of Birth':'01:23:54'},
{'name':'Jannell', 'age':'39', 'Date of Birth':'1983-02-27','Time of Birth':'11:21:34'},
{'name':'David', 'age':'24', 'Date of Birth':'1998-01-20','Time of Birth':'01:27:24'},
{'name':'Matthew', 'age':'24','Date of Birth':'1998-03-31','Time of Birth':'01:26:41'},
{'name':'Tylan', 'age':'24','Date of Birth':'1998-01-22','Time of Birth':'01:23:16'}
]

#create dataframe
df = pd.DataFrame().append([i for i in dicts], ignore_index=True)

#convert strings to datetime formats for easy date calculations
df["Date of Birth"] = pd.to_datetime(df["Date of Birth"], format="%Y-%m-%d")
df["Time of Birth"] = pd.to_datetime(df["Time of Birth"], format="%H:%M:%S") #ignore the fact that the same date incorrect is imputed, we only need the time

# function that checks conditions
# row: [name, age, date, time]
def check_birth(row1, row2): #returns true if all conditions are met
    delta_days = abs(row1[2] - row2[2])
    delta_minutes = row1[3] - row2[3]
    #no need to check age since it is done in the Date of Birth check
    return delta_days<datetime.timedelta(days=10) and delta_minutes<datetime.timedelta(minutes=10) 

groups = [] #keep track of groups

#for each member check if other members meet the condition
for i in range(df.shape[0]): 
    track = [df.iloc[i,0]]
    for j in range(i+1, df.shape[0]):  #loop starting at i+1 to avoid duplicate groups 
        if check_birth(df.iloc[i,:], df.iloc[j,:]): 
            track.append(df.iloc[j, 0])
    if len(track) >1: groups.append(track) #exclude groups of one member

print(groups)

Answer 2

要按年龄分组，您可以创建列表字典并将年龄设置为键。

from collections import defaultdict

grouped_by_age = defaultdict(list)

for item in dicts:
    grouped_by_age[item['age']].append(item['name'])

print(grouped_by_age)

Answer 3

我不确定您是否也要求完整的解决方案，但在这里，解释在代码注释中，它还应该考虑['Alan', 'Betty'], ['Betty', 'Cooper'] ：

# importing all the necessary modules
import operator
import itertools
import datetime


data = [
    {'name': 'Josh', 'age': '39', 'Date of Birth': '1983-02-22', 'Time of Birth': '11:25:03'},
    {'name': 'Tyrell', 'age': '24', 'Date of Birth': '1998-01-27', 'Time of Birth': '01:23:54'},
    {'name': 'Jannell', 'age': '39', 'Date of Birth': '1983-02-27', 'Time of Birth': '11:21:34'},
    {'name': 'David', 'age': '24', 'Date of Birth': '1998-01-20', 'Time of Birth': '01:27:24'},
    {'name': 'Matthew', 'age': '24', 'Date of Birth': '1998-03-31', 'Time of Birth': '01:26:41'},
    {'name': 'Tylan', 'age': '24', 'Date of Birth': '1998-01-22', 'Time of Birth': '01:23:16'}
]

# creating a key for sorting, basically it will first sort by age, then by date, then by time
key = operator.itemgetter('age', 'Date of Birth', 'Time of Birth')
data = sorted(data, key=key)


# a convenience function to get a person's date and time of birth as a datetime object
# for time manipulations such as subtraction
def get_datetime(p):
    iso_format = f'{p["Date of Birth"]}T{p["Time of Birth"]}'
    t = datetime.datetime.fromisoformat(iso_format)
    return t


# going over the grouped list by age
for age, group in itertools.groupby(data, key=operator.itemgetter('age')):
    print(f'Age: {age}')
    # convert generator to a list to not exhaust it
    group = list(group)
    previous_match = [None]
    # going over the group while also keeping the current index for later use
    for index, person in enumerate(group):
        # creating a list of people that match the conditions of days and minutes
        # and adding the current person as the first item there
        match = [person['name']]
        time1 = get_datetime(person)
        # going over the group starting from the next person to check if they
        # match that condition of days and minutes
        for other_person in itertools.islice(group, index + 1, None):
            time2 = get_datetime(other_person)
            # subtracting time of both people
            delta = time2 - time1
            # checking if they are in the ten day range and if they are in the ten minute range
            if delta.days <= 10 and (delta.seconds <= 10 * 60 or 24 * 3600 - delta.seconds <= 10 * 60):
                # if they match the conditions of days and minutes append them to the match
                match.append(other_person['name'])
        # check if any other person got matched and check if any new person has appeared
        # this is to check for that case of [Alan, Betty], [Betty, Cooper]
        if len(match) > 1 and match[-1] != previous_match[-1]:
            previous_match = match
            print(match)

一些资源（以下所有库都是内置的）：

Answer 4

做完年龄组之后，就可以把同年龄的人分组了。 然后，您需要迭代每个组的每个成员，并在同一组中找到符合您条件的其他人：

from datetime import datetime
dicts = [{'name':'Josh', 'age':'39','Date of Birth':'1983-02-22','Time of Birth':'11:25:03'},
{'name':'Tyrell', 'age':'24', 'Date of Birth':'1998-01-27','Time of Birth':'01:23:54'},
{'name':'Jannell', 'age':'39', 'Date of Birth':'1983-02-27','Time of Birth':'11:21:34'},
{'name':'David', 'age':'24', 'Date of Birth':'1998-01-20','Time of Birth':'01:27:24'},
{'name':'Matthew', 'age':'24','Date of Birth':'1998-03-31','Time of Birth':'01:26:41'},
{'name':'Tylan', 'age':'24','Date of Birth':'1998-01-22','Time of Birth':'01:23:16'}
]
ages = set([d['age'] for d in dicts])
grouped_list = [[each_person for each_person in dicts if each_person['age'] == each_age] for each_age in ages]
grouped_people = []
for each_group in grouped_list:
    for each_person in each_group:
        new_group_people = [each_one['name'] for each_one in each_group if abs(datetime.strptime(each_one['Date of Birth'], '%Y-%m-%d') - datetime.strptime(each_person['Date of Birth'], '%Y-%m-%d')).days <= 10 and abs(datetime.strptime(each_one['Time of Birth'], '%H:%M:%S') - datetime.strptime(each_person['Time of Birth'], '%H:%M:%S')).seconds <= 10*60]
        if len(new_group_people) > 1 and new_group_people not in grouped_people:
            grouped_people.append(new_group_people)

如果您更容易理解，您还可以展开一个循环：

ages = set([d['age'] for d in dicts])
grouped_list = [[each_person for each_person in dicts if each_person['age'] == each_age] for each_age in ages]
grouped_people = []
for each_group in grouped_list:
    for each_person in each_group:
        #new_group_people = [each_one['name'] for each_one in each_group if abs((datetime.strptime(each_one['Date of Birth'], '%Y-%m-%d') - datetime.strptime(each_person['Date of Birth'], '%Y-%m-%d')).days) <= 10 and abs(datetime.strptime(each_one['Time of Birth'], '%H:%M:%S') - datetime.strptime(each_person['Time of Birth'], '%H:%M:%S')).seconds <= 10*60]
        new_group_people = []
        for each_one in each_group:
            if abs((datetime.strptime(each_one['Date of Birth'], '%Y-%m-%d') - datetime.strptime(each_person['Date of Birth'], '%Y-%m-%d')).days) <= 10 and abs(datetime.strptime(each_one['Time of Birth'], '%H:%M:%S') - datetime.strptime(each_person['Time of Birth'], '%H:%M:%S')).seconds <= 10*60:
               new_group_people.append(each_one['name'])
        if len(new_group_people) > 1 and new_group_people not in grouped_people:
            grouped_people.append(new_group_people)
print(grouped_people)

output：

[['Tyrell', 'David', 'Tylan'], ['Josh', 'Jannell']]

如何从字典列表中返回所有具有匹配值的字典

问题描述

4 个解决方案

解决方案1
2 已采纳 2022-03-16 23:15:13

解决方案2
0 2022-03-16 21:58:00

解决方案3
0 2022-03-16 23:05:46

解决方案4
0 2022-03-17 00:32:00

如何从字典列表中返回所有具有匹配值的字典

问题描述

4 个解决方案

解决方案1 2 已采纳 2022-03-16 23:15:13

解决方案2 0 2022-03-16 21:58:00

解决方案3 0 2022-03-16 23:05:46

解决方案4 0 2022-03-17 00:32:00

解决方案1
2 已采纳 2022-03-16 23:15:13

解决方案2
0 2022-03-16 21:58:00

解决方案3
0 2022-03-16 23:05:46

解决方案4
0 2022-03-17 00:32:00