繁体   English   中英

如何从字典列表中返回所有具有匹配值的字典

[英]How to return all dicts with matching values from within a list of dicts

我环顾四周寻找类似的问题,因为这看起来很基本,但找不到任何东西。 如果那里已经有东西了,很抱歉提出一个新问题!

我正在努力想办法解决我的问题:

我有一个字典列表:

[{'name':'Josh', 'age':'39','Date of Birth':'1983-02-22','Time of Birth':'11:25:03'},
{'name':'Tyrell', 'age':'24', 'Date of Birth':'1998-01-27','Time of Birth':'01:23:54'},
{'name':'Jannell', 'age':'39', 'Date of Birth':'1983-02-27','Time of Birth':'11:21:34'},
{'name':'David', 'age':'24', 'Date of Birth':'1998-01-20','Time of Birth':'01:27:24'},
{'name':'Matthew', 'age':'24','Date of Birth':'1998-03-31','Time of Birth':'01:26:41'},
{'name':'Tylan', 'age':'24','Date of Birth':'1998-01-22','Time of Birth':'01:23:16'}
]

从该列表中,我想提取具有完全相同年龄、出生日期相差 10 天内以及出生时间相差 10 分钟以内的字典的所有名称键值。 所以从上面:

39 岁: [Josh,Jannell]或年龄:24 [Tyrell,David,Tylan][]任何其他年龄。

如果向我展示如何成功提取这些案例中的任何一个,我绝对认为我可以自己弄清楚。

我尝试解决

我目前的尝试是这样的:

#dicts = above dict from question
ages = [d['age'] for d in dicts]
ages = list(set(ages))

groupedlist = []
for age in ages:
    sameagelist = []
    for dict_ in [x for x in dicts if x['age'] == ages]:

        sameagelist.append(dict_)
    groupedlist.append(sameagelist)    

return groupedlist

尽管事实证明这很麻烦,但因为现在我只有一个带有字典的列表列表,当我需要过滤出生时间/出生日期时,下一步似乎更困难/更复杂。

我很难过,但我觉得答案会很简单。 感谢任何提供推动力将我推向边缘的人!

如果根据你设置的条件“年龄完全相同,出生日期相差 10 天内,出生时间相差 10 分钟以内”和你提供的数据,如果我没记错的话,'Tyrell','David ' 和 'Tylan' 应该在同一组中。

但可能存在 Tyrell 比 David 早 9 天出生而 Tylan 晚 9 天的情况,这意味着 Tylan 和 David 这对夫妇不符合要求。

一个想法可能是为每个人建立一个小组。 以下代码输出:

[['Josh', 'Jannell'], ['Tyrell', 'David', 'Tylan'], ['David', 'Tylan']]

其中每个子列表的名字是该组的“焦点/主要”人。 这意味着,当查看组['Tyrell', 'David', 'Tylan']时,David 和 Tylan 在 Tyrell 的边界内。 要知道 David 和 Tylan 是否在彼此的边界内,任何一个都需要成为焦点,因此是第二组。

为了使计算更容易,我使用了:

import pandas as pd 
import datetime

dicts = [{'name':'Josh', 'age':'39','Date of Birth':'1983-02-22','Time of Birth':'11:25:03'},
{'name':'Tyrell', 'age':'24', 'Date of Birth':'1998-01-27','Time of Birth':'01:23:54'},
{'name':'Jannell', 'age':'39', 'Date of Birth':'1983-02-27','Time of Birth':'11:21:34'},
{'name':'David', 'age':'24', 'Date of Birth':'1998-01-20','Time of Birth':'01:27:24'},
{'name':'Matthew', 'age':'24','Date of Birth':'1998-03-31','Time of Birth':'01:26:41'},
{'name':'Tylan', 'age':'24','Date of Birth':'1998-01-22','Time of Birth':'01:23:16'}
]

#create dataframe
df = pd.DataFrame().append([i for i in dicts], ignore_index=True)

#convert strings to datetime formats for easy date calculations
df["Date of Birth"] = pd.to_datetime(df["Date of Birth"], format="%Y-%m-%d")
df["Time of Birth"] = pd.to_datetime(df["Time of Birth"], format="%H:%M:%S") #ignore the fact that the same date incorrect is imputed, we only need the time

# function that checks conditions
# row: [name, age, date, time]
def check_birth(row1, row2): #returns true if all conditions are met
    delta_days = abs(row1[2] - row2[2])
    delta_minutes = row1[3] - row2[3]
    #no need to check age since it is done in the Date of Birth check
    return delta_days<datetime.timedelta(days=10) and delta_minutes<datetime.timedelta(minutes=10) 

groups = [] #keep track of groups

#for each member check if other members meet the condition
for i in range(df.shape[0]): 
    track = [df.iloc[i,0]]
    for j in range(i+1, df.shape[0]):  #loop starting at i+1 to avoid duplicate groups 
        if check_birth(df.iloc[i,:], df.iloc[j,:]): 
            track.append(df.iloc[j, 0])
    if len(track) >1: groups.append(track) #exclude groups of one member

print(groups)

要按年龄分组,您可以创建列表字典并将年龄设置为键。

from collections import defaultdict

grouped_by_age = defaultdict(list)

for item in dicts:
    grouped_by_age[item['age']].append(item['name'])

print(grouped_by_age)

我不确定您是否也要求完整的解决方案,但在这里,解释在代码注释中,它还应该考虑['Alan', 'Betty'], ['Betty', 'Cooper']

# importing all the necessary modules
import operator
import itertools
import datetime


data = [
    {'name': 'Josh', 'age': '39', 'Date of Birth': '1983-02-22', 'Time of Birth': '11:25:03'},
    {'name': 'Tyrell', 'age': '24', 'Date of Birth': '1998-01-27', 'Time of Birth': '01:23:54'},
    {'name': 'Jannell', 'age': '39', 'Date of Birth': '1983-02-27', 'Time of Birth': '11:21:34'},
    {'name': 'David', 'age': '24', 'Date of Birth': '1998-01-20', 'Time of Birth': '01:27:24'},
    {'name': 'Matthew', 'age': '24', 'Date of Birth': '1998-03-31', 'Time of Birth': '01:26:41'},
    {'name': 'Tylan', 'age': '24', 'Date of Birth': '1998-01-22', 'Time of Birth': '01:23:16'}
]

# creating a key for sorting, basically it will first sort by age, then by date, then by time
key = operator.itemgetter('age', 'Date of Birth', 'Time of Birth')
data = sorted(data, key=key)


# a convenience function to get a person's date and time of birth as a datetime object
# for time manipulations such as subtraction
def get_datetime(p):
    iso_format = f'{p["Date of Birth"]}T{p["Time of Birth"]}'
    t = datetime.datetime.fromisoformat(iso_format)
    return t


# going over the grouped list by age
for age, group in itertools.groupby(data, key=operator.itemgetter('age')):
    print(f'Age: {age}')
    # convert generator to a list to not exhaust it
    group = list(group)
    previous_match = [None]
    # going over the group while also keeping the current index for later use
    for index, person in enumerate(group):
        # creating a list of people that match the conditions of days and minutes
        # and adding the current person as the first item there
        match = [person['name']]
        time1 = get_datetime(person)
        # going over the group starting from the next person to check if they
        # match that condition of days and minutes
        for other_person in itertools.islice(group, index + 1, None):
            time2 = get_datetime(other_person)
            # subtracting time of both people
            delta = time2 - time1
            # checking if they are in the ten day range and if they are in the ten minute range
            if delta.days <= 10 and (delta.seconds <= 10 * 60 or 24 * 3600 - delta.seconds <= 10 * 60):
                # if they match the conditions of days and minutes append them to the match
                match.append(other_person['name'])
        # check if any other person got matched and check if any new person has appeared
        # this is to check for that case of [Alan, Betty], [Betty, Cooper]
        if len(match) > 1 and match[-1] != previous_match[-1]:
            previous_match = match
            print(match)

一些资源(以下所有库都是内置的):

做完年龄组之后,就可以把同年龄的人分组了。 然后,您需要迭代每个组的每个成员,并在同一组中找到符合您条件的其他人:

from datetime import datetime
dicts = [{'name':'Josh', 'age':'39','Date of Birth':'1983-02-22','Time of Birth':'11:25:03'},
{'name':'Tyrell', 'age':'24', 'Date of Birth':'1998-01-27','Time of Birth':'01:23:54'},
{'name':'Jannell', 'age':'39', 'Date of Birth':'1983-02-27','Time of Birth':'11:21:34'},
{'name':'David', 'age':'24', 'Date of Birth':'1998-01-20','Time of Birth':'01:27:24'},
{'name':'Matthew', 'age':'24','Date of Birth':'1998-03-31','Time of Birth':'01:26:41'},
{'name':'Tylan', 'age':'24','Date of Birth':'1998-01-22','Time of Birth':'01:23:16'}
]
ages = set([d['age'] for d in dicts])
grouped_list = [[each_person for each_person in dicts if each_person['age'] == each_age] for each_age in ages]
grouped_people = []
for each_group in grouped_list:
    for each_person in each_group:
        new_group_people = [each_one['name'] for each_one in each_group if abs(datetime.strptime(each_one['Date of Birth'], '%Y-%m-%d') - datetime.strptime(each_person['Date of Birth'], '%Y-%m-%d')).days <= 10 and abs(datetime.strptime(each_one['Time of Birth'], '%H:%M:%S') - datetime.strptime(each_person['Time of Birth'], '%H:%M:%S')).seconds <= 10*60]
        if len(new_group_people) > 1 and new_group_people not in grouped_people:
            grouped_people.append(new_group_people)

如果您更容易理解,您还可以展开一个循环:

ages = set([d['age'] for d in dicts])
grouped_list = [[each_person for each_person in dicts if each_person['age'] == each_age] for each_age in ages]
grouped_people = []
for each_group in grouped_list:
    for each_person in each_group:
        #new_group_people = [each_one['name'] for each_one in each_group if abs((datetime.strptime(each_one['Date of Birth'], '%Y-%m-%d') - datetime.strptime(each_person['Date of Birth'], '%Y-%m-%d')).days) <= 10 and abs(datetime.strptime(each_one['Time of Birth'], '%H:%M:%S') - datetime.strptime(each_person['Time of Birth'], '%H:%M:%S')).seconds <= 10*60]
        new_group_people = []
        for each_one in each_group:
            if abs((datetime.strptime(each_one['Date of Birth'], '%Y-%m-%d') - datetime.strptime(each_person['Date of Birth'], '%Y-%m-%d')).days) <= 10 and abs(datetime.strptime(each_one['Time of Birth'], '%H:%M:%S') - datetime.strptime(each_person['Time of Birth'], '%H:%M:%S')).seconds <= 10*60:
               new_group_people.append(each_one['name'])
        if len(new_group_people) > 1 and new_group_people not in grouped_people:
            grouped_people.append(new_group_people)
print(grouped_people)

output:

[['Tyrell', 'David', 'Tylan'], ['Josh', 'Jannell']]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM