简体   繁体   中英

i have multiple dictionaries in a python list. i want to find what they share in values & then compare them, while looping through a list of them

ok so here's an example dataset:

returntime= '9:00'


data1 = {Name:'jim', cardriven: '20123', time:'7:30'}
data1 = {Name:'bob', cardriven: '20123', time:'10:30'}
data1 = {Name:'jim', cardriven: '201111', time:'8:30'}
data1 = {Name:'bob', cardriven: '201314', time:'9:30'}

my problem is that i need to be able to loop over these dictionaries & find the car that both of them have driven & then compare the times they drove them to see who returned the car closest to 9:00

i have tried many loops & created lists etc... but i know theres gotta be a simple way to just say...

for [data1, data2....] who returned the car closest to the time... and here is the info from that record.

thanx in advance

Maybe you can trying using only 1 dict where each entry in the dict is another dict with the key being maybe the name of the driver or an ID code.
Then you can loop over that dict and find out which dict entries had driven the same car.

Here's a simplified example of what I mean

returntime= '9:00'


data1 = {'Name':'jim', 'cardriven': '20123', 'time': "7:30"}
data2 = {'Name':'bob', 'cardriven': '20123', 'time': "10:30"}
data3 = {'Name':'jim', 'cardriven': '201111', 'time': "8:30"}

dict = {}
dict[0] = data1
dict[1] = data2
dict[2] = data3


for i in range(len(dict)):
    if dict[i]["cardriven"] == '20123':
        print(dict[i]["Name"])

Output:

jim
bob

Also a pro-tip: you can enter the time into the dict as a datetime object and that would help you greatly in comparing the time.

This will iterate through the data you offered and put cars in a dictionary, which will keep track of whichever car has the closest time to the goal.

import datetime

returntime = "09:00"

data = [
    dict(name="Jim", cardriven="20123", time="7:30"),
    dict(name="Bob", cardriven="20123", time="10:30"),
    dict(name="Jim", cardriven="201111", time="8:30"),
    dict(name="Bob", cardriven="201314", time="9:30"),
]

def parsedelta(s):
    t = datetime.datetime.strptime(s, "%M:%S")
    return datetime.timedelta(minutes=t.minute, seconds=t.second)

deltareturn = parsedelta(returntime)

def diffreturn(s):
    return abs(deltareturn.seconds - parsedelta(s).seconds)

cars = {}
for datum in data:
    car = datum["cardriven"]
    if car not in cars:
        cars[car] = datum
        continue
    if diffreturn(datum["time"]) < diffreturn(cars[car]["time"]):
        cars[car] = datum

print(cars)

Since we want to find a car both of them drove in, we could create a dictionary where each key is the car driven and each value is list of name-time pairs as well as a list of cars both drove in. Then compare the times and see who returned it closest to returntime .

from datetime import datetime

temp = {}
both_drove = []
for data in [data1, data2, data3, data4]:
    if data['cardriven'] in temp:
        temp[data['cardriven']].append((data['Name'], data['time']))
        both_drove.append(data['cardriven'])
    else:
        temp[data['cardriven']] = [(data['Name'], data['time'])]

returntime = datetime.strptime(returntime, '%H:%M')

for car in both_drove:
    p1, p2  = temp[car]
    if abs(datetime.strptime(p1[1], '%H:%M') - returntime) > abs(datetime.strptime(p2[1], '%H:%M') - returntime):
        print(p2)
    else:
        print(p1)

Output:

('jim', '7:30')

NB It's not clear which is closer to returntime , 10:30 or 7:30 .

The test data is a bit funky for the question. You are basically looking for a groupby and sort approach but 2 out of the 3 groups in your test data has only a single entry. Furthermore, for car 20123 , the times are equal distance ( delta_min in my answer below) from the returntime. In this case, the sort_values step below won't affect the order. If you know how equal distance entries should be ranked, then that is a next step you can work on.

Nevertheless, I think the best course of action is to convert it into a pandas dateframe and create a pipeline. For this data

data1 = {"Name":'jim', "cardriven": '20123', "time":'7:30'}
data2 = {"Name":'bob', "cardriven": '20123', "time":'10:30'}
data3 = {"Name":'jim', "cardriven": '201111', "time":'8:30'}
data4 = {"Name":'bob', "cardriven": '201314', "time":'9:30'}

We can design a pipeline that uses a modified version of the excellent parsedelta function proposed in ljmc ´s answer.

import datetime
import pandas as pd

data = pd.DataFrame([data1, data2, data3, data4])
#   Name cardriven   time
# 0  jim     20123   7:30
# 1  bob     20123  10:30
# 2  jim    201111   8:30
# 3  bob    201314   9:30


def timedelta(time):
    t = datetime.datetime.strptime(time, "%H:%M")
    return datetime.timedelta(hours=t.hour, minutes=t.minute).seconds / 60

returntime= '9:00'

latest_entries = (
    data
    .assign(delta_min=lambda d: abs(d["time"].apply(timedelta) - timedelta(returntime)))
    .sort_values("delta_min")
    .drop("delta_min", axis = 1) # comment this out if you want the minute difference
    .drop_duplicates(subset="cardriven")
    
)
print(latest_entries)

Which gives us

    Name cardriven  time
  2  jim    201111  8:30
  0  jim     20123  7:30
  3  bob    201314  9:30

Going further, we could simplify the pipeline by passing the timedelta function directly as the key parameter in the sort_values step. We also split the timedelta function.

def _timedelta(tm):
    t = datetime.datetime.strptime(tm, "%H:%M")
    return datetime.timedelta(hours=t.hour, minutes=t.minute).seconds / 60


def timedelta(time, rtrn_time):
    return abs(_timedelta(time) - _timedelta(rtrn_time))


returntime= '9:00'

latest_entries = (
    data
    .sort_values("time", key=lambda d: d.apply(timedelta, rtrn_time=returntime))
    .drop_duplicates(subset="cardriven")
    
)
print(latest_entries)

    Name cardriven  time
  2  jim    201111  8:30
  0  jim     20123  7:30
  3  bob    201314  9:30

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM