简体   繁体   中英

Iterate through list of dictionary and identify similar values in dictionary in Python

Suppose I have a list of dictionary as below

[{'name': 'User_ORDERS1234', 'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}], 'users': ['User_2']}, {'name': 'User_ORDERS1235', 'expressions': [{'exp': '"table"."ORDERS"."STATUS"  = \'Shipped\''}], 'users': ['User_1']}, {'name': 'User_ORDERS1236', 'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}], 'users': ['User_3']}]

On iterate of this list I want to check if the expressions(key) a sub list values are same as some other set of dictionary key expression values.In the above case users key with value-User_2 has same expression values as User_3 .In this case I want to delete the entire dictionary of User_3 and add append the value User_3 to User_2 list(as 'Users':['User_2','User_3'])

exprected output:

[{'name': 'User_ORDERS1234', 'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}], 'users': ['User_2','User_3']}, {'name': 'User_ORDERS1235', 'expressions': [{'exp': '"table"."ORDERS"."STATUS"  = \'Shipped\''}], 'users': ['User_1']}]

You can use enumerate to get the index and value of each order in the list of orders. scanned_exp is a dictionary with the unique expression as the key and the value is the index in the list of orders in which the first occurrence of the unique expression was found. When iterating, we check if the current expression has already been scanned, ie, in scanned_exp . If it has been found already, we extend the list of users at the index position of the first occurrence of that expression with the list of users from the current expression. We then delete the current order from the list using remove .

scanned_exp = {}
for idx, order in enumerate(d):
    exp = order["expressions"][0]["exp"]
    if exp in scanned_exp:
        d[scanned_exp[exp]]["users"].extend(order["users"])
        d.remove(order)
    else:
        scanned_exp[exp] = idx

Your output then becomes:

[
    {
        'name': 'User_ORDERS1234', 
        'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}], 
        'users': ['User_2', 'User_3']
    }, 
    {
        'name': 'User_ORDERS1235', 
        'expressions': [{'exp': '"table"."ORDERS"."STATUS"  = \'Shipped\''}], 
        'users': ['User_1']
    }
]

Edit

Okay, let's make this dynamic. Firstly, the keys of a dictionary cannot be lists (unhashable type), so this breaks our original implementation. above. A collection that is able to be used as a key is tuple (unless the tuple contains unhashable types, ie, list , dict ). What we can do is make a tuple that contains all of the string values that appear as a value in the exp key.

So, you can replace this:

exp = order["expressions"][0]["exp"]

with this:

exp = tuple(e["exp"] for e in order["expressions"])
orders = [{
    'name': 'User_ORDERS1234',
    'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}],
    'users': ['User_2']
},{
    'name': 'User_ORDERS1235',
    'expressions': [{'exp': '"table"."ORDERS"."STATUS"  = \'Shipped\''}],
    'users': ['User_1']
},{
    'name': 'User_ORDERS1236',
    'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}],
    'users': ['User_3']
}]

for i, order in enumerate(orders):                # loop trough orders:
    exp1 = order['expressions']                   # 'exp' value of the order

    for next_order in orders[i+1:]:               # loop through the next orders:
        exp2 = next_order['expressions']          # 'exp' value of a next order

        if exp1 == exp2:                          # if the 'exp' values are the same:
            order['users'] += next_order['users'] # add the 'users' to the order 'users'
            next_order['users'] = []              # remove users from the next order

orders = [o for o in orders if o['users']]        # leave only the orders that have 'users'

print(orders)

Output

[{
    'name': 'User_ORDERS1234',
    'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}],
    'users': ['User_2', 'User_3']
},{
    'name': 'User_ORDERS1235',
    'expressions': [{'exp': '"table"."ORDERS"."STATUS"  = \'Shipped\''}],
    'users': ['User_1']
}]
dictt = [{'name': 'User_ORDERS1234', 'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}], 'users': ['User_2']}, {'name': 'User_ORDERS1235', 'expressions': [{'exp': '"table"."ORDERS"."STATUS"  = \'Shipped\''}], 'users': ['User_1']}, {'name': 'User_ORDERS1236', 'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}], 'users': ['User_3']}]



def sorting_it(d):
    for n,c in enumerate([x['expressions'] for x in dictt]):
        if c == d['expressions'] and dictt[n] != d and d['users']:
            d['users'] = d['users'] + dictt[n]['users']
            del dictt[n]
f = list(map(sorting_it,dictt))

print(dictt)

>>> [{'name': 'User_ORDERS1234', 'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}], 'users': ['User_2', 'User_3']}, {'name': 'User_ORDERS1235', 'expressions': [{'exp': '"table"."ORDERS"."STATUS"  = \'Shipped\''}], 'users': ['User_1']}]

Explanation:

f = list(map(sorting_it,dictt))

using the map function, every dictionary in dictt is passed through function sorting_it one at a time as the variable d, so first is:

{'name': 'User_ORDERS1234', 'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}], 'users': ['User_2']}

Now I'm looping through the the values of key 'expressions' , [x['expressions'] for x in dictt] is the list for this

If the value of key 'expressions' in dictt d is equal to the value of key 'expressions' in [x['expressions'] for x in dictt] then I get the index n , use this to find the corresponding dictionary in dictt and add all the values for key 'expressions' together.

I then do del dictt[n] since the user for that dictionary has already been added to another dictionary, so in this case dictionary for 'user_3' is deleted since they were added to dictionary for 'user_2' .

Also dictt[n] != d and d['users'] makes sure I'm not comparing the same dictionary.

def function_1(values):
  for j in range(len(values)):
    for k in range(j + 1, len(values)):
      if values[j]['expressions'] == values[k]['expressions']:
        values[j]['users'] = values[j]['users'] + values[k]['users'] 
  return values

#In the performance

list_values = [{'name': 'User_ORDERS1234', 'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}], 'users': ['User_2']}, {'name': 'User_ORDERS1235', 'expressions': [{'exp': '"table"."ORDERS"."STATUS"  = \'Shipped\''}], 'users': ['User_1']}, {'name': 'User_ORDERS1236', 'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}], 'users': ['User_3']}]

#call the function

function_1(list_values)

[{'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}],
  'name': 'User_ORDERS1234',
  'users': ['User_2', 'User_3']},
 {'expressions': [{'exp': '"table"."ORDERS"."STATUS"  = \'Shipped\''}],
  'name': 'User_ORDERS1235',
  'users': ['User_1']},
 {'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}],
  'name': 'User_ORDERS1236',
  'users': ['User_3']}]
[ ]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM