简体   繁体   中英

Fill dictionary with value from the same row, but different column

Lately I've been trying to map some values, so I'm trying to create a dictionary to do so. The odd thing is my DataFrame has a column made of lists, and DataFrames are always a bit awkward with lists. The DataFrame has the following structure:

    rules          procedure
['10','11','12']       1
['13','14']            2
['20','21','22','24']  3

So I want to create a dictionary that maps '10' to 1, '14' to 2, and so on. I tried the following:

dicc=dict()
for j in df['rules']:
    for i,k in zip(j,df.procedure):
        dicc[i]=k

But that isn't making it. Probably something to do with indexes. What am I missing?

Edit: I'm trying to create a dictionary that maps the values '10', '11', '12' to 1; '13','14' to 2; '20','21','22','24' to 3, so if I type dicc['10'] I get 1 , if I type dicc['22'] I get 3 . Obviously, the actual DataFrame is quite bigger and I can't do it manually.

You can do it like this:

import pandas as pd

data = [[['10', '11', '12'], 1],
        [['13', '14'], 2],
        [['20', '21', '22', '24'], 3]]

df = pd.DataFrame(data=data, columns=['rules', 'procedure'])

d = {r : p for rs, p in df[['rules', 'procedure']].values for r in rs}
print(d)

Output

{'20': 3, '10': 1, '11': 1, '24': 3, '14': 2, '22': 3, '13': 2, '12': 1, '21': 3}

Notes:

  • The code {r : p for rs, p in df[['rules', 'procedure']].values for r in rs} is a dictionary comprehension, the dictionary counterpart of list.
  • The df[['rules', 'procedure']].values is equivalent to zip(df.rules, df.procedure) it outputs a pair of list, int. So the rs variable is a list and p is an integer.
  • Finally you iterate over the values of rs using the second for loop

UPDATE

As suggested for @piRSquared you can use zip:

d = {r : p for rs, p in zip(df.rules, df.procedure) for r in rs}

Help from cytoolz

from cytoolz.dicttoolz import merge

merge(*map(dict.fromkeys, df.rules, df.procedure))

{'10': 1,
 '11': 1,
 '12': 1,
 '13': 2,
 '14': 2,
 '20': 3,
 '21': 3,
 '22': 3,
 '24': 3}

Note

I updated my post to mimic how @jpp passed multiple iterables to map . @jpp's answer is very good . Though I'd advocate for upvoting all useful answers, I wish I could upvote their answer again (-:

Using collections.ChainMap :

from collections import ChainMap

res = dict(ChainMap(*map(dict.fromkeys, df['rules'], df['procedure'])))

print(res)

{'10': 1, '11': 1, '12': 1, '13': 2, '14': 2,
 '20': 3, '21': 3, '22': 3, '24': 3}

For many uses, the final dict conversion is not necessary:

A ChainMap class is provided for quickly linking a number of mappings so they can be treated as a single unit. It is often much faster than creating a new dictionary and running multiple update() calls.

See also What is the purpose of collections.ChainMap?

You may check flatten the list

dict(zip(sum(df.rules.tolist(),[]),df.procedure.repeat(df.rules.str.len())))
Out[60]: 
{'10': 1,
 '11': 1,
 '12': 1,
 '13': 2,
 '14': 2,
 '20': 3,
 '21': 3,
 '22': 3,
 '24': 3}

using itertools.chain and DataFrame.itertuples :

dict(
    chain.from_iterable(
        ((rule, row.procedure) for rule in row.rules) for row in df.itertuples()
    )
)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM