简体   繁体   English

如何使用 Pandas 数据框的公共键填充多个字典?

[英]How to populate multiple dictionary with common keys to pandas dataframe?

I have a list of dictionaries where keys are identical but values in each dictionary is not same, and the order of each dictionary strictly preserved.我有一个字典列表,其中键相同但每个字典中的值不同,并且每个字典的顺序严格保留。 I am trying to find an automatic solution to populate these dictionaries to pandas dataframe as new column, but didn't get the expected output.我试图找到一种自动解决方案,将这些字典作为新列填充到 Pandas 数据框,但没有得到预期的输出。

original data on gist gist 上的原始数据

here is the data that I have on old data on gist .这是我关于 gist 旧数据的数据

my attempt我的尝试

here is my attempt to populate multiple dictionaries with same keys but different values (binary value), my goal is I want to write down handy function to vectorize the code.这是我尝试用相同的键但不同的值(二进制值)填充多个字典,我的目标是我想写下方便的函数来向量化代码。 Here is my inefficient code but works on gist这是我的低效代码,但适用于要点

import pandas as pd

dat= pd.read_csv('old_data.csv', encoding='utf-8')

dat['type']=dat['code'].astype(str).map(typ)
dat['anim']=dat['code'].astype(str).map(anim)
dat['bovin'] = dat['code'].astype(str).map(bov)
dat['catg'] = dat['code'].astype(str).map(cat)
dat['foot'] = dat['code'].astype(str).map(foo)

my code works but it is not vectorized (not efficient I think).我的代码有效,但它不是矢量化的(我认为效率不高)。 I am wondering how can I make this few lines of a simple function.我想知道如何制作这几行简单的函数。 Any idea?任何的想法? how to we make this happen as efficiently as possible?我们如何尽可能有效地做到这一点?

Here is my current and the desired output:这是我当前和所需的输出:

since I got correct output but code is not well efficient here.因为我得到了正确的输出,但这里的代码效率不高。 this is my current output on gist这是我目前在要点上的输出

If you restructure your dictionaries into a dictionary of dictionaries you can one line it:如果您将字典重组为字典字典,您可以将其一行:

 for keys in values.keys():
        dat[keys]=dat['code'].astype(str).map(values[keys])

Full code:完整代码:

values = {"typ" :{
    '20230' : 'A',
    '20130' : 'A',
    '20220' : 'A',
    '20120' : 'A',
    '20329' : 'A',
    '20322' : 'A',
    '20321' : 'B',
    '20110' : 'B',
    '20210' : 'B',
    '20311' : 'B'
    } ,

"anim" :{
    '20230' : 'AOB',
    '20130' : 'AOB',
    '20220' : 'AOB',
    '20120' : 'AOB',
    '20329' : 'AOC',
    '20322' : 'AOC',
    '20321' : 'AOC',
    '20110' : 'AOB',
    '20210' : 'AOB',
    '20311' : 'AOC'
    } ,

"bov" :{
    '20230' : 'AOD',
    '20130' : 'AOD',
    '20220' : 'AOD',
    '20120' : 'AOD',
    '20329' : 'AOE',
    '20322' : 'AOE',
    '20321' : 'AOE',
    '20110' : 'AOD',
    '20210' : 'AOD',
    '20311' : 'AOE'
    } ,

"cat" :{
    '20230' : 'AOF',
    '20130' : 'AOG',
    '20220' : 'AOF',
    '20120' : 'AOG',
    '20329' : 'AOF',
    '20322' : 'AOF',
    '20321' : 'AOF',
    '20110' : 'AOG',
    '20210' : 'AOF',
    '20311' : 'AOG'
    } ,

"foo" :{
    '20230' : 'AOL',
    '20130' : 'AOL',
    '20220' : 'AOM',
    '20120' : 'AOM',
    '20329' : 'AOL',
    '20322' : 'AOM',
    '20321' : 'AOM',
    '20110' : 'AOM',
    '20210' : 'AOM',
    '20311' : 'AOM'
    } 
}




import pandas as pd

dat= pd.read_csv('old_data.csv', encoding='utf-8')
for keys in values.keys():
    dat[keys]=dat['code'].astype(str).map(values[keys])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM