简体   繁体   English

从DF列将值插入字典-Pandas(Python)

[英]Insert Values into a Dictionary From a DF Column - Pandas (Python)

I have a list of values that I am trying to match with a column of a pandas df and then would like to create a dictionary that will have list values as keys and then dictionary values from a different column from the data frame. 我有一个要与pandas df的列匹配的值列表,然后想创建一个字典,该字典将具有列表值作为键,然后将字典值从数据框的另一列中提取。

This is how I have my list: 这是我的清单:

sample_list = [101,105,112]

My Data Frame: 我的数据框:

sample_df = pd.DataFrame([[101, "NJ"], [105, "CA"],[111, "MO"], [101, "NJ"], [112, "NB"], [101, "NJ"], [105, "CA"]], \
                         columns=["Col1", "Col2"])

looks like this, 看起来像这样

    Col1    Col2
0   101     NJ
1   105     CA
2   111     MO
3   101     NJ
4   112     NB
5   101     NJ
6   105     CA

Now, I am trying to iterate list values (which are keys of my new_dict )and match them with Col1 and if they match I would like to extract Col2 values as my dictionary values. 现在,我尝试迭代列表值(这是我的new_dict键),并将其与Col1匹配,如果它们匹配,我想将Col2值提取为字典值。 This is how I have my code so far, 到目前为止,这就是我的代码,

new_dict = {}
for value in sample_list:
    for i in sample_df['Col1']:
        if value == i:
            new_dict[value] = [i for i in sample_df['Col2']]

However, my new_dict looks like this, 但是,我的new_dict看起来像这样,

{101: ['NJ', 'CA', 'MO', 'NJ', 'NB', 'NJ', 'CA'],
 105: ['NJ', 'CA', 'MO', 'NJ', 'NB', 'NJ', 'CA'],
 112: ['NJ', 'CA', 'MO', 'NJ', 'NB', 'NJ', 'CA']}

I need my output like this, 我需要这样的输出,

{101: ['NJ'],
 105: ['CA'],
 112: ['NB']}

How can I get to my desired output? 如何获得所需的输出? Any help would be nice. 你能帮忙的话,我会很高兴。

这样做:

new_dict = {i: [sample_df[sample_df['Col1']==i]['Col2'].values[0]] for i in sample_list}

Alt 1 替代项1

If you insist here is another solution that should be efficient by using isin() to create a mask used to filter away not desired rows. 如果您坚持认为这是另一种解决方案,则应该使用isin()创建用于过滤掉不需要的行的掩码来提高效率。

m = sample_df['Col1'].isin(sample_list)
sample_df[m].drop_duplicates().groupby('Col1')['Col2'].apply(list).to_dict()

Returns: {101: ['NJ'], 105: ['CA'], 112: ['NB']} 返回值: {101: ['NJ'], 105: ['CA'], 112: ['NB']}

note: if there are more non-unique combos they will be in the list too . 注意:如果有更多非唯一组合,它们也将出现在列表中 Use: {k:[v] for k,v in sample_df[m].groupby('Col1')['Col2'].first().items()} if you only want the first. 使用: {k:[v] for k,v in sample_df[m].groupby('Col1')['Col2'].first().items()}如果您只想要第一个。


Alt 2 Alt 2

If you are going for list items but not all why not just the values? 如果要使用列表项而不是全部,为什么不只是值?

m = sample_df['Col1'].isin(sample_list)
sample_df[m].set_index('Col1')['Col2'].to_dict()

Returns: {101: 'NJ', 105: 'CA', 112: 'NB'} 返回值: {101: 'NJ', 105: 'CA', 112: 'NB'}


Alt 3 Alt 3

or if you want all the items: 或者如果您想要所有物品:

m = sample_df['Col1'].isin(sample_list)
sample_df[m].groupby('Col1')['Col2'].apply(list).to_dict()

Returns: {101: ['NJ', 'NJ', 'NJ'], 105: ['CA', 'CA'], 112: ['NB']} 返回值: {101: ['NJ', 'NJ', 'NJ'], 105: ['CA', 'CA'], 112: ['NB']}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM