简体   繁体   中英

How to copy unique keys and values from another dictionary in Python

I have a dataframe df with transactions where the values in the column Col can be repeated. I use Counter dictionary1 to count the frequency for each Col value, then I would like to run a for loop on a subset of the data and obtain a value pit . I want to create a new dictionary dict1 where the key is the key from dictionary1 and the value is the value of pit . This is the code I have so far:

dictionary1 = Counter(df['Col'])
dict1 = defaultdict(int)

for i in range(len(dictionary1)):       
    temp = df[df['Col'] == dictionary1.keys()[i]]
    b = temp['IsBuy'].sum()
    n = temp['IsBuy'].count()
    pit = b/n
    dict1[dictionary1.keys()[i]] = pit

My question is, how can i assign the key and value for dict1 based on the key of dictionary1 and the value obtained from the calculation of pit . In other words, what is the correct way to write the last line of code in the above script.

Thank you.

Since you're using pandas , I should point out that the problem you're facing is common enough that there's a built-in way to do it. We call collecting "similar" data into groups and then performing operations on them a groupby operation. It's probably wortwhile reading the tutorial section on the groupby split-apply-combine idiom -- there are lots of neat things you can do!

The pandorable way to compute the pit values would be something like

df.groupby("Col")["IsBuy"].mean()

For example:

>>> # make dummy data
>>> N = 10**4
>>> df = pd.DataFrame({"Col": np.random.randint(1, 10, N), "IsBuy": np.random.choice([True, False], N)})
>>> df.head()
   Col  IsBuy
0    3  False
1    6   True
2    6   True
3    1   True
4    5   True
>>> df.groupby("Col")["IsBuy"].mean()
Col
1      0.511709
2      0.495697
3      0.489796
4      0.510658
5      0.507491
6      0.513183
7      0.522936
8      0.488688
9      0.490498
Name: IsBuy, dtype: float64

which you could turn into a dictionary from a Series if you insisted:

>>> df.groupby("Col")["IsBuy"].mean().to_dict()
{1: 0.51170858629661753, 2: 0.49569707401032703, 3: 0.48979591836734693, 4: 0.51065801668211308, 5: 0.50749063670411987, 6: 0.51318267419962338, 7: 0.52293577981651373, 8: 0.48868778280542985, 9: 0.49049773755656106}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM