![](/img/trans.png)
[英]Fill NaN values in a pandas dataframe with values of another dataframe
[英]Fill nan values with random value from another DataFrame pandas
我有一個數百萬行和許多NaN值的DataFrame。 一些例子:
index Company Area
0 Google Technology
1 Coca Cola Drinks
2 NaN Drinks
3 Apple Technology
4 NaN Technology
5 Gatorade Drinks
6 Dell Technology
7 Apple Technology
8 Coca Cola Drinks
9 NaN Drinks
10 Google Technology
我的想法是用其Area的2個最常見的值之一填充Companies NaN值。
例如:如果技術領域中使用頻率最高的公司是Apple和Google,我想用其中一個值(隨機)填充“ df ['Area'] =='Technology'” NaN值
我已經用最常見的值創建了一個Group By DataFrame,它是這樣的:
Area Company
Technology Google
Technology Apple
Drinks Coca Cola
Drinks Pepsi
結果應該是這樣的:
index Company Area
0 Google Technology
1 Coca Cola Drinks
2 Pepsi Drinks
3 Apple Technology
4 Google Technology
5 Gatorade Drinks
6 Dell Technology
7 Apple Technology
8 Coca Cola Drinks
9 Pepsi Drinks
10 Google Technology
我希望你能幫助我。
謝謝!!!
我通過使用random.choice
提出了這個解決方案
import random
s=df1.groupby('Area').Company.apply(list).reindex(df.Area).apply(lambda x :random.choice(x) )
s.index=df.index
df.Company=df.Company.fillna(s)
df
Out[200]:
index Company Area
0 0 Google Technology
1 1 CocaCola Drinks
2 2 CocaCola Drinks
3 3 Apple Technology
4 4 Google Technology
5 5 Gatorade Drinks
6 6 Dell Technology
7 7 Apple Technology
8 8 CocaCola Drinks
9 9 Pepsi Drinks
10 10 Google Technology
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.