[英]Pandas dataframe fillna by some value
I have this data 我有这些数据
import numpy as np
import pandas as pd
group = {'gender': ['male', 'female', 'female', 'male', 'female', 'male', 'male'],
'height': [175, 168, np.nan, 170, 167, np.nan, 190],
}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
df = pd.DataFrame(group, index=labels)
df2 = df.groupby('gender')['height'].mean()
and i want to fill nan with mean value from df2 我想用df2的平均值填充nan
code 码
import pandas as pd
import numpy as np
group = {'gender': ['male', 'female', 'female', 'male', 'female', 'male', 'male'],
'height': [175, 168, np.nan, 170, 167, np.nan, 190],
}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
df = pd.DataFrame(group, index=labels)
df2 = df.groupby('gender')['height'].mean()
df['height'].fillna(df['gender'].map(df2), inplace=True)
# print(df2)
print(df)
output 产量
gender height
a male 175.000000
b female 168.000000
c female 167.500000
d male 170.000000
e female 167.000000
f male 178.333333
g male 190.000000
You can use groupby
+ transform
with mean
. 您可以使用
groupby
+ transform
with mean
。 Then fillna
with the resulting series. 然后
fillna
与结果系列。
means = df.groupby('gender')['height'].transform('mean')
df['height'] = df['height'].fillna(means)
print(df)
gender height
a male 175.000000
b female 168.000000
c female 167.500000
d male 170.000000
e female 167.000000
f male 178.333333
g male 190.000000
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.