[英]python dataframe, groupby based on one column and fill null values from another column using last non-null value
我有一个数据框,其中包含基于多个日期的名称值。 我在新日期为每个名称创建了一系列空值,并希望根据该名称的最后一个非空值填充一列的空值。
data = {'name': ['Alex', 'Ben', 'Marry','Alex', 'Ben', 'Marry'],
'job': ['teacher', 'doctor', 'engineer','teacher', 'doctor', 'engineer'],
'age': ['27', '32', '78','27', '32', '78'],
'weight': ['160', '209', '130','164', '206', '132'],
'date': ['6-12-2022', '6-12-2022', '6-12-2022','6-13-2022', '6-13-2022', '6-13-2022']
}
df = pd.DataFrame(data)
df
添加空值后:
|name |job |age|weight |date
|---|-------|-----------|---|-------|--------
|0 |Alex |teacher |27 |160 |6-12-2022
|1 |Ben |doctor |32 |209 |6-12-2022
|2 |Marry |engineer |78 |130 |6-12-2022
|3 |Alex |teacher |27 |164 |6-13-2022
|4 |Ben |doctor |32 |206 |6-13-2022
|5 |Marry |engineer |78 |132 |6-13-2022
|6 |Alex |NaN |NaN|NaN |6-14-2022
|7 |Ben |NaN |NaN|NaN |6-14-2022
|8 |Marry |NaN |NaN|NaN |6-14-2022
现在我需要为工作填充空值,并根据该名称的最后输入值填充年龄。
感谢你的帮助
谢谢
如果我理解正确,您可以.groupby()
然后.ffill()
:
df[["job", "age", "weight"]] = df.groupby("name")[["job", "age", "weight"]].ffill()
print(df)
印刷:
name job age weight date
0 Alex teacher 27.0 160.0 6-12-2022
1 Ben doctor 32.0 209.0 6-12-2022
2 Marry engineer 78.0 130.0 6-12-2022
3 Alex teacher 27.0 164.0 6-13-2022
4 Ben doctor 32.0 206.0 6-13-2022
5 Marry engineer 78.0 132.0 6-13-2022
6 Alex teacher 27.0 164.0 6-14-2022
7 Ben doctor 32.0 206.0 6-14-2022
8 Marry engineer 78.0 132.0 6-14-2022
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.