简体   繁体   English

Pandas groupby:填充其他组成员的缺失值

[英]Pandas groupby: fill missing values from other group members

I think this is best shown with an example. 我认为最好用一个例子来说明。 What I'm trying to do is find the non-null number from a group and propagate it to the rest of the group. 我想做的是从一个组中找到非空数,并将其传播到该组的其余部分。

In [52]: df = pd.DataFrame.from_dict({1:{'i_id': 2, 'i_num':1}, 2: {'i_id': 2, 'i_num': np.nan}, 3: {'i_id': 2, 'i_num': np.nan}, 4: {'i_id': 3, 'i_num': np.nan}, 5: {'i_id': 3, 'i_num': 5}}, orient='index')

In [53]: df
Out[53]:
   i_num  i_id
1      1     2
2    NaN     2
3    NaN     2
4    NaN     3
5      5     3

The DataFrame would look something like this. DataFrame看起来像这样。 What I want is to take all the i_id == 2 and make their i_num == 1, and all the i_id == 3, and make their i_num == 5 (so both matching their non-null group neighbors). 我想要的是让所有i_id == 2并使它们的i_num == 1,并使所有i_id == 3,并使它们的i_num == 5(因此都与它们的非空组邻居匹配)。

So the end result would be this: 因此最终结果将是这样:

   i_num  i_id
1      1     2
2      1     2
3      1     2
4      5     3
5      5     3

first finds the first non-null value in a group. first查找组中的第一个非空值。 You can fill in the other values in each group like this: 您可以像这样在每个组中填写其他值:

df['i_num'] = df.groupby('i_id')['i_num'].transform('first')

This produces the column as required: 这将根据需要生成列:

   i_num  i_id
1      1     2
2      1     2
3      1     2
4      5     3
5      5     3

Bear in mind that this will replace all values in the group with the first value, not just NaN values (this seems to be what you're looking for here though). 请记住,这会将组中的所有值替换为第一个值,而不仅仅是NaN值(尽管这似乎是您在此处查找的内容)。

Alternatively - and to respect any other non-null values in the group - you can use fillna in the following way: 另外,为了尊重组中的其他任何非空值,您可以通过以下方式使用fillna

# make a column of first values for each group
x = df['i_id'].map(df.groupby('i_id')['i_num'].first())
# fill only NaN values using new column x
df['i_num'] = df['i_num'].fillna(x)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM