简体   繁体   English

熊猫数据框合并行以删除NaN

[英]Pandas dataframe merging rows to remove NaN

I have a dataframe with some NaNs: 我有一个带有某些NaN的数据框:

hostname period Teff
51 Peg  4.2293  5773
51 Peg  4.231   NaN
51 Peg  4.23077 NaN
55 Cnc  44.3787 NaN
55 Cnc  44.373  NaN
55 Cnc  44.4175 NaN
55 Cnc  NaN 5234
61 Vir  NaN 5577
61 Vir  38.021  NaN
61 Vir  123.01  NaN

The rows with the same "hostname" all refer to the same object, but as you can see, some entries have NaNs under various columns. 具有相同“主机名”的行均引用同一对象,但是如您所见,某些条目在各个列下均具有NaN。 I'd like to merge all the rows under the same hostname such that I retain the first finite value in each column (drop the row if all values are NaN). 我想合并同一主机名下的所有行,以便在每列中保留第一个有限值(如果所有值均为NaN,则删除该行)。 So the result should look like this: 因此结果应如下所示:

hostname period Teff
51 Peg  4.2293  5773
55 Cnc  44.3787 5234
61 Vir  38.021  5577

How would you go about doing this? 您将如何去做?

Use groupby.first ; 使用groupby.first ; It takes the first non NA value : 它采用第一个非NA值

df.groupby('hostname')[['period', 'Teff']].first().reset_index()
#  hostname   period  Teff
#0      Cnc  44.3787  5234
#1      Peg   4.2293  5773
#2      Vir  38.0210  5577

Or manually do this with a custom aggregation function: 或使用自定义聚合功能手动执行此操作:

df.groupby('hostname')[['period', 'Teff']].agg(lambda x: x.dropna().iat[0]).reset_index()

This requires each group has at least one non NA value. 这要求每个组至少具有一个非NA值。

Write your own function to handle the edge case: 编写自己的函数来处理边缘情况:

def first_(g):
    non_na = g.dropna()
    return non_na.iat[0] if len(non_na) > 0 else pd.np.nan

df.groupby('hostname')[['period', 'Teff']].agg(first_).reset_index()

#  hostname   period  Teff
#0      Cnc  44.3787  5234
#1      Peg   4.2293  5773
#2      Vir  38.0210  5577

Is this what you need ? 这是您需要的吗?

pd.concat([ df1.apply(lambda x: sorted(x, key=pd.isnull)) for _, df1 in df.groupby('hostname')]).dropna()
Out[343]: 
   hostname   period    Teff
55      Cnc  44.3787  5234.0
51      Peg   4.2293  5773.0
61      Vir  38.0210  5577.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM