简体   繁体   English

Pandas - 通过聚合非空值替换NaN

[英]Pandas - Replacing NaN by aggregate of non-null values

Suppose I have a DataFrame with some NaN - 假设我有一些带有NaN的DataFrame -

import pandas as pd
l = [{'C1':-6,'C3':2},
     {'C2':-6,'C3':3},
     {'C1':-6.3,'C2':8,'C3':9},
     {'C2':-7}]
df1 = pd.DataFrame(l,
    index=['R1','R2','R3','R4'])
print(df1)

     C1   C2   C3
R1 -6.0  NaN  2.0
R2  NaN -6.0  3.0
R3 -6.3  8.0  9.0
R4  NaN -7.0  NaN

Problem - If there is any NaN value in any row cell then it has to be replaced by the aggregate of non-null values from the same row. 问题 -如果任何行单元格中存在任何NaN值,则必须将其替换为同一行中的非空值的聚合。 For instance, in first row, the value of (R1,C2) should be = (-6+2)/2 = -2 例如,在第一行中,(R1,C2)的值应为=( - 6 + 2)/ 2 = -2

Expected output - 预期产量 -

     C1   C2   C3
R1 -6.0 -4.0  2.0
R2 -1.5 -6.0  3.0
R3 -6.3  8.0  9.0
R4 -7.0 -7.0 -7.0

Use apply with axis=1 for process by rows: 对于按行处理,使用apply with axis=1

df1 = df1.apply(lambda x: x.fillna(x.mean()), axis=1)
print(df1)

     C1   C2   C3
R1 -6.0 -2.0  2.0
R2 -1.5 -6.0  3.0
R3 -6.3  8.0  9.0
R4 -7.0 -7.0 -7.0

Also works: 还有效:

df1 = df1.T.fillna(df1.mean(1)).T
print(df1)
     C1   C2   C3
R1 -6.0 -2.0  2.0
R2 -1.5 -6.0  3.0
R3 -6.3  8.0  9.0
R4 -7.0 -7.0 -7.0

Because: 因为:

df1 = df1.fillna(df1.mean(1), axis=1)
print(df1)

NotImplementedError: Currently only can fill with dict/Series column by column NotImplementedError:目前只能逐列填充dict / Series

You could do this. 你可以做到这一点。 Transpose, then do fillna() then again transpose it. 转置,然后做fillna()然后再转置它。

>>> df1 = df1.T.fillna(df1.mean(axis=1)).T
>>> print(df1)
     C1   C2   C3
R1 -6.0 -4.0  2.0
R2 -1.5 -6.0  3.0
R3 -6.3  8.0  9.0
R4 -7.0 -7.0 -7.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM