简体   繁体   English

移动每行的最后一个非空值 - Pandas

[英]Move last non-null value of each row - Pandas

I'm working on a dataframe which is from a noSQL table, which implies the rows don't have the same length.我正在研究 dataframe ,它来自 noSQL 表,这意味着行的长度不同。 I need to retrieve the last non-null value of each row, move it to a new column 'h' and remove it from its initial position.我需要检索每行的最后一个非空值,将其移动到新列“h”并将其从其初始 position 中删除。

My initial DataFrame is:我最初的 DataFrame 是:

      a           b     c     d   e     f     g
0  1635  01/01/2018  Null  Null  95   120    80
1  7364  01/15/2018   178   182  99  Null  Null
2  8947  01/20/2018  Null   190  92  Null  Null
3  6473  01/24/2018    45   122  99    32  Null

And I'd like to get this result:我想得到这个结果:

      a           b     c     d     e     f     g   h
0  1635  01/01/2018  Null  Null    95   120  Null  80
1  7364  01/15/2018   178   182  Null  Null  Null  99
2  8947  01/20/2018  Null   190  Null  Null  Null  92
3  6473  01/24/2018    45   122    99  Null  Null  32

Use, DataFrame.ne along with DataFrame.cumsum and DataFrame.idxmax along axis=1 to get the columns containing the last non null value, finally use DataFrame.lookup to get the values, corresponding to the cols : Use, DataFrame.ne along with DataFrame.cumsum and DataFrame.idxmax along axis=1 to get the columns containing the last non null value, finally use DataFrame.lookup to get the values, corresponding to the cols :

cols = df.ne('Null').cumsum(axis=1).idxmax(axis=1)
df['h'] = df.lookup(df.index, cols)

Result:结果:

# print(df)
      a           b     c     d   e     f     g   h
0  1635  01/01/2018  Null  Null  95   120    80  80
1  7364  01/15/2018   178   182  99  Null  Null  99
2  8947  01/20/2018  Null   190  92  Null  Null  92
3  6473  01/24/2018    45   122  99    32  Null  32

As other solution you can use last_valid_index .作为其他解决方案,您可以使用last_valid_index However, you first have to convert all the Null values to np.NaN .但是,您首先必须将所有Null值转换为np.NaN

df[df=="Null"] = np.NaN

df["h"] = df.apply(lambda x: x[x.last_valid_index()], axis=1)
df

Output: Output:

    a       b           c   d   e   f   g   h
0   1635    01/01/2018          95  120 80  80
1   7364    01/15/2018  178 182 99          99
2   8947    01/20/2018      190 92          92
3   6473    01/24/2018  45  122 99  32      32

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM