熊貓在每行中獲得最高的非空值，在具有可變列數的數據框中

Question

我有一個包含以下示例數據的數據框，其中Col.x格式的列數未知：

Col.1,Col.2,Col.3
Val1, 
Val2,Val3
Val3,
Val4,Val2,Val3

我需要有一個單獨的列，其中的值是從非 null 的最大 x 數填充的。 如：

Col.1,Col.2,Col.3,Latest
Val1,,,Val1
Val2,Val3,,Val3
Val3,,,Val3
Val4,Val2,Val3,Val3

我能夠用下面的代碼解決這個問題，但這個解決方案取決於a）知道確切的列名和b）不以可擴展的方式處理可變數量的列：

df["Latest"] = np.where(df["Col.3"].isnull(),np.where(df["Col.2"].isnull(),df["Col.1"],df["Col.2"]),df["Col.3"])

a) 我可以解決...

cols = [col for col in df.columns if 'Col' in col]

...我需要 b) 部分的幫助。

Answer 1

我們可以使用filter來提取某些列。 like和regex是兩個可以使用的強大選項。

鑒於：

    Col1  Col2  Col3  Ignore_me
0   18.0   NaN  40.0       82.0
1    6.0   NaN   NaN       92.0
2  100.0   NaN  19.0       43.0
3   38.0  98.0   NaN        8.0

正在做：

df['Latest'] = (df[df.filter(like='Col') # Using filter to select certain columns.
                     .columns
                     .sort_values(ascending=False)] # Sort them descending.
                  .bfill(axis=1) # backfill values
                  .iloc[:,0]) # take the first column, 
                              # This has the first non-nan value.

輸出，我們可以看到Ignore_me沒有被使用：

    Col1  Col2  Col3  Ignore_me  Latest
0   18.0   NaN  40.0       82.0    40.0
1    6.0   NaN   NaN       92.0     6.0
2  100.0   NaN  19.0       43.0    19.0
3   38.0  98.0   NaN        8.0    98.0

Answer 2

將fillna與functools.reduce一起使用：

# sort column names by suffix in reverse order
cols = sorted(
   (col for col in df.columns if col.startswith('Col')), 
   key=lambda col: -int(col.split('.')[1])
)
cols
# ['Col.3', 'Col.2', 'Col.1']

from functools import reduce
df['Latest'] = reduce(lambda x, y: x.fillna(y), [df[col] for col in cols])

df
#  Col.1 Col.2 Col.3 Latest
#0  Val1   NaN   NaN   Val1
#1  Val2   NaN  Val3   Val3
#2  Val3   NaN   NaN   Val3
#3  Val4  Val2  Val3   Val3

熊貓在每行中獲得最高的非空值，在具有可變列數的數據框中

問題描述

2 個解決方案

解決方案1
1 2022-07-16 18:48:53

解決方案2
0 2022-07-16 18:08:16

熊貓在每行中獲得最高的非空值，在具有可變列數的數據框中

問題描述

2 個解決方案

解決方案1 1 2022-07-16 18:48:53

解決方案2 0 2022-07-16 18:08:16

解決方案1
1 2022-07-16 18:48:53

解決方案2
0 2022-07-16 18:08:16