在列组中查找结果为非零值

Question

I have a Dataframe as below:我有一个 Dataframe 如下：

    SYS     Date_Time                Col_1 Col_2   Col_3  Col_4   Col_5  Col_6
0   SYSTEM1 2021-01-07 09:15:00      0     0       0      0       Y      0
1   SYSTEM1 2021-01-07 09:20:00      0     0       0      0       0      0
2   SYSTEM1 2021-01-07 09:25:00      R     0       0      0       0      0
3   SYSTEM1 2021-01-07 09:30:00      0     0       0      0       0      0
4   SYSTEM1 2021-01-07 09:35:00      0     0       0      0       0      0
5   SYSTEM1 2021-01-07 09:40:00      0     R       0      0       0      0
6   SYSTEM1 2021-01-07 09:45:00      0     0       0      0       0      0
7   SYSTEM1 2021-01-07 09:50:00      0     0       0      0       0      0
8   SYSTEM1 2021-01-07 09:55:00      0     0       0      0       0      0
9   SYSTEM1 2021-01-07 10:00:00      0     0       0      0       0      0
10  SYSTEM1 2021-01-07 10:05:00      0     0       0      0       0      0
11  SYSTEM1 2021-01-07 10:10:00      0     0       0      0       0      0
12  SYSTEM1 2021-01-07 10:15:00      0     0       0      0       0      G
13  SYSTEM1 2021-01-07 10:20:00      0     0       0      0       R      0
14  SYSTEM1 2021-01-07 10:25:00      0     0       0      0       0      0

I need to find the result color in column group of (Col_1, Col_2, Col_3, Col_4, Col_5, Col_6) where the color is not Zero.我需要在颜色不为零的 (Col_1, Col_2, Col_3, Col_4, Col_5, Col_6) 列组中找到结果颜色。

Two possible condition can exist in above dataframe:在上述 dataframe 中可能存在两种可能的情况：

Only one out of 6 columns will be Non Zero. 6 列中只有一列将是非零。
If all columns have Zero Value then result will be Zero.如果所有列的值为零，则结果将为零。

I want the Output as below:我想要 Output 如下：

    SYS     Date_Time                Col_1 Col_2   Col_3  Col_4   Col_5  Col_6 Result
0   SYSTEM1 2021-01-07 09:15:00      0     0       0      0       Y      0     Y
1   SYSTEM1 2021-01-07 09:20:00      0     0       0      0       0      0     0
2   SYSTEM1 2021-01-07 09:25:00      R     0       0      0       0      0     R
3   SYSTEM1 2021-01-07 09:30:00      0     0       0      0       0      0     0
4   SYSTEM1 2021-01-07 09:35:00      0     0       0      0       0      0     0
5   SYSTEM1 2021-01-07 09:40:00      0     R       0      0       0      0     R
6   SYSTEM1 2021-01-07 09:45:00      0     0       0      0       0      0     0
7   SYSTEM1 2021-01-07 09:50:00      0     0       0      0       0      0     0
8   SYSTEM1 2021-01-07 09:55:00      0     0       0      0       0      0     0
9   SYSTEM1 2021-01-07 10:00:00      0     0       0      0       0      0     0
10  SYSTEM1 2021-01-07 10:05:00      0     0       0      0       0      0     0
11  SYSTEM1 2021-01-07 10:10:00      0     0       0      0       0      0     0
12  SYSTEM1 2021-01-07 10:15:00      0     0       0      0       0      G     G
13  SYSTEM1 2021-01-07 10:20:00      0     0       0      0       R      0     R
14  SYSTEM1 2021-01-07 10:25:00      0     0       0      0       0      0     0

Answer 1

You can get it using the following code:您可以使用以下代码获取它：

join all the columns that have 'Col_' in them, using a lambda function in apply在apply中使用lambda function join所有包含“Col_”的列
replace all the numeric characters with '' which will keep only alphabets用 '' replace所有数字字符，这将只保留字母
Lastly, replace '' with 0 to get exactly your output.最后，用 0 replace '' 以获得准确的 output。

df['result'] = df[[c for c in df.columns if 'Col_' in c]].apply(lambda row: ''.join(row.values.astype(str)), axis=1).str.replace('[^a-zA-Z]', '').replace('',0)

which prints:打印：

        SYS  Date_Time Col_1 Col_2  Col_3  Col_4 Col_5 Col_6 result
0   SYSTEM1 2021-01-07     0     0      0      0     Y     0      Y
1   SYSTEM1 2021-01-07     0     0      0      0     0     0      0
2   SYSTEM1 2021-01-07     R     0      0      0     0     0      R
3   SYSTEM1 2021-01-07     0     0      0      0     0     0      0
4   SYSTEM1 2021-01-07     0     0      0      0     0     0      0
5   SYSTEM1 2021-01-07     0     R      0      0     0     0      R
6   SYSTEM1 2021-01-07     0     0      0      0     0     0      0
7   SYSTEM1 2021-01-07     0     0      0      0     0     0      0
8   SYSTEM1 2021-01-07     0     0      0      0     0     0      0
9   SYSTEM1 2021-01-07     0     0      0      0     0     0      0
10  SYSTEM1 2021-01-07     0     0      0      0     0     0      0
11  SYSTEM1 2021-01-07     0     0      0      0     0     0      0
12  SYSTEM1 2021-01-07     0     0      0      0     0     G      G
13  SYSTEM1 2021-01-07     0     0      0      0     R     0      R
14  SYSTEM1 2021-01-07     0     0      0      0     0     0      0

There probably is a better more pythonic way to do this, but this is a one-liner and does the trick.可能有一种更好的更 Pythonic 的方式来做到这一点，但这是一个单行并且可以解决问题。

Answer 2

You can filter the Col like columns, then change the dtype of these columns to str and take max along axis=1 .您可以filter Col like 列，然后将这些列的dtype更改为str并沿axis=1取max 。 The idea used here is that when you take max('0', some_alphabet) then the returned max value will always be some_alphabet :这里使用的想法是，当您使用max('0', some_alphabet)时，返回的最大值将始终为some_alphabet ：

m = df.filter(like='Col').astype(str).max(1)
df['Result'] = m.where(m.ne('0'), 0) # replace '0' with 0

0     Y
1     0
2     R
3     0
4     0
5     R
6     0
7     0
8     0
9     0
10    0
11    0
12    G
13    R
14    0
Name: Result, dtype: object

在列组中查找结果为非零值

问题描述

2 个解决方案

解决方案1
1 已采纳 2021-01-10 14:41:44

解决方案2
1 2021-01-10 14:48:01

在列组中查找结果为非零值

问题描述

2 个解决方案

解决方案1 1 已采纳 2021-01-10 14:41:44

解决方案2 1 2021-01-10 14:48:01

解决方案1
1 已采纳 2021-01-10 14:41:44

解决方案2
1 2021-01-10 14:48:01