简体   繁体   English

在列组中查找结果为非零值

[英]Find Result as Non Zero Value in Group of Columns

I have a Dataframe as below:我有一个 Dataframe 如下:

    SYS     Date_Time                Col_1 Col_2   Col_3  Col_4   Col_5  Col_6
0   SYSTEM1 2021-01-07 09:15:00      0     0       0      0       Y      0
1   SYSTEM1 2021-01-07 09:20:00      0     0       0      0       0      0
2   SYSTEM1 2021-01-07 09:25:00      R     0       0      0       0      0
3   SYSTEM1 2021-01-07 09:30:00      0     0       0      0       0      0
4   SYSTEM1 2021-01-07 09:35:00      0     0       0      0       0      0
5   SYSTEM1 2021-01-07 09:40:00      0     R       0      0       0      0
6   SYSTEM1 2021-01-07 09:45:00      0     0       0      0       0      0
7   SYSTEM1 2021-01-07 09:50:00      0     0       0      0       0      0
8   SYSTEM1 2021-01-07 09:55:00      0     0       0      0       0      0
9   SYSTEM1 2021-01-07 10:00:00      0     0       0      0       0      0
10  SYSTEM1 2021-01-07 10:05:00      0     0       0      0       0      0
11  SYSTEM1 2021-01-07 10:10:00      0     0       0      0       0      0
12  SYSTEM1 2021-01-07 10:15:00      0     0       0      0       0      G
13  SYSTEM1 2021-01-07 10:20:00      0     0       0      0       R      0
14  SYSTEM1 2021-01-07 10:25:00      0     0       0      0       0      0

I need to find the result color in column group of (Col_1, Col_2, Col_3, Col_4, Col_5, Col_6) where the color is not Zero.我需要在颜色不为零的 (Col_1, Col_2, Col_3, Col_4, Col_5, Col_6) 列组中找到结果颜色。

Two possible condition can exist in above dataframe:在上述 dataframe 中可能存在两种可能的情况:

  1. Only one out of 6 columns will be Non Zero. 6 列中只有一列将是非零。
  2. If all columns have Zero Value then result will be Zero.如果所有列的值为零,则结果将为零。

I want the Output as below:我想要 Output 如下:

    SYS     Date_Time                Col_1 Col_2   Col_3  Col_4   Col_5  Col_6 Result
0   SYSTEM1 2021-01-07 09:15:00      0     0       0      0       Y      0     Y
1   SYSTEM1 2021-01-07 09:20:00      0     0       0      0       0      0     0
2   SYSTEM1 2021-01-07 09:25:00      R     0       0      0       0      0     R
3   SYSTEM1 2021-01-07 09:30:00      0     0       0      0       0      0     0
4   SYSTEM1 2021-01-07 09:35:00      0     0       0      0       0      0     0
5   SYSTEM1 2021-01-07 09:40:00      0     R       0      0       0      0     R
6   SYSTEM1 2021-01-07 09:45:00      0     0       0      0       0      0     0
7   SYSTEM1 2021-01-07 09:50:00      0     0       0      0       0      0     0
8   SYSTEM1 2021-01-07 09:55:00      0     0       0      0       0      0     0
9   SYSTEM1 2021-01-07 10:00:00      0     0       0      0       0      0     0
10  SYSTEM1 2021-01-07 10:05:00      0     0       0      0       0      0     0
11  SYSTEM1 2021-01-07 10:10:00      0     0       0      0       0      0     0
12  SYSTEM1 2021-01-07 10:15:00      0     0       0      0       0      G     G
13  SYSTEM1 2021-01-07 10:20:00      0     0       0      0       R      0     R
14  SYSTEM1 2021-01-07 10:25:00      0     0       0      0       0      0     0

You can get it using the following code:您可以使用以下代码获取它:

  • join all the columns that have 'Col_' in them, using a lambda function in applyapply中使用lambda function join所有包含“Col_”的列
  • replace all the numeric characters with '' which will keep only alphabets用 '' replace所有数字字符,这将只保留字母
  • Lastly, replace '' with 0 to get exactly your output.最后,用 0 replace '' 以获得准确的 output。
df['result'] = df[[c for c in df.columns if 'Col_' in c]].apply(lambda row: ''.join(row.values.astype(str)), axis=1).str.replace('[^a-zA-Z]', '').replace('',0)

which prints:打印:

        SYS  Date_Time Col_1 Col_2  Col_3  Col_4 Col_5 Col_6 result
0   SYSTEM1 2021-01-07     0     0      0      0     Y     0      Y
1   SYSTEM1 2021-01-07     0     0      0      0     0     0      0
2   SYSTEM1 2021-01-07     R     0      0      0     0     0      R
3   SYSTEM1 2021-01-07     0     0      0      0     0     0      0
4   SYSTEM1 2021-01-07     0     0      0      0     0     0      0
5   SYSTEM1 2021-01-07     0     R      0      0     0     0      R
6   SYSTEM1 2021-01-07     0     0      0      0     0     0      0
7   SYSTEM1 2021-01-07     0     0      0      0     0     0      0
8   SYSTEM1 2021-01-07     0     0      0      0     0     0      0
9   SYSTEM1 2021-01-07     0     0      0      0     0     0      0
10  SYSTEM1 2021-01-07     0     0      0      0     0     0      0
11  SYSTEM1 2021-01-07     0     0      0      0     0     0      0
12  SYSTEM1 2021-01-07     0     0      0      0     0     G      G
13  SYSTEM1 2021-01-07     0     0      0      0     R     0      R
14  SYSTEM1 2021-01-07     0     0      0      0     0     0      0

There probably is a better more pythonic way to do this, but this is a one-liner and does the trick.可能有一种更好的更 Pythonic 的方式来做到这一点,但这是一个单行并且可以解决问题。

You can filter the Col like columns, then change the dtype of these columns to str and take max along axis=1 .您可以filter Col like 列,然后将这些列的dtype更改为str并沿axis=1max The idea used here is that when you take max('0', some_alphabet) then the returned max value will always be some_alphabet :这里使用的想法是,当您使用max('0', some_alphabet)时,返回的最大值将始终为some_alphabet

m = df.filter(like='Col').astype(str).max(1)
df['Result'] = m.where(m.ne('0'), 0) # replace '0' with 0

0     Y
1     0
2     R
3     0
4     0
5     R
6     0
7     0
8     0
9     0
10    0
11    0
12    G
13    R
14    0
Name: Result, dtype: object

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM