[英]Find Result as Non Zero Value in Group of Columns
I have a Dataframe as below:我有一个 Dataframe 如下:
SYS Date_Time Col_1 Col_2 Col_3 Col_4 Col_5 Col_6
0 SYSTEM1 2021-01-07 09:15:00 0 0 0 0 Y 0
1 SYSTEM1 2021-01-07 09:20:00 0 0 0 0 0 0
2 SYSTEM1 2021-01-07 09:25:00 R 0 0 0 0 0
3 SYSTEM1 2021-01-07 09:30:00 0 0 0 0 0 0
4 SYSTEM1 2021-01-07 09:35:00 0 0 0 0 0 0
5 SYSTEM1 2021-01-07 09:40:00 0 R 0 0 0 0
6 SYSTEM1 2021-01-07 09:45:00 0 0 0 0 0 0
7 SYSTEM1 2021-01-07 09:50:00 0 0 0 0 0 0
8 SYSTEM1 2021-01-07 09:55:00 0 0 0 0 0 0
9 SYSTEM1 2021-01-07 10:00:00 0 0 0 0 0 0
10 SYSTEM1 2021-01-07 10:05:00 0 0 0 0 0 0
11 SYSTEM1 2021-01-07 10:10:00 0 0 0 0 0 0
12 SYSTEM1 2021-01-07 10:15:00 0 0 0 0 0 G
13 SYSTEM1 2021-01-07 10:20:00 0 0 0 0 R 0
14 SYSTEM1 2021-01-07 10:25:00 0 0 0 0 0 0
I need to find the result color in column group of (Col_1, Col_2, Col_3, Col_4, Col_5, Col_6) where the color is not Zero.我需要在颜色不为零的 (Col_1, Col_2, Col_3, Col_4, Col_5, Col_6) 列组中找到结果颜色。
Two possible condition can exist in above dataframe:在上述 dataframe 中可能存在两种可能的情况:
I want the Output as below:我想要 Output 如下:
SYS Date_Time Col_1 Col_2 Col_3 Col_4 Col_5 Col_6 Result
0 SYSTEM1 2021-01-07 09:15:00 0 0 0 0 Y 0 Y
1 SYSTEM1 2021-01-07 09:20:00 0 0 0 0 0 0 0
2 SYSTEM1 2021-01-07 09:25:00 R 0 0 0 0 0 R
3 SYSTEM1 2021-01-07 09:30:00 0 0 0 0 0 0 0
4 SYSTEM1 2021-01-07 09:35:00 0 0 0 0 0 0 0
5 SYSTEM1 2021-01-07 09:40:00 0 R 0 0 0 0 R
6 SYSTEM1 2021-01-07 09:45:00 0 0 0 0 0 0 0
7 SYSTEM1 2021-01-07 09:50:00 0 0 0 0 0 0 0
8 SYSTEM1 2021-01-07 09:55:00 0 0 0 0 0 0 0
9 SYSTEM1 2021-01-07 10:00:00 0 0 0 0 0 0 0
10 SYSTEM1 2021-01-07 10:05:00 0 0 0 0 0 0 0
11 SYSTEM1 2021-01-07 10:10:00 0 0 0 0 0 0 0
12 SYSTEM1 2021-01-07 10:15:00 0 0 0 0 0 G G
13 SYSTEM1 2021-01-07 10:20:00 0 0 0 0 R 0 R
14 SYSTEM1 2021-01-07 10:25:00 0 0 0 0 0 0 0
You can get it using the following code:您可以使用以下代码获取它:
join
all the columns that have 'Col_' in them, using a lambda
function in apply
apply
中使用lambda
function join
所有包含“Col_”的列replace
all the numeric characters with '' which will keep only alphabetsreplace
所有数字字符,这将只保留字母replace
'' with 0 to get exactly your output.replace
'' 以获得准确的 output。df['result'] = df[[c for c in df.columns if 'Col_' in c]].apply(lambda row: ''.join(row.values.astype(str)), axis=1).str.replace('[^a-zA-Z]', '').replace('',0)
which prints:打印:
SYS Date_Time Col_1 Col_2 Col_3 Col_4 Col_5 Col_6 result
0 SYSTEM1 2021-01-07 0 0 0 0 Y 0 Y
1 SYSTEM1 2021-01-07 0 0 0 0 0 0 0
2 SYSTEM1 2021-01-07 R 0 0 0 0 0 R
3 SYSTEM1 2021-01-07 0 0 0 0 0 0 0
4 SYSTEM1 2021-01-07 0 0 0 0 0 0 0
5 SYSTEM1 2021-01-07 0 R 0 0 0 0 R
6 SYSTEM1 2021-01-07 0 0 0 0 0 0 0
7 SYSTEM1 2021-01-07 0 0 0 0 0 0 0
8 SYSTEM1 2021-01-07 0 0 0 0 0 0 0
9 SYSTEM1 2021-01-07 0 0 0 0 0 0 0
10 SYSTEM1 2021-01-07 0 0 0 0 0 0 0
11 SYSTEM1 2021-01-07 0 0 0 0 0 0 0
12 SYSTEM1 2021-01-07 0 0 0 0 0 G G
13 SYSTEM1 2021-01-07 0 0 0 0 R 0 R
14 SYSTEM1 2021-01-07 0 0 0 0 0 0 0
There probably is a better more pythonic way to do this, but this is a one-liner and does the trick.可能有一种更好的更 Pythonic 的方式来做到这一点,但这是一个单行并且可以解决问题。
You can filter
the Col
like columns, then change the dtype
of these columns to str
and take max
along axis=1
.您可以
filter
Col
like 列,然后将这些列的dtype
更改为str
并沿axis=1
取max
。 The idea used here is that when you take max('0', some_alphabet)
then the returned max value will always be some_alphabet
:这里使用的想法是,当您使用
max('0', some_alphabet)
时,返回的最大值将始终为some_alphabet
:
m = df.filter(like='Col').astype(str).max(1)
df['Result'] = m.where(m.ne('0'), 0) # replace '0' with 0
0 Y
1 0
2 R
3 0
4 0
5 R
6 0
7 0
8 0
9 0
10 0
11 0
12 G
13 R
14 0
Name: Result, dtype: object
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.