I have data like:
import pandas as pd
df = pd.DataFrame(data=[[1,-2,3,0,0], [0,0,0,4,0], [0,0,0,0,5]]).T
df.columns = ['col1', 'col2', 'col3']
> df
col1 col2 col3
1 0 0
-2 0 0
3 0 0
0 4 0
0 0 5
I want to create a fourth ("Col4") that takes the col that is non-zero.
So result would be:
col1 col2 col3 col4
1 0 0 1
-2 0 0 -2
3 0 0 3
0 4 0 4
0 0 5 5
EDIT: If two non-zero, always use col1
. Also, the numbers may be negative. I have updated the df
to reflect this.
Using the maximum of the columns is a possibility
df['col4'] = df.max(axis=1)
Here's an example:
def func(a):
a = set(a)
assert len(a)==2 # 0 and another number
for i in a:
if i!=0:
return i
df['col4'] = df.apply(func,axis=1)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.